Key Takeaways
- The Claude apps gateway removes per-developer credential management by routing all Claude Code traffic through a single Cloud Run service identity
- Spend limits can be set per user, group, or org with automatic 429 responses when caps are hit
- Policy enforcement happens server-side, making local managed-settings.json edits ineffective
Anthropic has released a Claude apps gateway for Google Cloud that solves the biggest headache of rolling out Claude Code across an organization: per-developer credential management. The gateway sits between local Claude Code clients and GCP, centralizing identity, policy, spend controls, and telemetry in a single self-hosted service.
Individual developers have been able to point Claude Code at Vertex AI for a while. Set CLAUDE_CODE_USE_VERTEX=1, grant the aiplatform.user role, and inference stays inside your GCP perimeter. Simple. But scaling that to hundreds of engineers means distributing service account keys, pushing config files over MDM, and having zero visibility into who's burning through tokens.

What does the gateway actually do?
The gateway handles five jobs that platform teams otherwise cobble together from scratch: identity, policy, telemetry, spend limits, and routing.
For identity, sign-in requests route through your existing IdP. Google Workspace or any OIDC provider. The gateway exchanges the token for a short-lived session. No service account keys or API keys land on developer laptops. Onboarding means adding someone to an IdP group. Offboarding means removing them, and their next session refresh fails immediately.
Policy lives in a single gateway.yaml file, resolved per group and enforced server-side. The gateway checks availableModels on every /v1/messages call. Developers can edit their local managed-settings.json all they want. It changes nothing. Rule updates propagate to the entire fleet within the hour.

Telemetry ships over OTLP/HTTP to whatever collector you run. Cloud Monitoring, Grafana, Datadog. The key difference: every claude_code.token.usage metric carries the verified email and groups from the session JWT. Not the spoofable OTEL_RESOURCE_ATTRIBUTES that clients can set themselves.
Spend limits get set per user, group, or org through an admin API. The gateway meters tokens against a Cloud SQL ledger and returns a 429 when someone hits their cap. One caveat: costs calculate at list price. Committed-use discounts and negotiated rates don't factor in. Treat it as a runaway-usage guardrail, not a billing reconciliation tool.
How does the architecture work?
A developer's local (or deployed) Claude process sends inference traffic to the gateway over HTTPS. The gateway runs as a stateless container on Cloud Run. It validates its own session bearer, checks policy, and forwards the request to Agent Platform using the Cloud Run service account.
Cloud SQL holds two things: device-code sign-in state and the spend ledger. Google Workspace gets contacted only at sign-in and token refresh, not on every request. Inference stays in your GCP project. Quota, DPA, billing all unchanged.

For routing, you can set region: global for Agent Platform's global endpoint or add multiple upstreams entries to fail over on 5xx, 429, or timeout in list order.
Setting up the gateway on GCP
The deployment breaks into two steps. First, provision the GCP foundation: enable Agent Platform, Cloud SQL, and Secret Manager APIs. Create a claude-gateway service account with roles/aiplatform.user. Spin up a small Cloud SQL Postgres instance for state.
The gateway authenticates to Agent Platform as the Cloud Run service identity. You don't create a service account key. Then create a new OAuth client (Web application type) in the Google Cloud console. The gateway authenticates developers against Google Workspace as an OIDC relying party, and this client issues the client_id and client_secret for that handshake. Those values feed the oidc block in gateway.yaml.

Second, configure the gateway itself. Write gateway.yaml pointing at your Google Workspace OIDC client, the Postgres connection string, and Agent Platform as the upstream. Store it in Secret Manager along with the OIDC client secret, the Postgres URL, and a JWT signing key.
The full gcloud command sequence and complete gateway.yaml reference live in Anthropic's Claude apps gateway on Google Cloud docs.
Who should deploy this?
Teams with more than a handful of Claude Code users. The individual developer flow works fine for small teams where everyone manages their own credentials. Once you're distributing configs over MDM and trying to track who's using what, the overhead tips toward running the gateway.
Regulated industries will care about the identity flow. No credentials on laptops means less surface area for credential theft. The spend caps matter less for compliance than for finance teams who've watched AI costs balloon without attribution.
Logicity's Take
This release reflects where enterprise AI tooling is headed: managed infrastructure that looks like traditional SaaS but runs in your cloud. The comparison here is with tools like Copilot for Business, which handles identity through Microsoft Entra but doesn't give you the same routing and spend control flexibility. Anthropic is betting that platform teams want control, even at the cost of running their own gateway. The Cloud SQL dependency adds operational overhead, but it's a fair trade for organizations that already run Postgres workloads. Pricing for the gateway itself isn't specified. It ships with the claude binary, suggesting no separate license, but inference costs still flow through your Vertex AI commitment.
Frequently Asked Questions
Does the Claude apps gateway require a separate license?
The gateway ships with the standard claude binary. Inference costs still apply through your Vertex AI billing, but no separate gateway license has been announced.
Can the gateway work with non-Google identity providers?
Yes. While the examples use Google Workspace, the gateway supports any OIDC-compliant identity provider.
How accurate are the spend limits?
Spend limits calculate at list price. Committed-use discounts and negotiated rates don't factor in, so treat them as guardrails against runaway usage rather than exact billing reconciliation.
What happens if the gateway goes down?
The gateway is stateless and runs on Cloud Run, so it scales and recovers automatically. However, Cloud SQL availability affects sign-in state and spend ledger access.
Covers essential security configurations for developer tool infrastructure
Need Help Implementing This?
Deploying the Claude apps gateway requires coordinating GCP permissions, OIDC configuration, and Cloud SQL setup. If your platform team needs guidance on enterprise AI infrastructure, reach out to our technical advisory at contact@logicity.in.
Source: Cloud Blog
Manaal Khan
Tech & Innovation Writer
Produced with AI assistance and reviewed by the Logicity editorial team. Learn more in our Editorial Policy.
Related Articles
Browse all
AI Revolution: How Tech is Transforming the World, One Industry at a Time
From desalination plants in Iran to AI-powered manufacturing, the tech world is abuzz with innovation. Discover how AI is changing the game for small entrepreneurs and what it means for the future of industry. Explore the latest developments in cybersecurity, robotics, and more.

Revolutionizing AI: The Game-Changing Tech That's Making Agents Smarter
A new technology is set to revolutionize the way AI agents learn and adapt, enabling them to accumulate wisdom and apply it to new situations. This innovation has the potential to significantly boost the reliability of AI agents, especially in complex tasks. By converting raw agent trajectories into reusable guidelines, this tech is poised to transform the AI landscape.

The Dark Side of AI: How Bots Are Fueling a Monetized Abuse Ecosystem
A recent analysis of 2.8 million Telegram messages reveals a shocking truth: AI-powered bots are being used to create and sell non-consensual intimate images. These bots can turn ordinary photos into synthetic nude images, and the abuse is being monetized through affiliate programs and subscription-based archives. The researchers behind the study are calling for stricter regulations to combat this growing problem.

AI's Secret Sauce: How Journalism Became the Unlikely Ingredient
A recent study reveals that AI chatbots rely heavily on journalistic sources for their quotes, with one in four coming from news outlets. This shocking discovery has significant implications for the media industry and our understanding of AI's information gathering processes. As AI technology continues to evolve, it's essential to consider the role of journalism in shaping its responses.

