Skip to main content

MCP Gateway Best Practices

This guide outlines essential security patterns and best practices for deploying MCP Gateway in production environments. It covers three critical topics:

  1. On-Behalf-Of (OBO) Authentication - Maintaining least-privilege access through token delegation
  2. Task-Based Access Control (TBAC) - Fine-grained authorization for AI agents
  3. The Triple Gate Pattern - Defense-in-depth security architecture

On-Behalf-Of (OBO) Authentication

When AI agents interact with MCP servers that access backend APIs, a critical security question arises: What identity and permissions should the MCP server use when calling those APIs?

The naive approach—giving the MCP server a service account with elevated permissions to access all backend resources—violates the principle of least privilege and creates a single point of compromise. If the MCP server is compromised, the attacker gains access to everything.

The secure alternative is On-Behalf-Of (OBO) authentication, where the MCP server acts with the same identity and permissions as the original client or agent.

How OBO Authentication Works

The Flow:

  1. Client authenticates with IdP and receives Token A (JWT) with audience = MCP Server
  2. Client sends request to MCP Gateway with Authorization: Bearer <Token A> header
  3. MCP Gateway validates Token A and enforces TBAC policies
  4. MCP Gateway forwards the Authorization header to MCP Server (if configured to do so)
  5. MCP Server contacts IdP with Token A to exchange it for Token B with audience = backend API
  6. IdP validates Token A and issues Token B with the same subject and permissions as Token A
  7. MCP Server calls backend API with Token B, maintaining the client's identity and permissions

Token Exchange and Audience Constraints

The token exchange step (5-6) is necessary because OAuth 2.0 and OIDC tokens are audience-locked. Token A is valid for the MCP Server audience, but backend APIs won't accept it. The IdP performs a token exchange (per RFC 8693) to issue Token B with:

  • Same subject (sub) - preserves the client's identity
  • Same permissions (scopes, claims) - maintains authorization boundaries
  • New audience (aud) - valid for the backend API

This ensures the MCP server can only access what the client can access—no more, no less.

MCP Gateway's Role in OBO

The MCP Gateway has exactly one role in OBO authentication: forward or strip the Authorization header.

This is controlled by configuring whether the gateway passes through the Authorization header to the upstream MCP server:

  • Forward the header (OBO enabled): The MCP server receives the client's token and can perform token exchange to act on behalf of the client
  • Strip the header (OBO turned off): The MCP server does not receive the token and cannot act on the client's behalf (useful when you want the MCP server to use its own service account)
Responsibility Boundary

The MCP server—not the gateway—implements OBO token exchange. The gateway's job is policy enforcement and optionally providing the authentication context. The MCP server must integrate with your IdP to perform the token exchange.

Why OBO Matters

OBO authentication eliminates the need for:

  • Overprivileged service accounts - The MCP server doesn't need a "super user" account
  • Static API keys - No hardcoded credentials with excessive permissions
  • Complex permission mapping - The IdP manages identity and permissions centrally

Instead, access control becomes dynamic and identity-driven: the MCP server can only access resources that the current client is authorized to access, enforced cryptographically by the IdP.

Implementation Checklist

When implementing OBO for your MCP server:

  • Configure your IdP to support token exchange (RFC 8693 or vendor-specific flow)
  • Define audience values for MCP server and backend APIs
  • Implement token exchange in your MCP server code (contact IdP when receiving requests)
  • Configure MCP Gateway to forward Authorization header to MCP server
  • Ensure backend APIs validate Token B's audience and signature
  • Test with different client identities to verify permission boundaries

Task-Based Access Control (TBAC)

Traditional Role-Based Access Control (RBAC) fails for AI agents because agents don't have static job functions—they complete tasks that span multiple domains and require context-aware authorization.

For a comprehensive explanation of TBAC, including the three dimensions (Tasks, Tools, Transactions), variable substitution, and practical examples, see our dedicated guide:

Understanding Task-Based Access Control

Key TBAC concepts to understand:

  • Dynamic authorization based on JWT claims and MCP request parameters
  • Variable substitution with ${jwt.claim} and ${mcp.parameter} syntax
  • Three-layer filtering: Task-level, Tool-level, and Transaction-level policies
  • IdP-driven control where permissions are encoded in cryptographically verified JWTs

TBAC is the authorization model that makes AI agent security practical and scalable. Combined with OBO authentication, it provides complete identity and access management for MCP-based systems.

The Triple Gate Pattern

AI agents that use MCP create three distinct attack surfaces, each requiring specialized security controls. Protecting one layer is insufficient—you need defense in depth.

Understanding the Three Pathways

MCP-based AI agent systems have three distinct pathways, each representing a potential attack surface:

   PATHWAY 1: Client → LLM
┌────────┐ ┌─────┐
│ Client │ ───── Prompts ─────> │ LLM │
└────────┘ └─────┘

Where prompts are sent to the LLM
Risks: Prompt injection, PII leakage, jailbreak attacks


PATHWAY 2: Client → MCP Server
┌────────┐ ┌────────────┐
│ Client │ ─── Tool Requests ─> │ MCP Server │
└────────┘ └────────────┘

Where the LLM requests access to tools and sensitive data
Risks: Unauthorized tool access, data exfiltration


PATHWAY 3: MCP Server → External APIs
┌────────────┐ ┌──────────────┐
│ MCP Server │ ───── API Calls ───> │ External APIs│
└────────────┘ └──────────────┘

Where the MCP Server calls external services
Risks: Malicious API calls, unauthorized actions, data exfiltration

Each pathway requires distinct security controls. A breach at any layer can compromise the entire system.

The Triple Gate Solution

The Triple Gate Pattern protects these three pathways by inserting security gates:

   PATHWAY 1 + GATE 1
┌────────┐ ┌────────────┐ ┌─────┐
│ Client │ ───> │ AI Gateway │ ───> │ LLM │
└────────┘ └─────┬──────┘ └─────┘

✓ Authenticate
✓ Filter PII
✓ Topic control
✓ Jailbreak detection


PATHWAY 2 + GATE 2
┌────────┐ ┌─────────────┐ ┌────────────┐
│ Client │ ───> │ MCP Gateway │ ───> │ MCP Server │
└────────┘ └──────┬──────┘ └────────────┘

✓ Authenticate
✓ Tool authorization (TBAC)
✓ Parameter validation
✓ Resource policies


PATHWAY 3 + GATE 3
┌────────────┐ ┌─────────────┐ ┌──────────────┐
│ MCP Server │ ───> │ API Gateway │ ───> │ External APIs│
└────────────┘ └──────┬──────┘ └──────────────┘

✓ Rate limiting
✓ Authentication
✓ Content inspection
✓ API authorization

Each pathway has distinct risks and requires specialized security controls:

Gate 1: AI Layer Protection

Protects: Client → LLM communication (prompts and completions)

Attack vectors:

  • Prompt injection attacks that manipulate the LLM's behavior
  • PII leakage when using third-party LLM services
  • Jailbreak attempts to bypass safety guardrails
  • Unauthorized access to LLM capabilities

Security capabilities required:

  • Authentication and authorization before prompts reach the LLM
  • PII data filtering to redact sensitive information from prompts
  • Topic control to enforce conversation boundaries
  • Content safety detection for malicious or harmful content
  • Jailbreak detection to identify attempts to subvert safety controls
  • Observability for audit trails and anomaly detection

Traefik solution: Use AI Gateway with LLM Guard middleware for comprehensive AI layer protection.

Gate 2: MCP Layer Protection

Protects: LLM → MCP Server communication (tool invocation requests)

Attack vectors:

  • Unauthorized tool access (LLM requests tools it shouldn't use)
  • Parameter manipulation (LLM provides malicious arguments)
  • Resource exhaustion (excessive tool invocations)
  • Data exfiltration through tool misuse

Security capabilities required:

  • Authentication and authorization with fine-grained access controls
  • Resource policies defining which MCP resources (prompts, tools) clients can access
  • Tool policies controlling which tools can be invoked and under what conditions
  • Dynamic authorization considering JWT claims and request parameters (TBAC)
  • Observability tracking which tools are invoked, by whom, and how frequently

Traefik Hub solution: Use MCP Gateway with JWT middleware and TBAC policies. See Understanding TBAC for implementation details.

Gate 3: API Layer Protection

Protects: MCP Server → Backend APIs

Attack vectors:

  • Rate limit bypass (agent makes excessive API calls)
  • Malicious API calls (email/SMS spam, data deletion)
  • Unauthorized API access (wrong service account permissions)
  • Data exfiltration through legitimate APIs

Security capabilities required:

  • Intelligent rate limiting that accounts for agent behavior
  • Content inspection for APIs that send messages or modify data
  • Traditional API security: authentication, authorization, input validation
  • Quota enforcement per client or agent identity
  • Observability for detecting anomalous API usage patterns

Traefik Hub solution: Use API Gateway with rate limiting, authentication middleware, and API management policies.

The Security Sequence Flow

Here's how a request flows through all three gates:

Critical insight: Each gate enforces policies appropriate to its layer. An attack that bypasses one gate can still be caught by the next gate.

Why You Need All Three Gates

Consider this attack scenario:

An attacker manipulates an LLM through prompt injection to exfiltrate sensitive data. The LLM is tricked into calling an email_api tool with stolen data.

With only Gate 1 (AI Gateway):

  • Might catch obvious prompt injection attempts
  • Sophisticated prompts can bypass detection
  • No control over which tools the LLM can access
  • No protection at API layer

With only Gate 2 (MCP Gateway):

  • Prompt injection happens before MCP layer
  • Can block unauthorized tool access
  • If attacker has valid credentials, tool policies might allow the call
  • No content inspection of API requests

With only Gate 3 (API Gateway):

  • Prompt injection happens before API layer
  • No control over tool invocation
  • Might catch anomalous API usage patterns
  • Legitimate-looking API calls might be allowed

With all three gates:

  • Gate 1 detects and blocks many prompt injection attempts
  • Gate 2 enforces TBAC policies (for example, "this agent can't access email_api")
  • Gate 3 enforces rate limits and content inspection (for example, "email recipients must be internal only")

Result: The attack is stopped at multiple points, and even if one gate fails, others provide defense in depth.

Summary

Securing AI agents in production requires three complementary security patterns:

  1. OBO Authentication - Ensures MCP servers act with client identity and permissions through token exchange, eliminating overprivileged service accounts

  2. TBAC - Provides dynamic, context-aware authorization for AI agents based on tasks, tools, and transaction parameters using JWT claims

  1. Triple Gate Pattern - Implements defense in depth by securing all three attack surfaces (AI, MCP, API layers) with specialized controls at each gate

Together, these patterns provide comprehensive security for MCP-based AI agent systems while maintaining operational simplicity through a unified gateway platform.