AI Gateway
As organizations embrace generative AI at scale, they face mounting challenges such as fragmented integrations, escalating inference costs, governance gaps, and inconsistent performance. Traefik Hub’s AI Gateway addresses these issues by providing a unified, secure, and extensible entry point to manage and scale AI workloads across providers, clouds, and edge environments.
Integrated as a service within Traefik Hub, AI Gateway simplifies the management and integration of multiple Large Language Model (LLM) providers by offering a unified API to connect with various AI services. This centralizes configuration, security, and observability for enterprise-grade AI deployments through a single, secure gateway.
Moreover, it can be combined with all existing middlewares to suit different use cases, serving as a versatile building block that complements and enhances your current infrastructure.
Challenges and How Traefik AI Gateway Solves Them
Challenge | How Traefik AI Gateway Solves It |
---|---|
Multiple integrations across teams | Unified API abstracts LLM complexity, reducing integration overhead. |
API key sprawl and mismanagement | Secure, centralized credential storage via Kubernetes Secrets ensures proper API key management. |
Redundant API calls | Semantic Cache Middleware reuses prior responses, cutting down on duplicate calls and inference costs. |
Governance inconsistencies | Content Guard Middleware filters inputs/outputs against compliance rules, enforcing consistent governance. |
Lack of observability | Native OpenTelemetry support provides AI-specific metrics for enhanced monitoring and troubleshooting. |
Vendor lock-in risks | Provider-agnostic API supports model switching, giving you the flexibility to change AI providers as needed. |
High latency for real-time AI | Edge-compatible deployment minimizes latency, ensuring faster AI inference. |
Cost inefficiencies in inference | Smart routing and caching optimize expensive API usage, lowering operational costs. |
Inconsistent versioning of models | Declarative configuration ensures consistency across deployments, reducing versioning issues in AI models. |
Supported AI Providers
The Traefik Hub AI Gateway currently supports the following AI providers:
- Anthropic
- AzureOpenAI
- Bedrock
- Cohere
- DeepSeek
- Gemini
- Ollama
- OpenAI
- Mistral
- Qwen
Creating an AI gateway with the AIService
CRD
- Enable the AI gateway feature by upgrading your Traefik hub deployment
helm upgrade traefik -n traefik --wait \
--reuse-values \
--set hub.experimental.aigateway=true \
traefik/traefik
The AI Gateway feature is currently marked as experimental. However, it is fully functional and ready for use, and we are committed to maintaining and enhancing this feature. Due to the fast-paced advancements in the AI space, the API may change in future releases to accommodate new developments. We recommend staying updated with the latest documentation to take full advantage of upcoming improvements.
- Create a Kubernetes Secret to store your AI provider token or API key.
apiVersion: v1
kind: Secret
metadata:
name: openai-apikey
type: Opaque
stringData:
key: YOUR_OPENAI_KEY
- Define & apply an
AIService
resource with any of the supported AI providers. For this example, we will be using the OpenAI provider.
apiVersion: hub.traefik.io/v1alpha1
kind: AIService
metadata:
name: ai-openai
namespace: traefik
spec:
openai:
baseURL: "YOUR_BASE_URL"
token:
secretName: "openai-apikey" # make sure to reference the name of the Secret you created to store the API key.
model: "o1-preview"
The baseURL
key is optional and should only be used if your AI provider is compatible with OpenAI API. It allows OpenAI-compatible AIs to integrate without requiring new configuration.
See the AIService reference page for more details.
- Attach the
ai-openai
AIService
we created above to an IngressRoute as aTraefikService
- IngressRoute
- IngressRoute with API Management Enabled
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: ai-test
namespace: traefik
spec:
routes:
- kind: Rule
match: Host(`ai.localhost`)
services:
- kind: TraefikService
name: traefik-ai-openai@ai-gateway-service
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
annotations:
hub.traefik.io/api: "ai@traefik"
name: ai-test
namespace: traefik
spec:
routes:
- kind: Rule
match: Host(`ai.localhost`)
services:
- kind: TraefikService
name: traefik-ai-openai@ai-gateway-service
- To define your
AIService
name in an IngressRoute, use the following format:namespace-ai-service-name@ai-gateway-service
- If you have API management enabled you can reference your API using an annotation in the following format :
api-name@namespace
- Make a request to the
AIService
The example below makes a request to the OpenAI o1-preview model.
- Request
- Response
curl -d '{
"messages": [
{
"role": "user",
"content": "tell me a joke"
}
]
}' http://ai.localhost
{
{
"id": "chatcmpl-AaXqZLqy082BZ81TNICriIzjqvrOD",
"object": "chat.completion",
"created": 1733273279,
"model": "o1-preview-2024-09-12",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Sure! Here's a joke for you:\n\n**Why don't scientists trust atoms?**\n\nBecause they make up everything!"
},
"finish_reason": "stop",
"content_filter_results": {
"hate": {
"filtered": false
},
"self_harm": {
"filtered": false
},
"sexual": {
"filtered": false
},
"violence": {
"filtered": false
},
"jailbreak": {
"filtered": false,
"detected": false
},
"profanity": {
"filtered": false,
"detected": false
}
}
}
],
"usage": {
"prompt_tokens": 32,
"completion_tokens": 419,
"total_tokens": 451,
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
},
"completion_tokens_details": {
"audio_tokens": 0,
"reasoning_tokens": 384
}
},
"system_fingerprint": "fp_e76890f0c3"
}
For a comprehensive list of configuration examples & options available for each supported provider, please refer to the AI Gateway reference documentation
Observability and Monitoring
The Traefik Hub AI Gateway integrates with OpenTelemetry to provide comprehensive usage metrics tailored for Generative AI operations. This allows you to monitor token usage, operation durations, and overall system performance.
See the Metrics page for more information.
Middlewares
In addition to the standard Traefik Hub middlewares, the AI Gateway can leverage specialized transformations or safeguards tailored to LLM-driven scenarios. The AI Gateway supports a few dedicated middlewares for these use cases:
-
Semantic Cache Middleware: Reduces repetitive calls to your LLM by caching previous requests and responses based on semantic similarity. Read the Semantic Cache middleware documentation →
-
Content Guard Middleware: Filters or masks specific sensitive information in incoming or outgoing requests, helping you enforce compliance or policy requirements. This middleware can be used with both AI and regular traffic. Read the Content Guard documentation →
While both can be composed with other Hub features (e.g., rate limiting, authentication), be aware that semantic caching only applies to AI workloads, whereas content guard supports standard API Gateway use cases as well.
Frequently Asked Questions
-
How do I rotate API tokens without downtime?
To rotate API tokens, update the
token
orapiKey
field in the correspondingAIService
resource. Traefik AI Gateway will automatically use the new credentials without requiring changes to client applications. -
Can I monitor AI service performance?
Yes, Traefik AI Gateway integrates with OpenTelemetry to provide detailed metrics on token usage and operation durations. You can visualize these metrics using monitoring tools like Prometheus and Grafana.
grafana.com
Related Content
- Learn more about the AI Gateway in its reference documentation.
- Learn more about Traefik Service in its dedicated section.
- Learn more about IngressRoute in its dedicated section.