Skip to main content

AI Gateway

AI Gateway is a service-based solution integrated within Traefik Hub, designed to simplify the management and integration of multiple Large Language Model (LLM) providers.

Configuration Example

  • Define your AIService
apiVersion: hub.traefik.io/v1alpha1
kind: AIService
metadata:
name: ai-anthropic
namespace: traefik
spec:
anthropic:
token: "YOUR_ANTHROPIC_TOKEN"
model: "anthropic-model-name"
  • Attach your preferred AIService to an ingressRoute as a TraefikService
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: ai-test
namespace: traefik
spec:
routes:
- kind: Rule
match: Host(`ai.localhost`)
services:
- kind: TraefikService
name: traefik-ai-openai@ai-gateway-service

Configuration options

Below is a table outlining the available configuration options/fields for the AIService resource:

AI ProvidersFieldDescriptionRequiredPossible Values / Examples
OpenAI, Cohere, AnthropictokenAuthentication token for AI providersYesA string provided by the AI provider (e.g., sk-XXXX)
Gemini, AzureOpenAI, MistralapiKeyAPI Key for AI providersYesA string key provided by the AI provider (e.g., APIKEY123)
Ollama, AzureOpenAIbaseUrlBase URL for AI providers that require itYesA valid URL (e.g., https://api.ollama.com)
AzureOpenAIdeploymentNameDeployment name for AzureOpenAIYesA string specifying the deployment (e.g., deployment1)
All ProvidersmodelSpecifies the AI model to useYesModel name as a string (e.g., gpt-4)
BedrockregionAWS region for Bedrock (auto-detected if AWS is the cloud provider)YesAWS region codes (e.g., us-west-2)
BedrocksystemMessageEnables system message for BedrockNoA boolean (e.g., true )
All Providersparams.frequencyPenaltyPenalty for frequency in model outputsNoFloat value (e.g., 0.5)
All Providersparams.maxTokensMaximum number of tokens per requestNoInteger value (e.g., 1500)
All Providersparams.presencePenaltyPenalty for presence in model outputsNoFloat value (e.g., 0.3)
All Providersparams.temperatureControls randomness in responsesNoFloat range typically between 0.0 and 2.0 (e.g., 0.7)
All Providersparams.topPControls diversity via nucleus samplingNoFloat range typically between 0.0 and 1.0 (e.g., 0.9)