Multi-Cluster Traffic Distribution
Multi-Cluster traffic distribution enables automatic cross-cluster service discovery and HTTP traffic routing. A parent Traefik cluster automatically discovers workloads advertised by child clusters running on different infrastructure - VMs, Kubernetes, Docker Compose, or different cloud platforms - and creates services to route traffic to them. This includes any protocol that runs over HTTP such as gRPC and WebSockets, and works with HTTP/1.1, HTTP/2, and HTTP/3.
Multi-Cluster is a licensed feature. The multi-cluster feature must be included in your license for both parent and child clusters. Contact the Traefik Labs sales team for access.
This guide walks you through setting up multi-cluster routing and covers common use cases like weighted load balancing, failover, canary deployments, and traffic mirroring.
Overview
In a multi-cluster setup:
- A parent cluster acts as the entry point for all traffic and makes routing decisions
- Child clusters advertise their workloads using Uplink resources
- The parent automatically discovers child workloads and creates services to route traffic to them
Key Concepts
| Concept | Description |
|---|---|
| Uplink | A resource on child clusters that advertises a workload to parent clusters |
| Uplink Entry Point | A specialized entry point on child clusters for inter-cluster communication |
| Auto-generated Services | Services automatically created on the parent when uplinks are discovered |
| Auto-generated Routes | Routes automatically created on the child when associating uplink to a route |
Prerequisites
Before setting up multi-cluster routing, ensure you have:
- At least two Traefik Hub instances (one parent, one or more children)
- Network connectivity from the parent cluster to each child cluster on the child's uplink entry point port (e.g.,
9443). Traffic flows in one direction only: parent → child. Firewall rules should allow inbound connections on the uplink port on child clusters from the parent cluster. - TLS certificates for secure inter-cluster communication (mTLS recommended for production)
Step 1: Configure Child Clusters
Each child cluster needs an uplink entry point and uplink resources to advertise its workloads.
Configure Uplink Entry Point
On each child cluster, configure an uplink entry point in the static configuration:
- Helm Chart
- Install Configuration
ports:
multicluster:
port: 9443
uplink: true # Marks this port as an uplink entry point
asDefault: true # Uplinks without explicit entryPoints use this one
expose:
default: true # Exposes this port on the existing LoadBalancer service
http:
tls:
enabled: true # Enables TLS (self-signed certificate by default)
hub:
uplinkEntryPoints:
multicluster:
address: ":9443"
asDefault: true # Uplinks without explicit entryPoints use this one
http:
tls: {} # Enables TLS (self-signed certificate by default)
None of these settings are configured by default — each must be set explicitly. If you define multiple uplink entry points and none has asDefault: true, all of them are used as defaults.
The uplink entry point exposes an internal discovery API and forwards traffic to child routers. If the port is publicly reachable, an attacker could discover advertised routes and send requests to backend services. To prevent unauthorized access:
- Restrict network access to the uplink port using firewall rules or private networks so only the parent cluster can reach it
- Use mTLS in production so the child verifies the parent's client certificate (see Securing Inter-Cluster Communication)
- Add middlewares on child routers (e.g., IP allowlisting, rate limiting) for defense-in-depth — authentication is typically handled at the parent level, but child-side middlewares provide an additional layer of protection
Enable the Multi-Cluster Provider (Child Clusters)
Child clusters must enable the Multi-Cluster provider so the parent cluster can discover their advertised workloads.
- Helm Chart
- Install Configuration
hub:
providers:
multicluster:
enabled: true
# Required when using Uplink CRDs (Kubernetes)
providers:
kubernetesCRD:
enabled: true
hub:
providers:
multicluster: {}
# Required when using Uplink CRDs (Kubernetes)
providers:
kubernetescrd: {}
Create Uplinks and Route Configuration
Each provider exposes the Uplink concept in its own way. On Kubernetes, uplinks are declared as CRDs and referenced from IngressRoutes via annotations. On VMs or other platforms, uplinks and routers are defined through the file provider.
- Kubernetes (CRD)
- File Provider
Create an Uplink resource for each workload you want to advertise to the parent cluster:
apiVersion: hub.traefik.io/v1alpha1
kind: Uplink
metadata:
name: api-workload
namespace: apps
Then connect your router to the uplink using the hub.traefik.io/router.uplinks annotation with the fully-qualified uplink name (<namespace>-<name>):
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: api-route
namespace: apps
annotations:
hub.traefik.io/router.uplinks: "apps-api-workload"
spec:
routes:
- match: PathPrefix(`/api`)
kind: Rule
services:
- name: api-backend
port: 8080
Define uplinks and routers in the file provider configuration:
http:
uplinks:
api-workload:
entryPoints:
- multicluster
routers:
api-route:
rule: PathPrefix(`/api`)
service: api-backend
uplinks:
- api-workload
services:
api-backend:
loadBalancer:
servers:
- url: http://127.0.0.1:8080
The file provider has no namespace concept, so uplink names are used as-is.
When mixing providers (e.g., file on VMs and CRD on Kubernetes), use spec.exposeName on the Kubernetes Uplink to match the file provider name.
See VM to Kubernetes Migration for an example.
When a router references an uplink:
- Do not specify
entryPointson the router (inherited from the uplink) - Do not specify
tlsconfiguration (handled by the uplink entry point)
Step 2: Configure Parent Cluster
The parent cluster needs the Multi-Cluster provider configured to connect to child clusters. Each child address must be reachable from the parent on the uplink port (see Prerequisites and Security Considerations).
- Helm Chart
- Install Configuration
hub:
providers:
multicluster:
enabled: true
pollInterval: 5
pollTimeout: 5
children:
child-1:
address: "https://child1.example.com:9443"
child-2:
address: "https://child2.example.com:9443"
hub:
providers:
multicluster:
pollInterval: 5s
pollTimeout: 5s
children:
child-1:
address: "https://child1.example.com:9443"
child-2:
address: "https://child2.example.com:9443"
If your child clusters use self-signed TLS certificates, the parent cluster will fail to connect with a certificate validation error
(e.g., tls: failed to verify certificate: x509: cannot validate certificate for 10.38.248.230 because it doesn't contain any IP SANs).
To allow connections with self-signed certificates during development or testing, add insecureSkipVerify: true to each child's configuration:
children:
child-1:
address: "https://child1.example.com:9443"
serversTransport:
insecureSkipVerify: true
Warning: Only use insecureSkipVerify in development/testing environments. For production, use properly signed certificates and configure
mutual TLS (mTLS) with certificate authorities.
Route Traffic to Child Clusters
Once configured, the parent automatically creates services for discovered uplinks. Reference these services in your routers:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: api-parent-route
namespace: apps
spec:
entryPoints:
- websecure
routes:
- match: Host(`api.example.com`)
kind: Rule
services:
- name: apps-api-workload@multicluster
kind: TraefikService
tls: {}
The service name follows the pattern <namespace>-<uplink-name>@multicluster.
When running child clusters on different platforms (e.g., Kubernetes and VMs), the service names that appear on the parent must match. Kubernetes Uplinks default to <namespace>-<name>,
but file provider uplinks use the key name as-is with no namespace prefix. Use spec.exposeName on Kubernetes Uplinks to align names across platforms.
See VM to Kubernetes Migration for an example.
Use Cases
Weighted Load Balancing
Distribute traffic across multiple child clusters. By default, traffic is distributed equally. If needed, you can assign a different weight to each cluster's Uplink to control the traffic proportion.
- Child 1 (Primary)
- Child 2 (Secondary)
# On child cluster 1 - receives 90% of traffic
apiVersion: hub.traefik.io/v1alpha1
kind: Uplink
metadata:
name: api-workload
namespace: apps
spec:
weight: 90
# On child cluster 2 - receives 10% of traffic
apiVersion: hub.traefik.io/v1alpha1
kind: Uplink
metadata:
name: api-workload
namespace: apps
spec:
weight: 10
The parent cluster automatically creates a weighted round-robin service that distributes traffic according to these weights. You can also create your own service on the parent that targets the per-child services to override weights or use a different routing strategy such as failover or traffic mirroring (see the Multi-Cluster Provider Reference).
Automatic Failover
Configure failover to automatically route traffic to a backup cluster when the primary becomes unavailable.
On the parent cluster, create a failover service that references the auto-generated per-child services:
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
name: api-failover
namespace: apps
spec:
failover:
service: apps-api-workload-child-1@multicluster
fallback: apps-api-workload-child-2@multicluster
Then reference this failover service in your router:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: api-route
namespace: apps
spec:
entryPoints:
- websecure
routes:
- match: Host(`api.example.com`)
kind: Rule
services:
- name: api-failover
kind: TraefikService
tls: {}
When the primary cluster (child-1) fails health checks or becomes unreachable, traffic automatically shifts to the fallback cluster (child-2).
High Availability Pair
For high availability with two clusters that serve as mutual backups, each cluster can use its local service as primary with the remote cluster as fallback. This creates a bidirectional failover configuration.
On Cluster A (e.g., EU region):
First, configure an Uplink with health checks:
apiVersion: hub.traefik.io/v1alpha1
kind: Uplink
metadata:
name: api-workload
namespace: apps
spec:
entryPoints:
- uplink
exposeName: api-workload-cluster-a
healthCheck:
hostname: "api.cluster-a.example.com"
path: /health
interval: 10s
timeout: 3s
status: 200
port: 443
Then configure the multicluster provider to connect to Cluster B and define a failover service using the file provider:
http:
services:
api-ha-failover:
failover:
service: apps-api-workload@kubernetescrd # Local service
fallback: api-workload-cluster-b@multicluster # Remote cluster
Create an IngressRoute that uses the failover service:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: api-route
namespace: apps
spec:
entryPoints:
- websecure
routes:
- match: Host(`api.example.com`)
kind: Rule
services:
- name: api-ha-failover@file
kind: TraefikService
tls: {}
On Cluster B (e.g., US region):
Configure an Uplink with health checks:
apiVersion: hub.traefik.io/v1alpha1
kind: Uplink
metadata:
name: api-workload
namespace: apps
spec:
entryPoints:
- uplink
exposeName: api-workload-cluster-b
healthCheck:
hostname: "api.cluster-b.example.com"
path: /health
interval: 10s
timeout: 3s
status: 200
port: 443
Then configure the multicluster provider to connect to Cluster A and define the symmetric failover:
http:
services:
api-ha-failover:
failover:
service: apps-api-workload@kubernetescrd # Local service
fallback: api-workload-cluster-a@multicluster # Remote cluster
Create the same IngressRoute configuration:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: api-route
namespace: apps
spec:
entryPoints:
- websecure
routes:
- match: Host(`api.example.com`)
kind: Rule
services:
- name: api-ha-failover@file
kind: TraefikService
tls: {}
Each cluster serves traffic from its local service by default. When health checks detect the local service is unavailable, traffic automatically fails over to the remote cluster.
Canary Deployments
Gradually shift traffic between clusters for canary deployments. This is controlled from the parent cluster rather than the children.
Create a weighted service on the parent:
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
name: api-canary
namespace: apps
spec:
weighted:
services:
- name: apps-api-workload-child-1@multicluster
kind: TraefikService
weight: 90
- name: apps-api-workload-child-2@multicluster
kind: TraefikService
weight: 10
To shift more traffic to the new version, update the weights:
spec:
weighted:
services:
- name: apps-api-workload-child-1@multicluster
kind: TraefikService
weight: 50
- name: apps-api-workload-child-2@multicluster
kind: TraefikService
weight: 50
Cookie-based sticky sessions (client-side session affinity) are not yet available for multi-cluster load balancing. However, server-side stickiness is supported through Highest Random Weight (HRW), which deterministically routes clients to the same child cluster based on request attributes.
For use cases requiring client-side session persistence, consider using application-level session management or implementing stickiness at the child cluster level.
Traffic Mirroring
Mirror a percentage of production traffic to a secondary cluster for testing without affecting users.
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
name: api-mirrored
namespace: apps
spec:
mirroring:
name: apps-api-workload-child-1@multicluster
kind: TraefikService
mirrors:
- name: apps-api-workload-child-2@multicluster
kind: TraefikService
percent: 10
This sends all traffic to child-1 while mirroring 10% to child-2. Responses from the mirror are discarded.
Consistent Hashing (HRW) for Stateful Services
For stateful services like MCP (Model Context Protocol) servers where clients must reach the same backend consistently, Traefik Hub supports Highest Random Weight (HRW), also known as rendezvous hashing or consistent hashing. This provides server-side stickiness without requiring cookies, routing requests to the same child cluster based on request attributes such as source IP or headers.
HRW ensures that clients using stateful protocols like MCP maintain their session with the same child cluster, which is essential when the server maintains conversation context and state across multiple requests.
To use HRW with multi-cluster services, create a dedicated TraefikService with highestRandomWeight that references the multi-cluster services:
- IngressRoute
- TraefikService
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: mcp-route
namespace: apps
spec:
entryPoints:
- websecure
routes:
- match: Host(`mcp.example.com`)
kind: Rule
services:
- name: mymcp-hrw-service
kind: TraefikService
middlewares:
- name: mcp-jwt
- name: mcp-gateway
tls: {}
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
name: mymcp-hrw-service
namespace: apps
spec:
highestRandomWeight:
services:
- name: apps-mcp-server-child-1@multicluster
kind: TraefikService
namespace: apps
- name: apps-mcp-server-child-2@multicluster
kind: TraefikService
namespace: apps
When multiple child clusters advertise the same MCP server uplink, HRW deterministically routes each client to the same child based on the client's request characteristics. This maintains session affinity without the overhead of cookie management, making it ideal for API Gateway and MCP Gateway scenarios.
VM to Kubernetes Migration
Migrate workloads from VMs to Kubernetes by running both in parallel and gradually shifting traffic.
On VMs, uplinks are defined through the file provider.
On Kubernetes, the Uplink CRD automatically prefixes the uplink name with the namespace (<namespace>-<name>).
Since the file provider has no namespace concept, you must align the names, so the parent sees both clusters under the same service.
Use the exposeName field on the Kubernetes Uplink to match the file provider name.
- VM Cluster (Routing Configuration)
- Kubernetes Cluster (CRD)
# VM cluster - file provider routing configuration
http:
uplinks:
banking-api:
weight: 80
routers:
banking:
rule: PathPrefix(`/banking`)
service: banking-backend
uplinks:
- banking-api
services:
banking-backend:
loadBalancer:
servers:
- url: http://127.0.0.1:8080
# Kubernetes cluster - Uplink CRD
apiVersion: hub.traefik.io/v1alpha1
kind: Uplink
metadata:
name: banking-api
namespace: apps
spec:
# Without exposeName, the uplink would be advertised as "apps-banking-api".
# Set exposeName to match the VM cluster's uplink name.
exposeName: banking-api
weight: 20
Both clusters now advertise under the name banking-api, and the parent creates a single banking-api@multicluster weighted service. As confidence in the new version grows, adjust weights to shift more traffic to Kubernetes.
Dedicated Infrastructure per Customer
Route specific customers to dedicated clusters based on JWT claims or other request attributes.
On the parent cluster, use Multi-Layer Routing so a parent router authenticates the request and injects claim-derived headers, and a child router makes the routing decision based on those headers. (You can't match on JWT-derived headers in the same router because middleware runs after router matching.)
- Parent Router
- Second-Layer Router (Parent Cluster)
- JWT Middleware
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: api-parent
namespace: apps
spec:
entryPoints:
- websecure
routes:
- match: Host(`api.example.com`)
kind: Rule
middlewares:
- name: jwt-auth
tls: {}
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: api-routing
namespace: apps
spec:
parentRefs:
- name: api-parent
namespace: apps
routes: # child routes are not accessible directly without matching the parent route first (AND relation)
- match: HeadersRegexp(`X-Customer-Tier`, `enterprise`)
kind: Rule
services:
- name: apps-api-workload-dedicated@multicluster
kind: TraefikService
- match: PathPrefix(`/`)
kind: Rule
priority: 0 # acts as a catch-all route for all other customers
services:
- name: apps-api-workload-shared@multicluster
kind: TraefikService
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: jwt-auth
namespace: apps
spec:
plugin:
jwt:
source:
header:
name: Authorization
prefix: Bearer
claims:
- name: tier
forwardHeader: X-Customer-Tier
In this example, both routers live on the parent cluster. The parent router authenticates the request and injects the X-Customer-Tier header, and the second-layer router makes the routing decision based on that header,
sending enterprise traffic to a dedicated child cluster (apps-api-workload-dedicated@multicluster) while other traffic goes to the shared cluster (apps-api-workload-shared@multicluster).
This example uses specific per-child services (-dedicated and -shared suffixes) rather than the generic apps-api-workload@multicluster weighted round-robin service.
If you used the generic service for the catch-all route, non-enterprise customers could randomly be routed to the dedicated cluster through load balancing, defeating the purpose of tier-based routing.
The per-child services ensure traffic isolation between customer tiers.
Since JWT authentication runs on the parent cluster, the child cluster's uplink entry point does not enforce it. If the uplink port is publicly reachable, requests sent directly to the child bypass the parent's JWT middleware entirely. Use mTLS and network restrictions on the uplink entry point to ensure only the parent can reach child clusters (see Security Considerations and Securing Inter-Cluster Communication).
Securing Inter-Cluster Communication
For production deployments, secure communication between parent and child clusters using mutual TLS (mTLS). With mTLS, both sides authenticate each other during the TLS handshake:
- The child presents its server certificate to the parent — the parent verifies it against a trusted CA (
rootCAs) - The parent presents its client certificate to the child — the child verifies it against the same or a different trusted CA (
clientAuth.caFiles)
This two-way verification ensures that only authorized parent clusters can communicate with child clusters, and that the parent connects to legitimate child clusters.
Certificate Files
mTLS requires the following certificates, which must be generated outside of Traefik (e.g., using openssl or a PKI tool). You can use a single CA for both purposes, or use separate CAs:
| File | Location | Purpose |
|---|---|---|
ca.crt | Parent + Child | CA certificate that signed both the parent's client cert and the child's server cert (or use separate CAs) |
client.crt / client.key | Parent | Client certificate the parent presents to the child during the TLS handshake |
child.crt / child.key | Child | Server certificate the child presents to the parent during the TLS handshake |
Configure mTLS on Parent
On the parent, the serversTransport configures both sides of the parent's TLS behavior:
rootCAs: CA certificates used to verify the child's server certificate (the standard TLS direction — without this, you'd needinsecureSkipVerify: true)certificates: Client certificate and key that the parent presents to the child (this is what makes it mTLS — without this, it's one-way TLS/HTTPS)
hub:
providers:
multicluster:
children:
child-1:
address: "https://child1.example.com:9443"
serversTransport:
rootCAs:
- /certs/ca.crt # Verify the child's server certificate
certificates:
- certFile: /certs/client.crt # Present to child as client identity
keyFile: /certs/client.key
Configure mTLS on Child Entry Point
On the child, two things must be configured:
- A TLS option with
clientAuth— this tells the child to demand a client certificate from the parent and verify it - A server certificate — the child's own certificate, presented to the parent during the TLS handshake
Reference a TLS option that requires client certificates on the child's uplink entry point:
- Helm Chart
- Install Configuration
ports:
multicluster:
port: 9443
uplink: true
asDefault: true
expose:
default: true
http:
tls:
enabled: true
options: strict-mtls@file
hub:
uplinkEntryPoints:
multicluster:
address: ":9443"
http:
tls:
options: strict-mtls@file
Then define the strict-mtls TLS option in a file provider configuration on the child cluster:
# Routing configuration (file provider) on the child cluster
tls:
options:
strict-mtls:
clientAuth:
caFiles:
- /certs/ca.crt # Verify the parent's client certificate
clientAuthType: RequireAndVerifyClientCert # Reject connections without a valid client cert
minVersion: VersionTLS12
stores:
default:
defaultCertificate:
certFile: /certs/child.crt # Child's server certificate (presented to parent)
keyFile: /certs/child.key
For Kubernetes, you can define the TLS option as a CRD and reference it using the format namespace-name@kubernetescrd:
apiVersion: traefik.io/v1alpha1
kind: TLSOption
metadata:
name: mtls-uplink
namespace: traefik
spec:
clientAuth:
secretNames:
- mtlsca # Secret containing ca.crt
clientAuthType: RequireAndVerifyClientCert
The child's server certificate must be added to the TLS store (not in the TLSOption):
# Helm values - TLS store configuration
tlsStore:
default:
defaultCertificate:
secretName: default-cert # Default certificate
certificates:
- secretName: uplinkcert # Child's server certificate for mTLS
Then reference the TLS option in your uplink entry point configuration:
# Helm values - Uplink entry point with mTLS
ports:
multicluster:
port: 9443
uplink: true
additionalArguments:
- --hub.uplinkEntryPoints.multicluster.http.tls.options=traefik-mtls-uplink@kubernetescrd
The key setting is clientAuthType: RequireAndVerifyClientCert — this is what enforces the "mutual" part of mTLS. Without it, the child would accept any TLS connection (only HTTPS), even from unauthorized clients. With it,
only clients presenting a certificate signed by the trusted CA (i.e., the parent cluster) can connect to the uplink entry point.
Testing mTLS Connection
To verify that mTLS is properly configured, test the connection to the child's uplink endpoint using curl with client certificates:
curl --cert client.pem --key client.key -k https://child-cluster.example.com:9443/api/uplinks
This command:
- Uses the parent's client certificate (
--cert client.pem) and private key (--key client.key) - Connects to the child's uplink entry point on the configured port (e.g.,
9443) - Queries the
/api/uplinksendpoint to verify authentication
If mTLS is correctly configured, the child will accept the connection and return uplink information. Without valid client certificates, the connection will be rejected.
For complete TLS configuration options, see the Multi-Cluster Provider Reference.
Troubleshooting
Uplinks Not Appearing on Parent
- Verify the child cluster's uplink entry point is reachable from the parent
- Check that the child address in the parent configuration is correct
- Check parent logs for polling errors
Traffic Not Reaching Child Clusters
- Verify the service name follows the pattern
<uplink-expose-name>@multicluster(the expose name defaults to<namespace>-<uplink-name>, but can be overridden byspec.exposeName) - Check that the router on the child references the uplink correctly
- Ensure the child cluster's backend service is healthy
Connection Errors
- Verify TLS certificates are valid and not expired
- Check firewall rules allow traffic on the uplink entry point port
- Ensure
serversTransportconfiguration matches the child's TLS setup
Related Content
- Multi-Cluster Provider Reference: Complete configuration options
- Uplink Reference: Uplink resource specification
- Uplink Entry Points Reference: Entry point configuration
- Multi-Layer Routing: Hierarchical routing patterns
