Metrics
Export Metrics on your APIs with Traefik Hub & Open Telemetry
Introduction
Traefik Hub uses OpenTelemetry (sometimes referred to as OTel). Open Telemetry is an open-source observability framework that allows you to collect, process, and export telemetry data from applications and infrastructure. Open Telemetry helps you get insights into system performance and behavior.
At any point while reading this documentation you need additional information on OpenTelemetry, you can refer to its official documentation.
Metrics are used to measure and record quantitative data about the performance and behavior of your APIs. Metrics are aggregated over time, allowing you to understand trends and patterns.
Traefik Hub exposes different kinds of metrics:
Metrics do not provide detailed information about individual requests or transactions. Metrics provide aggregated information about the overall system or application.
Configuration
To enable OpenTelemetry, you have to adjust the default configuration of your Traefik Hub Gateway Helm Chart with the dedicated options, like in the following examples.
- HTTP OpenTelemetry EndPoint
- gRPC OpenTelemetry EndPoint
metrics:
otlp:
enabled: true
http:
enabled: true
endpoint: "http://myotlpcollector:4318"
metrics:
otlp:
enabled: true
grpc:
enabled: true
endpoint: "myotlpcollector:55690"
Adjust the Configuration
To adjust the default configuration, please refer to the values files of the Chart.
Dataflow Metrics
Metric | Type | Description |
---|---|---|
traefik_hub_api_requests_total | Counter | The number of handled API requests. |
traefik_hub_api_requests_duration_milliseconds_sum traefik_hub_api_requests_duration_milliseconds_count traefik_hub_api_requests_duration_milliseconds_bucket | Histogram | Processing duration histograms. |
traefik_hub_api_requests_bytes_total | Counter | Requests size (in bytes) handled by APIs (body size). |
traefik_hub_api_responses_bytes_total | Counter | Responses size (in bytes) handled by APIs (body size). |
Labels
Labels, also known as tags or dimensions, are key-value pairs that provide context to metrics. Labels add additional information to your metrics and traces, allowing you to segment and filter data to gain more insights.
Label | Description | Example |
---|---|---|
code | Request code | "200" |
method | Request Method | "GET" |
protocol | Request protocol | "http", "grpc", ... |
user_id | Unique user identifier | "645133c75d2aee16b07ef3da" The user_id holds the internal ID or the JWT ID, depending on where the user comes from. |
email | User email | "[email protected]" The owner of the API Key or Application consuming the API. |
token_name | Name of the API key used to authenticate. | "my-test-token" |
api_name | Name of the API. | "flight-api" |
api_version_name | The version of the API. | "flight-api-v2" |
api_namespace | Namespace of the API. | "airlines" |
tls_version | TLS version used for the request. | "1.0" |
tls_cipher | TLS cipher used for the request. | "TLS_FALLBACK_SCSV" |
API Management Metrics
Metric | Type | Labels | Description |
---|---|---|---|
traefik_hub_apis_ratio | Gauge | api_namespace | Number of APIs. |
traefik_hub_apis_bundle_ratio | Gauge | - | Number of API Bundles. |
traefik_hub_apis_plans_ratio | Gauge | - | Number of API Plans. |
traefik_hub_api_portal_apis_ratio | Gauge | - | Number of APIs published on API Portals. |
traefik_hub_api_versions_ratio | Gauge | api_name , api_namespace | Number of API versions. |
traefik_hub_api_accesses_ratio | Gauge | - | Number of API Accesses. |
traefik_hub_api_keys_ratio | Gauge | user_id , email | Number of API keys with access to APIs on the agent. |
traefik_hub_users_ratio | Gauge | - | Number of users that belong to a group attached to at least one API Access. |
traefik_hub_user_groups_ratio | Gauge | - | Number of groups attached to at least one API Access. |
traefik_hub_portals_ratio | Gauge | - | Number of API Portals. |
Labels
Labels, also known as tags or dimensions, are key-value pairs that provide context to metrics. Labels add additional information to your metrics and traces, allowing you to segment and filter data to gain more insights.
Label | Description | Example |
---|---|---|
app_id | Unique Identifier for the application | "app-id" |
app_name | Name of the Application | "myApp" |
user_id | Unique user identifier | "645133c75d2aee16b07ef3da" The user_id holds the internal ID or the JWT ID, depending on where the user comes from. |
email | User email | "[email protected]" The owner of the API Key or Application consuming the API. |
token_name | Name of the API key used to authenticate. | "my-test-token" |
api_name | Name of the API. | "flight-api" |
api_version_name | The version of the API. | "flight-api-v2" |
api_namespace | Namespace of the API. | "airlines" |
Ingress Metrics
Global Metrics
Metric | Type | Labels | Description |
---|---|---|---|
traefik_config_reloads_total | Count | The total count of configuration reloads. | |
traefik_config_last_reload_success | Gauge | The timestamp of the last configuration reload success. | |
traefik_open_connections | Gauge | entrypoint , protocol | The current count of open connections, by entrypoint and protocol. |
traefik_tls_certs_not_after | Gauge | The expiration date of certificates. | |
traefik_hub_ingresses_ratio | Gauge | Number of Ingresses in the cluster. | |
traefik_hub_ingress_routes_ratio | Gauge | Number of Traefik Ingresses in the cluster. | |
traefik_hub_services_ratio | Gauge | Number of Services in the cluster. |
Labels
Here is a comprehensive list of labels that are provided by the global metrics:
Label | Description | example |
---|---|---|
entrypoint | Entrypoint that handled the connection | "example_entrypoint" |
protocol | Connection protocol | "TCP" |
OpenTelemetry Semantic Conventions
Traefik Proxy follows official OpenTelemetry semantic conventions v1.23.1.
HTTP Server
Metric | Type | Labels | Description |
---|---|---|---|
http.server.request.duration | Histogram | error.type , http.request.method , http.response.status_code , network.protocol.name , server.address , server.port , url.scheme | Duration of HTTP server requests |
Labels
Here is a comprehensive list of labels that are provided by the metrics:
Label | Description | example |
---|---|---|
error.type | Describes a class of error the operation ended with | "500" |
http.request.method | HTTP request method | "GET" |
http.response.status_code | HTTP response status code | "200" |
network.protocol.name | OSI application layer or non-OSI equivalent | "http/1.1" |
network.protocol.version | Version of the protocol specified in network.protocol.name | "1.1" |
server.address | Name of the local HTTP server that received the request | "example.com" |
server.port | Port of the local HTTP server that received the request | "80" |
url.scheme | The URI scheme component identifying the used protocol | "http" |
HTTP Client
Metric | Type | Labels | Description |
---|---|---|---|
http.client.request.duration | Histogram | error.type , http.request.method , http.response.status_code , network.protocol.name , server.address , server.port , url.scheme | Duration of HTTP client requests |
Labels
Here is a comprehensive list of labels that are provided by the metrics:
Label | Description | example |
---|---|---|
error.type | Describes a class of error the operation ended with | "500" |
http.request.method | HTTP request method | "GET" |
http.response.status_code | HTTP response status code | "200" |
network.protocol.name | OSI application layer or non-OSI equivalent | "http/1.1" |
network.protocol.version | Version of the protocol specified in network.protocol.name | "1.1" |
server.address | Name of the local HTTP server that received the request | "example.com" |
server.port | Port of the local HTTP server that received the request | "80" |
url.scheme | The URI scheme component identifying the used protocol | "http" |
HTTP Metrics
On top of the official OpenTelemetry semantic conventions, Traefik provides its own metrics to monitor the incoming traffic.
EntryPoint Metrics
Metric | Type | Labels | Description |
---|---|---|---|
traefik_entrypoint_requests_total | Count | code , method , protocol , entrypoint | The total count of HTTP requests received by an entrypoint. |
traefik_entrypoint_requests_tls_total | Count | tls_version , tls_cipher , entrypoint | The total count of HTTPS requests received by an entrypoint. |
traefik_entrypoint_request_duration_seconds | Histogram | code , method , protocol , entrypoint | Request processing duration histogram on an entrypoint. |
traefik_entrypoint_requests_bytes_total | Count | code , method , protocol , entrypoint | The total size of HTTP requests in bytes handled by an entrypoint. |
traefik_entrypoint_responses_bytes_total | Count | code , method , protocol , entrypoint | The total size of HTTP responses in bytes handled by an entrypoint. |
Router Metrics
Metric | Type | Labels | Description |
---|---|---|---|
traefik_router_requests_total | Count | code , method , protocol , router , service | The total count of HTTP requests handled by a router. |
traefik_router_requests_tls_total | Count | tls_version , tls_cipher , router , service | The total count of HTTPS requests handled by a router. |
traefik_router_request_duration_seconds | Histogram | code , method , protocol , router , service | Request processing duration histogram on a router. |
traefik_router_requests_bytes_total | Count | code , method , protocol , router , service | The total size of HTTP requests in bytes handled by a router. |
traefik_router_responses_bytes_total | Count | code , method , protocol , router , service | The total size of HTTP responses in bytes handled by a router. |
Service Metrics
Metric | Type | Labels | Description |
---|---|---|---|
traefik_service_requests_total | Count | code , method , protocol , service | The total count of HTTP requests processed on a service. |
traefik_service_requests_tls_total | Count | tls_version , tls_cipher , service | The total count of HTTPS requests processed on a service. |
traefik_service_request_duration_seconds | Histogram | code , method , protocol , service | Request processing duration histogram on a service. |
traefik_service_retries_total | Count | service | The count of requests retries on a service. |
traefik_service_server_up | Gauge | service , url | Current service's server status, 0 for a down or 1 for up. |
traefik_service_requests_bytes_total | Count | code , method , protocol , service | The total size of requests in bytes received by a service. |
traefik_service_responses_bytes_total | Count | code , method , protocol , service | The total size of responses in bytes returned by a service. |
Labels
Here is a comprehensive list of labels that are provided by the metrics:
Label | Description | example |
---|---|---|
cn | Certificate Common Name | "example.com" |
code | Request code | "200" |
entrypoint | Entrypoint that handled the request | "example_entrypoint" |
method | Request Method | "GET" |
protocol | Request protocol | "http" |
router | Router that handled the request | "example_router" |
sans | Certificate Subject Alternative NameS | "example.com" |
serial | Certificate Serial Number | "123..." |
service | Service that handled the request | "example_service@provider" |
tls_cipher | TLS cipher used for the request | "TLS_FALLBACK_SCSV" |
tls_version | TLS version used for the request | "1.0" |
url | Service server url | "http://example.com" |
method
label value"Generative AI Metrics
The Generative AI client metrics are only available when you enable the AI Gateway
The following metrics are specifically designed to monitor Generative AI client applications, adhering to OpenTelemetry’s conventions.
Metric | Type | Description |
---|---|---|
gen_ai.client.token.usage | Histogram | Measures the number of input and output tokens used in AI operations. |
gen_ai.client.operation.duration | Histogram | Measures the duration of Generative AI operations in seconds. |
For more detailed information on the available metrics & their attributes, refer to the OpenTelemetry Gen AI client metrics documentation.
Labels
Label | Description | example |
---|---|---|
app_id | Unique Identifier for the application | "app-id" |
app_name | Name of the Application | "myApp" |
error.type | Describes a class of error the operation ended with. | timeout; java.net.UnknownHostException; server_certificate_invalid; 500 |
genai.operation.name | The name of the operation performed | "chat" |
genai.response_model | The model used by the AI service in the response | "gpt-4o-2024-08-06" |
genai.request_model | The model requested by the client | "gpt-3.5-turbo" |
genai.system | Identifier for the AI provider system | "openai" |
server.address | The address of the AI service server | "api.openai.com" |
server.port | The port number of the AI service server | 443 |
Related Content
- Learn how to use Prometheus to export your API metrics.