grafana · knylander-grafana · Oct 3, 2025 · Aug 30, 2025 · Sep 4, 2025 · Sep 4, 2025
@@ -135,3 +135,12 @@ query_frontend:
     concurrent_jobs: 8
     target_bytes_per_job: 1.25e+09 # ~1.25GB
 ```
+
+## Sampling and performance optimization
+
+TraceQL metrics queries support sampling hints to improve performance on large datasets. Refer to the [TraceQL metrics sampling](/docs/tempo/<TEMPO_VERSION>/metrics-from-traces/metrics-queries/sampling-guide/) documentation for more information.
+
+When using sampling in your TraceQL metrics queries, consider:
+
+- **Timeout settings:** Sampled queries run faster but may still benefit from adequate timeouts
+- **Concurrent jobs:** Sampling reduces per-job processing time, allowing higher concurrency
@@ -335,3 +335,42 @@ This example means the attribute `resource.cluster` had too many values.
 ```
 { __meta_error="__too_many_values__", resource.cluster=<nil> }
 ```
+
+## Adaptive sampling
+
+TraceQL metrics queries support sampling to optimize performance and control sampling behavior.
+There are three sampling methods available:
+
+- Adaptive sampling using `with(sample=true)`, which automatically determines the optimal sampling strategy based on query characteristics.
+- Fixed span sampling using `with(span_sample=0.xx)`, which selects the specified percentage of spans.
+- Fixed trace sampling using `with(trace_sample=0.xx)`, which selects complete traces for analysis.
+
+Refer to the [TraceQL metrics sampling](/docs/tempo/<TEMPO_VERSION>/metrics-from-traces/metrics-queries/sampling-guide/) documentation for more information.
+
+{{< admonition  type="note" >}}
+Sampling hints only work with TraceQL metrics queries (those using functions like `rate()`, `count_over_time()`, etc.).
+{{< /admonition >}}
+
+### Adaptive sampling: `with(sample=true)`
+
+Automatically determines optimal sampling strategy based on query selectivity and data volume.
+
+```
+{ resource.service.name="frontend" } | rate() with(sample=true)
+```
+
+#### Fixed span sampling: `with(span_sample=0.xx)`
+
+Samples a fixed percentage of spans for span-level aggregations.
+
+```
+{ status=error } | count_over_time() with(span_sample=0.1)
+```
+
+#### Fixed trace sampling: `with(trace_sample=0.xx)`
+
+Samples a fixed percentage of traces for trace-level aggregations.
+
+```
+{ } | count() by (resource.service.name) with(trace_sample=0.05)
+```
@@ -0,0 +1,98 @@
+---
+title: TraceQL metrics sampling
+menuTitle: TraceQL metrics sampling
+description: Optimize TraceQL metrics query performance using sampling hints
+weight: 500
+keywords:
+  - TraceQL metrics
+  - sampling
+  - performance optimization
+  - query optimization
+---
+
+# TraceQL metrics sampling
+
+{{< docs/shared source="tempo" lookup="traceql-metrics-admonition.md" version="<TEMPO_VERSION>" >}}
+
+TraceQL metrics sampling dynamically and automatically chooses how to sample your tracing data to give you the highest quality signal with examining as little data as possible.
+The overall performance improvement depends on the query. Heavy queries, such as `{ } | rate()`, show improvements of 2-4 times.
+
+Sampling intelligently selects a representative subset of data for processing, making it particularly valuable for:
+
+- Real-time dashboards requiring fast refresh rates
+- Exploratory data analysis where approximate results accelerate insights
+- Resource-constrained environments with limited compute capacity
+- Large-scale deployments processing terabytes of trace data daily
+
+Adaptive sampling was featured in the September 2025 Tempo community call. Watch the [recording](https://www.youtube.com/watch?v=7H8JX5FUw08) starting at the 12:00 minute mark to learn more.
+
+Refer to the [TraceQL metrics documentation](https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-from-traces/metrics-queries/) to learn more.
+
+{{< youtube id="fdmLmJMlUjI" start="720" >}}
+
+## Sampling methods
+
+There are three sampling methods available:
+
+- Adaptive sampling using `with(sample=true)`, which automatically determines the optimal sampling strategy based on query characteristics.
+- Fixed span sampling using `with(span_sample=0.xx)`, which selects the specified percentage of spans.
+- Fixed trace sampling using `with(trace_sample=0.xx)`, which selects complete traces for analysis.
+
+### How adaptive sampling works
+
+Adaptive sampling, `with(sample=true)`, applies probabilistic sampling at the storage layer.
+This sampling method uses an adaptive probabilistic approach that responds to how common spans and traces matching the query are.
+This approach applies probabilistic sampling at the storage layer, for example, only inspecting `xx%` spans, or `xx%` traces, depending on the needs of the query.
+
+When there is a lot of data, it lowers the sampling rate. When matches are rare it keeps the sampling rate higher, possibly never going below 100%. Therefore, the performance gain depends on the query.
+
+This behavior can be overridden to focus more on fixed span sampling using `with(span_sample=0.xx)` or fixed trace sampling using `with(trace_sample=0.xx)`.
+
+## Before you begin
+
+TraceQL metrics sampling requires:
+
+- Tempo 2.8+ with TraceQL metrics enabled
+- `local-blocks` processor configured in metrics-generator ([documentation](/docs/tempo/<TEMPO_VERSION>/metrics-from-traces/metrics-queries/configure-traceql-metrics/))
+- Grafana 10.4+ or Grafana Cloud for UI integration
+
+You can use the TraceQL query editor in the Tempo data source in Grafana or Grafana Cloud to run the sample queries.
+Refer to [TraceQL queries in Grafana](https://grafana.com/docs/tempo/<TEMPO_VERSION>/traceql/query-editor/) for more information.
+
+## Adaptive sampling using `with(sample=true)`
+
+Adaptive sampling automatically determines the optimal sampling strategy based on query characteristics. It switches between span-level and trace-level sampling as needed and adjusts sampling rates dynamically.
+The goal is for `with(sample=true)` to be safe to include in virtually any query, regardless of scale or selectivity.
+
+```traceql
+{ resource.service.name="checkout-service" } | rate() with(sample=true)
+{ status=error } | count_over_time() by (resource.service.name) with(sample=true)
+```
+
+**Best for:** Most queries. Specifically, all queries returning a single series, and cases where the dynamic sampling rate is important, such as when the traffic has large variations across time or is not known in advance.
+
+**Limitations:** May under-sample rare events depending on the query, if it returns time series with a large difference between the most common and rarest events.
+
+## Fixed span sampling using `with(span_sample=0.xx)`
+
+Fixed span sampling selects the specified percentage of spans.
+
+```traceql
+{ status=error } | rate() by (resource.service.name) with(span_sample=0.1)
+```
+
+**Best for:** Exact control over accuracy and speed when the data characteristics are known in advance.
+
+**Limitations:** May miss important events during low-volume periods and not optimal for naturally selective queries.
+
+## Fixed trace sampling using `with(trace_sample=0.xx)`
+
+Fixed trace sampling selects complete traces for analysis, preserving trace context and relationships between spans within the same request flow.
+
+```traceql
+{ } >> { status=error }  | rate() by (resource.service.name) with(trace_sample=0.1)
+```
+
+**Best for:** Trace-level aggregations, service dependency mapping, and error correlation analysis.
+
+**Limitations:** Not as accurate as span-level sampling when trace sizes vary significantly. Only use for queries requiring it, such as structural or spanset correlation, and prefer adaptive or span-level sampling for all others.