-
Notifications
You must be signed in to change notification settings - Fork 692
[metrics]: enable native histograms for Tempo's promauto-registered metrics #6910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
896ceb9
e7ba032
556365d
fce3f0c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -93,6 +93,7 @@ func IsSingleBinary(target string) bool { | |
| func (t *App) initServer() (services.Service, error) { | ||
| t.cfg.Server.MetricsNamespace = metricsNamespace | ||
| t.cfg.Server.ExcludeRequestInLog = true | ||
| t.cfg.Server.MetricsNativeHistogramFactor = 1.1 | ||
|
|
||
|
Comment on lines
93
to
97
|
||
| if t.cfg.EnableGoRuntimeMetrics { | ||
| // unregister default Go collector | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -66,10 +66,13 @@ var ( | |
| Help: "The total number of blocks cleared.", | ||
| }, []string{"block_type"}) | ||
| metricCompletionSize = promauto.NewHistogram(prometheus.HistogramOpts{ | ||
| Namespace: "tempo_live_store", | ||
| Name: "completion_size_bytes", | ||
| Help: "Size in bytes of blocks completed.", | ||
| Buckets: prometheus.ExponentialBuckets(1024*1024, 2, 10), // from 1MB up to 1GB | ||
| Namespace: "tempo_live_store", | ||
| Name: "completion_size_bytes", | ||
| Help: "Size in bytes of blocks completed.", | ||
| Buckets: prometheus.ExponentialBuckets(1024*1024, 2, 10), // from 1MB up to 1GB | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this true, ie from 1MB to 1GB? Or is this more like 500MB?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IE the first bucket is 1 MiB, so the last one would be 512MiB? Other than that looks good.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch. This is pre-existing, but I've pushed a commit to update the comment. |
||
| NativeHistogramBucketFactor: 1.1, | ||
| NativeHistogramMaxBucketNumber: 100, | ||
| NativeHistogramMinResetDuration: 1 * time.Hour, | ||
| }) | ||
| metricBackPressure = promauto.NewCounterVec(prometheus.CounterOpts{ | ||
| Namespace: "tempo", | ||
|
|
@@ -78,10 +81,14 @@ var ( | |
| Help: "The total amount of time spent waiting to process data from queue", | ||
| }, []string{"reason"}) | ||
| metricTotalBackPressure = promauto.NewHistogram(prometheus.HistogramOpts{ | ||
| Namespace: "tempo", | ||
| Subsystem: "live_store", | ||
| Name: "back_pressure_duration_seconds", | ||
| Help: "Duration of backpressure wait per push", | ||
| Namespace: "tempo", | ||
| Subsystem: "live_store", | ||
| Name: "back_pressure_duration_seconds", | ||
| Help: "Duration of backpressure wait per push", | ||
| Buckets: prometheus.DefBuckets, | ||
| NativeHistogramBucketFactor: 1.1, | ||
| NativeHistogramMaxBucketNumber: 100, | ||
| NativeHistogramMinResetDuration: 1 * time.Hour, | ||
| }) | ||
| ) | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting
t.cfg.Server.MetricsNativeHistogramFactor = 1.1enables native histograms for server request/throughput metrics by default, which adds per-request CPU/memory overhead and (because the upstream field isyaml:"-") can’t be tuned/disabled via Tempo config.Could we consider plumbing a Tempo config/flag for this (even if defaulting to 1.1), so operators can turn it off or adjust the factor if they hit performance or cardinality/memory issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this is necessary.