Feature: Distributor usage trackers by mdisibio · Pull Request #4162 · grafana/tempo

mdisibio · 2024-10-07T15:53:47Z

What this PR does:
This a new feature that allows a tenant to accurately track the amount of ingested traffic by a set of custom labels. It's similar to the existing traces_spanmetrics_size_total metric created by the generators but improves on it in some key ways.

** Need **
The core need is to export a set of highly accurate metrics on ingested traffic that tenants can use for cost-attribution. This means that every ingested byte can be attributed to something, i.e. a team or department, so that tracing costs can be reconciled. Any attribute in the tracing data can be used.

** Reasons for a new feature**:

The existing size metric isn't accurate enough. It doesn't include non-span data (i.e. resources and scopes). This can be significant, typically 15+% of the total payload. However it can also not be fixed because of the way input data is sharded across the generator rings. Each time a batch is split by trace ID, the non-span data is duplicated (the resource-level information is duplicated for each generator target in order to ensure an internally-consistent payload). Trying to account for it then errs on the other side and over-counts the non-span data (85% -> 115%). Therefore we needed a new approach which is 99+% accurate. The only component in Tempo which has the original payload is the distributor so it is the ideal location to add this functionality.
The labels for usage tracking need to be separately configurable from span metrics. Span metrics typically includes labels such as http url or status code, span success/failure, database targets. This level of detail is fine-grained and geared towards operational needs, which is separate from cost-attribution and cost reconciliation.
Rename and combine - To bridge gaps in instrumentation and provide value without application changes, we support the ability to rename and combine different attributes into the same output bucket. The prime case for this is to bridge the gap between open-telemetry-style semantic conventions (service.name) and historical conventions (app). The per-tenant dimensions are a map of input->output. If multiple inputs correspond to the same output then they are added together.

** Important concepts about this new feature **

This lays the foundation in the distributor for generic trackers in the future. The only one now is cost-attribution, and is controlled by per-tenant overrides. Examples of other trackers are helpful things like tracking the adoption of instrumentation libraries, databases, etc.
The trackers are exposed on a new endpoint /usage_metrics. This is so that they aren't mixed with the existing operational /metrics, because they are expected to have much higher cardinality and a different purpose.
Significant work went into the algorithm for measuring non-span data correctly and fairly. Tracing payloads are composed of a batch with resource attributes and many spans with their own attributes (simplifying greatly here). A span is always matched to a single category, but the non-span data cannot (the ~15% of data). Therefore we split it proportionally based on the assignment of spans. Example If the batch contains 10 spans with "foo", and 5 spans with "bar", then the category "foo" will get 67% of the non-span bytes, and "bar" will get 33%.
This adds overhead to the write-path of the distributor. But it should be minimal. It is effectively a single additional call to proto.Size(), and the series tracking contains buffering that enables it to be zero-allocation for existing series. If you run the benchmark you will see:
BenchmarkUsageTracker 1995282 5904 ns/op 0 B/op 0 allocs/op

TODOs
~~There is some remaining questions:~~
1. Behavior for unconfigured tenants. Currently it's opt-in: if overrides.cost_attribution.dimensions isn't set then the distributor records nothing. But maybe it makes sense to have default behavior here? Service name may be a good default.
2. Behavior when hitting against max cardinality. Currently if we already have the max number of series, it records the data into the unlabeled series. But the existing series keep working. This means that it is not possible to assess the data quality after max cardinality is reached. Thinking about alternative behaviors here.

Which issue(s) this PR fixes:
Fixes #

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

…leanup

knylander-grafana · 2024-10-07T22:55:28Z

Should we create a doc issue for this when it's ready?

joe-elliott

core code looks great. have some Qs but mainly just needs changelog and docs

mdisibio · 2024-10-09T18:25:35Z

Should we create a doc issue for this when it's ready?

I'm happy to add docs in this PR if we want. At the minimum I will update the config and url sections.

mdisibio · 2024-10-10T18:48:21Z

Pushed some changes to configure via map[string]string] instead of []string, to handle the case of many-to-one dimensions with relabel. This allows for the flexibility to accommodate gaps across a tenant's data. For example we could now scan for both labels k8s.namespace.name and app_namespace, and combine both into a final namespace label. There are still some outstanding changes TODO.

…f the previous span if they were missing values

…ys having a value

…e manifest and config index

mdisibio · 2024-10-22T17:45:11Z

Pushed new behavior for overflow and missing labels:

Missing: we now always output a label=value pair for each configured dimension. If both the span and batch are missing this value, then it is set to dimension="__missing__". Each dimension is handled separately, so dimensionA might be populated, and dimensionB could be "__missing__".
Overflow: when max-cardinality is reached it now goes into the series with each dimension="__overflow__". Previously it went to the unlabeled series.

These two changes ensure that the output metrics always have the expected labels, and are consistent across edge cases.

zalegrala

Solid work.

joe-elliott · 2024-10-24T20:27:11Z

 		return &tempopb.PushResponse{}, nil
 	}
 	// check limits
+	// todo - usage tracker include discarded bytes?


i believe the answer is no? we don't want it to included discarded?

To clarify this TODO is about creating separate metrics for discards.
tempo_usage_tracker_bytes_received_total vs
tempo_usage_tracker_bytes_discarded_total

I think for now let's proceed without it. We can add it later if needed.

joe-elliott

partway done. will review the Observe method next

knylander-grafana

Thank you for updating the doc. Approving for the doc only. At a future time, we should convert all of the links in the API doc from relrefs to standard links. We've changed our recommended way to do linking a while back.

joe-elliott

final thoughts. overall lgtm just had some Qs. i'll drop a approval whenever you're ready

joe-elliott · 2024-10-25T12:50:37Z

+				// to rehash, but couldn't figure out a better way for now.
+				// The difficulty is tracking bucket dirty status while
+				// resetting to batch values and recording the span values.
+				if bucket == nil || !slices.Equal(buffer2, last) {


i feel like the normal case here is that people will configure resource level attributes only so we will almost always use the previously grabbed bucket.

the best idea i have to dodge slices.Equal is if we configured each dimension with resource or span level. then we would only search those attributes that could match and we'd know if we needed to look in span level at all.

that might not work with the broader goals of the project though if we intend people to specify attribute names only w/o the overhead of knowing its span or resource.

Yes we discussed some of this in: #4162 (comment) For now I'd like to leave the configuration as-is (scopeless) because of the broader goals of the project and also it's similar to how dimensions for the metrics-generator are configured.

* First working draft of cost attribution usage tracker * Add missing tracker name label, more efficient batch proportioning, cleanup * Reduce series hashing * Fix user-configurable overrides tests for new json element * lint * Add per-tenant override for max cardinality * lint, review feedback * Default to not enabled, cleanup test config * Explicitly check for usage_metrics handler * review feedback * Update tracker to support many-to-one mapping with relabel * lint * New behavior for missing and overflow * Fix issue where subsequent spans would incorrectly reuse the series of the previous span if they were missing values * Revert maps back to slices now that we can depend on a dimension always having a value * Please ignore benchmark profiles * Tweak config to have specific cost attribution tracker section. Update manifest and config index * lint * changelog * Update api docs for new endpoint * Review feedback * review feedback * Swap loop order for a tad more performance

mdisibio added 5 commits September 18, 2024 15:13

First working draft of cost attribution usage tracker

c17aa36

Add missing tracker name label, more efficient batch proportioning, c…

dd1c660

…leanup

Reduce series hashing

f631f66

Fix user-configurable overrides tests for new json element

9841b90

lint

86a64d6

joe-elliott reviewed Oct 8, 2024

View reviewed changes

Add per-tenant override for max cardinality

bbae991

electron0zero reviewed Oct 9, 2024

View reviewed changes

mdisibio added 4 commits October 9, 2024 13:49

lint, review feedback

b972253

Default to not enabled, cleanup test config

4488a7b

Explicitly check for usage_metrics handler

40a3845

review feedback

f35267e

mdisibio added 3 commits October 10, 2024 09:58

Merge branch 'main' into usage-tracker

83856bd

Update tracker to support many-to-one mapping with relabel

73f55a0

lint

5ed9bb8

mdisibio added 8 commits October 21, 2024 09:19

New behavior for missing and overflow

33fa586

Fix issue where subsequent spans would incorrectly reuse the series o…

40a0964

…f the previous span if they were missing values

Revert maps back to slices now that we can depend on a dimension alwa…

7e7e949

…ys having a value

Please ignore benchmark profiles

9c670ec

Tweak config to have specific cost attribution tracker section. Updat…

c7c2001

…e manifest and config index

lint

84225bf

changelog

61608f8

Merge branch 'main' into usage-tracker

4f53dde

mdisibio added 2 commits October 22, 2024 15:04

Update api docs for new endpoint

ba69c98

Merge branch 'main' into usage-tracker

300f2db

mdisibio marked this pull request as ready for review October 22, 2024 19:35

mdisibio requested review from ie-pham, javiermolinar, knylander-grafana, mapno, stoewer, yvrhdn and zalegrala as code owners October 22, 2024 19:35

javiermolinar reviewed Oct 23, 2024

View reviewed changes

Review feedback

95660e4

zalegrala approved these changes Oct 24, 2024

View reviewed changes

Comment thread modules/distributor/usage/tracker.go

joe-elliott reviewed Oct 24, 2024

View reviewed changes

Comment thread modules/distributor/usage/tracker.go

Comment thread modules/distributor/usage/tracker.go

Comment thread modules/distributor/usage/tracker.go

Comment thread modules/distributor/usage/tracker.go

knylander-grafana reviewed Oct 25, 2024

View reviewed changes

Comment thread docs/sources/tempo/configuration/_index.md Outdated

knylander-grafana reviewed Oct 25, 2024

View reviewed changes

joe-elliott reviewed Oct 25, 2024

View reviewed changes

mdisibio added 3 commits October 25, 2024 11:09

review feedback

0941a04

Swap loop order for a tad more performance

4042829

Merge branch 'main' into usage-tracker

eed21b5

mdisibio merged commit b0f06ce into grafana:main Oct 25, 2024

joe-elliott mentioned this pull request Nov 15, 2024

How to Create a Prometheus Alert for Missing Traces for a Specific Component in Tempo? #4322

Closed

electron0zero mentioned this pull request Nov 28, 2024

max query expr electron0zero/tempo#3

Closed

electron0zero mentioned this pull request Jan 29, 2025

contri svs electron0zero/tempo#4

Closed

Conversation

mdisibio commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knylander-grafana commented Oct 7, 2024

Uh oh!

joe-elliott left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mdisibio commented Oct 9, 2024

Uh oh!

mdisibio commented Oct 10, 2024

Uh oh!

mdisibio commented Oct 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zalegrala left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joe-elliott Oct 24, 2024

Choose a reason for hiding this comment

Uh oh!

mdisibio Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

joe-elliott left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

knylander-grafana left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joe-elliott left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joe-elliott Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

mdisibio Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

mdisibio commented Oct 7, 2024 •

edited

Loading

mdisibio commented Oct 22, 2024 •

edited

Loading

knylander-grafana left a comment •

edited

Loading