Skip to content

[Metrics Generator] Add support for entity limiting#5788

Merged
joe-elliott merged 27 commits intomainfrom
logiraptor/external-metrics-limiter
Nov 14, 2025
Merged

[Metrics Generator] Add support for entity limiting#5788
joe-elliott merged 27 commits intomainfrom
logiraptor/external-metrics-limiter

Conversation

@Logiraptor
Copy link
Copy Markdown
Contributor

@Logiraptor Logiraptor commented Oct 23, 2025

This PR adds a new type of limiter to the metrics generator. It is
designed to limit on "entities" rather than series. In this case an
entity is a single labelset across multiple metrics, excluding external
labels. In effect, it allows the limiter to always generate the full set
of data for a given entity, rather than limiting randomly once the
series limit is triggered.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Oct 23, 2025

CLA assistant check
All committers have signed the CLA.

@Logiraptor Logiraptor changed the title logiraptor/external metrics limiter [Metrics Generator] Add support for external metrics generator Oct 23, 2025
@github-actions
Copy link
Copy Markdown
Contributor

💻 Deploy preview available ([Metrics Generator] Add support for external metrics generator):

@Logiraptor Logiraptor force-pushed the logiraptor/external-metrics-limiter branch from edc77ae to f2eced7 Compare October 27, 2025 20:40
@Logiraptor Logiraptor changed the title [Metrics Generator] Add support for external metrics generator [Metrics Generator] Add support for entity limiting Oct 27, 2025
@Logiraptor Logiraptor marked this pull request as ready for review October 27, 2025 20:55
This PR adds a new type of limiter to the metrics generator. It is
designed to limit on "entities" rather than series. In this case an
entity is a single labelset across multiple metrics, excluding external
labels. In effect, it allows the limiter to always generate the full set
of data for a given entity, rather than limiting randomly once the
series limit is triggered.

Importantly, this limit is applied at collection time, so it's still
necessary to set a series limit in order to avoid generator OOM kills.
In practice, the series limit should likely be some constant multiple of
the entity limit. e.g. 20x. This allows the system to primarily limit on
entities while still enforcing a ceiling on memory consumption.
@Logiraptor Logiraptor force-pushed the logiraptor/external-metrics-limiter branch from 45a56eb to 737e348 Compare October 30, 2025 12:39
Copy link
Copy Markdown
Collaborator

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair number of performance concerns I would like to be addressed. I'm also concerned that even if the generator does not have max active entities configured it seems to be doing a fair amount of work.

Final question, why don't we just limit this in the same way we do max active series? Can we extend onAddMetricSeries to take an entity hash? and just do the limiting there? it seems like it would have much less impact on the codebase.

Comment thread modules/generator/localentitylimiter/local_entity_limiter.go Outdated
Comment thread modules/generator/registry/registry.go Outdated
Comment thread modules/generator/registry/registry.go Outdated
Comment thread modules/generator/localentitylimiter/local_entity_limiter.go Outdated
Comment thread modules/generator/registry/registry.go Outdated
Comment thread modules/generator/registry/registry.go Outdated
Comment thread modules/generator/registry/registry.go Outdated
Comment thread modules/generator/localentitylimiter/local_entity_limiter.go Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Oct 31, 2025

💻 Deploy preview deleted ([Metrics Generator] Add support for entity limiting).

@Logiraptor
Copy link
Copy Markdown
Contributor Author

@joe-elliott Based on our conversations in Grafana Labs internal slack, I've made the following changes:

  • Simplified the Limiter interface to the minimum needed for this PR only
  • Extracted series limiting into an implementation of Limiter
  • Added a configuration option to pick which limiter applies

I opted to keep the demand metrics for both limiters always on, this way it's possible to estimate what a good entity limit would be when migrating. I've also moved the series limiter metrics into local_series_limiter.go and added equivalents for entity limiting, so it should all feel familiar for operators when switching.

Copy link
Copy Markdown
Collaborator

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some thoughts. i really like the reorg with the limiter interface and i think this will be np to merge. can we get a changelog entry?

also it might be nice to add a few lines to a doc somewhere that this mode exists and why. unfortunately i couldn't find a good place to add it. maybe just note it in the metrics generator config? a full explanation of the feature and all the details is likely a bit much (unless you are super excited to add all that), but just documenting its existence might be nice to see if we get community adoption.

Comment thread modules/generator/instance.go
Comment thread modules/generator/localentitylimiter/local_entity_limiter.go Outdated
Comment thread modules/generator/localentitylimiter/local_entity_limiter.go
Comment thread modules/generator/localentitylimiter/local_entity_limiter.go Outdated
Comment thread modules/generator/localentitylimiter/local_entity_limiter.go Outdated
Comment thread modules/generator/localentitylimiter/local_entity_limiter.go Outdated
@Logiraptor
Copy link
Copy Markdown
Contributor Author

@joe-elliott I believe I've implemented all your feedback. Regarding docs, I think for now the config docs are sufficient. Once I get some experience with this in practice, and also work on a few other improvements, it might be worth adding / updating a docs page specifically for managing span metrics cardinality, we'll see.

Copy link
Copy Markdown
Contributor

@knylander-grafana knylander-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding comments to the config docs.

Copy link
Copy Markdown
Collaborator

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loki Grafana Tempo Mimir

looks like some minor conflicts to cleanup then we can merge?

@Logiraptor
Copy link
Copy Markdown
Contributor Author

@joe-elliott done! thanks

@joe-elliott joe-elliott merged commit 86e5820 into main Nov 14, 2025
23 checks passed
@joe-elliott joe-elliott deleted the logiraptor/external-metrics-limiter branch November 14, 2025 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants