[Metrics Generator] Add support for entity limiting#5788
Conversation
|
💻 Deploy preview available ([Metrics Generator] Add support for external metrics generator): |
edc77ae to
f2eced7
Compare
This PR adds a new type of limiter to the metrics generator. It is designed to limit on "entities" rather than series. In this case an entity is a single labelset across multiple metrics, excluding external labels. In effect, it allows the limiter to always generate the full set of data for a given entity, rather than limiting randomly once the series limit is triggered. Importantly, this limit is applied at collection time, so it's still necessary to set a series limit in order to avoid generator OOM kills. In practice, the series limit should likely be some constant multiple of the entity limit. e.g. 20x. This allows the system to primarily limit on entities while still enforcing a ceiling on memory consumption.
45a56eb to
737e348
Compare
joe-elliott
left a comment
There was a problem hiding this comment.
Fair number of performance concerns I would like to be addressed. I'm also concerned that even if the generator does not have max active entities configured it seems to be doing a fair amount of work.
Final question, why don't we just limit this in the same way we do max active series? Can we extend onAddMetricSeries to take an entity hash? and just do the limiting there? it seems like it would have much less impact on the codebase.
|
💻 Deploy preview deleted ([Metrics Generator] Add support for entity limiting). |
|
@joe-elliott Based on our conversations in Grafana Labs internal slack, I've made the following changes:
I opted to keep the demand metrics for both limiters always on, this way it's possible to estimate what a good entity limit would be when migrating. I've also moved the series limiter metrics into local_series_limiter.go and added equivalents for entity limiting, so it should all feel familiar for operators when switching. |
joe-elliott
left a comment
There was a problem hiding this comment.
added some thoughts. i really like the reorg with the limiter interface and i think this will be np to merge. can we get a changelog entry?
also it might be nice to add a few lines to a doc somewhere that this mode exists and why. unfortunately i couldn't find a good place to add it. maybe just note it in the metrics generator config? a full explanation of the feature and all the details is likely a bit much (unless you are super excited to add all that), but just documenting its existence might be nice to see if we get community adoption.
|
@joe-elliott I believe I've implemented all your feedback. Regarding docs, I think for now the config docs are sufficient. Once I get some experience with this in practice, and also work on a few other improvements, it might be worth adding / updating a docs page specifically for managing span metrics cardinality, we'll see. |
knylander-grafana
left a comment
There was a problem hiding this comment.
Thank you for adding comments to the config docs.
|
@joe-elliott done! thanks |
This PR adds a new type of limiter to the metrics generator. It is
designed to limit on "entities" rather than series. In this case an
entity is a single labelset across multiple metrics, excluding external
labels. In effect, it allows the limiter to always generate the full set
of data for a given entity, rather than limiting randomly once the
series limit is triggered.