Use BasicLifecycler for distributors and auto-forget#2154
Merged
56quarters merged 4 commits intomainfrom Jun 24, 2022
Merged
Conversation
pstibrany
reviewed
Jun 21, 2022
e461b3e to
2a5c10c
Compare
100669f to
a096530
Compare
c05e273 to
83c065d
Compare
colega
reviewed
Jun 23, 2022
colega
reviewed
Jun 23, 2022
colega
reviewed
Jun 23, 2022
Contributor
|
LGTM, left some nitpicks, waiting to be un-drafted. |
Contributor
|
Hi, let me know if there's anything useful here: #684 , I'd like to close that old PR of mine - I never had time to finish it, plus lifecycler had some feature that I didn't know how to emulate. |
pstibrany
reviewed
Jun 23, 2022
Contributor
Author
Definitely some useful parts! I see some similarity with needing a number of healthy ring members:
I'm not sure what the "MinReadyDuration" feature would look like as a delegate. |
colega
approved these changes
Jun 24, 2022
pstibrany
approved these changes
Jun 24, 2022
Use the BasicLifecycler in distributors for managing their lifecycle so that we can take advantage of the "auto-forget" delegates feature. This prevents the ring from filling up with "unhealthy" distributors that are never removed. This wasn't a bug but it was confusing for users and operators. Fixes #2138 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
315bc07 to
ed3350f
Compare
pstibrany
reviewed
Jun 24, 2022
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
rlex
added a commit
to rlex/mimir
that referenced
this pull request
Jun 28, 2022
* main: (63 commits) Add new section on website for links to blog posts, podcasts and talks. (grafana#2216) Rename codified errors to errors catalog (grafana#2256) Helm: add a step to contributing doc (grafana#2257) Signal that 2.2 release is now in progress. (grafana#2254) Removed migration of alertmanager local state files from old hierarchy (Cortex 1.8 and earlier) (grafana#2253) operations/mimir: Change multi_zone_ingester_max_unavailable to 25 (grafana#2251) Helm: weekly release (grafana#2252) Jsonnet: Configure ingester max global metadata per user and per metric (grafana#2250) Helm: metamonitor naming (grafana#2236) Mimir documentation about out-of-order (grafana#2183) Vendor latest mimir-prometheus/main (grafana#2243) Set CODEOWNERS to primary technical writer (grafana#2242) Use BasicLifecycler for distributors and auto-forget (grafana#2154) Docs: Basic documentation for deploying the ruler using jsonnet. (grafana#2127) Fix post merge reviews on 2187 (grafana#2230) Add tests for user metadata in the ingester (grafana#2184) Change the error message template for per-tenant limits (grafana#2234) helm: meta-monitoring (grafana#2068) Article about migrating from Consul to memberlist. Added documentation for /memberlist endpoint. (grafana#2166) Update runbooks to mention possibility to investigate memberlist KV store in various alerts (grafana#2158) ...
masonmei
pushed a commit
to udmire/mimir
that referenced
this pull request
Jul 11, 2022
Use the BasicLifecycler in distributors for managing their lifecycle so that we can take advantage of the "auto-forget" delegates feature. This prevents the ring from filling up with "unhealthy" distributors that are never removed. This wasn't a bug but it was confusing for users and operators. Fixes grafana#2138 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
pracucci
reviewed
Jul 13, 2022
|
|
||
| // ringAutoForgetUnhealthyPeriods is how many consecutive timeout periods an unhealthy instance | ||
| // in the ring will be automatically removed after. | ||
| ringAutoForgetUnhealthyPeriods = 10 |
Collaborator
There was a problem hiding this comment.
Looks very high. I would be more aggressive with distributors, like 2 should be enough.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Use the BasicLifecycler in distributors for managing their lifecycle so
that we can take advantage of the "auto-forget" delegates feature. This
prevents the ring from filling up with "unhealthy" distributors that are
never removed. This wasn't a bug but it was confusing for users and
operators.
Which issue(s) this PR fixes or relates to
Fixes #2138
Signed-off-by: Nick Pillitteri nick.pillitteri@grafana.com
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]