tempo: add new alerts and runbooks#6276
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3c8262646c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a9236849ab
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - "alert": "TempoMetricsGeneratorProcessorUpdatesFailing" | ||
| "annotations": | ||
| "message": "Metrics-generator processor updates are failing for tenant {{ $labels.tenant }} in {{ $labels.cluster }}/{{ $labels.namespace }}." | ||
| "runbook_url": "https://github.com/grafana/tempo/tree/main/operations/tempo-mixin/runbook.md#TempoMetricsGeneratorProcessorUpdatesFailing" |
There was a problem hiding this comment.
Complete the compiled alert definition
The compiled alerts.yaml now terminates immediately after the annotations for TempoMetricsGeneratorProcessorUpdatesFailing, so this alert (and anything that should follow) is missing the required expr/for/labels fields. Prometheus rule files require an expr for each alert; loading this compiled mixin will fail validation or silently drop the alert group depending on the loader, so users consuming operations/tempo-mixin-compiled/alerts.yaml will not be able to load the rules. Regenerating the compiled file or fixing the truncation resolves this.
Useful? React with 👍 / 👎.
What this PR does:
It introduces new alerts and runbook entries to make tempo operators life easier:
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]