Restore tempo_request_duration_seconds metrics for querier_api_* requests#3403
Merged
yvrhdn merged 2 commits intografana:mainfrom Feb 16, 2024
Merged
Restore tempo_request_duration_seconds metrics for querier_api_* requests#3403yvrhdn merged 2 commits intografana:mainfrom
yvrhdn merged 2 commits intografana:mainfrom
Conversation
yvrhdn
commented
Feb 16, 2024
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create server: %w", err) | ||
| } | ||
| s.handler = s.externalServer.HTTPServer.Handler |
Contributor
Author
There was a problem hiding this comment.
@joe-elliott we were really close while pairing. We used s.externalServer.HTTP which is the router, HTTPServer.Handler is the handler which contains all middleware as well.
joe-elliott
approved these changes
Feb 16, 2024
|
|
||
| type TempoServer interface { | ||
| HTTP() *mux.Router | ||
| HTTPRouter() *mux.Router |
Collaborator
There was a problem hiding this comment.
i like this rename for clarity
yvrhdn
pushed a commit
to yvrhdn/tempo
that referenced
this pull request
Feb 26, 2024
…ests (grafana#3403) * Restore tempo_request_duration_seconds metrics for querier_api_* requests * Update CHANGELOG.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does:
Recent changes (I suspect #3300) broke middleware on our querier requests. We have HTTP middleware that provides tracing and metrics. Requests sent to the querier are routed slightly differently and due to this change we didn't get metrics anymore for the HTTP routes.
Issue
So
tempo_request_duration_secondsdoes not show up for routes likequerier_api_search_tagsandquerier_api_traces_traceid. GRPC routes were still instrumented correctly (/tempopb.Querier/FindTraceByIDand/tempopb.Querier/SearchTags).How it works
Normal middleware flow:
When we set
stream_over_http_enabled: falsethis will beWhen we set
stream_over_http_enabled: truewe switch up the order a bit:The bug: requests from query-frontend to querier are handled differently, they are pulled by the frontend worker process.
The frontend worker will pull a request from the frontend, unpack the GRPC requests and then push the HTTP request to the regular HTTP handler. This handler was the router (marked
(2)), since this handler is after instrumentation we didn't get any metrics for it. By instead passing the start of the middleware chain ((1)) we fix this.Adding
HTTPRouterandHTTPHandlertoTempoServershould make this more explicit.Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]