Conversation
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
mdisibio
left a comment
There was a problem hiding this comment.
This looks great and I like the way it is controlled by querier features. A few small q's, but none blocking and will go ahead and approve.
Co-authored-by: Martin Disibio <mdisibio@gmail.com>
zalegrala
left a comment
There was a problem hiding this comment.
This looks pretty good to me, nice improvement. Will be interesting to see the results on the dashboard. Had a question about the context handling but not blocking.
| // then error out this upstream request _and_ stream. | ||
| case err := <-errs: | ||
| req.err <- err | ||
| err = reportResponseUpstream(reqBatch, errs, resps) |
There was a problem hiding this comment.
Do we have a context to pass? Wondering if it might simplify the context handling below.
There was a problem hiding this comment.
If the streaming GRPC server connection itself drops or context is cancelled then the .Send() returns an error and this case is hit:
https://github.com/grafana/tempo/pull/2677/files#diff-0914703aed52090bd72851004df203444207d9d48677c10860b0459afef1a0b9R311
If the request is cancelled upstream then this case is hit:
https://github.com/grafana/tempo/pull/2677/files#diff-0914703aed52090bd72851004df203444207d9d48677c10860b0459afef1a0b9R304
If the requests are cancelled downstream then we get an http response and this case is hit:
https://github.com/grafana/tempo/pull/2677/files#diff-0914703aed52090bd72851004df203444207d9d48677c10860b0459afef1a0b9R304
I think everything is covered.
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
What this PR does:
Batches jobs in the requests from the query-frontend queue to the queriers. Previously, the frontend would send each job 1 at at time with an individual http request. This PR adds a configurable parameter to allow the frontend to send more than one request at once.
Other changes:
tempo_query_frontend_actual_batch_sizeto track the actual size of the batches being farmed to the queriersPerformance testing
The goal with the setup was to create a cluster that could execute the 36k jobs created by the test query simultaneously. This way job throughput from frontend -> querier could be tested more directly.
Results
The overall latency of queries where total jobs > total cluster capacity was not as impressively reduced, but this is a good step in the right direction.
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]