Skip to content

[clickhouse][perf] Restructure ClickHouse FindTraceIDs Query to Improve Performance#8125

Merged
yurishkuro merged 7 commits intojaegertracing:mainfrom
mahadzaryab1:improve-search-performance
Mar 3, 2026
Merged

[clickhouse][perf] Restructure ClickHouse FindTraceIDs Query to Improve Performance#8125
yurishkuro merged 7 commits intojaegertracing:mainfrom
mahadzaryab1:improve-search-performance

Conversation

@mahadzaryab1
Copy link
Copy Markdown
Collaborator

@mahadzaryab1 mahadzaryab1 commented Mar 2, 2026

Which problem is this PR solving?

Description of the changes

  • Fixes the bottleneck in the search query for ClickHouse (reported/discovered by @jixiuf). Originally, every row returned by the FindTraceIDs query was being joined against the trace_id_timestamps table before limiting the number of results. This PR essentially inverts the operation such that we limit the results before obtaining their timestamps.

How was this change tested?

  • Quick benchmarks show a significant speedup across all search queries (benchmark scripts with concrete numbers to follow)

Checklist

AI Usage in this PR (choose one)

See AI Usage Policy.

  • None: No AI tools were used in creating this PR
  • Light: AI provided minor assistance (formatting, simple suggestions)
  • Moderate: AI helped with code generation or debugging specific parts
  • Heavy: AI generated most or all of the code changes

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1 mahadzaryab1 added the changelog:experimental Change to an experimental part of the code label Mar 2, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.69%. Comparing base (3ead761) to head (86a22ee).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8125      +/-   ##
==========================================
+ Coverage   95.67%   95.69%   +0.02%     
==========================================
  Files         317      317              
  Lines       16734    16735       +1     
==========================================
+ Hits        16010    16015       +5     
+ Misses        571      568       -3     
+ Partials      153      152       -1     
Flag Coverage Δ
badger_v1 9.06% <0.00%> (-0.01%) ⬇️
badger_v2 1.04% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1-manual 13.26% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-auto 1.03% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-manual 1.03% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v1-manual 13.26% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-auto 1.03% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-manual 1.03% <0.00%> (-0.01%) ⬇️
clickhouse 1.16% <0.00%> (-0.01%) ⬇️
elasticsearch-6.x-v1 16.62% <0.00%> (-0.01%) ⬇️
elasticsearch-7.x-v1 16.65% <0.00%> (-0.01%) ⬇️
elasticsearch-8.x-v1 16.80% <0.00%> (-0.01%) ⬇️
elasticsearch-8.x-v2 1.04% <0.00%> (-0.01%) ⬇️
elasticsearch-9.x-v2 1.04% <0.00%> (-0.01%) ⬇️
grpc_v1 7.80% <0.00%> (-0.01%) ⬇️
grpc_v2 1.04% <0.00%> (-0.01%) ⬇️
kafka-3.x-v2 1.04% <0.00%> (-0.01%) ⬇️
memory_v2 1.04% <0.00%> (-0.01%) ⬇️
opensearch-1.x-v1 16.69% <0.00%> (-0.01%) ⬇️
opensearch-2.x-v1 16.69% <0.00%> (-0.01%) ⬇️
opensearch-2.x-v2 1.04% <0.00%> (-0.01%) ⬇️
opensearch-3.x-v2 1.04% <0.00%> (-0.01%) ⬇️
query 1.04% <0.00%> (-0.01%) ⬇️
tailsampling-processor 0.52% <0.00%> (-0.01%) ⬇️
unittests 94.38% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 2, 2026

Metrics Comparison Summary

Total changes across all snapshots: 32

Detailed changes per snapshot

📊 Cassandra

File Name: summary_metrics_snapshot_cassandra
Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

📊 Cassandra

File Name: summary_metrics_snapshot_cassandra
Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 106 metrics

📊 Cassandra

File Name: summary_metrics_snapshot_cassandra
Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

📊 Cassandra

File Name: summary_metrics_snapshot_cassandra
Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

📊 Badger

File Name: summary_metrics_snapshot_badger
Total Changes: 32

  • 🆕 Added: 0 metrics
  • ❌ Removed: 32 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 0 metrics

❌ Removed Metrics

  • jaeger_storage_badger_compaction_current_num_lsm (2 variants)
View diff sample
-jaeger_storage_badger_compaction_current_num_lsm{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_compaction_current_num_lsm{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_get_num_memtable` (2 variants)
View diff sample
-jaeger_storage_badger_get_num_memtable{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_get_num_memtable{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_get_num_user` (2 variants)
View diff sample
-jaeger_storage_badger_get_num_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_get_num_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_get_with_result_num_user` (2 variants)
View diff sample
-jaeger_storage_badger_get_with_result_num_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_get_with_result_num_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_iterator_num_user` (2 variants)
View diff sample
-jaeger_storage_badger_iterator_num_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_iterator_num_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_put_num_user` (2 variants)
View diff sample
-jaeger_storage_badger_put_num_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_put_num_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_read_bytes_lsm` (2 variants)
View diff sample
-jaeger_storage_badger_read_bytes_lsm{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_read_bytes_lsm{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_read_bytes_vlog` (2 variants)
View diff sample
-jaeger_storage_badger_read_bytes_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_read_bytes_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_read_num_vlog` (2 variants)
View diff sample
-jaeger_storage_badger_read_num_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_read_num_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_size_bytes_lsm` (2 variants)
View diff sample
-jaeger_storage_badger_size_bytes_lsm{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_size_bytes_lsm{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_size_bytes_vlog` (2 variants)
View diff sample
-jaeger_storage_badger_size_bytes_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_size_bytes_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_write_bytes_l0` (2 variants)
View diff sample
-jaeger_storage_badger_write_bytes_l0{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_bytes_l0{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_write_bytes_user` (2 variants)
View diff sample
-jaeger_storage_badger_write_bytes_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_bytes_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_write_bytes_vlog` (2 variants)
View diff sample
-jaeger_storage_badger_write_bytes_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_bytes_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_write_num_vlog` (2 variants)
View diff sample
-jaeger_storage_badger_write_num_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_num_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
- `jaeger_storage_badger_write_pending_num_memtable` (2 variants)
View diff sample
-jaeger_storage_badger_write_pending_num_memtable{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_pending_num_memtable{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}

➡️ View CI artifacts | View Summary Report logs

Code Coverage

Coverage 96.9% (baseline 46.4%)

@mahadzaryab1 mahadzaryab1 marked this pull request as ready for review March 2, 2026 05:52
@mahadzaryab1 mahadzaryab1 requested a review from a team as a code owner March 2, 2026 05:52
Copilot AI review requested due to automatic review settings March 2, 2026 05:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the performance of the ClickHouse FindTraceIDs and FindTraces queries by restructuring the SQL to apply the result LIMIT before joining with trace_id_timestamps. Previously, every matching row was joined with trace_id_timestamps before limiting, causing unnecessary work at scale.

Changes:

  • Splits SearchTraceIDs SQL into SearchTraceIDsBase (the inner subquery that filters distinct trace IDs with LIMIT) and SearchTraceIDs (the outer wrapper that joins with trace_id_timestamps using fmt.Sprintf formatting).
  • Updates buildFindTraceIDsQuery in query_builder.go to build the inner subquery separately and wrap it using the new template.
  • Updates all affected snapshot files and test mock keys to reflect the restructured query.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
internal/storage/v2/clickhouse/sql/queries.go Splits SearchTraceIDs into SearchTraceIDsBase (inner subquery) and SearchTraceIDs (outer JOIN wrapper template); changes LEFT JOIN to JOIN
internal/storage/v2/clickhouse/tracestore/query_builder.go Builds inner subquery separately and wraps with fmt.Sprintf for the outer JOIN
internal/storage/v2/clickhouse/tracestore/driver_test.go Adds whitespace normalization to the mock driver's substring matching
internal/storage/v2/clickhouse/tracestore/reader_test.go Updates mock query response keys from SearchTraceIDs to SearchTraceIDsBase
internal/storage/v2/clickhouse/tracestore/snapshots/TestFindTraceIDs_2.sql Updated snapshot for the new two-level query structure
internal/storage/v2/clickhouse/tracestore/snapshots/TestFindTraces_WithFilters_2.sql Updated snapshot for the new two-level query structure
internal/storage/v2/clickhouse/tracestore/snapshots/TestFindTraces_Success/single_span_1.sql Updated snapshot for the new two-level query structure
internal/storage/v2/clickhouse/tracestore/snapshots/TestFindTraces_Success/multiple_spans_1.sql Updated snapshot for the new two-level query structure

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yurishkuro
Copy link
Copy Markdown
Member

This change seems to go in the opposite direction of what we want. The inner query has no time range so it has to search ALL segments in the table. The whole point of a join was to limit the search to a known time range, which would typically be just a few minutes, one or two segments on disk.

@github-actions github-actions bot added the waiting-for-author PR is waiting for author to respond to maintainer's comments label Mar 2, 2026
@mahadzaryab1
Copy link
Copy Markdown
Collaborator Author

This change seems to go in the opposite direction of what we want. The inner query has no time range so it has to search ALL segments in the table. The whole point of a join was to limit the search to a known time range, which would typically be just a few minutes, one or two segments on disk.

@yurishkuro The trace_id_timestamps is essentially just a cache to get the start and end time. It is ordered by trace_id. The spans table is partitioned by the start_time and has a skip index on it as well. Previously, the query joined every matching span row (which is an expensive operation) with trace_id_timestamps before applying LIMIT. The new approach applies LIMIT first, then only joins the resulting trace IDs to look up their timestamps which is why we see a significant speedup here.

In the approach that you mentioned, are you suggesting that we use the trace_id_timestamps table to narrow down the possible trace_ids based on the time range provided in the query?

@github-actions github-actions bot removed the waiting-for-author PR is waiting for author to respond to maintainer's comments label Mar 3, 2026
@yurishkuro
Copy link
Copy Markdown
Member

Sorry I confused this with GetTrace method. For FindTraceIDs yes you first have to search on the spans table and then join with timestamps table just to return ts range with the trace ID. The copilot is right that you want a LEFT JOIN to avoid skipping found trace IDs even if we can't find their time range (which is an optimization for Get).

@github-actions github-actions bot added the waiting-for-author PR is waiting for author to respond to maintainer's comments label Mar 3, 2026
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@github-actions github-actions bot removed the waiting-for-author PR is waiting for author to respond to maintainer's comments label Mar 3, 2026
@mahadzaryab1 mahadzaryab1 enabled auto-merge March 3, 2026 05:47
Copilot AI review requested due to automatic review settings March 3, 2026 06:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yurishkuro yurishkuro merged commit 50ce6e9 into jaegertracing:main Mar 3, 2026
124 of 127 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/storage changelog:experimental Change to an experimental part of the code performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants