Skip to content

[fix] Ensure Badger maintenance is stopped before existing Close()#7940

Merged
yurishkuro merged 1 commit intojaegertracing:mainfrom
Yashika0724:fix/badger-shutdown-race
Feb 1, 2026
Merged

[fix] Ensure Badger maintenance is stopped before existing Close()#7940
yurishkuro merged 1 commit intojaegertracing:mainfrom
Yashika0724:fix/badger-shutdown-race

Conversation

@Yashika0724
Copy link
Copy Markdown
Contributor

This PR adds synchronization between the Badger storage factory’s background goroutines and store shutdown.

During shutdown (for example, pod termination or process restart), the maintenance or metrics goroutines may still be running while the Badger store is being closed. This change ensures the goroutines are allowed to exit cleanly after shutdown is signaled and before the store is closed.

The behavior and APIs remain unchanged; the update only makes the shutdown sequence safer and more predictable.

Thanks for taking a look, and please let me know if any adjustments are needed.

Signed-off-by: Yashika0724 <ssyashika1311@gmail.com>
@Yashika0724 Yashika0724 requested a review from a team as a code owner January 30, 2026 21:54
@dosubot dosubot bot added area/storage enhancement storage/badger Issues related to badger storage labels Jan 30, 2026
@Yashika0724
Copy link
Copy Markdown
Contributor Author

Hi @yurishkuro ! Thanks for taking the time to review this.
This change just adds proper synchronization during shutdown to avoid background Badger goroutines accessing the store after it’s closed.
Happy to adjust or refine anything if you think it can be improved.

// Close Implements io.Closer and closes the underlying storage
func (f *Factory) Close() error {
close(f.maintenanceDone)
f.bgWg.Wait() // Wait for background goroutines to finish before closing store
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't understand the need! closing the maintenanceDone will close the channel which will close the go routines. How will it lead to race condition? Can you provide the steps for reproducing the race condition or could point to the issue thread

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a standard pattern that ensures that goroutines are exiting (i.e. definitely not holding onto any locks or resources) before Close() function exists. This could be helpful in avoiding situations when the process exits while maintenance goroutine still tries to do some writes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it! thanks

Comment on lines 181 to 191
err := f.store.Close()

// Remove tmp files if this was ephemeral storage
if f.Config.Ephemeral {
errSecondary := os.RemoveAll(f.tmpDir)
if err == nil {
err = errSecondary
}
}

return err
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could've been simplified to do fewer ifs by using errors.Join()

@yurishkuro yurishkuro changed the title Fix race between Badger background goroutines and store shutdown [fix] Ensure Badger maintenance is stopped before existing Close() Feb 1, 2026
@yurishkuro yurishkuro enabled auto-merge February 1, 2026 16:26
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.55%. Comparing base (57ef85a) to head (5dfa4c6).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7940      +/-   ##
==========================================
- Coverage   95.58%   95.55%   -0.04%     
==========================================
  Files         316      316              
  Lines       16726    16734       +8     
==========================================
+ Hits        15988    15990       +2     
- Misses        576      580       +4     
- Partials      162      164       +2     
Flag Coverage Δ
badger_v1 9.19% <100.00%> (+0.07%) ⬆️
badger_v2 1.90% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1-manual 13.40% <0.00%> (-0.02%) ⬇️
cassandra-4.x-v2-auto 1.89% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-manual 1.89% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v1-manual 13.40% <0.00%> (-0.02%) ⬇️
cassandra-5.x-v2-auto 1.89% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-manual 1.89% <0.00%> (-0.01%) ⬇️
clickhouse 1.98% <0.00%> (-0.01%) ⬇️
elasticsearch-6.x-v1 17.25% <0.00%> (-0.02%) ⬇️
elasticsearch-7.x-v1 17.28% <0.00%> (-0.02%) ⬇️
elasticsearch-8.x-v1 17.43% <0.00%> (-0.02%) ⬇️
elasticsearch-8.x-v2 1.90% <0.00%> (-0.01%) ⬇️
elasticsearch-9.x-v2 1.90% <0.00%> (-0.01%) ⬇️
grpc_v1 8.43% <0.00%> (-0.01%) ⬇️
grpc_v2 1.90% <0.00%> (-0.01%) ⬇️
kafka-3.x-v2 1.90% <0.00%> (-0.01%) ⬇️
memory_v2 1.90% <0.00%> (-0.01%) ⬇️
opensearch-1.x-v1 17.33% <0.00%> (-0.02%) ⬇️
opensearch-2.x-v1 17.33% <0.00%> (-0.02%) ⬇️
opensearch-2.x-v2 1.90% <0.00%> (-0.01%) ⬇️
opensearch-3.x-v2 1.90% <0.00%> (-0.01%) ⬇️
query 1.90% <0.00%> (-0.01%) ⬇️
tailsampling-processor 0.54% <0.00%> (-0.01%) ⬇️
unittests 94.23% <100.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yurishkuro yurishkuro added this pull request to the merge queue Feb 1, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 1, 2026

Metrics Comparison Summary

Total changes across all snapshots: 0

Detailed changes per snapshot

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 106 metrics

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 106 metrics

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

➡️ View full metrics file

Merged via the queue into jaegertracing:main with commit 2d4e012 Feb 1, 2026
64 checks passed
@dosubot
Copy link
Copy Markdown

dosubot bot commented Feb 1, 2026

Related Documentation

No published documentation to review for changes on this repository.

Write your first living document

How did I do? Any feedback?  Join Discord

Parship12 pushed a commit to Parship12/jaeger that referenced this pull request Feb 11, 2026
…aegertracing#7940)

This PR adds synchronization between the Badger storage factory’s
background goroutines and store shutdown.

During shutdown (for example, pod termination or process restart), the
maintenance or metrics goroutines may still be running while the Badger
store is being closed. This change ensures the goroutines are allowed to
exit cleanly after shutdown is signaled and before the store is closed.

The behavior and APIs remain unchanged; the update only makes the
shutdown sequence safer and more predictable.

Thanks for taking a look, and please let me know if any adjustments are
needed.

Signed-off-by: Yashika0724 <ssyashika1311@gmail.com>
SoumyaRaikwar pushed a commit to SoumyaRaikwar/jaeger that referenced this pull request Feb 13, 2026
…aegertracing#7940)

This PR adds synchronization between the Badger storage factory’s
background goroutines and store shutdown.

During shutdown (for example, pod termination or process restart), the
maintenance or metrics goroutines may still be running while the Badger
store is being closed. This change ensures the goroutines are allowed to
exit cleanly after shutdown is signaled and before the store is closed.

The behavior and APIs remain unchanged; the update only makes the
shutdown sequence safer and more predictable.

Thanks for taking a look, and please let me know if any adjustments are
needed.

Signed-off-by: Yashika0724 <ssyashika1311@gmail.com>
OlegChumin pushed a commit to OlegChumin/jaeger that referenced this pull request Feb 28, 2026
…aegertracing#7940)

This PR adds synchronization between the Badger storage factory’s
background goroutines and store shutdown.

During shutdown (for example, pod termination or process restart), the
maintenance or metrics goroutines may still be running while the Badger
store is being closed. This change ensures the goroutines are allowed to
exit cleanly after shutdown is signaled and before the store is closed.

The behavior and APIs remain unchanged; the update only makes the
shutdown sequence safer and more predictable.

Thanks for taking a look, and please let me know if any adjustments are
needed.

Signed-off-by: Yashika0724 <ssyashika1311@gmail.com>
singhvibhanshu pushed a commit to singhvibhanshu/jaeger that referenced this pull request Mar 18, 2026
…aegertracing#7940)

This PR adds synchronization between the Badger storage factory’s
background goroutines and store shutdown.

During shutdown (for example, pod termination or process restart), the
maintenance or metrics goroutines may still be running while the Badger
store is being closed. This change ensures the goroutines are allowed to
exit cleanly after shutdown is signaled and before the store is closed.

The behavior and APIs remain unchanged; the update only makes the
shutdown sequence safer and more predictable.

Thanks for taking a look, and please let me know if any adjustments are
needed.

Signed-off-by: Yashika0724 <ssyashika1311@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants