Skip to content

ingest: make Enconding sync.Pool more robust#5568

Merged
carles-grafana merged 1 commit intografana:mainfrom
carles-grafana:syncPool_improvements_encoding
Sep 1, 2025
Merged

ingest: make Enconding sync.Pool more robust#5568
carles-grafana merged 1 commit intografana:mainfrom
carles-grafana:syncPool_improvements_encoding

Conversation

@carles-grafana
Copy link
Copy Markdown
Contributor

@carles-grafana carles-grafana commented Aug 22, 2025

We were resetting the length or the underlying slices but not clearing the contents.
This makes a potential future data leak due to a bug less likely.

Also add a test to ensure the data is reset when reusing it.

Benchmark in this branch:

goos: darwin
goarch: arm64
pkg: github.com/grafana/tempo/pkg/ingest
cpu: Apple M3 Pro
BenchmarkEncoderPoolPutDifferentSizes/Small_Original-11         	80590712	        15.56 ns/op	       0 B/op	       0 allocs/op
BenchmarkEncoderPoolPutDifferentSizes/Small_WithClear-11        	48544262	        25.38 ns/op	       0 B/op	       0 allocs/op
BenchmarkEncoderPoolPutDifferentSizes/Medium_Original-11        	19074123	        62.93 ns/op	       0 B/op	       0 allocs/op
BenchmarkEncoderPoolPutDifferentSizes/Medium_WithClear-11       	10028392	       117.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkEncoderPoolPutDifferentSizes/Large_Original-11         	 1676859	       702.9 ns/op	       0 B/op	       0 allocs/op
BenchmarkEncoderPoolPutDifferentSizes/Large_WithClear-11        	 1000000	      1120 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	github.com/grafana/tempo/pkg/ingest	9.330s

Clearing the slices is significantly slower. (60-85% slower). Is the extra safety worth it?

What this PR does:

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@carles-grafana carles-grafana force-pushed the syncPool_improvements_encoding branch from 164fac5 to 511543d Compare August 22, 2025 11:58
@carles-grafana carles-grafana marked this pull request as draft August 22, 2025 12:50
@carles-grafana carles-grafana force-pushed the syncPool_improvements_encoding branch 3 times, most recently from 7b7adf6 to 22bb0b1 Compare August 26, 2025 08:04
@carles-grafana carles-grafana marked this pull request as ready for review August 26, 2025 08:45
@carles-grafana
Copy link
Copy Markdown
Contributor Author

@mdisibio Inlining the reset function didn't provide any significant speedup

@joe-elliott
Copy link
Copy Markdown
Collaborator

For the benchmarks to be meaningful you need to run them a decent number of times in this branch and in main and compare the results.

Something like:

go test -run xxx -bench <your bench> -count=10 -benchmem > before.txt
go test -run xxx -bench <your bench> -count=10 -benchmem > after.txt
benchstat before.txt after.txt

@carles-grafana
Copy link
Copy Markdown
Contributor Author

For the benchmarks to be meaningful you need to run them a decent number of times in this branch and in main and compare the results.

Something like:

go test -run xxx -bench <your bench> -count=10 -benchmem > before.txt
go test -run xxx -bench <your bench> -count=10 -benchmem > after.txt
benchstat before.txt after.txt

Note that in the new benchmark I'm already comparing the new and old implementation of encoderPoolPut (the original is reimplemented in the test file so I don't have to switch branches).

For the existing EncodeDecode benchmark, the effect is not visible:

  benchstat old.txt new.txt                                                    
  goos: darwin
  goarch: arm64
  pkg: github.com/grafana/tempo/pkg/ingest
  cpu: Apple M3 Pro
                  │   old.txt    │            new.txt            │
                  │    sec/op    │   sec/op     vs base          │
  EncodeDecode-11   139.0µ ± 23%   136.0µ ± 8%  ~ (p=0.684 n=10)

                  │   old.txt    │            new.txt             │
                  │     B/op     │     B/op      vs base          │
  EncodeDecode-11   1.242Mi ± 1%   1.255Mi ± 3%  ~ (p=0.796 n=10)

                  │   old.txt   │             new.txt             │
                  │  allocs/op  │  allocs/op   vs base            │
  EncodeDecode-11   2.004k ± 0%   2.004k ± 0%  ~ (p=1.000 n=10) ¹
  ¹ all samples are equal

}

// Original implementation without clear() for comparison
func encoderPoolPutOriginal(req *tempopb.PushBytesRequest) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an intentional implementation of the original function for the benchmark below.
Do you think it's not worth keeping?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not unprecedented. here we keep the original implementation of truncateRowNumber:

https://github.com/grafana/tempo/blob/main/pkg/parquetquery/iters.go#L80-L88

but we also do this b/c it allows us to validate the more complex implementation against the original:

https://github.com/grafana/tempo/blob/main/pkg/parquetquery/iters_test.go#L28

i'm not opposed to this and will go ahead and approve, but imo this is a simple enough change that we are good removing the old impl

Comment thread tempodb/encoding/vparquet2/compactor.go Outdated
We were resetting the length or the underlying slices but not clearing the contents.
This makes a potential future data leak due to a bug less likely.

Also add a test to ensure the data is reset when reusing it.
@carles-grafana carles-grafana force-pushed the syncPool_improvements_encoding branch from 22bb0b1 to 2061156 Compare August 29, 2025 14:49
@carles-grafana carles-grafana merged commit 1898635 into grafana:main Sep 1, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants