Skip to content

Fix compactor panic#4915

Merged
joe-elliott merged 10 commits intografana:mainfrom
joe-elliott:fix-panic
Apr 7, 2025
Merged

Fix compactor panic#4915
joe-elliott merged 10 commits intografana:mainfrom
joe-elliott:fix-panic

Conversation

@joe-elliott
Copy link
Copy Markdown
Collaborator

@joe-elliott joe-elliott commented Mar 27, 2025

What this PR does:
Compactors rarely panic with a stack like:

        /enterprise-traces/vendor/github.com/parquet-go/parquet-go/page.go:129 +0xfd
created by github.com/parquet-go/parquet-go.AsyncPages in goroutine 33664
        /enterprise-traces/vendor/github.com/parquet-go/parquet-go/page.go:264 +0x178
github.com/parquet-go/parquet-go.readPages({0x3bc5390, 0xc06ed2f360}, 0xc0b09b2f50, 0xc0acc4b780, 0x3bea488?, 0xc0b09b3030)
        /enterprise-traces/vendor/github.com/parquet-go/parquet-go/file.go:801 +0x9f
github.com/parquet-go/parquet-go.(*FilePages).ReadPage(0xc06ed2f360)
        /enterprise-traces/vendor/github.com/parquet-go/parquet-go/file.go:916 +0x65
github.com/parquet-go/parquet-go.(*FilePages).readPage(0xc06ed2f360, 0xc096766f30, 0xc0795b8000)
        /enterprise-traces/vendor/github.com/parquet-go/parquet-go/buffer.go:532 +0xd7
github.com/parquet-go/parquet-go.(*bufferPool).get(0x5a1ff60?, 0xfffffffffffffffc?)
        /enterprise-traces/vendor/github.com/parquet-go/parquet-go/buffer.go:510 +0x2d
github.com/parquet-go/parquet-go.(*bufferPool).newBuffer(0x5a1ff60, 0xfffffffffffffffc, 0x1000)
goroutine 176675 [running]:   
        
panic: runtime error: makeslice: len out of range

This was caused by the io.BufferedReaderAt not invalidating buffers when an error was received here:

tempo/pkg/io/buffered.go

Lines 64 to 68 in c5e5978

func (r *BufferedReaderAt) populate(buf *readerBuffer) (int, error) {
// Read
n, err := r.ra.ReadAt(buf.buf, buf.off)
return n, err
}

Later it was possible to return data from this buffer as if it were correct even though the buffer was in an unknown state.

Other Changes

  • Removed BufferedWriterWithQueue b/c it was unused.
  • Added a test to replicate the described issue.
  • Renamed open -> openForIteration and added comments to help clarify how the two open funcs are used.
  • Simplified the locking in BufferedReaderAt. Previously it would drop the primary lock for per-buffer locking before it attempted to read from the backend. This was designed to prevent this struct from serializing calls to object storage. I believe this could be maintained with the fix, but it was getting quite complicated. Consider that you would need to handle a buffer being invalid after it had been already selected b/c the read might invalidate it. Since this code was only used in the compactors I favored simplicity over performance. There is a small, but reasonable performance hit.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Comment thread pkg/io/buffered.go
Copy link
Copy Markdown
Contributor

@mapno mapno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joe-elliott joe-elliott merged commit c253655 into grafana:main Apr 7, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants