Skip to content

Avoid panic by cheking whether entry is a span in batchCollector#3847

Merged
stoewer merged 2 commits intografana:mainfrom
stoewer:fix-panic-bug
Jul 5, 2024
Merged

Avoid panic by cheking whether entry is a span in batchCollector#3847
stoewer merged 2 commits intografana:mainfrom
stoewer:fix-panic-bug

Conversation

@stoewer
Copy link
Copy Markdown
Contributor

@stoewer stoewer commented Jul 5, 2024

What this PR does:
It was observed that queries such as {} | rate() by (event.name) can cause panics in the batchCollector. Here is a stack trace taken from the logs:

2024-07-04 14:01:01.425	
github.com/grafana/tempo/pkg/traceql.(*MetricsEvalulator).Do(0xc0bd65db00, {0x3305e18, 0xc08a0a7440}, {0x32e1c60?, 0xc06f4f8cd0?}, 0x0?, 0xc099578f68?)
2024-07-04 14:01:01.425	
	/drone/src/tempodb/encoding/vparquet4/block_traceql.go:1312 +0x1f
2024-07-04 14:01:01.425	
github.com/grafana/tempo/tempodb/encoding/vparquet4.(*spansetIterator).Next(0xc097d151e0?, {0x3324f20?, 0xc08e5dad20?})
2024-07-04 14:01:01.425	
	/drone/src/tempodb/encoding/vparquet4/block_traceql.go:1198 +0x11f
2024-07-04 14:01:01.425	
github.com/grafana/tempo/tempodb/encoding/vparquet4.(*rebatchIterator).Next(0xc08a0a7f50)
2024-07-04 14:01:01.425	
	/drone/src/pkg/parquetquery/iters.go:1666 +0xa5
2024-07-04 14:01:01.425	
github.com/grafana/tempo/pkg/parquetquery.(*JoinIterator).Next(0xc04e206240)
2024-07-04 14:01:01.425	
	/drone/src/pkg/parquetquery/iters.go:1741 +0x352
2024-07-04 14:01:01.425	
github.com/grafana/tempo/pkg/parquetquery.(*JoinIterator).collect(0xc04e206240, {0xc9, 0x0, 0x0, 0x0, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff})
2024-07-04 14:01:01.425	
	/drone/src/pkg/parquetquery/iters.go:1863 +0xd6
2024-07-04 14:01:01.425	
github.com/grafana/tempo/pkg/parquetquery.(*LeftJoinIterator).Next(0xc0631158c0)
2024-07-04 14:01:01.425	
	/drone/src/tempodb/encoding/vparquet4/block_traceql.go:2707 +0xcb1
2024-07-04 14:01:01.425	
github.com/grafana/tempo/tempodb/encoding/vparquet4.(*batchCollector).KeepGroup(0xc08a0a7ef0, 0xc0c47cc960)
2024-07-04 14:01:01.425	
goroutine 62344 [running]:
2024-07-04 14:01:01.425	

2024-07-04 14:01:01.425	
panic: interface conversion: interface {} is traceql.Static, not *vparquet4.span

I wasn't able to reproduce this locally or in tests, but given the above log the missing check when casting to *span looks like the most likely culprit.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@stoewer stoewer added the type/bug Something isn't working label Jul 5, 2024
@stoewer stoewer merged commit afe4b47 into grafana:main Jul 5, 2024
@mapno mapno mentioned this pull request Jul 24, 2024
3 tasks
@stoewer stoewer deleted the fix-panic-bug branch March 13, 2025 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants