Skip to content

Updated memcached client to honor a cancelled context#5041

Merged
joe-elliott merged 5 commits intografana:mainfrom
joe-elliott:memcached-honors-ctx
Apr 23, 2025
Merged

Updated memcached client to honor a cancelled context#5041
joe-elliott merged 5 commits intografana:mainfrom
joe-elliott:memcached-honors-ctx

Conversation

@joe-elliott
Copy link
Copy Markdown
Collaborator

@joe-elliott joe-elliott commented Apr 23, 2025

What this PR does:
Updates the memcached client to honor a cancelled context. This prevents a somewhat rare panic from occurring:

	/Users/joe/dev/joe-elliott/tempo/vendor/github.com/parquet-go/parquet-go/page.go:129 +0xfd
created by github.com/parquet-go/parquet-go.AsyncPages in goroutine 610894
	/Users/joe/dev/joe-elliott/tempo/vendor/github.com/parquet-go/parquet-go/page.go:269 +0x178
github.com/parquet-go/parquet-go.readPages({0x3fdf8f0, 0xc0bf919ae0}, 0xc10c1dc310, 0xc095e75880, 0x4007b18?, 0xc10c1dc3f0)
	/Users/joe/dev/joe-elliott/tempo/vendor/github.com/parquet-go/parquet-go/file.go:801 +0x9f
github.com/parquet-go/parquet-go.(*FilePages).ReadPage(0xc0bf919ae0)
	/Users/joe/dev/joe-elliott/tempo/vendor/github.com/parquet-go/parquet-go/file.go:916 +0x65
github.com/parquet-go/parquet-go.(*FilePages).readPage(0xc0bf919ae0, 0xc07d7b5740, 0xc096ba61e0)
	/Users/joe/dev/joe-elliott/tempo/vendor/github.com/parquet-go/parquet-go/buffer.go:534 +0x12d
github.com/parquet-go/parquet-go.(*bufferPool).get(0x623bf20?, 0xfffffffffffffffc?)
goroutine 611543 [running]:

panic: runtime error: slice bounds out of range [:-4]

The believed series of events that create this panic are:

  1. While decoding a header in FilePages.ReadPage the thrift code will read short segments and sometimes individual bytes.
  2. One of these very short reads misses the bufio.Reader's internal buffer triggering a read from the backend.
  3. Cache misses and the backend read fails with context.Canceled b/c the job was canceled for any number of reasons.
  4. Something else successfully reads that exact parquet page and stores it in cache.
  5. The AsyncPages loop comes back around and calls ReadPage again.
  6. Decoder.decode is called again but now the underlying buffer is misaligned. We are N bytes into the header instead of at the start of the header. This wouldn't matter if the backend would just return context.Canceled again but...
  7. This time cache hits and returns successfully because memcached doesn't honor context.

Other notes:

  • The redis client seems to already respect context? It takes one but I've found it difficult to trace exactly how it's using it.
  • Long term we should probably swap to the dskit cache clients

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott joe-elliott changed the title Memcached honors ctx Updated memcached client to honor a canceled context Apr 23, 2025
@joe-elliott joe-elliott changed the title Updated memcached client to honor a canceled context Updated memcached client to honor a cancelled context Apr 23, 2025
Signed-off-by: Joe Elliott <number101010@gmail.com>
Comment thread pkg/cache/memcached.go
Signed-off-by: Joe Elliott <number101010@gmail.com>
Copy link
Copy Markdown
Contributor

@zalegrala zalegrala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one.

Comment thread pkg/cache/memcached.go Outdated
Signed-off-by: Joe Elliott <number101010@gmail.com>
Comment thread pkg/cache/memcached.go
@joe-elliott joe-elliott merged commit a8f7a60 into grafana:main Apr 23, 2025
15 checks passed
zalegrala pushed a commit to zalegrala/tempo that referenced this pull request Apr 23, 2025
* respect context

Signed-off-by: Joe Elliott <number101010@gmail.com>

* add test

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* collapse fetch() calls

Signed-off-by: Joe Elliott <number101010@gmail.com>

* move ctx check inside loop

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants