Skip to content

caching bucket for parquet chunks file #6805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 10, 2025

Conversation

yeya24
Copy link
Contributor

@yeya24 yeya24 commented Jun 7, 2025

What this PR does:

Enable caching bucket for Parquet queryable. The main benefit is to enable parquet chunks cache. I am able to see reduction on S3 calls, mainly S3 GetRange requests on parquet querier and some CPU usage reduction.

image

Chunks cache hit ratio is also high. But since this is just synthetic load so actual improvement might not be that good.

image

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@dosubot dosubot bot added the storage/blocks Blocks storage engine label Jun 7, 2025
@yeya24 yeya24 force-pushed the cache-parquet-chunks branch from 3db4282 to 8864dec Compare June 10, 2025 00:49
yeya24 added 2 commits June 9, 2025 20:13
@alanprot
Copy link
Member

Thanks! Amazing work!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 10, 2025
@alanprot alanprot merged commit 648356c into cortexproject:master Jun 10, 2025
17 checks passed
@yeya24 yeya24 deleted the cache-parquet-chunks branch June 10, 2025 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size/M storage/blocks Blocks storage engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants