-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix addEmptyBuckets
from creating too many buckets when given big extended bounds
#17718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
❌ Gradle check result for 60c7b21: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
addEmptyBuckets
from creating too many buckets when given big extended bounds
int preEmptyBucketCount = list.size(); | ||
// we use counts here only to add those values to the CircuitBreaker, list's count has already been added in #reduce, so we only | ||
// need to add emptyBucketCount | ||
int emptyBucketCount = getTotalBucketCount() - list.size(); | ||
if (emptyBucketCount > 0) { | ||
CircuitBreaker breaker = reduceContext.getBreaker(); | ||
if (breaker != null) { | ||
breaker.addEstimateBytesAndMaybeBreak(50L * emptyBucketCount, "empty histogram buckets"); | ||
} | ||
preEmptyBucketCount += emptyBucketCount; | ||
reduceContext.consumeBucketsAndMaybeBreak(emptyBucketCount); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int preEmptyBucketCount = list.size(); | |
// we use counts here only to add those values to the CircuitBreaker, list's count has already been added in #reduce, so we only | |
// need to add emptyBucketCount | |
int emptyBucketCount = getTotalBucketCount() - list.size(); | |
if (emptyBucketCount > 0) { | |
CircuitBreaker breaker = reduceContext.getBreaker(); | |
if (breaker != null) { | |
breaker.addEstimateBytesAndMaybeBreak(50L * emptyBucketCount, "empty histogram buckets"); | |
} | |
preEmptyBucketCount += emptyBucketCount; | |
reduceContext.consumeBucketsAndMaybeBreak(emptyBucketCount); | |
} | |
final int originalSize = list.size(); | |
final int estimateEmptyBucketCount = estimateTotalBucketCount() - originalSize; | |
assert estimateEmptyBucketCount >= 0; | |
CircuitBreaker breaker = reduceContext.getBreaker(); | |
if (breaker != null) { | |
// 50 bytes memory usage for each empty bucket | |
breaker.addEstimateBytesAndMaybeBreak(50L * estimateEmptyBucketCount, "empty histogram buckets"); | |
} | |
reduceContext.consumeBucketsAndMaybeBreak(estimateEmptyBucketCount); |
I think If emptyBucketCount < 0, that means the estimateTotalBucketCount is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The breaker.addEstimateBytesAndMaybeBreak can be fold into reduceContext, so you don't need to do the null check here.
int postEmptyBucketCount = list.size() - preEmptyBucketCount; | ||
reduceContext.consumeBucketsAndMaybeBreak(postEmptyBucketCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int postEmptyBucketCount = list.size() - preEmptyBucketCount; | |
reduceContext.consumeBucketsAndMaybeBreak(postEmptyBucketCount); | |
int postAddEmptyBucketCount = list.size() - estimateEmptyBucketCount - originalSize; | |
reduceContext.consumeBucketsAndMaybeBreak(postAddEmptyBucketCount); |
int i = 0; | ||
double key = min; | ||
while (key < max && i++ < 10) { | ||
bucketCount++; | ||
key = nextKey(key); | ||
} | ||
|
||
if (bucketCount < 10) { | ||
return bucketCount; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the goal of this?
@@ -382,7 +383,46 @@ private double round(double key) { | |||
return Math.floor((key - emptyBucketInfo.offset) / emptyBucketInfo.interval) * emptyBucketInfo.interval + emptyBucketInfo.offset; | |||
} | |||
|
|||
private void addEmptyBuckets(List<Bucket> list, ReduceContext reduceContext) { | |||
private int getTotalBucketCount() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private int getTotalBucketCount() { | |
private int estimateTotalBucketCount() { |
❌ Gradle check result for ed6093c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@harshavamsi This is a simple case that still cause curl -X PUT "http://localhost:9200/test-index" -H 'Content-Type: application/json' -d'
{
"mappings": {
"properties": {
"value": { "type": "double" },
"timestamp": { "type": "date" }
}
}
}'
curl -X POST "http://localhost:9200/test-index/_doc/1" -H 'Content-Type: application/json' -d'
{
"value": 1,
"timestamp": "2000-01-01T00:00:00Z"
}'
curl -X POST "http://localhost:9200/test-index/_doc/2" -H 'Content-Type: application/json' -d'
{
"value": 1000000000,
"timestamp": "2025-01-01T00:00:00Z"
}'
curl -X POST "http://localhost:9200/test-index/_search" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"value_hist": {
"histogram": {
"field": "value",
"interval": 1
}
}
}
}'
curl -X POST "http://localhost:9200/test-index/_search" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"timestamp_hist": {
"date_histogram": {
"field": "timestamp",
"calendar_interval": "1m"
}
}
}
}' |
thanks for bringing this up, I will include this in the fix! |
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
ed6093c
to
67e4508
Compare
❌ Gradle check result for 67e4508: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 67e4508: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
This PR is stalled because it has been open for 30 days with no activity. |
Description
Most of the description is in #17702, this PR adds checks before we can create empty buckets.
Before we create empty buckets, we check how many potential buckets would be created and add those to the CircuitBreaker which could either trip or cause
max_buckets_exception
.Related Issues
Resolves #17702
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.