Skip to content

Star Tree Search changes related to new Aggregations supported #9163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Feb 11, 2025
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 50 additions & 4 deletions _search-plugins/star-tree-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,14 +140,18 @@

### Supported queries

The following queries are supported as of OpenSearch 2.18:
The following queries are supported as of OpenSearch 2.19:

- [Term query]({{site.url}}{{site.baseurl}}/query-dsl/term/term/)
- [Terms query]({{site.url}}{{site.baseurl}}/query-dsl/term/terms/)
- [Match all docs query]({{site.url}}{{site.baseurl}}/query-dsl/match-all/)
- [Range query]({{site.url}}{{site.baseurl}}/query-dsl/term/range/)

To use a query with a star-tree index, the query's fields must be present in the `ordered_dimensions` section of the star-tree configuration. Queries must also be paired with a supported aggregation.
To use a query in supported aggregations with a star-tree index, the query's fields must be present in the `ordered_dimensions` section of the star-tree configuration. Queries without aggregaions are not supported, they must be paired with a supported aggregation.

Check failure on line 150 in _search-plugins/star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: aggregaions. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: aggregaions. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/star-tree-index.md", "range": {"start": {"line": 150, "column": 185}}}, "severity": "ERROR"}

### Supported aggregations

#### Metric Aggregations

Check failure on line 154 in _search-plugins/star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings. Raw Output: {"message": "[OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings.", "location": {"path": "_search-plugins/star-tree-index.md", "range": {"start": {"line": 154, "column": 1}}}, "severity": "ERROR"}

Check failure on line 154 in _search-plugins/star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Metric Aggregations' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Metric Aggregations' is a heading and should be in sentence case.", "location": {"path": "_search-plugins/star-tree-index.md", "range": {"start": {"line": 154, "column": 6}}}, "severity": "ERROR"}

The following metric aggregations are supported as of OpenSearch 2.18:
- [Sum]({{site.url}}{{site.baseurl}}/aggregations/metric/sum/)
Expand All @@ -156,12 +160,12 @@
- [Value count]({{site.url}}{{site.baseurl}}/aggregations/metric/value-count/)
- [Average]({{site.url}}{{site.baseurl}}/aggregations/metric/average/)

To use aggregations:
To use aggregations searchable via star-tree:

Check warning on line 163 in _search-plugins/star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.LatinismsSubstitution] Use 'using, through, by accessing, or by choosing' instead of 'via'. Raw Output: {"message": "[OpenSearch.LatinismsSubstitution] Use 'using, through, by accessing, or by choosing' instead of 'via'.", "location": {"path": "_search-plugins/star-tree-index.md", "range": {"start": {"line": 163, "column": 32}}}, "severity": "WARNING"}

- The fields must be present in the `metrics` section of the star-tree configuration.
- The metric aggregation type must be part of the `stats` parameter.

### Aggregation example
##### Example

The following example gets the sum of all the values in the `size` field for all error logs with `status=500`, using the [example mapping](#example-mapping):

Expand All @@ -185,6 +189,48 @@

Using a star-tree index, the result will be retrieved from a single aggregated document as it traverses the `status=500` node, as opposed to scanning through all of the matching documents. This results in lower query latency.

#### Date Histogram Aggregation with Metric Aggregations

Check failure on line 192 in _search-plugins/star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Date Histogram Aggregation with Metric Aggregations' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Date Histogram Aggregation with Metric Aggregations' is a heading and should be in sentence case.", "location": {"path": "_search-plugins/star-tree-index.md", "range": {"start": {"line": 192, "column": 6}}}, "severity": "ERROR"}

[Date Histogram]({{site.url}}{{site.baseurl}}/aggregations/bucket/date-histogram/) on calendar intervals with above metric sub-aggregations is supported as of OpenSearch 2.19.

Check warning on line 194 in _search-plugins/star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.DirectionAboveBelow] Use 'previous, preceding, or earlier' instead of 'above' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions. Raw Output: {"message": "[OpenSearch.DirectionAboveBelow] Use 'previous, preceding, or earlier' instead of 'above' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions.", "location": {"path": "_search-plugins/star-tree-index.md", "range": {"start": {"line": 194, "column": 111}}}, "severity": "WARNING"}

To use date histogram aggregations searchable via star-tree:

Check warning on line 196 in _search-plugins/star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.LatinismsSubstitution] Use 'using, through, by accessing, or by choosing' instead of 'via'. Raw Output: {"message": "[OpenSearch.LatinismsSubstitution] Use 'using, through, by accessing, or by choosing' instead of 'via'.", "location": {"path": "_search-plugins/star-tree-index.md", "range": {"start": {"line": 196, "column": 47}}}, "severity": "WARNING"}

- The calendar intervals in star-tree mapping configuration should have either the request calendar field or a lower granularity calendar field. For example, `month` can be resolved by star-tree from `day` field as well if present in star-tree mapping.
- Metric sub-aggregation must be part of the aggregation request.

##### Example

The following example gets the sum of all the values in the `size` field aggregated for each calendar month, for all error logs with `status=500`.

```json
POST /logs/_search
{
"query": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a valid query ? @sandeshkr419 just double checking. I don't see term/terms etc

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bharath-techie: I updated the query to add terms while keeping the method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not a valid query , we don't support range on timestamp. Lets reword this @Naarcha-AWS

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bharath-techie: Can you suggest a valid query?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
    "size": 0,
    "query": {
        "range": {
            "status": {
                "gte": "200",
                "lte": "400"
            }
        }
    },
    "aggs": {
        "by_hour": {
            "date_histogram": {
                "field": "@timestamp",
                "calendar_interval": "month"
            },
            "aggs": {
                "sum_size": {
                    "sum": {
                        "field": "size"
                    }
                }
            }
        }
    }
}

This is a valid query which we can use @Naarcha-AWS

"term": {
"status": "500"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use range query or keyword term query since term is already used in the above example ? Maybe keyword term query will be a good example.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bharath-techie: I'll update the example to use a range query.

}
},
"size": 0,
"aggs": {
"by_hour": {
"date_histogram": {
"field": "@timestamp",
"calendar_interval": "month"
},
"aggs": {
"sum_size": {
"sum": {
"field": "size"
}
}
}
}
}
}
```



## Using queries without a star-tree index

Set the `indices.composite_index.star_tree.enabled` setting to `false` to run queries without using a star-tree index.
Loading