Skip to content

Commit 9afbb9a

Browse files
sandeshkr419bharath-techieNaarcha-AWSnatebower
authored andcommitted
Star Tree Search changes related to new Aggregations supported (opensearch-project#9163)
* Update star-tree-index.md Adding example for star-tree supported date histogram as it is supported in OpenSearch 2.19 Signed-off-by: Sandesh Kumar <[email protected]> * Update star-tree-index.md Add queries included in 2.19 Signed-off-by: Sandesh Kumar <[email protected]> * typo Signed-off-by: Sandesh Kumar <[email protected]> * indexing documentation changes Signed-off-by: bharath-techie <[email protected]> * Technical Writer edits Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * Update star-tree-index.md Signed-off-by: Naarcha-AWS <[email protected]> * update example Signed-off-by: Sandesh Kumar <[email protected]> * rephrasing Co-authored-by: Naarcha-AWS <[email protected]> Signed-off-by: Sandesh Kumar <[email protected]> * add timestamp in example Signed-off-by: Sandesh Kumar <[email protected]> * date field decription Signed-off-by: Sandesh Kumar <[email protected]> * Break up unordered lists in ordered dimensions Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <[email protected]> * date clarification Signed-off-by: Sandesh Kumar <[email protected]> * space fix Signed-off-by: Sandesh Kumar <[email protected]> * Update query Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Fix headers. Use range query for second example Signed-off-by: Naarcha-AWS <[email protected]> * Update star-tree-index.md Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> --------- Signed-off-by: Sandesh Kumar <[email protected]> Signed-off-by: bharath-techie <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> Co-authored-by: bharath-techie <[email protected]> Co-authored-by: Naarcha-AWS <[email protected]> Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Eric Pugh <[email protected]>
1 parent f631ae2 commit 9afbb9a

File tree

2 files changed

+105
-11
lines changed

2 files changed

+105
-11
lines changed

_field-types/supported-field-types/star-tree.md

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,8 @@ PUT logs
3737
"settings": {
3838
"index.number_of_shards": 1,
3939
"index.number_of_replicas": 0,
40-
"index.composite_index": true
40+
"index.composite_index": true,
41+
"index.append_only.enabled": true
4142
},
4243
"mappings": {
4344
"composite": {
@@ -54,6 +55,16 @@ PUT logs
5455
},
5556
{
5657
"name": "port"
58+
},
59+
{
60+
"name": "method"
61+
},
62+
{
63+
"name": "@timestamp",
64+
"calendar_intervals": [
65+
"month",
66+
"day"
67+
]
5768
}
5869
],
5970
"metrics": [
@@ -80,6 +91,10 @@ PUT logs
8091
}
8192
},
8293
"properties": {
94+
"@timestamp": {
95+
"format": "strict_date_optional_time||epoch_second",
96+
"type": "date"
97+
},
8398
"status": {
8499
"type": "integer"
85100
},
@@ -89,6 +104,9 @@ PUT logs
89104
"request_size": {
90105
"type": "integer"
91106
},
107+
"method" : {
108+
"type": "keyword"
109+
},
92110
"latency": {
93111
"type": "scaled_float",
94112
"scaling_factor": 10
@@ -118,17 +136,33 @@ When using the `ordered_dimesions` parameter, follow these best practices:
118136

119137
- The order of dimensions matters. You can define the dimensions ordered from the highest cardinality to the lowest cardinality for efficient storage and query pruning.
120138
- Avoid using high-cardinality fields as dimensions. High-cardinality fields adversely affect storage space, indexing throughput, and query performance.
121-
- Currently, fields supported by the `ordered_dimensions` parameter are all [numeric field types]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/numeric/), with the exception of `unsigned_long`. For more information, see [GitHub issue #15231](https://github.com/opensearch-project/OpenSearch/issues/15231).
122-
- Support for other field types, such as `keyword` and `ip`, will be added in future versions. For more information, see [GitHub issue #16232](https://github.com/opensearch-project/OpenSearch/issues/16232).
123139
- A minimum of `2` and a maximum of `10` dimensions are supported per star-tree index.
124140

141+
The `ordered_dimensions` parameter supports the following field types:
142+
143+
- All numeric field types, excluding `unsigned_long` and `scaled_float`
144+
- `keyword`
145+
- `object`
146+
- `date`, which can use up to three of the following calendar intervals:
147+
- `year` (of era)
148+
- `quarter` (of year)
149+
- `month` (of year)
150+
- `week` (of week-based year)
151+
- `day` (of month)
152+
- `hour` (of day)
153+
- `half-hour` (of day)
154+
- `quater-hour` (of day)
155+
- `minute` (of hour)
156+
- `second` (of minute)
157+
158+
Support for other field types, such as `ip`, will be added in future versions. For more information, see [GitHub issue #13875](https://github.com/opensearch-project/OpenSearch/issues/13875).
159+
125160
The `ordered_dimensions` parameter supports the following property.
126161

127162
| Parameter | Required/Optional | Description |
128163
| :--- | :--- | :--- |
129164
| `name` | Required | The name of the field. The field name should be present in the `properties` section as part of the index `mapping`. Ensure that the `doc_values` setting is `enabled` for any associated fields. |
130165

131-
132166
### Metrics
133167

134168
Configure any metric fields on which you need to perform aggregations. `Metrics` are required as part of a star-tree index configuration.

_search-plugins/star-tree-index.md

Lines changed: 67 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ A star-tree index can be used to perform faster aggregations. Consider the follo
2626

2727
Star-tree indexes have the following limitations:
2828

29-
- A star-tree index should only be enabled on indexes whose data is not updated or deleted because updates and deletions are not accounted for in a star-tree index.
29+
- A star-tree index should only be enabled on indexes whose data is not updated or deleted because updates and deletions are not accounted for in a star-tree index. To enforce this policy and use star-tree indexes, set the `index.append_only.enabled` setting to `true`.
3030
- A star-tree index can be used for aggregation queries only if the queried fields are a subset of the star-tree's dimensions and the aggregated fields are a subset of the star-tree's metrics.
3131
- After a star-tree index is enabled, it cannot be disabled. In order to disable a star-tree index, the data in the index must be reindexed without the star-tree mapping. Furthermore, changing a star-tree configuration will also require a reindex operation.
3232
- [Multi-values/array values]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/index/#arrays) are not supported.
@@ -68,6 +68,7 @@ To use a star-tree index, modify the following settings:
6868
- Set the feature flag `opensearch.experimental.feature.composite_index.star_tree.enabled` to `true`. For more information about enabling and disabling feature flags, see [Enabling experimental features]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).
6969
- Set the `indices.composite_index.star_tree.enabled` setting to `true`. For instructions on how to configure OpenSearch, see [Configuring settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/#static-settings).
7070
- Set the `index.composite_index` index setting to `true` during index creation.
71+
- Set the `index.append_only.enabled` index setting to `true` during index creation.
7172
- Ensure that the `doc_values` parameter is enabled for the `dimensions` and `metrics` fields used in your star-tree mapping.
7273

7374

@@ -81,7 +82,8 @@ PUT logs
8182
"settings": {
8283
"index.number_of_shards": 1,
8384
"index.number_of_replicas": 0,
84-
"index.composite_index": true
85+
"index.composite_index": true,
86+
"index.append_only.enabled": true
8587
},
8688
"mappings": {
8789
"composite": {
@@ -94,6 +96,9 @@ PUT logs
9496
},
9597
{
9698
"name": "port"
99+
},
100+
{
101+
"name": "method"
97102
}
98103
],
99104
"metrics": [
@@ -123,6 +128,9 @@ PUT logs
123128
"size": {
124129
"type": "integer"
125130
},
131+
"method" : {
132+
"type": "keyword"
133+
},
126134
"latency": {
127135
"type": "scaled_float",
128136
"scaling_factor": 10
@@ -131,6 +139,7 @@ PUT logs
131139
}
132140
}
133141
```
142+
{% include copy.html %}
134143

135144
For detailed information about star-tree index mappings and parameters, see [Star-tree field type]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/star-tree/).
136145

@@ -140,14 +149,20 @@ Star-tree indexes can be used to optimize queries and aggregations.
140149

141150
### Supported queries
142151

143-
The following queries are supported as of OpenSearch 2.18:
152+
The following queries are supported as of OpenSearch 2.19:
144153

145154
- [Term query]({{site.url}}{{site.baseurl}}/query-dsl/term/term/)
155+
- [Terms query]({{site.url}}{{site.baseurl}}/query-dsl/term/terms/)
146156
- [Match all docs query]({{site.url}}{{site.baseurl}}/query-dsl/match-all/)
157+
- [Range query]({{site.url}}{{site.baseurl}}/query-dsl/term/range/)
147158

148-
To use a query with a star-tree index, the query's fields must be present in the `ordered_dimensions` section of the star-tree configuration. Queries must also be paired with a supported aggregation.
159+
To use a query with a star-tree index, the query's fields must be present in the `ordered_dimensions` section of the star-tree configuration. Queries must also be paired with a supported aggregation. Queries without aggregations cannot be used with a star-tree index. Currently, queries on `date` fields are not supported and will be added in later versions.
149160

150161
### Supported aggregations
162+
163+
The following aggregations are supported by star-tree indexes.
164+
165+
#### Metric aggregations
151166

152167
The following metric aggregations are supported as of OpenSearch 2.18:
153168
- [Sum]({{site.url}}{{site.baseurl}}/aggregations/metric/sum/)
@@ -156,13 +171,11 @@ The following metric aggregations are supported as of OpenSearch 2.18:
156171
- [Value count]({{site.url}}{{site.baseurl}}/aggregations/metric/value-count/)
157172
- [Average]({{site.url}}{{site.baseurl}}/aggregations/metric/average/)
158173

159-
To use aggregations:
174+
To use searchable aggregations with a star-tree index, make sure you fulfill the following prerequisites:
160175

161176
- The fields must be present in the `metrics` section of the star-tree configuration.
162177
- The metric aggregation type must be part of the `stats` parameter.
163178

164-
### Aggregation example
165-
166179
The following example gets the sum of all the values in the `size` field for all error logs with `status=500`, using the [example mapping](#example-mapping):
167180

168181
```json
@@ -182,9 +195,56 @@ POST /logs/_search
182195
}
183196
}
184197
```
198+
{% include copy.html %}
185199

186200
Using a star-tree index, the result will be retrieved from a single aggregated document as it traverses the `status=500` node, as opposed to scanning through all of the matching documents. This results in lower query latency.
187201

202+
#### Date histograms with metric aggregations
203+
204+
You can use [date histograms]({{site.url}}{{site.baseurl}}/aggregations/bucket/date-histogram/) on calendar intervals with metric sub-aggregations.
205+
206+
To use date histogram aggregations and make them searchable in a star-tree index, remember the following requirements:
207+
208+
- The calendar intervals in a star-tree mapping configuration can use either the request's calendar field or a field of lower granularity than the request field. For example, if an aggregation uses the `month` field, the star-tree search can still use lower-granularity fields such as `day`.
209+
- A metric sub-aggregation must be part of the aggregation request.
210+
211+
The following example gets the sum of all the values in the `size` field with a range query, aggregated for each calendar month, for all error logs containing `method:get`:
212+
213+
```json
214+
POST /logs/_search
215+
{
216+
{
217+
"query": {
218+
"range": {
219+
"created": {
220+
"gte": "2019/01/01",
221+
"lte": "2019/12/31"
222+
},
223+
"method": {
224+
"status": "get"
225+
}
226+
},
227+
"size": 0,
228+
"aggs": {
229+
"by_hour": {
230+
"date_histogram": {
231+
"field": "@timestamp",
232+
"calendar_interval": "month"
233+
},
234+
"aggs": {
235+
"sum_size": {
236+
"sum": {
237+
"field": "size"
238+
}
239+
}
240+
}
241+
}
242+
}
243+
}
244+
```
245+
{% include copy-curl.html %}
246+
247+
188248
## Using queries without a star-tree index
189249

190250
Set the `indices.composite_index.star_tree.enabled` setting to `false` to run queries without using a star-tree index.

0 commit comments

Comments
 (0)