Skip to content

Update sparse_vector field mapping to include default setting for token pruning #129089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

markjhoy
Copy link
Contributor

@markjhoy markjhoy commented Jun 6, 2025

Updates the SparseVectorFieldMapper type to include index options for pruning tokens and associated configuration values.

Before this update, token pruning for sparse vector types is only available via the query (see parameters for the sparse vector query ).

With this PR, by default, any new indices with a sparse_vector field type will by default have token pruning turned on (previous indices that may have had sparse_vector fields that exist before this update will still keep pruning turned off by default). Any sparse_vector queries that have explicit pruning options will still override the index defaults if they are set up.

Example:

{
  "properties": {
    "example_field": {
       "type": "sparse_vector",
        "index_options": {
          "prune": (boolean, default is `true`),
          "pruning_config": {
            "tokens_freq_ratio_threshold": (integer, range 1-100, default is 5),
            "tokens_weight_threshold": (double, range 0.0-1.0, default if 0.4)
          }
        }
     }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants