Closed
Description
Describe the bug
When I run a query with Hybrid Search and RFF pipeline by sorted by ID, the score is changed.
GET my-test/_search?search_pipeline=rrf-pipeline
{
"query": {
"hybrid": {
"queries": [
{
"match": {
"content_text": "zoo"
}
},
{
"knn": {
"content_vector": {
"vector": [0.1, 0.2],
"k": 10
}
}
}
]
}
},
"sort": [
"_id"
]
}
Related component
Search
To Reproduce
- Create index
PUT my-test
{
"settings": {
"index": {
"knn": true
}
},
"mappings": {
"properties": {
"content_text": {
"type": "text"
},
"content_vector": {
"type": "knn_vector",
"dimension": 2,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "nmslib"
}
}
}
}
}
- Add data
POST my-test/_doc/1
{
"content_text":"foo",
"content_vector":[0.12, -0.34]
}
POST my-test/_doc/2
{
"content_text":"bar",
"content_vector":[0.13, 0.45]
}
- Put RRF pipeline
PUT /_search/pipeline/rrf-pipeline
{
"description": "Post processor for hybrid RRF search",
"phase_results_processors": [
{
"score-ranker-processor": {
"combination": {
"technique": "rrf"
}
}
}
]
}
- Run hybrid search with sort option.
GET my-test/_search?search_pipeline=rrf-pipeline
{
"query": {
"hybrid": {
"queries": [
{
"match": {
"content_text": "zoo"
}
},
{
"knn": {
"content_vector": {
"vector": [0.1, 0.2],
"k": 10
}
}
}
]
}
},
"sort": [
"_id"
]
}
The result is
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.016393442,
"hits": [
{
"_index": "my-test",
"_id": "1",
"_score": 0.016393442,
"sort": [
"1"
]
},
{
"_index": "my-test",
"_id": "2",
"_score": 0.016129032,
"sort": [
"2"
]
}
]
}
}
But if the query does not have "sort",
GET my-test/_search?search_pipeline=rrf-pipeline
{
"_source": false,
"query": {
"hybrid": {
"queries": [
{
"match": {
"content_text": "zoo"
}
},
{
"knn": {
"content_vector": {
"vector": [0.1, 0.2],
"k": 10
}
}
}
]
}
}
}
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.016393442,
"hits": [
{
"_index": "my-test",
"_id": "2",
"_score": 0.016393442
},
{
"_index": "my-test",
"_id": "1",
"_score": 0.016129032
}
]
}
}
I expected the differences is only the order of the hits
array.
But score was different.
- id=1 : 0.016393442 -> 0.016129032
- id=2 : 0.016129032 -> 0.016393442
Expected behavior
The scores should not be changes with sort option.
Additional Details
Plugins
7982a63674f5 analysis-sudachi 3.3.0
7982a63674f5 opensearch-alerting 2.19.1.0
7982a63674f5 opensearch-anomaly-detection 2.19.1.0
7982a63674f5 opensearch-asynchronous-search 2.19.1.0
7982a63674f5 opensearch-cross-cluster-replication 2.19.1.0
7982a63674f5 opensearch-custom-codecs 2.19.1.0
7982a63674f5 opensearch-flow-framework 2.19.1.0
7982a63674f5 opensearch-geospatial 2.19.1.0
7982a63674f5 opensearch-index-management 2.19.1.0
7982a63674f5 opensearch-job-scheduler 2.19.1.0
7982a63674f5 opensearch-knn 2.19.1.0
7982a63674f5 opensearch-ltr 2.19.1.0
7982a63674f5 opensearch-ml 2.19.1.0
7982a63674f5 opensearch-neural-search 2.19.1.0
7982a63674f5 opensearch-notifications 2.19.1.0
7982a63674f5 opensearch-notifications-core 2.19.1.0
7982a63674f5 opensearch-observability 2.19.1.0
7982a63674f5 opensearch-performance-analyzer 2.19.1.0
7982a63674f5 opensearch-reports-scheduler 2.19.1.0
7982a63674f5 opensearch-security 2.19.1.0
7982a63674f5 opensearch-security-analytics 2.19.1.0
7982a63674f5 opensearch-skills 2.19.1.0
7982a63674f5 opensearch-sql 2.19.1.0
7982a63674f5 opensearch-system-templates 2.19.1.0
7982a63674f5 query-insights 2.19.1.0
Screenshots
No
Host/Environment (please complete the following information):
- OS: Ubuntu 22.04
- Version: OpenSearch 2.19.1
Additional context
No