-
Notifications
You must be signed in to change notification settings - Fork 210
Sort And Relevance Cuffoff Part 3 - Add Relevance Cutoff Feature #1246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mainline
Are you sure you want to change the base?
Conversation
self.assertIn("h1", result_ids) | ||
self.assertIn("h2", result_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about other records? do we expect them in the result? why don't we just assert the whole result_ids list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, we assert on the exact order of the response documents ids.
self.assertNotIn("l1", ids_with_cutoff) # Should be filtered out | ||
self.assertNotIn("l2", ids_with_cutoff) # Should be filtered out | ||
|
||
# Should still include high relevance docs | ||
self.assertIn("h1", ids_with_cutoff) | ||
self.assertIn("h2", ids_with_cutoff) | ||
self.assertIn("h3", ids_with_cutoff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. simple way is to assert the ids_with_cutoff equals to the expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
{"_id": "l2", "content": "Bright morning sunlight streams through", "sort_value": 9.5}, | ||
# High sort, low relevance | ||
] | ||
self.client.index(self.structured_index_name).add_documents( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. no need to populate docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
# Should include low relevance docs due to override | ||
ids = [hit["_id"] for hit in response["hits"]] | ||
self.assertIn("l1", ids) # Should be included due to minSortCandidates override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we verify the order of the ids list instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
) | ||
ids_no_cutoff = [hit["_id"] for hit in response_no_cutoff["hits"]] | ||
# Should include low relevance docs with high sort values | ||
self.assertIn("l1", ids_no_cutoff[:3]) # Should be in top 3 due to high sort value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isn't it the top 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
if (relevance_cutoff and isinstance(marqo_index, UnstructuredMarqoIndex) and | ||
not isinstance(marqo_index, SemiStructuredMarqoIndex)): | ||
# Legacy unstructured indexes do not support relevance cutoff | ||
raise core_exceptions.UnsupportedFeatureError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't erroring out for structured indexes due the second condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the condition. Relevance cut-off is only available to unstructured indexes now
@@ -7,6 +7,8 @@ | |||
class RootFields(BaseModel): | |||
total_count: Optional[int] = Field(None, alias='totalCount') | |||
sort_candidates: Optional[int] = Field(None, alias='marqo__sortCandidates') | |||
relevant_candidates: Optional[int] = Field(None, alias='marqo__relevantCandidates') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the Vespa documentation have an opinion on this?
|
||
|
||
class TestSearchRelevanceCutoffFeatureSort1Field(MarqoTestCase): | ||
class TestSearchRelevanceCutoffFeature(MarqoTestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the search result (order and score of each doc) deterministic? what about with multi-shards and multi-replica?
Also, why do we use greaterThan and lessThan to verify expected candidates? Shouldn't we verify with a specific number if the result is deterministic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the tests more restrictive
def tearDown(self): | ||
"""Ensure documents are not changed after each test.""" | ||
for index_name in [self.unstructured_index_name, self.structured_index_name]: | ||
if 30 !=self.monitoring.get_index_stats_by_name(index_name).number_of_documents: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this, why do we need to verify the doc number in setup and tearDown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a pass for setUp
} | ||
|
||
for _ in range(10): | ||
# We run it several times to ensure that the results are consistent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to run multiple times? What could cause the result to be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
) | ||
|
||
r = self.index.to_vespa_query(self.hybrid_query) | ||
self.assertAlmostEqual(0.001, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.001 is not too small, will we encounter precision issue when use assertEqual
? I think we should use assertEqual
if it can pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use self.assertEqual
* @return The number of relevant results to keep | ||
*/ | ||
@VisibleForTesting | ||
Integer detectCutoffCount( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. return a int
instead, so the caller will not need to test if the result is null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
assertThat(result).contains("targetHits: 200"); | ||
assertThat(result).doesNotContain("targetHits: 100"); | ||
assertThat(result).contains("hnsw.exploreAdditionalHits: 1800"); | ||
assertThat(result).doesNotContain("hnsw.exploreAdditionalHits: 1900"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better way is to just assertThat(result).isEqualTo(expectedResult)
to verify it's exactly what we want it to be. Same for the following test cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
String yql3 = "{targetHits:100, hnsw.exploreAdditionalHits:900}"; // efSearch = 1000 | ||
String result3 = callOverwriteTargetHits(yql3, 1000); | ||
assertThat(result3).contains("targetHits:1000"); | ||
assertThat(result3).contains("hnsw.exploreAdditionalHits:0"); // 1000 - 1000 = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test case that override is larger than efsearch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
…s. Also remove duplicated tests
…ance-cuffoff-part-3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. I feel like there are duplicate test cases in this tests, (might be auto generated by AI), maybe try ask AI to identify what test cases are duplicate and consolidate them a bit. Also consider using parameterised test to further reduce the size of the test class for easier reading.
} | ||
|
||
@Test | ||
void shouldUpdateHitsWhenSortByCandidatesLowerThanLimitOffset() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
|
||
@Test | ||
void shouldHandleNullParametersCorrectly() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a duplicate test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted
class IntegrationRelevanceCutoffAndSortByTest { | ||
|
||
@Test | ||
void shouldUseMaxCandidatesWhenBothFeaturesEnabledRelevanceHigher() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dup with shouldUpdateHitsWhenBothFeaturesEnabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted
} | ||
|
||
@Test | ||
void shouldUseMaxCandidatesWhenBothFeaturesEnabledSortByHigher() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dup with shouldUpdateHitsWhenBothFeaturesEnabledReversed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted
} | ||
|
||
@Test | ||
void shouldValidateTensorTargetHitsBeforeProcessingWithBothFeatures() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dup with shouldThrowExceptionWhenTensorTargetHitsLowerThanLimitOffset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted
Change Summary
This PR implements the relevance-cutoff functionality in Marqo Python and Java. Relevance Cutoff can be used to stub out irrelevant documents in the retrieval stage. Detailed changes are shown as below:
Sorting Enhancements:
sortCandidates
tominSortCandidates
across models, validation logic, and query construction for better clarity. (src/marqo/tensor_search/models/sort_by_model.py
,minSortCandidates
instead ofsortCandidates
.Relevance Cutoff Implementation:
Implement 3 different relevance cutoff methods in Java, namely
"relative_max_score"
,"gap_detection"
, and"mean_std_dev"
,- Algorithm: Uses a dynamic threshold based on the top search result's score
- Implementation:
- Algorithm: Finds the largest score gap between consecutive results (elbow detection)
- Implementation:
- Parameter: None required (no cutoff parameter)
- Algorithm: Statistical approach using mean and standard deviation
- Implementation:
New Response Metadata
All relevance cutoff searches return additional metadata:
_relevantCandidates
: Number of results passing relevance filter_probeCandidates
: Total candidates analyzed during probe phaseNew API Parameter:
relevanceCutoff
API Structure
Related Jira Ticket
ticket link
Checklist
For new field types: