-
Notifications
You must be signed in to change notification settings - Fork 25.2k
kNN vector rescoring for quantized vectors #116663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kNN vector rescoring for quantized vectors #116663
Conversation
…ased DoubleValueSource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! this is more what I was thinking. Nice use of functionscore!
A couple of things:
- we should only oversample & rescore if its a quantized type (you can see this via indexOptions)
- I still think the API level thing needs to be an object with a parameter.
- We need to make sure this works for knn query and for the top-level KNN object :)
Thanks for taking a look @benwtrent ! I was just trying to validate the overall direction. You're perfectly right in your comments, I was not aiming for that yet 🙂 I'll keep iterating on this idea and open up a proper draft and not a PoC.
Is it OK to tackle that as separate PRs? |
If necessary, for sure. But both hit the same path on the shard level, its all about being able to declare the values at the parser level and propagate them down. If you did separate PRs, you then have separate transport versions, and likely separate capabilities for testing, etc. I am not 100% sure its worth it? |
# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java
hey @benwtrent ! I'm making progress on this. I've added the rescoring oversample to the knn query, knn section and the knn retriever. I still have to fix CI, use it just for quantized vectors, and add proper testing - and I'm already at 38 changed files. Do you still think this should go on a single PR, or should I start working on separate PRs for query, section and retriver? |
The main "logic" changes are done down on the mapper. Which all these would use. The rest is serialization & testing. I think all in one is ok. |
/** | ||
* Wraps a kNN vector query to rescore the results using the non-quantized vectors | ||
*/ | ||
public class KnnRescoreVectorQuery extends Query implements ProfilingQuery { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benwtrent I've created this new query to:
- Wrap the KNN vector query with a function score for rescoring
- Limit the number of results back from each shard when k is specified
LMKWYT!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is good. Once its ready for review, I can give it a thorough look over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some small things in the new query. I haven't fully reviewed. But I will once its ready
byte[] byteTarget, | ||
VectorSimilarityFunction vectorSimilarityFunction, | ||
Integer k, | ||
Query vectorQuery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the knn query should also be a profiled query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmmm, gotcha. There can be multiple levels of wrapping:
- KnnRescoreVectorQuery
- VectorSimilarityQuery
- EsKnnFloatVectorQuery
We'll probably need VectorSimilarityQuery
to implement ProfilingQuery
as well - right now using a similarity
param means that profiling returns 0 vector op counts.
I'll fix that, thanks for the catch 👍
|
||
if (k == null) { | ||
// No need to calculate top k - let the request size limit the results | ||
return query; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be sure to account for the inner queries vector ops and ensure that we account for the total vector ops (quantized and non).
scores[i] = scoreDocs[i].score; | ||
} | ||
|
||
vectorOpsCount = scoreDocs.length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like it should be the inner query vector ops + the total docs gathered.
/** | ||
* Wraps a kNN vector query to rescore the results using the non-quantized vectors | ||
*/ | ||
public class KnnRescoreVectorQuery extends Query implements ProfilingQuery { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is good. Once its ready for review, I can give it a thorough look over.
@benwtrent This is ready for another round of review. Main changes:
LMKWYT! |
# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor things. but this looks good!
...r/src/main/java/org/elasticsearch/index/mapper/vectors/VectorSimilarityFloatValueSource.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/org/elasticsearch/index/mapper/vectors/VectorSimilarityFloatValueSource.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/vectors/RescoreKnnVectorQuery.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/vectors/RescoreVectorBuilder.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/vectors/RescoreVectorBuilder.java
Outdated
Show resolved
Hide resolved
@ParametersFactory | ||
public static Iterable<Object[]> parameters() { | ||
List<Object[]> params = new ArrayList<>(); | ||
params.add(new Object[] { true }); | ||
params.add(new Object[] { false }); | ||
|
||
return params; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤯
And please don't forget about adding docs (either in this PR or in a follow up). |
Co-authored-by: Benjamin Trent <[email protected]>
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
(cherry picked from commit 5996772) # Conflicts: # server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java # x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java
* kNN vector rescoring for quantized vectors (#116663) (cherry picked from commit 5996772) # Conflicts: # server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java # x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java * FloatVectorValues have a different interface in this Lucene version
(cherry picked from commit 5996772) # Conflicts: # server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java # x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java
(cherry picked from commit 5996772) # Conflicts: # server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java # x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java Co-authored-by: Felix Barnsteiner <[email protected]>
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to elastic#116663 Closes elastic#119711
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to #116663 Closes #119711
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to elastic#116663 Closes elastic#119711
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to elastic#116663 Closes elastic#119711
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to elastic#116663 Closes elastic#119711
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to #116663 Closes #119711
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to #116663 Closes #119711
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to #116663 Closes #119711
Adds a new parameter to kNN based searches (
rescore_vector.num_candidates_factor
) that is used to:This approach uses a
FunctionScoreQuery
to replace scores using aVectorSimilarity
basedDoubleValueSource
.The
DenseVectorFieldMapper
is the one creating the Query - we could have done the rewriting on theKnnVectorQueryBuilder
, but given the new code had to access thevectorSimilarity
and other internals, it felt more natural to include it in the field mapper.The parameter has been added to knn section, knn query and knn retriever.
Usage:
Documentation will be added as a separate PR once this is merged.
Backlog