Skip to content

kNN vector rescoring for quantized vectors #116663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

carlosdelest
Copy link
Member

@carlosdelest carlosdelest commented Nov 12, 2024

Adds a new parameter to kNN based searches (rescore_vector.num_candidates_factor) that is used to:

  • Multiply the number of candidates in the kNN query
  • Perform an approximate search in the extended candidate set
  • Perform an exact search over the returned results, and get the top k

This approach uses a FunctionScoreQuery to replace scores using a VectorSimilarity based DoubleValueSource.

The DenseVectorFieldMapper is the one creating the Query - we could have done the rewriting on the KnnVectorQueryBuilder, but given the new code had to access the vectorSimilarity and other internals, it felt more natural to include it in the field mapper.

The parameter has been added to knn section, knn query and knn retriever.

Usage:

GET msmarco-v2-bbq/_search
{
    "query": {
        "knn": {
            "field": "emb",
            "query_vector": [...],
            "k": 10,
            "num_candidates": 100,
            "rescore_vector": {
                "num_candidates_factor": 10.0
            }
        }
    }
}

Documentation will be added as a separate PR once this is merged.

Backlog

  • Oversample and rescore if its a quantized type - checking indexOptions
  • Use an object with a parameter on the knn query
  • Add to the kNN section
  • Add to the kNN retriever
  • Implement profiling
  • Testing

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! this is more what I was thinking. Nice use of functionscore!

A couple of things:

  • we should only oversample & rescore if its a quantized type (you can see this via indexOptions)
  • I still think the API level thing needs to be an object with a parameter.
  • We need to make sure this works for knn query and for the top-level KNN object :)

@carlosdelest
Copy link
Member Author

Thanks for taking a look @benwtrent ! I was just trying to validate the overall direction. You're perfectly right in your comments, I was not aiming for that yet 🙂

I'll keep iterating on this idea and open up a proper draft and not a PoC.

We need to make sure this works for knn query and for the top-level KNN object :)

Is it OK to tackle that as separate PRs?

@benwtrent
Copy link
Member

Is it OK to tackle that as separate PRs?

If necessary, for sure. But both hit the same path on the shard level, its all about being able to declare the values at the parser level and propagate them down. If you did separate PRs, you then have separate transport versions, and likely separate capabilities for testing, etc. I am not 100% sure its worth it?

@carlosdelest carlosdelest changed the title PoC - Vector rescoring in kNN, take 2 WIP - Vector rescoring Nov 13, 2024
@carlosdelest
Copy link
Member Author

carlosdelest commented Nov 18, 2024

hey @benwtrent ! I'm making progress on this. I've added the rescoring oversample to the knn query, knn section and the knn retriever.

I still have to fix CI, use it just for quantized vectors, and add proper testing - and I'm already at 38 changed files. Do you still think this should go on a single PR, or should I start working on separate PRs for query, section and retriver?

@benwtrent
Copy link
Member

I still have to fix CI, use it just for quantized vectors, and add proper testing - and I'm already at 38 changed files. Do you still think this should go on a single PR, or should I start working on separate PRs for query, section and retriver?

The main "logic" changes are done down on the mapper. Which all these would use. The rest is serialization & testing. I think all in one is ok.

/**
* Wraps a kNN vector query to rescore the results using the non-quantized vectors
*/
public class KnnRescoreVectorQuery extends Query implements ProfilingQuery {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benwtrent I've created this new query to:

  • Wrap the KNN vector query with a function score for rescoring
  • Limit the number of results back from each shard when k is specified

LMKWYT!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is good. Once its ready for review, I can give it a thorough look over.

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some small things in the new query. I haven't fully reviewed. But I will once its ready

byte[] byteTarget,
VectorSimilarityFunction vectorSimilarityFunction,
Integer k,
Query vectorQuery
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the knn query should also be a profiled query

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm, gotcha. There can be multiple levels of wrapping:

  • KnnRescoreVectorQuery
  • VectorSimilarityQuery
  • EsKnnFloatVectorQuery

We'll probably need VectorSimilarityQuery to implement ProfilingQuery as well - right now using a similarity param means that profiling returns 0 vector op counts.

I'll fix that, thanks for the catch 👍


if (k == null) {
// No need to calculate top k - let the request size limit the results
return query;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be sure to account for the inner queries vector ops and ensure that we account for the total vector ops (quantized and non).

scores[i] = scoreDocs[i].score;
}

vectorOpsCount = scoreDocs.length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like it should be the inner query vector ops + the total docs gathered.

/**
* Wraps a kNN vector query to rescore the results using the non-quantized vectors
*/
public class KnnRescoreVectorQuery extends Query implements ProfilingQuery {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is good. Once its ready for review, I can give it a thorough look over.

@carlosdelest
Copy link
Member Author

@benwtrent This is ready for another round of review.

Main changes:

  • Renaming oversample to num_candidates_factor and make it work with num_candidates
  • Changed how queries are wrapped to ensure VectorSimilarityQuery wraps the new RescoreKnnVectorQuery, so similarity is applied to the rescored scoring
  • Updating some YAML tests to use MIP where applicable

LMKWYT!

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things. but this looks good!

Comment on lines +233 to +240
@ParametersFactory
public static Iterable<Object[]> parameters() {
List<Object[]> params = new ArrayList<>();
params.add(new Object[] { true });
params.add(new Object[] { false });

return params;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯

@benwtrent
Copy link
Member

And please don't forget about adding docs (either in this PR or in a follow up).

@carlosdelest carlosdelest merged commit 5996772 into elastic:main Dec 11, 2024
15 of 16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 116663

@carlosdelest
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

carlosdelest added a commit to carlosdelest/elasticsearch that referenced this pull request Dec 11, 2024
(cherry picked from commit 5996772)

# Conflicts:
#	server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java
#	x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java
elasticsearchmachine pushed a commit that referenced this pull request Dec 11, 2024
* kNN vector rescoring for quantized vectors (#116663)

(cherry picked from commit 5996772)

# Conflicts:
#	server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java
#	x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java

* FloatVectorValues have a different interface in this Lucene version
carlosdelest added a commit to carlosdelest/elasticsearch that referenced this pull request Jan 9, 2025
(cherry picked from commit 5996772)

# Conflicts:
#	server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java
#	x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java
elasticsearchmachine pushed a commit that referenced this pull request Jan 9, 2025
(cherry picked from commit 5996772)

# Conflicts:
#	server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java
#	x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java

Co-authored-by: Felix Barnsteiner <[email protected]>
javanna added a commit to javanna/elasticsearch that referenced this pull request Feb 14, 2025
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to elastic#116663

Closes elastic#119711
javanna added a commit that referenced this pull request Feb 15, 2025
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to #116663

Closes #119711
javanna added a commit to javanna/elasticsearch that referenced this pull request Feb 15, 2025
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to elastic#116663

Closes elastic#119711
javanna added a commit to javanna/elasticsearch that referenced this pull request Feb 15, 2025
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to elastic#116663

Closes elastic#119711
javanna added a commit to javanna/elasticsearch that referenced this pull request Feb 15, 2025
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to elastic#116663

Closes elastic#119711
elasticsearchmachine pushed a commit that referenced this pull request Feb 15, 2025
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to #116663

Closes #119711
elasticsearchmachine pushed a commit that referenced this pull request Feb 15, 2025
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to #116663

Closes #119711
elasticsearchmachine pushed a commit that referenced this pull request Feb 15, 2025
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to #116663

Closes #119711
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >feature :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants