Vector rescoring oversamples k instead of num_candidates #119835

carlosdelest · 2025-01-09T09:11:19Z

It makes more sense to apply rescoring to an oversampled k instead of num_candidates, as rescoring just a fraction of the candidates will be more performant and offer good recall, specially for smaller k sizes compared to number of candidates.

API changes so we use oversample instead of num_candidates_factor:

GET msmarco-v2-bbq/_search
{
    "query": {
        "knn": {
            "field": "emb",
            "query_vector": [...],
            "k": 10,
            "num_candidates": 100,
            "rescore_vector": {
                "oversample": 2.5
            }
        }
    }
}

This will mean rescoring k * oversample from the num_candidates retrieved on each shard, and returning the top k out of them.

Follow up to #116663

elasticsearchmachine · 2025-01-10T16:12:20Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

benwtrent

Great stuff! You may have issues with the api compat tests and thus they may need to be muted before backport. I am not sure.

carlosdelest · 2025-01-10T20:34:16Z

You may have issues with the api compat tests and thus they may need to be muted before backport. I am not sure.

I hope not as I changed the capability name - I'll keep an eye on this 👍

)

elasticsearchmachine · 2025-01-10T20:35:10Z

💚 Backport successful

Status	Branch	Result
✅	8.x

…119996)

Use oversample to modify k instead of num_candidates for rescoring

04ed6e4

elasticsearchmachine added the v9.0.0 label Jan 9, 2025

carlosdelest added >non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.18.0 auto-backport Automatically create backport pull requests when merged labels Jan 9, 2025

carlosdelest mentioned this pull request Jan 9, 2025

Vector rescoring - Simplify code for k == null #118997

Merged

carlosdelest changed the base branch from main to 8.x January 9, 2025 09:52

carlosdelest changed the base branch from 8.x to main January 9, 2025 09:53

carlosdelest added 4 commits January 9, 2025 18:04

Renaming typo

6ed95a4

Fix test

e28d3a1

Change capability name so BwC tests run correctly

ab7ca58

Merge branch 'main' into non-issue/rescore-vector-oversample-using-k

9d9836a

carlosdelest marked this pull request as ready for review January 10, 2025 16:11

carlosdelest requested a review from benwtrent January 10, 2025 16:13

benwtrent approved these changes Jan 10, 2025

View reviewed changes

carlosdelest merged commit 8ca062a into elastic:main Jan 10, 2025
16 checks passed

carlosdelest added a commit to carlosdelest/elasticsearch that referenced this pull request Jan 10, 2025

Vector rescoring oversamples k instead of num_candidates (elastic#119835

ff562f6

)

carlosdelest mentioned this pull request Jan 10, 2025

[8.x] Vector rescoring oversamples k instead of num_candidates (#119835) #119996

Merged

elasticsearchmachine pushed a commit that referenced this pull request Jan 10, 2025

Vector rescoring oversamples k instead of num_candidates (#119835) (#…

dc5c8b3

…119996)

This was referenced Jan 13, 2025

[8.x] Vector rescoring oversamples k instead of num_candidates #119887

Closed

[Docs] kNN vector rescoring for quantized vectors #118425

Merged

This was referenced Jan 20, 2025

Update knn search and query template autocomplete elastic/kibana#207187

Merged

Add rescore_vector to knn query, knn section and knn retriever elastic/elasticsearch-specification#3553

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vector rescoring oversamples k instead of num_candidates #119835

Vector rescoring oversamples k instead of num_candidates #119835

Uh oh!

carlosdelest commented Jan 9, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Jan 10, 2025

Uh oh!

benwtrent left a comment

Uh oh!

Uh oh!

carlosdelest commented Jan 10, 2025

Uh oh!

elasticsearchmachine commented Jan 10, 2025

Uh oh!

Uh oh!

Vector rescoring oversamples k instead of num_candidates #119835

Vector rescoring oversamples k instead of num_candidates #119835

Uh oh!

Conversation

carlosdelest commented Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Jan 10, 2025

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

carlosdelest commented Jan 10, 2025

Uh oh!

elasticsearchmachine commented Jan 10, 2025

💚 Backport successful

Uh oh!

Uh oh!

carlosdelest commented Jan 9, 2025 •

edited

Loading