-
Notifications
You must be signed in to change notification settings - Fork 2.6k
LUCENE-8563: Remove k1+1 from the numerator of BM25Similarity #511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lucene/MIGRATE.txt
Outdated
@@ -150,3 +150,11 @@ in order to support ToParent/ToChildBlockJoinQuery. | |||
|
|||
Normalization is now type-safe, with CharFilterFactory#normalize() returning a Reader and | |||
TokenFilterFactory#normalize() returning a TokenFilter. | |||
|
|||
## k1+1 constant factor removed from BM25 similarity numerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you append (LUCENE-8563) ##
at the end of the line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
lucene/MIGRATE.txt
Outdated
constant factor was removed from the numerator of the scoring formula. | ||
Ordering of results is preserved unless scores are computed from multiple | ||
fields using different similarities. The previous behaviour is now exposed | ||
through the LegacyBM25Similarity class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add that it can be found in the lucene-misc jar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I wasn't sure how to phrase that. Will add.
Merged. |
Patch for https://issues.apache.org/jira/browse/LUCENE-8563.
This PR removes the k1+1 factor from the numerator of
BM25Similarity
and adds a newLegacyBM25Similarity
under misc that exposes the old behaviour. Note that I haven't found a way to easily reproduce the previous behaviour in the explain method, so I left that part out ofLegacyBM25Similarity
for now.