Skip to content

[ML] Fix weights to maximize minimum recall for multiclass classification when the training data is missing classes #1239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 15, 2020

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented May 14, 2020

This is a follow on from #1231.

We were still running into problems if the Java doesn't sample a class at all in the training data. This points to a problem with the stratified sampling implementation in Java, but we need to make the weight calculation defensive.

Since this is a correction to change #1113, which hasn't been released, I've marked this as a non-issue, but it must go out at the same time in 7.8.

@tveasey tveasey changed the title [ML] Fix weights to maximize minimum recall for multiclass classification when the training data is some of the missing classes [ML] Fix weights to maximize minimum recall for multiclass classification when the training data is missing classes May 14, 2020
@tveasey tveasey requested a review from valeriy42 May 14, 2020 14:43
Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good work on catching the corner case.

@tveasey
Copy link
Contributor Author

tveasey commented May 15, 2020

This only failed with a gradle issue on aarch64, so I'm going to merge.

@tveasey tveasey merged commit 505960a into elastic:master May 15, 2020
@tveasey tveasey deleted the minimum-recall-weights-robustness branch May 15, 2020 11:09
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 15, 2020
tveasey added a commit that referenced this pull request May 15, 2020
…tion when the training data is missing classes (#1256)

Backport #1239.
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 15, 2020
tveasey added a commit that referenced this pull request May 16, 2020
…tion when the training data is missing classes (#1262)

Backport #1239.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants