[ML] Fix weights to maximize minimum recall for multiclass classification #1231

tveasey · 2020-05-12T10:11:50Z

This is a correction for #1113.

We were computing the objective on the full training set rather than the sampled mask. Normally this is fine, since we use a stratified sample it basically acts like a constant scaling of the objective. However, when there is very little training data we can run into problems (such as a class missing in the sample set which is present in the training set). It also means this code is much slower than intended on very large data sets.

Change #1113 hasn't been released so I've marked this as a non-issue, but it must go out at the same time in 7.8.

… objective on the sample set

droberts195

LGTM

…tion (elastic#1231)

…tion (#1235) Backport #1231.

…tion (#1236) Backport #1231.

Maximise minimum recall for multiclass should have been computing the…

c38a1dd

… objective on the sample set

tveasey added >bug review >non-issue v8.0.0 :ml/DataFrameAnalysis v7.8.0 v7.9.0 labels May 12, 2020

droberts195 approved these changes May 12, 2020

View reviewed changes

Relax test threshold slightly

fc36e30

tveasey merged commit 9e68204 into elastic:master May 12, 2020

tveasey deleted the maximize-minimum-recall-bug branch May 12, 2020 16:29

tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 13, 2020

[ML] Fix weights to maximize minimum recall for multiclass classifica…

94e6494

…tion (elastic#1231)

tveasey mentioned this pull request May 13, 2020

[7.9][ML] Fix weights to maximize minimum recall for multiclass classification #1235

Merged

tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 13, 2020

[ML] Fix weights to maximize minimum recall for multiclass classifica…

2f63912

…tion (elastic#1231)

tveasey mentioned this pull request May 13, 2020

[7.8][ML] Fix weights to maximize minimum recall for multiclass classification #1236

Merged

tveasey added a commit that referenced this pull request May 13, 2020

[ML] Fix weights to maximize minimum recall for multiclass classifica…

21b7c55

…tion (#1235) Backport #1231.

tveasey added a commit that referenced this pull request May 13, 2020

[ML] Fix weights to maximize minimum recall for multiclass classifica…

770e2a2

…tion (#1236) Backport #1231.

tveasey mentioned this pull request May 14, 2020

[ML] Fix weights to maximize minimum recall for multiclass classification when the training data is missing classes #1239

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Fix weights to maximize minimum recall for multiclass classification #1231

[ML] Fix weights to maximize minimum recall for multiclass classification #1231

Uh oh!

tveasey commented May 12, 2020 •

edited

Loading

Uh oh!

droberts195 left a comment

Uh oh!

Uh oh!

[ML] Fix weights to maximize minimum recall for multiclass classification #1231

[ML] Fix weights to maximize minimum recall for multiclass classification #1231

Uh oh!

Conversation

tveasey commented May 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

droberts195 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tveasey commented May 12, 2020 •

edited

Loading