-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[ML] AucRoc gives misleading results when num_top_classes is set too low. #63306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/ml-core (:ml) |
I think having a special value of But if we ever increase the number of classes we allow, this default value would have to change. I think decoupling them is prudent. |
I think so too. I've implemented this idea on C++ side in elastic/ml-cpp#1526 |
All the changes are in so I consider this issue solved. |
Uh oh!
There was an error while loading. Please reload this page.
In the case of multiclass classification, the calculation of AucRoc should require that the class in question appears in all documents top classes arrays, so that we know its probability for every document.
Otherwise, the results are not correct or, in some cases, as pointed out by @wwang500, the evaluation request fails because it cannot find even one single document with the class in question listed in top classes.
The solution is to set
num_top_classes
so that it is greater or equal to the total number of classes. We should minimize the surprise for the users though and possibly apply a sensible default ourselves.The text was updated successfully, but these errors were encountered: