-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add MlClientDocumentationIT tests for classification. #47569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MlClientDocumentationIT tests for classification. #47569
Conversation
41c48de
to
d39605a
Compare
65152a7
to
c8df7f8
Compare
c8df7f8
to
230020a
Compare
Pinging @elastic/ml-core (:ml) |
Pinging @elastic/es-docs (>docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I left a few nits and questions. Maybe nothing needs to be changed if it's obvious which of the duplicate class names to use, not possible to link to bookmarks on external sites, and we don't care about full stops at the end of numbered items. But they are things to at least consider.
...high-level/src/test/java/org/elasticsearch/client/documentation/MlClientDocumentationIT.java
Show resolved
Hide resolved
{ | ||
// tag::evaluate-data-frame-evaluation-regression | ||
Evaluation evaluation = | ||
new Regression( // <1> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Regression
class is not fully qualified. But I don't think the doc examples include the import
s. So this doesn't make it clear which package to choose when typing Regression
into an IDE and it suggests two possible classes that could be imported.
It might be best to rename one of the classes, or else fully qualify the name here as well as where the other one is used in the docs.
(Same for Classification
on line 3341.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I've fully qualified the names Regression
and Classification
in this file for now.
LMK if you like the idea of renaming Regression
to RegressionEvaluation
and Classification
to ClassificationEvaluation
(or maybe have a different idea for naming). Then I could move on with renaming.
<2> Name of the field in the index. Its value denotes the actual (i.e. ground truth) label for an example. Must be either true or false. | ||
<3> Name of the field in the index. Its value denotes the probability (as per some ML algorithm) of the example being classified as positive. | ||
<4> The remaining parameters are the metrics to be calculated based on the two fields described above. | ||
<5> https://en.wikipedia.org/wiki/Precision_and_recall[Precision] calculated at thresholds: 0.4, 0.5 and 0.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to link to the #Precision
bookmark on this page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean, instead of wikipedia link, or in addition?
Such a section does not exist yet on our page. Should I add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can link to a specific bookmark on the Wikipedia page like this:
https://en.wikipedia.org/wiki/Precision_and_recall#Precision
I'm not sure it's possible in Asciidoc though. Maybe the #
causes a problem. If not don't worry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's what you meant.
Sure, done.
<3> Name of the field in the index. Its value denotes the probability (as per some ML algorithm) of the example being classified as positive. | ||
<4> The remaining parameters are the metrics to be calculated based on the two fields described above. | ||
<5> https://en.wikipedia.org/wiki/Precision_and_recall[Precision] calculated at thresholds: 0.4, 0.5 and 0.6 | ||
<6> https://en.wikipedia.org/wiki/Precision_and_recall[Recall] calculated at thresholds: 0.5 and 0.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to link to the #Recall
bookmark on this page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my questions above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'm happy to merge if the docs team is happy with the numbered lists.
e2e5d40
to
9a6a33c
Compare
9a6a33c
to
36e6446
Compare
run elasticsearch-ci/packaging-sample-matrix |
run elasticsearch-ci/packaging-sample |
This PR enhances client documentation tests with the new
classification
analysis type:testPutDataFrameAnalytics
testEvaluateDataFrame
,testEvaluateDataFrame_Classification
,testEvaluateDataFrame_Regression
Additionally, it adds basic java rest high-level docs related to
classification
.Relates #46735