-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Static pipeline column indexers, binary/regression evaluators #869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
internal IndexHelper(SchemaBearing<T> schematized) | ||
{ | ||
Indices = StaticPipeInternalUtils.MakeAnalysisInstance<T>(out var rec); | ||
// We don't really care about this, but we need to work over something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really care about this, but we need to work over something. [](start = 19, length = 68)
I'm sorry, I completely don't understand what is going on. The comment adds negative clarity. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, you're constructing a method just to manufacture a ParameterInfo
. Okay. Then you're calling GetNamesValues
, but you only ever need values.
At least add a comment to GetNamesValues
please
In reply to: 216415488 [](ancestors = 216415488)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class Result | ||
{ | ||
/// <summary> | ||
/// Gets the area under the ROC curve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gets [](start = 16, length = 4)
not necessary on this PR, but maybe we can provide links to the longer explanation of the metrics (wikipedia or the MSFT doc on RML, maybe?), since those comments will be the user-facing documentation about the metrics (I assume) #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this might be helpful. Maybe we can review later, frankly I don't quite know how to insert links in XML docs since AFAIK <a
is not a supported tag.
In reply to: 216470620 [](ancestors = 216470620)
/// precision/recall curve. It is increasingly used in the machine learning community, particularly | ||
/// for imbalanced datasets where one class is observed more frequently than the other. On these | ||
/// datasets, AUPRC can highlight performance differences that are lost with AUC. | ||
/// </remarks> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, really like the explanation of the metric. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually just copy pasted it from the soon-to-be-deprecated metrics structure. ;) I'm not sure whom you should thank then. @zeahmed maybe? Someone else? #Closed
/// The log-loss reduction is scaled relative to a classifier that predicts the prior for every example: | ||
/// (LL(prior) - LL(classifier)) / LL(prior) | ||
/// This metric can be interpreted as the advantage of the classifier over a random prediction. | ||
/// E.g., if the RIG equals 20, it can be interpreted as "the probability of a correct prediction is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" [](start = 69, length = 1)
I think you might need to XML escape -> " #Resolved
/// <param name="data">The data to evaluate.</param> | ||
/// <param name="label">The index delegate for the label column.</param> | ||
/// <param name="pred">The index delegate for columns from calibrated prediction of a binary classifier. | ||
/// Under typical scenarios, this will just be the same tuple of results returned from the trainer.</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under typical scenarios, this will just be the same tuple of results returned from the trainer [](start = 12, length = 94)
When I read this, I can't help thinking: under what scenarios this won't be the case. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you could imagine someone composing their own dataset, not training anything or doing anything but loading a file. But I didn't feel the need to call this out.
In reply to: 216473441 [](ancestors = 216473441)
/// </summary> | ||
/// <remarks> | ||
/// The absolute loss is defined as | ||
/// L1 = (1/m) * sum( abs( yi - y'i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
' [](start = 45, length = 1)
XML escape the apostrophe in those metrics: ' #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Though the comment does appear to render correctly from the place where I copied it.
In reply to: 216474189 [](ancestors = 216474189)
@@ -13,7 +14,9 @@ namespace Microsoft.ML.Data.StaticPipe.Runtime | |||
{ | |||
/// <summary> | |||
/// Utility methods for components that want to expose themselves in the idioms of the statically-typed pipelines. | |||
/// These utilities are meant to be called by and useful to component authors, not users of those components. | |||
/// These utilities are meant to be called by and useful to component authors, not users of those components. The | |||
/// purpose is not to keep them hidden per se, but rather out of the face of users that are just trying to use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of the face of users [](start = 62, length = 24)
out of the attention of users #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I did not describe this well -- I think the flaw in my writing goes a bit beyond simple word choice. Let me rephrase this sentence entirely.
In reply to: 216475276 [](ancestors = 216475276)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Infrastructure for column indexers * Binary classification and regression evaluators
77da04d
to
8a516f9
Compare
8a516f9
to
6914856
Compare
…#869) Static pipeline column indexers, binary/regression evaluators * Infrastructure for column indexers * Binary classification and regression evaluators
Fixes #868.