Skip to content

Static pipeline column indexers, binary/regression evaluators #869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 10, 2018

Conversation

TomFinley
Copy link
Contributor

  • Infrastructure for column indexers
  • Binary classification and regression evaluators

Fixes #868.

@TomFinley TomFinley self-assigned this Sep 8, 2018
@TomFinley TomFinley added API Issues pertaining the friendly API and removed Answered labels Sep 8, 2018
internal IndexHelper(SchemaBearing<T> schematized)
{
Indices = StaticPipeInternalUtils.MakeAnalysisInstance<T>(out var rec);
// We don't really care about this, but we need to work over something.
Copy link
Contributor

@Zruty0 Zruty0 Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really care about this, but we need to work over something. [](start = 19, length = 68)

I'm sorry, I completely don't understand what is going on. The comment adds negative clarity. #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you're constructing a method just to manufacture a ParameterInfo. Okay. Then you're calling GetNamesValues, but you only ever need values.

At least add a comment to GetNamesValues please


In reply to: 216415488 [](ancestors = 216415488)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very well.


In reply to: 216416576 [](ancestors = 216416576,216415488)

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

public class Result
{
/// <summary>
/// Gets the area under the ROC curve.
Copy link
Member

@sfilipi sfilipi Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gets [](start = 16, length = 4)

not necessary on this PR, but maybe we can provide links to the longer explanation of the metrics (wikipedia or the MSFT doc on RML, maybe?), since those comments will be the user-facing documentation about the metrics (I assume) #Pending

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this might be helpful. Maybe we can review later, frankly I don't quite know how to insert links in XML docs since AFAIK <a is not a supported tag.


In reply to: 216470620 [](ancestors = 216470620)

/// precision/recall curve. It is increasingly used in the machine learning community, particularly
/// for imbalanced datasets where one class is observed more frequently than the other. On these
/// datasets, AUPRC can highlight performance differences that are lost with AUC.
/// </remarks>
Copy link
Member

@sfilipi sfilipi Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, really like the explanation of the metric. #Resolved

Copy link
Contributor Author

@TomFinley TomFinley Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually just copy pasted it from the soon-to-be-deprecated metrics structure. ;) I'm not sure whom you should thank then. @zeahmed maybe? Someone else? #Closed

/// The log-loss reduction is scaled relative to a classifier that predicts the prior for every example:
/// (LL(prior) - LL(classifier)) / LL(prior)
/// This metric can be interpreted as the advantage of the classifier over a random prediction.
/// E.g., if the RIG equals 20, it can be interpreted as "the probability of a correct prediction is
Copy link
Member

@sfilipi sfilipi Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" [](start = 69, length = 1)

I think you might need to XML escape -> " #Resolved

/// <param name="data">The data to evaluate.</param>
/// <param name="label">The index delegate for the label column.</param>
/// <param name="pred">The index delegate for columns from calibrated prediction of a binary classifier.
/// Under typical scenarios, this will just be the same tuple of results returned from the trainer.</param>
Copy link
Member

@sfilipi sfilipi Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under typical scenarios, this will just be the same tuple of results returned from the trainer [](start = 12, length = 94)

When I read this, I can't help thinking: under what scenarios this won't be the case. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you could imagine someone composing their own dataset, not training anything or doing anything but loading a file. But I didn't feel the need to call this out.


In reply to: 216473441 [](ancestors = 216473441)

/// </summary>
/// <remarks>
/// The absolute loss is defined as
/// L1 = (1/m) * sum( abs( yi - y'i))
Copy link
Member

@sfilipi sfilipi Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

' [](start = 45, length = 1)

XML escape the apostrophe in those metrics: ' #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Though the comment does appear to render correctly from the place where I copied it.


In reply to: 216474189 [](ancestors = 216474189)

@@ -13,7 +14,9 @@ namespace Microsoft.ML.Data.StaticPipe.Runtime
{
/// <summary>
/// Utility methods for components that want to expose themselves in the idioms of the statically-typed pipelines.
/// These utilities are meant to be called by and useful to component authors, not users of those components.
/// These utilities are meant to be called by and useful to component authors, not users of those components. The
/// purpose is not to keep them hidden per se, but rather out of the face of users that are just trying to use the
Copy link
Member

@sfilipi sfilipi Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of the face of users [](start = 62, length = 24)

out of the attention of users #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I did not describe this well -- I think the flaw in my writing goes a bit beyond simple word choice. Let me rephrase this sentence entirely.


In reply to: 216475276 [](ancestors = 216475276)

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

* Infrastructure for column indexers
* Binary classification and regression evaluators
@TomFinley TomFinley merged commit 627ad79 into dotnet:master Sep 10, 2018
@TomFinley TomFinley deleted the tfinley/Eval branch September 10, 2018 23:05
Zruty0 pushed a commit to Zruty0/machinelearning that referenced this pull request Sep 11, 2018
…#869)

Static pipeline column indexers, binary/regression evaluators

* Infrastructure for column indexers
* Binary classification and regression evaluators
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants