Skip to content

Add an example for static pipeline with in-memory data and show how to get class probabilities #1953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 8, 2019

Conversation

wschin
Copy link
Member

@wschin wschin commented Dec 22, 2018

Fixes #1947 and fixes #1881.

@TomFinley
Copy link
Contributor

TomFinley commented Dec 23, 2018

This may be worthwhile as an example, but it would be better if we could create a static pipeline out of a type directly, since the entire purpose of the static pipeline idiom is type safety and compile time checks, which should be achievable, and yet which this does not do. So perhaps a note in the sample that it is not intended to be a permanent solution, and is intended not a serious suggestion forever but more as a stopgap solution until the static pipeline evolves further?

@wschin
Copy link
Member Author

wschin commented Jan 2, 2019

@TomFinley, I added your comment to an inline comment. Do you think it's ok now?

/// <summary>
/// Helper function used to generate <see cref="NativeExample"/>s.
/// </summary>
private static List<NativeExample> GenerateRandomExamples(int count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private static List GenerateRandomExamples(int count) [](start = 5, length = 71)

I'd move this in the SampleUtils project, and reference it from the test, and from here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we move this to SamplesUtils, then user will not able to see a full example in doc.


In reply to: 244875572 [](ancestors = 244875572)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe the main purpose of those examples in the docs are to get copy/pasted in a VS project, add the necessary references and study it there, otherwise int he web page many things don't make sense.

There is also guideline to keep those examples small, and too the point, otherwise their length is an immediate turnoff for average user.


In reply to: 245743767 [](ancestors = 245743767,244875572)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will move NativeExample and its generator to SamplesUtils.


In reply to: 245796925 [](ancestors = 245796925,245743767,244875572)

{
// Create a general context for ML.NET operations. It can be used for exception tracking and logging,
// as a catalog of available operations and as the source of randomness.
var mlContext = new MLContext(seed: 1, conc: 1);
Copy link
Member

@sfilipi sfilipi Jan 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seed: 1, conc: 1 [](start = 42, length = 16)

seed: 1, conc: 1 [](start = 42, length = 16)

let's omit those in samples, and go with defaults as much as we can. #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.


In reply to: 244875689 [](ancestors = 244875689)

var mlContext = new MLContext(seed: 1, conc: 1);

// Context for calling static classifiers. It contains constructors of classifiers and evaluation utilities.
var ctx = new MulticlassClassificationContext(mlContext);
Copy link
Member

@sfilipi sfilipi Jan 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this, and use mlContext.MulticlassClassificationContext further down #Closed


// Show prediction result for the 3rd example.
var nativePrediction = nativePredictions[2];
Console.WriteLine("Our predicted label to this example is {0} with probability {1}",
Copy link
Member

@sfilipi sfilipi Jan 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Console.WriteLine(" [](start = 11, length = 20)

have a comment for the WriteLines showing here directly what the output would be. #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. Will do the same for Line 159 below.


In reply to: 244876086 [](ancestors = 244876086)

return examples;
}

public void MultiClassLightGbmStaticPipelineWithInMemoryData()
Copy link
Member

@sfilipi sfilipi Jan 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiClassLightGbmStaticPipelineWithInMemoryData [](start = 20, length = 48)

i think most of the emphasis in 1.0 is on the dynamic API. Would this example add value as a dynamic sample? #WontFix

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in another PR if really needed.


In reply to: 244876215 [](ancestors = 244876215)


namespace Microsoft.ML.Samples.Static
{
class LightGBMMulticlassWithInMemoryData
Copy link
Member

@sfilipi sfilipi Jan 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LightGBMMulticlassWithInMemoryData [](start = 10, length = 34)

this needs to be referenced from some other file, otherwise it won't display in the documentation.

Maybe from the LightGBMStatics catalog, if we have one already? #Closed

Copy link
Member Author

@wschin wschin Jan 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do it following pattern for MF. Please take a look at LightGbmStaticExtensions.cs in the next iteration.


In reply to: 244876344 [](ancestors = 244876344)

@TomFinley
Copy link
Contributor

TomFinley commented Jan 7, 2019

using Microsoft.ML.Data;

Header? #ByDesign


Refers to: docs/samples/Microsoft.ML.Samples/Static/LightGBMMulticlassWithInMemoryData.cs:1 in 5aa58c3. [](commit_id = 5aa58c3, deletion_comment = False)

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wschin , looks generally good. Only thing is the header.

@TomFinley
Copy link
Contributor

Yeah, that'll do.


In reply to: 450924472 [](ancestors = 450924472)

@wschin
Copy link
Member Author

wschin commented Jan 7, 2019

using Microsoft.ML.Data;

Example files usually do contain a header because they could be encapsulated in another class' doc string.


In reply to: 452028645 [](ancestors = 452028645)


Refers to: docs/samples/Microsoft.ML.Samples/Static/LightGBMMulticlassWithInMemoryData.cs:1 in 5aa58c3. [](commit_id = 5aa58c3, deletion_comment = False)

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@wschin wschin merged commit 17bdb98 into dotnet:master Jan 8, 2019
@wschin wschin deleted the an-example branch January 8, 2019 19:35
@ghost ghost locked as resolved and limited conversation to collaborators Mar 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need Example of Generating DataView and Feeding It to Static Pipeline Multi classification - Probability
3 participants