-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add an example for static pipeline with in-memory data and show how to get class probabilities #1953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This may be worthwhile as an example, but it would be better if we could create a static pipeline out of a type directly, since the entire purpose of the static pipeline idiom is type safety and compile time checks, which should be achievable, and yet which this does not do. So perhaps a note in the sample that it is not intended to be a permanent solution, and is intended not a serious suggestion forever but more as a stopgap solution until the static pipeline evolves further? |
@TomFinley, I added your comment to an inline comment. Do you think it's ok now? |
/// <summary> | ||
/// Helper function used to generate <see cref="NativeExample"/>s. | ||
/// </summary> | ||
private static List<NativeExample> GenerateRandomExamples(int count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static List GenerateRandomExamples(int count) [](start = 5, length = 71)
I'd move this in the SampleUtils project, and reference it from the test, and from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move this to SamplesUtils, then user will not able to see a full example in doc.
In reply to: 244875572 [](ancestors = 244875572)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe the main purpose of those examples in the docs are to get copy/pasted in a VS project, add the necessary references and study it there, otherwise int he web page many things don't make sense.
There is also guideline to keep those examples small, and too the point, otherwise their length is an immediate turnoff for average user.
In reply to: 245743767 [](ancestors = 245743767,244875572)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Will move NativeExample and its generator to SamplesUtils.
In reply to: 245796925 [](ancestors = 245796925,245743767,244875572)
{ | ||
// Create a general context for ML.NET operations. It can be used for exception tracking and logging, | ||
// as a catalog of available operations and as the source of randomness. | ||
var mlContext = new MLContext(seed: 1, conc: 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seed: 1, conc: 1 [](start = 42, length = 16)
seed: 1, conc: 1 [](start = 42, length = 16)
let's omit those in samples, and go with defaults as much as we can. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var mlContext = new MLContext(seed: 1, conc: 1); | ||
|
||
// Context for calling static classifiers. It contains constructors of classifiers and evaluation utilities. | ||
var ctx = new MulticlassClassificationContext(mlContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this, and use mlContext.MulticlassClassificationContext further down #Closed
|
||
// Show prediction result for the 3rd example. | ||
var nativePrediction = nativePredictions[2]; | ||
Console.WriteLine("Our predicted label to this example is {0} with probability {1}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Console.WriteLine(" [](start = 11, length = 20)
have a comment for the WriteLines showing here directly what the output would be. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return examples; | ||
} | ||
|
||
public void MultiClassLightGbmStaticPipelineWithInMemoryData() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MultiClassLightGbmStaticPipelineWithInMemoryData [](start = 20, length = 48)
i think most of the emphasis in 1.0 is on the dynamic API. Would this example add value as a dynamic sample? #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
namespace Microsoft.ML.Samples.Static | ||
{ | ||
class LightGBMMulticlassWithInMemoryData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LightGBMMulticlassWithInMemoryData [](start = 10, length = 34)
this needs to be referenced from some other file, otherwise it won't display in the documentation.
Maybe from the LightGBMStatics catalog, if we have one already? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do it following pattern for MF. Please take a look at LightGbmStaticExtensions.cs in the next iteration.
In reply to: 244876344 [](ancestors = 244876344)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wschin , looks generally good. Only thing is the header.
Yeah, that'll do. In reply to: 450924472 [](ancestors = 450924472) |
Example files usually do contain a header because they could be encapsulated in another class' doc string. In reply to: 452028645 [](ancestors = 452028645) Refers to: docs/samples/Microsoft.ML.Samples/Static/LightGBMMulticlassWithInMemoryData.cs:1 in 5aa58c3. [](commit_id = 5aa58c3, deletion_comment = False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes #1947 and fixes #1881.