Multiclass Classification Samples Update #3322

artidoro · 2019-04-12T23:37:05Z

Tracked under #2522

This PR adds samples for LbfgsMaximumEntropy and SdcaNonCalibrated trainers.

This PR also removes dependency from Samples Utils in other multiclass classification samples and adds .tt files for all multiclass classification samples.

Notice that this PR does not take care of Naive Bayes as it is in progress in #3246.

artidoro · 2019-04-13T00:04:29Z

.../samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/SdcaNonCalibrated.cs

+            // Expected output:
+            //  Micro Accuracy: 0.91
+            //  Macro Accuracy: 0.91
+            //  Log Loss: 0.00


Notice that there is no EvaluateNonCalibrated method in the multiclass classification catalog.
The LogLoss metric does not makes sense in this case.

I opened an issue #3323 to track this problem.

wschin · 2019-04-13T00:44:31Z

...soft.ML.Samples/Dynamic/Trainers/MulticlassClassification/MulticlassClassification.ttinclude

@@ -44,7 +51,12 @@ namespace Samples.Dynamic.Trainers.MulticlassClassification
            var options = new <#=TrainerOptions#>;

            // Define the trainer.
-            var pipeline = mlContext.MulticlassClassification.Trainers.<#=Trainer#>(options);
+            var pipeline = 
+			        // Convert the string labels into key types.


a line not aligned. #Resolved

It's done on purpose, so that we can comment before the two estimators added to the pipeline.

In reply to: 275097557 [](ancestors = 275097557)

codecov · 2019-04-13T00:46:12Z

Codecov Report

Merging #3322 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #3322   +/-   ##
=======================================
  Coverage   72.69%   72.69%           
=======================================
  Files         807      807           
  Lines      145172   145172           
  Branches    16225    16225           
=======================================
  Hits       105537   105537           
  Misses      35220    35220           
  Partials     4415     4415

Flag	Coverage Δ
#Debug	`72.69% <ø> (ø)`	⬆️
#production	`68.23% <ø> (ø)`	⬆️
#test	`88.97% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
...oft.ML.StandardTrainers/StandardTrainersCatalog.cs	`92.34% <ø> (ø)`	⬆️

wschin · 2019-04-13T00:55:41Z

                mlContext.Transforms.Conversion.MapValueToKey("Label")

nameof(DataPoint.Label) #Resolved

Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/MulticlassClassification.ttinclude:39 in 5975856. [](commit_id = 5975856, deletion_comment = False)

wschin · 2019-04-13T00:56:27Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/LightGbm.cs

+                    // Convert the string labels into key types.
+                    mlContext.Transforms.Conversion.MapValueToKey("Label")
+                    // Apply LightGbm multiclass trainer.
+                    .Append(mlContext.MulticlassClassification.Trainers.LightGbm());


LightGbm() [](start = 72, length = 10)

Better to specify column names using nameof. #Resolved

yaeldekel · 2019-04-15T17:41:29Z

...amples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/LbfgsMaximumEntropy.cs

+            var model = pipeline.Fit(trainingData);
+
+            // Create testing data. Use different random seed to make it different from training data.
+            var testData = mlContext.Data.LoadFromEnumerable(GenerateRandomDataPoints(500, seed:123));


nit: add a space here. #Resolved

yaeldekel · 2019-04-15T17:48:48Z

...amples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/LbfgsMaximumEntropy.cs

+            // Define the trainer.
+            var pipeline =
+                    // Convert the string labels into key types.
+                    mlContext.Transforms.Conversion.MapValueToKey("Label")


MapValueToKey [](start = 52, length = 13)

I'm not sure if this is an issue, there may be other samples for this, but would it make sense to pass the IDataView keyData argument to this method to show how the user can avoid a pass over the data to get the labels in case the set of labels is known? #Resolved

I think it's important to show that use of the method, and I hope that there is a sample for that under MapValueToKey. However, I don't think it would be the right place here to include such a sample.
What I could do instead is load a keyType directly, instead of using MapValueToKey. Would that be better?

In reply to: 275474158 [](ancestors = 275474158)

shmoradims

yaeldekel

artidoro added the documentation Related to documentation of ML.NET label Apr 12, 2019

artidoro requested review from wschin, shmoradims, ganik and zeahmed April 12, 2019 23:37

artidoro self-assigned this Apr 12, 2019

artidoro force-pushed the multiclassamples branch from 40725ca to 2bf1485 Compare April 13, 2019 00:01

artidoro commented Apr 13, 2019

View reviewed changes

artidoro force-pushed the multiclassamples branch from 2bf1485 to 5975856 Compare April 13, 2019 00:06

wschin reviewed Apr 13, 2019

View reviewed changes

wschin approved these changes Apr 13, 2019

View reviewed changes

wschin reviewed Apr 13, 2019

View reviewed changes

yaeldekel reviewed Apr 15, 2019

View reviewed changes

shmoradims mentioned this pull request Apr 15, 2019

Docs and samples for the API reference site (P0 & P1 Trainers) #2522

Closed

artidoro force-pushed the multiclassamples branch from 5975856 to ec560ed Compare April 15, 2019 23:15

shmoradims approved these changes Apr 15, 2019

View reviewed changes

yaeldekel approved these changes Apr 16, 2019

View reviewed changes

artidoro force-pushed the multiclassamples branch from ec560ed to 7682d7f Compare April 16, 2019 18:42

artidoro added 3 commits April 16, 2019 15:20

multiclasssamples

73dc2d4

review comments

c871f69

rebase fix

41f062f

artidoro force-pushed the multiclassamples branch from 7682d7f to 41f062f Compare April 16, 2019 22:22

artidoro merged commit 8644b3b into dotnet:master Apr 16, 2019

ghost locked as resolved and limited conversation to collaborators Mar 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiclass Classification Samples Update #3322

Multiclass Classification Samples Update #3322

artidoro commented Apr 12, 2019 •

edited

Loading

artidoro Apr 13, 2019

wschin Apr 13, 2019 •

edited by artidoro

Loading

artidoro Apr 15, 2019

codecov bot commented Apr 13, 2019 •

edited

Loading

wschin commented Apr 13, 2019 •

edited by artidoro

Loading

wschin Apr 13, 2019 •

edited by artidoro

Loading

yaeldekel Apr 15, 2019 •

edited by artidoro

Loading

yaeldekel Apr 15, 2019 •

edited

Loading

artidoro Apr 15, 2019

shmoradims left a comment

yaeldekel left a comment

Multiclass Classification Samples Update #3322

Multiclass Classification Samples Update #3322

Conversation

artidoro commented Apr 12, 2019 • edited Loading

artidoro Apr 13, 2019

Choose a reason for hiding this comment

wschin Apr 13, 2019 • edited by artidoro Loading

Choose a reason for hiding this comment

artidoro Apr 15, 2019

Choose a reason for hiding this comment

codecov bot commented Apr 13, 2019 • edited Loading

Codecov Report

wschin commented Apr 13, 2019 • edited by artidoro Loading

wschin Apr 13, 2019 • edited by artidoro Loading

Choose a reason for hiding this comment

yaeldekel Apr 15, 2019 • edited by artidoro Loading

Choose a reason for hiding this comment

yaeldekel Apr 15, 2019 • edited Loading

Choose a reason for hiding this comment

artidoro Apr 15, 2019

Choose a reason for hiding this comment

shmoradims left a comment

Choose a reason for hiding this comment

yaeldekel left a comment

Choose a reason for hiding this comment

artidoro commented Apr 12, 2019 •

edited

Loading

wschin Apr 13, 2019 •

edited by artidoro

Loading

codecov bot commented Apr 13, 2019 •

edited

Loading

wschin commented Apr 13, 2019 •

edited by artidoro

Loading

wschin Apr 13, 2019 •

edited by artidoro

Loading

yaeldekel Apr 15, 2019 •

edited by artidoro

Loading

yaeldekel Apr 15, 2019 •

edited

Loading