Added dynamic API snippets to cookbook #1538

shmoradims · 2018-11-06T01:00:57Z

Added the dynamic API equivalent for all the snippets that were using static API, except for the snippet that uses onFit, which is not supported by dynamic API yet.

sfilipi · 2018-11-06T07:16:29Z

docs/code/MlNetCookBook.md


 // Create the reader: define the data columns and where to find them in the text file.
-var reader = TextLoader.CreateReader(env, ctx => (
+var reader = mlContext.Data.TextReader(ctx => (


mlContext [](start = 13, length = 9)

the examples of this file are actually tests in the CookbookExamples.cs. I believe we want to keep them in sync. #Closed

We want the cookbook to show the preferred way to load data. It would be confusing for users to see both mlContext.Data.TextReader and TextLoader.CreateReader. I'd rather change the test instead.

In reply to: 231016690 [](ancestors = 231016690)

sfilipi · 2018-11-06T07:20:44Z

docs/code/MlNetCookBook.md

+var mlContext = new MLContext();
+
+// Create the reader: define the data columns and where to find them in the text file.
+var reader = new TextLoader(mlContext, new TextLoader.Arguments


TextLoader [](start = 17, length = 10)

mlContext.Data.TextReader #Closed

updated them all.

In reply to: 231017639 [](ancestors = 231017639)

sfilipi · 2018-11-06T07:23:11Z

docs/code/MlNetCookBook.md

+// This will give the entire dataset: make sure to only take several row
+// in case the dataset is huge. The is similar to the static API, except
+// you have to specify the column name and type.
+var featureColumns = transformedData.GetColumn<string[]>(mlContext, "AllFeatures")


string [](start = 47, length = 6)

does this work? I believe i had to do ReadOnlyMemoryOf #Closed

surprisingly yes. I added actual tests for it.

In reply to: 231018069 [](ancestors = 231018069)

sfilipi · 2018-11-06T07:24:26Z

docs/code/MlNetCookBook.md

+    // Add the SDCA regression trainer.
+    .Append(mlContext.Regression.Trainers.StochasticDualCoordinateAscent(label: "Target", features: "FeatureVector"))
+
+// Step three. Train the pipeline.


Train the [](start = 15, length = 9)

Fit the training data to the pipeline. Did you want to do transform? #Closed

Fixed the text.
Not doing any transforms. Just writing equivalent of the static version.

In reply to: 231018287 [](ancestors = 231018287)

sfilipi · 2018-11-06T07:26:04Z

docs/code/MlNetCookBook.md

+    .Append(new ValueToKeyMappingEstimator(mlContext, "Label"), TransformerScope.TrainTest)
+    // Use the multi-class SDCA model to predict the label using features.
+    .Append(mlContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent())
+    // Apply the inverse conversion from 'PredictedLabel' key back to string value.


key [](start = 58, length = 3)

column #Closed

sfilipi · 2018-11-06T07:29:09Z

docs/code/MlNetCookBook.md

+// Read the data.
+var data = reader.Read(dataPath);
+
+// Inspect the categorical columns to check that they are correctly read.


categorical columns [](start = 15, length = 19)

by looking at the first 10 records. #Closed

sfilipi · 2018-11-06T07:29:45Z

docs/code/MlNetCookBook.md

+var transformedData = dynamicPipeline.Fit(data).Transform(data);
+
+// Inspect some columns of the resulting dataset.
+var categoricalBags = transformedData.GetColumn<float[]>(mlContext, "CategoricalBag").Take(10).ToArray();


.Take(10) [](start = 85, length = 9)

should we remove those #Closed

I think it's good: otherwise we'll materializer the entire column of data, which may be large

In reply to: 231019258 [](ancestors = 231019258)

sfilipi · 2018-11-06T07:30:58Z

docs/code/MlNetCookBook.md

+    .Append(mlContext.Transforms.Text.NormalizeText("Message", "NormalizedMessage"))
+
+    // NLP pipeline 1: bag of words.
+    .Append(new WordBagEstimator(mlContext, "NormalizedMessage", "BagOfWords"))


WordBagEstimator [](start = 16, length = 16)

is this not part of the Text catalog? #Closed

+1

In reply to: 231019528 [](ancestors = 231019528)

not in the catalog yet. Senja will add it soon.

In reply to: 231225526 [](ancestors = 231225526,231019528)

sfilipi · 2018-11-06T07:31:19Z

docs/code/MlNetCookBook.md

+    .Append(new WordBagEstimator(mlContext, "NormalizedMessage", "BagOfWords"))
+
+    // NLP pipeline 2: bag of bigrams, using hashes instead of dictionary indices.
+    .Append(new WordHashBagEstimator(mlContext, "NormalizedMessage", "BagOfBigrams", 


WordHashBagEstimator [](start = 16, length = 20)

i think this is part of the text catalog. #Closed

not in the catalog yet. Senja will add it soon.

In reply to: 231019583 [](ancestors = 231019583)

sfilipi · 2018-11-06T07:32:36Z

docs/code/MlNetCookBook.md

+    // Note that the label is text, so it needs to be converted to key.
+    .Append(new ValueToKeyMappingEstimator(mlContext, "Label"), TransformerScope.TrainTest)
+    // Use the multi-class SDCA model to predict the label using features.
+    .Append(mlContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent());


.Append(m [](start = 4, length = 9)

does it need KeyToVal at the end? #Resolved

the static pipeline version above it doesn't have the KeyToVal for this example, so I'm not adding it. This one is looking at averaged accuracies, so the string value of the label is not as important.
The example in "How do I use the model to make one prediction" add the ToKey, which looks at the transformed data.

In reply to: 231019826 [](ancestors = 231019826)

If they are looking for an example of how to multiclass , and they use this, the users will have trouble making sense of their predictions. I think we have had several bus along those lines.

In reply to: 231210991 [](ancestors = 231210991,231019826)

JRAlexander

@shmoradims @Zruty0 - Will the following be updated with dynamic examples?

How do I verify the model quality?
How do I load data from multiple files?
What if my training data is not in a text file?

JRAlexander · 2018-11-06T17:14:41Z

docs/code/MlNetCookBook.md

+var reader = mlContext.Data.TextReader(new TextLoader.Arguments
+{
+    Column = new[] {
+        // A boolean column depicting the 'label'.


Shouldn't these be DataKind.Text and DataKind.Boolean instead? #Resolved

We have some equivalent names. We tend prefer the short two letter version.

TX = 11,
TXT = TX,
Text = TX,

BL = 12,
Bool = BL,

In reply to: 231213230 [](ancestors = 231213230)

Is that because not every type has an equivalent name? #Resolved

[John are you using codeflow? If not, your comments will come as different threads.]

I clarified with Pete. We are planning to rename DataKind to be closer to .NET types. We'll need to update all of these once that change happens.

In reply to: 231290740 [](ancestors = 231290740)

Zruty0 · 2018-11-06T17:39:00Z

docs/code/MlNetCookBook.md

+```csharp
+// Create a new context for ML.NET operations. It can be used for exception tracking and logging, 
+// as a catalog of available operations and as the source of randomness.
+var mlContext = new MLContext();


var [](start = 0, length = 3)

Please also add these snippets as tests to CookbookSamples: this way we ensure that they compile, and are updated when the API gets updated. #Resolved

yes. already created a new set of tests for dynamic api.

In reply to: 231221675 [](ancestors = 231221675)

Zruty0 · 2018-11-06T17:40:23Z

docs/code/MlNetCookBook.md

+// We 'start' the pipeline with the output of the reader.
+var dynamicPipeline =
+    // First 'normalize' the data (rescale to be
+    // between -1 and 1 for all examples), and then train the model.


, and then train the model. [](start = 41, length = 27)

not needed #Resolved

removed from both static and dynamic snippets.

In reply to: 231222135 [](ancestors = 231222135)

Zruty0 · 2018-11-06T17:48:35Z

docs/code/MlNetCookBook.md

+    // Concatenate all the features together into one column 'Features'.
+    mlContext.Transforms.Concatenate("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth")
+    // Note that the label is text, so it needs to be converted to key.
+    .Append(new ValueToKeyMappingEstimator(mlContext, "Label"), TransformerScope.TrainTest)


ValueToKeyMappingEstimator [](start = 16, length = 26)

mlContext.Transforms.Categorical.MapValueToKey #Resolved

changed.

In reply to: 231225091 [](ancestors = 231225091)

Zruty0

sfilipi · 2018-11-06T20:58:17Z

docs/code/MlNetCookBook.md

+    // Apply the inverse conversion from 'PredictedLabel' column back to string value.
+    .Append(mlContext.Transforms.Conversion.MapKeyToValue(("PredictedLabel", "Data")));
+
+// Train the model.


Train the model. [](start = 3, length = 16)

technically you'd have to Transform if you say "train the model". #Resolved

keeping it the same as Pete's words from static pipeline.

In reply to: 231290065 [](ancestors = 231290065)

sfilipi · 2018-11-06T20:59:07Z

docs/code/MlNetCookBook.md

+    .Append(mlContext.BinaryClassification.Trainers.FastTree(numTrees: 50));
+
+// Train the model.
+var model = fullLearningPipeline.Fit(data);


.Fit(data) [](start = 32, length = 10)

same here, Transform. #Resolved

Resolved offline.

In reply to: 231290355 [](ancestors = 231290355)

sfilipi

Shahab Moradi added 9 commits November 5, 2018 10:46

Fixed "How do I look at the intermediate data?"

7629533

Fixed "How do I train a regression model?"

7a51c13

Fixed "How do I use the model to make one prediction?"

59c4f0d

Small fix

3f05a01

Merge branch 'master' into update_cookbook

da83130

Fixed normalization examples

6045166

Updated categorical examples

63d0536

Updated text example

c6e0284

Updated CV

cccf6c9

shmoradims requested review from Zruty0, sfilipi and JRAlexander November 6, 2018 01:01

shmoradims changed the title ~~Updated cookbook with dynamic API~~ Added dynamic API snippets to cookbook Nov 6, 2018

sfilipi reviewed Nov 6, 2018

View reviewed changes

Shahab Moradi added 2 commits November 6, 2018 08:25

Changed all the readers to mlContext.Data.TextReader

eb5f02b

Some PR comment fixes

1f21c84

JRAlexander reviewed Nov 6, 2018

View reviewed changes

Zruty0 reviewed Nov 6, 2018

View reviewed changes

Zruty0 approved these changes Nov 6, 2018

View reviewed changes

Shahab Moradi added 3 commits November 6, 2018 10:49

Updated static cookbook tests to match cookbook

bf4bc63

Added tests for dynamic API snippets, and fixed the bugs in md file.

04bafe2

Cosmetic changes

ef26ab7

sfilipi reviewed Nov 6, 2018

View reviewed changes

sfilipi approved these changes Nov 6, 2018

View reviewed changes

Shahab Moradi added 2 commits November 6, 2018 13:33

Addressed the final comments.

e3a34ae

Merge branch 'master' into update_cookbook

6ecc5e3

shmoradims merged commit 88ad2c2 into dotnet:master Nov 7, 2018

shmoradims deleted the update_cookbook branch December 12, 2018 22:07

ghost locked as resolved and limited conversation to collaborators Mar 27, 2022

Added dynamic API snippets to cookbook #1538

Added dynamic API snippets to cookbook #1538

Conversation

shmoradims commented Nov 6, 2018

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

shmoradims Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited by shmoradims Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JRAlexander left a comment • edited Loading

Choose a reason for hiding this comment

JRAlexander Nov 6, 2018 • edited by shmoradims Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JRAlexander Nov 6, 2018 • edited by shmoradims Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zruty0 Nov 6, 2018 • edited by shmoradims Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zruty0 Nov 6, 2018 • edited by shmoradims Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zruty0 Nov 6, 2018 • edited by shmoradims Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zruty0 left a comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited by shmoradims Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 • edited by shmoradims Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi left a comment

Choose a reason for hiding this comment

sfilipi Nov 6, 2018 •

edited

Loading

shmoradims Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited

Loading

sfilipi Nov 6, 2018 •

edited by shmoradims

Loading

JRAlexander left a comment •

edited

Loading

JRAlexander Nov 6, 2018 •

edited by shmoradims

Loading

JRAlexander Nov 6, 2018 •

edited by shmoradims

Loading

Zruty0 Nov 6, 2018 •

edited by shmoradims

Loading

Zruty0 Nov 6, 2018 •

edited by shmoradims

Loading

Zruty0 Nov 6, 2018 •

edited by shmoradims

Loading

sfilipi Nov 6, 2018 •

edited by shmoradims

Loading

sfilipi Nov 6, 2018 •

edited by shmoradims

Loading