Skip to content

XML documentation references cs code for examples #1105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 4, 2018
15 changes: 14 additions & 1 deletion Microsoft.ML.sln
Original file line number Diff line number Diff line change
Expand Up @@ -119,10 +119,14 @@ Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.OnnxTransform"
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.DnnAnalyzer", "src\Microsoft.ML.DnnAnalyzer\Microsoft.ML.DnnAnalyzer\Microsoft.ML.DnnAnalyzer.csproj", "{73DAAC82-D308-48CC-8FFE-3B037F8BBCCA}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Microsoft.ML.OnnxTransformTest", "test\Microsoft.ML.OnnxTransformTest\Microsoft.ML.OnnxTransformTest.csproj", "{49D03292-8AFE-4B82-823C-D047BF8420F7}"
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.OnnxTransformTest", "test\Microsoft.ML.OnnxTransformTest\Microsoft.ML.OnnxTransformTest.csproj", "{49D03292-8AFE-4B82-823C-D047BF8420F7}"
Copy link
Member Author

@sfilipi sfilipi Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") [](start = 0, length = 50)

revert #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, please keep this. This is a good change. All our .csproj files should have 9A19103F-16F7-4668-BE54-9A1E7A4F7556 for the GUID.


In reply to: 221743776 [](ancestors = 221743776)

EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.Benchmarks.Tests", "test\Microsoft.ML.Benchmarks.Tests\Microsoft.ML.Benchmarks.Tests.csproj", "{B6C83F04-A04B-4F00-9E68-1EC411F9317C}"
EndProject
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "samples", "samples", "{DA452A53-2E94-4433-B08C-041EDEC729E6}"
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.Samples", "docs\samples\Microsoft.ML.Samples.StaticPipe\Microsoft.ML.Samples.csproj", "{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Expand Down Expand Up @@ -459,6 +463,14 @@ Global
{B6C83F04-A04B-4F00-9E68-1EC411F9317C}.Release|Any CPU.Build.0 = Release|Any CPU
{B6C83F04-A04B-4F00-9E68-1EC411F9317C}.Release-Intrinsics|Any CPU.ActiveCfg = Release|Any CPU
{B6C83F04-A04B-4F00-9E68-1EC411F9317C}.Release-Intrinsics|Any CPU.Build.0 = Release|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Debug|Any CPU.Build.0 = Debug|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Debug-Intrinsics|Any CPU.ActiveCfg = Debug|Any CPU
Copy link
Member

@eerhardt eerhardt Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - you'll need to sync up the .sln file. There is 1 change you will need to make here - on the right side of these lines you'll need -Intrinsics after Debug and Release. See my change here: #1032 for examples. #Resolved

{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Debug-Intrinsics|Any CPU.Build.0 = Debug|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Release|Any CPU.ActiveCfg = Release|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Release|Any CPU.Build.0 = Release|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Release-Intrinsics|Any CPU.ActiveCfg = Release|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Release-Intrinsics|Any CPU.Build.0 = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
Expand Down Expand Up @@ -510,6 +522,7 @@ Global
{73DAAC82-D308-48CC-8FFE-3B037F8BBCCA} = {09EADF06-BE25-4228-AB53-95AE3E15B530}
{49D03292-8AFE-4B82-823C-D047BF8420F7} = {AED9C836-31E3-4F3F-8ABC-929555D3F3C4}
{B6C83F04-A04B-4F00-9E68-1EC411F9317C} = {AED9C836-31E3-4F3F-8ABC-929555D3F3C4}
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2} = {DA452A53-2E94-4433-B08C-041EDEC729E6}
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {41165AF1-35BB-4832-A189-73060F82B01D}
Expand Down
60 changes: 60 additions & 0 deletions docs/samples/Microsoft.ML.Samples.StaticPipe/DatasetCreator.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using System;
using System.IO;
using System.Text;

namespace Microsoft.ML.Samples.StaticPipe
{
public static class DatasetCreator
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatasetCreator [](start = 24, length = 14)

@shauheen, @GalOshri What do we want to do with the datasets for the samples? This is not a solution, as this function won't display/be referenceable by the samples.

Shall we create a small nuget package with 6 functions that generate those datasets, and import that in our samples?
scikit examples always import from the datasets package.

Shall we package/re-distribute the datasets we are using for testing?(i bet we'd want to stay out of the legal work for that.)

{
public static (string trainPath, string testPath) CreateRegressionDataset()
{
// creating a small sample dataset, and writting it to file
string trainDataPath = @"RegressionTrainDataset.txt";
string testDataPath = @"RegressionTestDataset.txt";

string header = "feature_a, feature_b, target";

int a = 0;
int b = 0;
int target = 0;

var csvTrain = new StringBuilder().AppendLine(header);
var csvTest = new StringBuilder().AppendLine(header);

Random rnd = new Random();
for (int i = 0; i < 1000; i++)
{
a = rnd.Next(i - 5, i + 5);
b = rnd.Next(0, 10);

target = 2*a + b;

if (i % 15 == 0)
csvTest.AppendLine($"{a}, {b}, {target}");
else
csvTrain.AppendLine($"{a}, {b} , {target}");
}


if (!File.Exists(trainDataPath))
File.WriteAllText(trainDataPath, csvTrain.ToString());
else
{
new Exception("Train dataset file already exists");
}

if (!File.Exists(testDataPath))
File.WriteAllText(testDataPath, csvTest.ToString());
else
{
new Exception("Test dataset file already exists");
}

return (trainDataPath, testDataPath);
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netcoreapp2.1</TargetFramework>
<OutputType>Exe</OutputType>
</PropertyGroup>

<ItemGroup>
<ProjectReference Include="..\..\..\src\Microsoft.ML.StandardLearners\Microsoft.ML.StandardLearners.csproj" />

<NativeAssemblyReference Include="CpuMathNative" />

<ProjectReference Include="..\..\..\src\Microsoft.ML.Analyzer\Microsoft.ML.Analyzer.csproj">
<ReferenceOutputAssembly>false</ReferenceOutputAssembly>
<OutputItemType>Analyzer</OutputItemType>
</ProjectReference>

</ItemGroup>

</Project>
16 changes: 16 additions & 0 deletions docs/samples/Microsoft.ML.Samples.StaticPipe/Program.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using Microsoft.ML.Samples.StaticPipe;

namespace Microsoft.ML.Samples
{
internal static class Program
{
static void Main(string[] args)
{
Trainers.SdcaRegression();
}
}
}
72 changes: 72 additions & 0 deletions docs/samples/Microsoft.ML.Samples.StaticPipe/Trainers.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using Microsoft.ML.Runtime.Data;
using Microsoft.ML.Runtime.Learners;
using Microsoft.ML.StaticPipe;
using System;

namespace Microsoft.ML.Samples.StaticPipe
{
public static class Trainers
{
public static void SdcaRegression()
{
var (trainDataPath, testDataPath) = DatasetCreator.CreateRegressionDataset();

//creating the ML.Net IHostEnvironment object, needed for the pipeline
var env = new ConsoleEnvironment(seed: 0);
Copy link
Member

@eerhardt eerhardt Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) do we want ConsoleEnvironment in the samples? or LocalEnvironment? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I like the ConsoleEnvironmnet, so they see the ML.Net output during training. Is there an argument more in favor of LocalEnvironment?


In reply to: 222474325 [](ancestors = 222474325)

Copy link
Member

@eerhardt eerhardt Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the argument is that it isn't typical for libraries to write to the console. For example, if I use EntityFramework to read/write to a database, it doesn't by default print SQL statements to the console.

Also, it assumes that you are in a console app, when it is way more common to be in a UI app, or on an ASP.NET service. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I'll change it.


In reply to: 222493698 [](ancestors = 222493698)


// creating the ML context, based on the task performed.
var regressionContext = new RegressionContext(env);

// Creating a data reader, based on the format of the data
var reader = TextLoader.CreateReader(env, c => (
label: c.LoadFloat(2),
features: c.LoadFloat(0, 1)
),
separator: ',', hasHeader: true);

// Read the data
var trainData = reader.Read(new MultiFileSource(trainDataPath));

// The predictor that gets produced out of training
LinearRegressionPredictor pred = null;

// Create the estimator
var learningPipeline = reader.MakeNewEstimator()
.Append(r => (r.label, score: regressionContext.Trainers.Sdca(
r.label,
r.features,
l1Threshold: 0f,
maxIterations: 100,
onFit: p => pred = p)
)
);

// fit this pipeline to the training data
var model = learningPipeline.Fit(trainData);

// check the weights that the model learned
VBuffer<float> weights = default;
pred.GetFeatureWeights(ref weights);

Console.WriteLine($"weight 0 - {weights.Values[0]}");
Console.WriteLine($"weight 1 - {weights.Values[1]}");

// test the model we just trained, using the test file.
var testData = reader.Read(new MultiFileSource(testDataPath));
var data = model.Transform(testData);

//Evaluate how the model is doing on the test data
var metrics = regressionContext.Evaluate(data, r => r.label, r => r.score);

Console.WriteLine($"L1 - {metrics.L1}");
Console.WriteLine($"L2 - {metrics.L2}");
Console.WriteLine($"LossFunction - {metrics.LossFn}");
Console.WriteLine($"RMS - {metrics.Rms}");
Console.WriteLine($"RSquared - {metrics.RSquared}");
}
}
}
6 changes: 6 additions & 0 deletions src/Microsoft.ML.StandardLearners/Standard/SdcaStatic.cs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,12 @@ public static partial class RegressionTrainers
/// the linear model that was trained. Note that this action cannot change the result in any way; it is only a way for the caller to
/// be informed about what was learnt.</param>
/// <returns>The predicted output.</returns>
/// <example>
/// <format type="text/markdown">
/// <![CDATA[
/// [!code-csharp[SDCA](../../../docs/samples/Microsoft.ML.Samples.StaticPipe/Trainers.cs?range=5-8,12-70) "The SDCA regression example."]
/// ]]></format>
/// </example>
public static Scalar<float> Sdca(this RegressionContext.RegressionTrainers ctx,
Scalar<float> label, Vector<float> features, Scalar<float> weights = null,
float? l2Const = null,
Expand Down