Skip to content

XML documentation references cs code for examples #1105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 4, 2018

Conversation

sfilipi
Copy link
Member

@sfilipi sfilipi commented Oct 1, 2018

Addresses #637 by creating the docs/samples older where we can host the projects for our examples.

Added one such example for SDCA regression.

@sfilipi sfilipi added the documentation Related to documentation of ML.NET label Oct 1, 2018

namespace Microsoft.ML.Samples.StaticPipe
{
public static class DatasetCreator
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatasetCreator [](start = 24, length = 14)

@shauheen, @GalOshri What do we want to do with the datasets for the samples? This is not a solution, as this function won't display/be referenceable by the samples.

Shall we create a small nuget package with 6 functions that generate those datasets, and import that in our samples?
scikit examples always import from the datasets package.

Shall we package/re-distribute the datasets we are using for testing?(i bet we'd want to stay out of the legal work for that.)

/// <![CDATA[
/// [!code-csharp[SDCA Regression](../../../docs/samples/Microsoft.ML.Samples.StaticPipe/Trainers.cs?range=5-8,12-14,25-79) "The SDCA regression example."]
/// ]]></format>
/// </example>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dend, @mairaw does the path look correct? In the examples it says it should be relative to the file containing the XML; but when I configured the CRR in the docs, there is a

"path_to_root": "docs/samples"

property. Does the path need to be relative to that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know @sfilipi. The way we use it in our docs is a bit different. Tagging @JRAlexander to help investigate.
We also do use the notation of ~ to start at the root as well.

Microsoft.ML.sln Outdated
@@ -119,10 +119,14 @@ Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.OnnxTransform"
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.DnnAnalyzer", "src\Microsoft.ML.DnnAnalyzer\Microsoft.ML.DnnAnalyzer\Microsoft.ML.DnnAnalyzer.csproj", "{73DAAC82-D308-48CC-8FFE-3B037F8BBCCA}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Microsoft.ML.OnnxTransformTest", "test\Microsoft.ML.OnnxTransformTest\Microsoft.ML.OnnxTransformTest.csproj", "{49D03292-8AFE-4B82-823C-D047BF8420F7}"
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.OnnxTransformTest", "test\Microsoft.ML.OnnxTransformTest\Microsoft.ML.OnnxTransformTest.csproj", "{49D03292-8AFE-4B82-823C-D047BF8420F7}"
Copy link
Member Author

@sfilipi sfilipi Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") [](start = 0, length = 50)

revert #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, please keep this. This is a good change. All our .csproj files should have 9A19103F-16F7-4668-BE54-9A1E7A4F7556 for the GUID.


In reply to: 221743776 [](ancestors = 221743776)

<NativeAssemblyReference Include="FastTreeNative" />
<NativeAssemblyReference Include="CpuMathNative" />
<NativeAssemblyReference Include="FactorizationMachineNative" />
<NativeAssemblyReference Include="LdaNative" />
Copy link
Member Author

@sfilipi sfilipi Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove, keep only the needed one, CpuMathNative. #Resolved

<PropertyGroup>
<TargetFramework>netcoreapp2.0</TargetFramework>
<OutputType>Exe</OutputType>
<StartupObject>Microsoft.ML.Samples.StaticPipe.Trainers</StartupObject>
Copy link
Member

@eerhardt eerhardt Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line necessary? #Resolved

<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netcoreapp2.0</TargetFramework>
Copy link
Member

@eerhardt eerhardt Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be netcoreapp2.1. We shouldn't be using 2.0 anymore. 2.0 has EOL'd today: https://blogs.msdn.microsoft.com/dotnet/2018/06/20/net-core-2-0-will-reach-end-of-life-on-september-1-2018/ #Resolved

<ItemGroup>
<ProjectReference Include="..\..\..\src\Microsoft.ML.Data\Microsoft.ML.Data.csproj" />
<ProjectReference Include="..\..\..\src\Microsoft.ML.ImageAnalytics\Microsoft.ML.ImageAnalytics.csproj" />
<ProjectReference Include="..\..\..\src\Microsoft.ML.LightGBM\Microsoft.ML.LightGBM.csproj" />
Copy link
Member

@eerhardt eerhardt Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How were these ProjectReferences chosen? I don't see LightGBM or ImageAnalytics used in the .csproj. #Resolved

public static (string trainPath, string testPath) CreateRegressionDataset()
{
// creating a small sample dataset, and writting it to file
string trainDataPath = @"c:\temp\RegressionTrainDataset.txt";
Copy link
Member

@eerhardt eerhardt Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be writing to C:\temp. Instead we should write into the repo's bin/obj directory. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some people may not have a C drive. Some people might not be using Windows.


In reply to: 221757977 [](ancestors = 221757977)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eric, I am leaving it just as a file in the current folder, since before merging, we should decide whether to package the datasets we have in a nuget, or package files we create, like this one for every task ..


In reply to: 221758066 [](ancestors = 221758066,221757977)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closing it, since this is captured in #1137


In reply to: 221764548 [](ancestors = 221764548,221758066,221757977)

@eerhardt
Copy link
Member

eerhardt commented Oct 1, 2018

using System;

Need a copyright header on all C# files. #Resolved


Refers to: docs/samples/Microsoft.ML.Samples.StaticPipe/DatasetCreator.cs:1 in c758415. [](commit_id = c758415, deletion_comment = False)

/// <example>
/// <format type="text/markdown">
/// <![CDATA[
/// [!code-csharp[SDCA Regression](../../../docs/samples/Microsoft.ML.Samples.StaticPipe/Trainers.cs?range=5-8,12-14,25-79) "The SDCA regression example."]
Copy link
Member

@eerhardt eerhardt Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't actually using the class SdcaRegressionTrainer in this example code. I think that would be an anti-pattern in the docs. If I want to see an example of using class Foo, the example should have the class Foo in it, IMO. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I will create a separate sample for this class, and leave the existing one just for the extension.

the new sample will use the constructor of SdcaRegressionTrainers directly to create the object, and invoke fit on it, maybe.


In reply to: 221758621 [](ancestors = 221758621)

@shauheen shauheen modified the milestone: 1018 Oct 2, 2018
{
public static class Trainers
{
public static void Main()
Copy link
Member

@eerhardt eerhardt Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this "Main" method should be moved to a separate file. If we ever added new lines to this method (like if we wanted to run some other sample), then it will mess the line numbers up in:

        /// <![CDATA[
        ///  [!code-csharp[SDCA Regression](../../../docs/samples/Microsoft.ML.Samples.StaticPipe/Trainers.cs?range=5-8,12-14,25-79) "The SDCA regression example."]
        /// ]]></format>
```` #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking more about it, we probably don;t need Main at all. We just care that this code compiles, afa running it, will probably do it only if we change things and want to validate the results.


In reply to: 222419899 [](ancestors = 222419899)

@eerhardt
Copy link
Member

eerhardt commented Oct 3, 2018

How do you feel about having just Microsoft.ML.Samples.csproj? And most (if not all) our doc samples be contained in a single project? That way people don't have to copy this infrastructure for each little thing we want to create a sample for. #Resolved


Refers to: docs/samples/Microsoft.ML.Samples.StaticPipe/Microsoft.ML.Samples.StaticPipe.csproj:1 in af43f81. [](commit_id = af43f81, deletion_comment = False)

…t is directly showing that. Splitting main in a separate file.
Microsoft.ML.sln Outdated
@@ -459,6 +463,14 @@ Global
{B6C83F04-A04B-4F00-9E68-1EC411F9317C}.Release|Any CPU.Build.0 = Release|Any CPU
{B6C83F04-A04B-4F00-9E68-1EC411F9317C}.Release-Intrinsics|Any CPU.ActiveCfg = Release|Any CPU
{B6C83F04-A04B-4F00-9E68-1EC411F9317C}.Release-Intrinsics|Any CPU.Build.0 = Release|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Debug|Any CPU.Build.0 = Debug|Any CPU
{E96D2EF3-F5D2-4BEE-8D2B-32C32A6344D2}.Debug-Intrinsics|Any CPU.ActiveCfg = Debug|Any CPU
Copy link
Member

@eerhardt eerhardt Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - you'll need to sync up the .sln file. There is 1 change you will need to make here - on the right side of these lines you'll need -Intrinsics after Debug and Release. See my change here: #1032 for examples. #Resolved

var (trainDataPath, testDataPath) = DatasetCreator.CreateRegressionDataset();

//creating the ML.Net IHostEnvironment object, needed for the pipeline
var env = new ConsoleEnvironment(seed: 0);
Copy link
Member

@eerhardt eerhardt Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) do we want ConsoleEnvironment in the samples? or LocalEnvironment? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I like the ConsoleEnvironmnet, so they see the ML.Net output during training. Is there an argument more in favor of LocalEnvironment?


In reply to: 222474325 [](ancestors = 222474325)

Copy link
Member

@eerhardt eerhardt Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the argument is that it isn't typical for libraries to write to the console. For example, if I use EntityFramework to read/write to a database, it doesn't by default print SQL statements to the console.

Also, it assumes that you are in a console app, when it is way more common to be in a UI app, or on an ASP.NET service. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I'll change it.


In reply to: 222493698 [](ancestors = 222493698)

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@eerhardt
Copy link
Member

eerhardt commented Oct 3, 2018

Since this file is now named Microsoft.ML.Samples.csproj, can we change the folder name as well to match? (remove StaticPipe) #Resolved


Refers to: docs/samples/Microsoft.ML.Samples.StaticPipe/Microsoft.ML.Samples.csproj:1 in 1e6b160. [](commit_id = 1e6b160, deletion_comment = False)

Microsoft.ML.sln Outdated
{4B101D58-E7E4-4877-A536-A9B41E2E82A3}.Release|Any CPU.ActiveCfg = Release|Any CPU
{4B101D58-E7E4-4877-A536-A9B41E2E82A3}.Release|Any CPU.Build.0 = Release|Any CPU
{4B101D58-E7E4-4877-A536-A9B41E2E82A3}.Release-Intrinsics|Any CPU.ActiveCfg = Release-Intrinsics|Any CPU
{4B101D58-E7E4-4877-A536-A9B41E2E82A3}.Release-Intrinsics|Any CPU.Build.0 = Release-Intrinsics|Any CPU
Copy link
Member Author

@sfilipi sfilipi Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix #Resolved

Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@sfilipi
Copy link
Member Author

sfilipi commented Oct 4, 2018

thanks @artidoro and @eerhardt

@sfilipi sfilipi merged commit c87b869 into dotnet:master Oct 4, 2018
@sfilipi
Copy link
Member Author

sfilipi commented Oct 5, 2018

Resolves #637

@sfilipi sfilipi deleted the docsSamples branch October 20, 2018 03:14
@ghost ghost locked as resolved and limited conversation to collaborators Mar 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Related to documentation of ML.NET
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants