-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Comments added to LearningPipeline class to make Intellisense more helpful. #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
d80ade9
7856abd
d090170
77d21ff
5471ed1
203d9fe
5c67672
98fc8d0
e7dc707
5089e98
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,25 +26,115 @@ public ScorerPipelineStep(Var<IDataView> data, Var<ITransformModel> model) | |
public Var<ITransformModel> Model { get; } | ||
} | ||
|
||
|
||
/// <summary> | ||
/// LearningPipeline class is used to define the steps needed to perform desired machine learning task.<para/> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Class names should appear in |
||
/// The steps are defined by adding a data loader (e.g. <see cref="TextLoader"/>) followed by zero or more transforms (e.g. <see cref="Microsoft.ML.Transforms.TextFeaturizer"/>) | ||
/// and atmost one trainer/learner (e.g. <see cref="Microsoft.ML.Trainers.FastTreeBinaryClassifier"/>) in the pipeline. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. type-o |
||
/// | ||
/// Data can be analyzed at every step by inspecting the LearningPipeline object in VS.Net debugger. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure this line provides much value. Do you think we need it? |
||
/// <example> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/xmldoc/example it shows that |
||
/// <para/> | ||
/// For example,<para/> | ||
/// <code> | ||
/// var pipeline = new LearningPipeline(); | ||
/// pipeline.Add(new TextLoader <SentimentData> (dataPath, separator: ",")); | ||
/// pipeline.Add(new TextFeaturizer("Features", "SentimentText")); | ||
/// pipeline.Add(new FastTreeBinaryClassifier()); | ||
/// | ||
/// var model = pipeline.Train<SentimentData, SentimentPrediction>(); | ||
/// </code> | ||
/// </example> | ||
/// </summary> | ||
[DebuggerTypeProxy(typeof(LearningPipelineDebugProxy))] | ||
public class LearningPipeline : ICollection<ILearningPipelineItem> | ||
{ | ||
private List<ILearningPipelineItem> Items { get; } = new List<ILearningPipelineItem>(); | ||
|
||
/// <summary> | ||
/// Construct an empty LearningPipeline object. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Generally use |
||
/// </summary> | ||
public LearningPipeline() | ||
{ | ||
} | ||
|
||
/// <summary> | ||
/// Get the count of ML components in the LearningPipeline object | ||
/// </summary> | ||
public int Count => Items.Count; | ||
public bool IsReadOnly => false; | ||
|
||
/// <summary> | ||
/// Add a data loader, transform or trainer into the pipeline. | ||
/// Possible data loader(s), transforms and trainers options are | ||
/// <para> | ||
/// Data Loader: | ||
/// <see cref="Microsoft.ML.TextLoader{TInput}" /> | ||
/// etc. | ||
/// </para> | ||
/// <para> | ||
/// Transforms: | ||
/// <see cref="Microsoft.ML.Transforms.Dictionarizer"/>, | ||
/// <see cref="Microsoft.ML.Transforms.CategoricalOneHotVectorizer"/> | ||
/// <see cref="Microsoft.ML.Transforms.MinMaxNormalizer"/>, | ||
/// <see cref="Microsoft.ML.Transforms.ColumnCopier"/>, | ||
/// <see cref="Microsoft.ML.Transforms.ColumnConcatenator"/>, | ||
/// <see cref="Microsoft.ML.Transforms.TextFeaturizer"/>, | ||
/// etc. | ||
/// </para> | ||
/// <para> | ||
/// Trainers: | ||
/// <see cref="Microsoft.ML.Trainers.AveragedPerceptronBinaryClassifier"/>, | ||
/// <see cref="Microsoft.ML.Trainers.LogisticRegressor"/>, | ||
/// <see cref="Microsoft.ML.Trainers.StochasticDualCoordinateAscentClassifier"/>, | ||
/// <see cref="Microsoft.ML.Trainers.FastTreeRegressor"/>, | ||
/// etc. | ||
/// </para> | ||
/// For a complete list of transforms and trainers, please see "Microsoft.ML.Transforms" and "Microsoft.ML.Trainers" namespaces. | ||
/// </summary> | ||
/// <param name="item"></param> | ||
public void Add(ILearningPipelineItem item) => Items.Add(item); | ||
|
||
/// <summary> | ||
/// Remove all the transforms/trainers from the pipeline. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not loaders? It seems like this would also clear any loaders. #Closed |
||
/// </summary> | ||
public void Clear() => Items.Clear(); | ||
|
||
/// <summary> | ||
/// Check if a specific loader/transform/trainer is in the pipeline? | ||
/// </summary> | ||
/// <param name="item">Any ML component (data loader, transform or trainer) defined as ILearningPipelineItem.</param> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
When having type names in documentation, it may be good practice for us to use the |
||
/// <returns>true/false</returns> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That the return value is either true or false is implied by the return value being There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, I typically copy the MSDN docs: https://msdn.microsoft.com/en-us/library/k5cf1d56(v=vs.110).aspx
|
||
public bool Contains(ILearningPipelineItem item) => Items.Contains(item); | ||
|
||
/// <summary> | ||
/// Copy the pipeline items into an array. | ||
/// </summary> | ||
/// <param name="array">Array the items are copied to.</param> | ||
/// <param name="arrayIndex">Index to start copying from.</param> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The phrase "Index to start copying from", the "from" suggests that this is an index on the source, but rather, it is an index on the destination. Perhaps the word "into" vs. "from" would be more clear... plus an explicit There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I typically copy the MSDN docs for comments like this. See ICollection.CopyTo
|
||
public void CopyTo(ILearningPipelineItem[] array, int arrayIndex) => Items.CopyTo(array, arrayIndex); | ||
public IEnumerator<ILearningPipelineItem> GetEnumerator() => Items.GetEnumerator(); | ||
|
||
/// <summary> | ||
/// Remove an item from the pipeline. | ||
/// </summary> | ||
/// <param name="item">ILearningPipelineItem to remove.</param> | ||
/// <returns>true/false</returns> | ||
public bool Remove(ILearningPipelineItem item) => Items.Remove(item); | ||
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator(); | ||
|
||
/// <summary> | ||
/// Train the model using the ML components in the pipeline. | ||
/// </summary> | ||
/// <typeparam name="TInput">Type of data instances the model will be trained on. It's a custom type defined by the user according to the structure of data. | ||
/// <para/> | ||
/// E.g. please see "Microsoft.ML.Scenarios.ScenarioTests.SentimentData" in "Microsoft.ML.Tests.csproj" for input type definition for sentiment classification task. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think typical users will have access to the repo. So putting a reference like this in the XML ref docs wouldn't help that much. Maybe a FWLink to the 'getting started' doc instead? |
||
/// The type is defined for a .csv file that contains sentiment classification data with Sentiment and SentimentText as two columns in the .csv file. | ||
/// </typeparam> | ||
/// <typeparam name="TOutput">Ouput type. The prediction will be return based on this type. | ||
/// E.g. for sentiment classifcation scenario, the prediction type is defined at "Microsoft.ML.Scenarios.ScenarioTests.SentimentPrediction" in "Microsoft.ML.Tests.csproj". | ||
/// </typeparam> | ||
/// <returns>PredictionModel object. This is the model object used for prediction on new instances. </returns> | ||
public PredictionModel<TInput, TOutput> Train<TInput, TOutput>() | ||
where TInput : class | ||
where TOutput : class, new() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"perform desired" => "perform a desired"? #Closed