-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Documentation for BinaryClassification.AveragedPerceptron (V2) #2517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For sample code, changed numIterations to 10 as per Justin's request
@@ -41,12 +41,21 @@ public sealed class AveragedPerceptronTrainer : AveragedLinearTrainer<BinaryPred | |||
|
|||
public sealed class Options : AveragedLinearArguments | |||
{ | |||
/// <summary> | |||
/// The custom <a href="tmpurl_loss">loss</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpurl_loss [](start = 36, length = 11)
What's the story with this tmpurl? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shahab is working on a concepts doc. Those tmpUrl are placeholders for referencing those concepts.
In reply to: 256202773 [](ancestors = 256202773)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #2517 +/- ##
==========================================
- Coverage 71.45% 71.42% -0.03%
==========================================
Files 800 801 +1
Lines 141700 141749 +49
Branches 16135 16135
==========================================
- Hits 101247 101240 -7
- Misses 35987 36041 +54
- Partials 4466 4468 +2
|
// Setting the seed to a fixed number in this example to make outputs deterministic. | ||
var mlContext = new MLContext(seed: 0); | ||
|
||
// Download and featurize the dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Downl [](start = 12, length = 8)
Period at the end of all comments. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var metrics = mlContext.BinaryClassification.EvaluateNonCalibrated(dataWithPredictions, "IsOver50K"); | ||
SamplesUtils.ConsoleUtils.PrintMetrics(metrics); | ||
|
||
// Output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output [](start = 15, length = 6)
Expected output, then an extra space on the output below. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space between Expected output
and line below it.
In reply to: 256538945 [](ancestors = 256538945,256221990)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean. This is how other samples look like.
In reply to: 256622797 [](ancestors = 256622797,256538945,256221990)
"occupation", "relationship", "ethnicity", "native-country", "age", "education-num", | ||
"capital-gain", "capital-loss", "hours-per-week")) | ||
// Min-max normalized all the features | ||
.Append(mlContext.Transforms.Normalize("Features")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.Append [](start = 16, length = 7)
Can you add a CopyColumns("Label", "IsOver50K")
? That way you don't have to specify the label or remember what it is. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -190,17 +190,46 @@ public static class StandardLearnersCatalog | |||
} | |||
|
|||
/// <summary> | |||
/// Predict a target using a linear binary classification model trained with the AveragedPerceptron trainer. | |||
/// Predict a target using a linear binary classification model trained with averaged perceptron trainer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// </summary> | ||
/// <remarks> | ||
/// Perceptron is a classification algorithm that makes its predictions by finding a separating hyperplane. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perceptron [](start = 12, length = 11)
The Perceptron #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// | ||
/// The perceptron is an online algorithm, which means it processes the instances in the training set one at a time. | ||
/// It starts with a set of initial weights (zero, random, or initialized from a previous learner). Then, for each example in the training set, the weighted sum of the features (sigma[0, D-1] (w_i * f_i)) is computed. | ||
/// If this value has the same sign as the label of the current example, the weights remain the same.If they have opposite signs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.If [](start = 108, length = 4)
Space on "same. If" #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The perceptron is an online algorithm, which means it processes the instances in the training set one at a time. | ||
/// It starts with a set of initial weights (zero, random, or initialized from a previous learner). Then, for each example in the training set, the weighted sum of the features (sigma[0, D-1] (w_i * f_i)) is computed. | ||
/// If this value has the same sign as the label of the current example, the weights remain the same.If they have opposite signs, | ||
/// the weights vector is updated by either subtracting or adding (if the label is negative or positive, respectively) the feature vector of the current example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subtracting or adding [](start = 52, length = 21)
"adding or subtracting" (common form of speech) #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// It starts with a set of initial weights (zero, random, or initialized from a previous learner). Then, for each example in the training set, the weighted sum of the features (sigma[0, D-1] (w_i * f_i)) is computed. | ||
/// If this value has the same sign as the label of the current example, the weights remain the same.If they have opposite signs, | ||
/// the weights vector is updated by either subtracting or adding (if the label is negative or positive, respectively) the feature vector of the current example, | ||
/// multiplied by a factor 0 < a <= 1, called the learning rate.In a generalization of this algorithm, the weights are updated by adding the feature vector multiplied by the learning rate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.In [](start = 76, length = 5)
Space #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <param name="catalog">The binary classification catalog trainer object.</param> | ||
/// <param name="labelColumn">The name of the label column, or dependent variable.</param> | ||
/// <param name="featureColumn">The features, or independent variables.</param> | ||
/// <param name="lossFunction">The custom loss.</param> | ||
/// <param name="lossFunction">The custom <a href="tmpurl_loss">loss</a>. If <see langword="null"/>, hinge loss will be used resulting in max-margin averaged perceptron.</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [](start = 38, length = 4)
A custom #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -219,10 +248,18 @@ public static class StandardLearnersCatalog | |||
} | |||
|
|||
/// <summary> | |||
/// Predict a target using a linear binary classification model trained with the AveragedPerceptron trainer. | |||
/// Predict a target using a linear binary classification model trained with averaged perceptron trainer using advanced options. | |||
/// For usage details, please see <see cref="AveragedPerceptron(BinaryClassificationCatalog.BinaryClassificationTrainers, string, string, string, IClassificationLoss, float, bool, float, int)"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For usage details [](start = 12, length = 17)
This goes in remarks, so it doesn't show up in the table of contents. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -17,38 +17,89 @@ namespace Microsoft.ML.Trainers.Online | |||
{ | |||
public abstract class AveragedLinearArguments : OnlineLinearArguments | |||
{ | |||
/// <summary> | |||
/// <a href="tmpurl_lr">Learning rate</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Learning rate [](start = 32, length = 13)
Remarks? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, if it is short, it can stay in summary. This does need more text, if that's what you mean by remarks
.
In reply to: 256223629 [](ancestors = 256223629)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was thinking the same as Senja. The remark will be simply the link, so we might as well leave it in the summary.
In reply to: 256251344 [](ancestors = 256251344,256223629)
/// Determine whether to decrease the <see cref="LearningRate"/> or not. | ||
/// </summary> | ||
/// <value> | ||
/// <see langword="true" /> to decrease the <see cref="LearningRate"/> as iterations progress; otherwise, <see langword="false" />. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<see lan [](start = 12, length = 8)
Remarks on exactly how it decreases? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// - Prediction calibration to produce probabilities. Off by default, if on, uses exponential (aka Platt) calibration. | ||
/// <include file='doc.xml' path='doc/members/member[@name="AP"]/*' /> | ||
/// <summary> | ||
/// This is averaged perceptron trainer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is averaged perceptron trainer. [](start = 8, length = 36)
See the email on standard summary text. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -19,27 +19,41 @@ namespace Microsoft.ML.Trainers.Online | |||
|
|||
public abstract class OnlineLinearArguments : LearnerInputBaseWithLabel | |||
{ | |||
/// <summary> | |||
/// Number of training iterations through the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Number of training iterations [](start = 12, length = 29)
Number of passes through the dataset. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
/// <summary> | ||
/// <see langword="true" /> to shuffle data for each training iteration; otherwise, <see langword="false" />. | ||
/// Default is <see langword="true" />. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default is <see lan [](start = 11, length = 20)
in value #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// In this examples we will use the adult income dataset. The goal is to predict | ||
// if a person's income is above $50K or not, based on different pieces of information about that person. | ||
// For more details about this dataset, please see https://archive.ics.uci.edu/ml/datasets/adult | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this be better atop the Example() #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <param name="catalog">The binary classification catalog trainer object.</param> | ||
/// <param name="labelColumn">The name of the label column, or dependent variable.</param> | ||
/// <param name="featureColumn">The features, or independent variables.</param> | ||
/// <param name="lossFunction">The custom loss.</param> | ||
/// <param name="lossFunction">The custom <a href="tmpurl_loss">loss</a>. If <see langword="null"/>, hinge loss will be used resulting in max-margin averaged perceptron.</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpurl_loss [](start = 59, length = 11)
replace before checking in #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have the final links yet. find-replace needs to be done later.
In reply to: 256250676 [](ancestors = 256250676)
/// <param name="learningRate">The learning Rate.</param> | ||
/// <param name="decreaseLearningRate">Decrease learning rate as iterations progress.</param> | ||
/// <param name="l2RegularizerWeight">L2 regularization weight.</param> | ||
/// <param name="learningRate"><a href="tmpurl_lr">Learning rate</a>.</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpurl_lr [](start = 48, length = 9)
replace #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <example> | ||
/// <format type="text/markdown"> | ||
/// <] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A [](start = 127, length = 1)
Trainers/AveragedPerceptron.cs #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -19,27 +19,41 @@ namespace Microsoft.ML.Trainers.Online | |||
|
|||
public abstract class OnlineLinearArguments : LearnerInputBaseWithLabel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OnlineLinearArguments [](start = 26, length = 21)
xml #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -17,38 +17,89 @@ namespace Microsoft.ML.Trainers.Online | |||
{ | |||
public abstract class AveragedLinearArguments : OnlineLinearArguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AveragedLinearArguments [](start = 26, length = 23)
xml #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -219,10 +248,18 @@ public static class StandardLearnersCatalog | |||
} | |||
|
|||
/// <summary> | |||
/// Predict a target using a linear binary classification model trained with the AveragedPerceptron trainer. | |||
/// Predict a target using a linear binary classification model trained with averaged perceptron trainer using advanced options. | |||
/// For usage details, please see <see cref="AveragedPerceptron(BinaryClassificationCatalog.BinaryClassificationTrainers, string, string, string, IClassificationLoss, float, bool, float, int)"/> | |||
/// </summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would prefer we have full text on all extension, and not redirect ppl to the other one just to read a paragraph.. This is where the doc.xml includes can come handy. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed offline, moving this text to the estimator class
In reply to: 256252156 [](ancestors = 256252156)
[Argument(ArgumentType.AtMostOnce, HelpText = "Number of iterations", ShortName = "iter", SortOrder = 50)] | ||
[TGUI(Label = "Number of Iterations", Description = "Number of training iterations through data", SuggestedSweeps = "1,10,100")] | ||
[TlcModule.SweepableLongParamAttribute("NumIterations", 1, 100, stepSize: 10, isLogScale: true)] | ||
public int NumIterations = OnlineDefaultArgs.NumIterations; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NumIterations [](start = 19, length = 13)
Set the Name attribute to be the old name everytime you change one of them; for maml backwards compat. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've covered it in AveragedPerceptronWithOptions.cs sample. We have public losses but they don't look good. I'll bring it up in the scrum. In reply to: 463025252 [](ancestors = 463025252) Refers to: src/Microsoft.ML.StandardLearners/StandardLearnersCatalog.cs:238 in 59673b8. [](commit_id = 59673b8, deletion_comment = False) |
var metrics = mlContext.BinaryClassification.EvaluateNonCalibrated(dataWithPredictions); | ||
SamplesUtils.ConsoleUtils.PrintMetrics(metrics); | ||
|
||
// Expected output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expected output: [](start = 15, length = 16)
space #Pending
@@ -86,6 +87,57 @@ public static string DownloadSentimentDataset() | |||
public static string DownloadAdultDataset() | |||
=> Download("https://raw.githubusercontent.com/dotnet/machinelearning/244a8c2ac832657af282aa312d568211698790aa/test/data/adult.train", "adult.txt"); | |||
|
|||
public static IDataView LoadFeaturizedAdultDataset(MLContext mlContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LoadFeaturizedAdultDataset [](start = 32, length = 26)
xml doc since this is public #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
/// <summary> | ||
/// Arguments class for online linear trainers. | ||
/// </summary> | ||
public abstract class OnlineLinearArguments : LearnerInputBaseWithLabel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. This serves as the base class (for the online linear learners) from which the public facing Options
class is derived.
It should be internal
though. I believe Issue #2264 should take care of making it internal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to disagree regarding internal
. Why? If you make it internal all this property will disappear for user, and they are good properties.
What is wrong with having hierarchy of classes?
In reply to: 256639493 [](ancestors = 256639493)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going by smell test -- I do not see any samples (yet) of how this class would be helpful for a public facing API. I may be wrong though.
Would like to hear @yaeldekel / @codemzs thoughts on this since they are working on "lockdown" issues.
If we do want to keep this public
perhaps we should rename to OnlineLinearOptions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have following structure:
OnlineLinear Options -> LinearSVM Options
|
|____> AverageLinear Options--> AveragePerceptron Options
|
|____________> OnlineGradientDescent Options
base options contains properties which belong to all of them, derived classes contains specific options.
Only way how you can get rid of this tree structure is to create new flatten object which will be flat and have all required properties by himself. But that would also mean you need to repeat some properties in this flatten structures multiple times.
In reply to: 256646030 [](ancestors = 256646030)
/// <summary> | ||
/// Number of passes through the training dataset. | ||
/// </summary> | ||
[Argument(ArgumentType.AtMostOnce, HelpText = "Number of iterations", ShortName = "iter, numIterations", SortOrder = 50)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numIterations", [](start = 96, length = 16)
Add a Name attribute property, for backwards compatibility, here and below in InitialWeightsDiameter.
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShortName and Name work same way, so you can have it in either place.
On other hand ShortName assumes what it's a short name, which numIteration is not so short.
Also I'm not sure if you presence of space would make system happy iter, numIterations
->iter,numIterations
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll remove the space. numIterations is now "short", given the new name NumberOfIterations :)
In reply to: 257009222 [](ancestors = 257009222)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#2483 got messed up with a bad rebase. Recreating the PR here.
Docs & sample for BinaryClassification.AveragedPerceptron. Related to #1209.