Skip to content

XML documentation for SDCA regression trainer. #3403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 21, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions docs/api-reference/algo-details-sdca.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
### Training Algorithm Details
This trainer is based on the Stochastic Dual Coordinate Ascent (SDCA) method, a
state-of-the-art optimization technique for convex objective functions. The
algorithm can be scaled for use on large out-of-memory data sets due to a
semi-asynchronized implementation that supports multi-threading.

Convergence is underwritten by periodically enforcing synchronization between
primal and dual variables in a separate thread. Several choices of loss
functions are also provided.

Note that SDCA is a stochastic and streaming optimization algorithm. The result
depends on the order of training data because the stopping tolerance is not
tight enough. In strongly-convex optimization, the optimal solution is unique
and therefore everyone eventually reaches the same place. Even in
non-strongly-convex cases, you will get equally-good solutions from run to run.
For reproducible results, it is recommended that one sets 'Shuffle' to False and
'NumThreads' to 1. Elastic net regularization can be specified by the 'L2Const'
and 'L1Threshold' parameters. Note that the 'L2Const' has an effect on the rate
of convergence. In general, the larger the 'L2Const', the faster SDCA converges.

For more information, see:
* [Scaling Up Stochastic Dual Coordinate
Ascent.](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/main-3.pdf)
* [Stochastic Dual Coordinate Ascent Methods for Regularized Loss
Minimization.](http://www.jmlr.org/papers/volume14/shalev-shwartz13a/shalev-shwartz13a.pdf)
23 changes: 22 additions & 1 deletion src/Microsoft.ML.StandardTrainers/Standard/SdcaRegression.cs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,28 @@ namespace Microsoft.ML.Trainers
/// <summary>
/// The <see cref="IEstimator{TTransformer}"/> for training a regression model using the stochastic dual coordinate ascent method.
/// </summary>
/// <include file='doc.xml' path='doc/members/member[@name="SDCA_remarks"]/*' />
/// <remarks>
/// <format type="text/markdown"><![CDATA[
/// To create this trainer, use [Sdca](xref:Microsoft.ML.StandardTrainersCatalog.Sdca(Microsoft.ML.RegressionCatalog.RegressionTrainers,System.String,System.String,System.String,Microsoft.ML.Trainers.ISupportSdcaRegressionLoss,System.Nullable{System.Single},System.Nullable{System.Single},System.Nullable{System.Int32}))
/// or [Sdca(Options)](xref:Microsoft.ML.StandardTrainersCatalog.Sdca(Microsoft.ML.RegressionCatalog.RegressionTrainers,Microsoft.ML.Trainers.SdcaRegressionTrainer.Options)).
///
/// [!include[io](~/../docs/samples/docs/api-reference/io-columns-regression.md)]
///
/// ### Trainer Characteristics
/// | | |
/// | -- | -- |
/// | Machine learning task | Regression |
/// | Is normalization required? | Yes |
/// | Is caching required? | No |
/// | Required NuGet in addition to Microsoft.ML | None |
///
/// [!include[io](~/../docs/samples/docs/api-reference/algo-details-sdca.md)]
/// ]]>
/// </format>
/// </remarks>
/// <seealso cref="StandardTrainersCatalog.Sdca(RegressionCatalog.RegressionTrainers, string, string, string, ISupportSdcaRegressionLoss, float?, float?, int?)"/>
/// <seealso cref="StandardTrainersCatalog.Sdca(RegressionCatalog.RegressionTrainers, SdcaRegressionTrainer.Options)"/>
/// <seealso cref="Options"/>
public sealed class SdcaRegressionTrainer : SdcaTrainerBase<SdcaRegressionTrainer.Options, RegressionPredictionTransformer<LinearRegressionModelParameters>, LinearRegressionModelParameters>
{
internal const string LoadNameValue = "SDCAR";
Expand Down
16 changes: 8 additions & 8 deletions src/Microsoft.ML.StandardTrainers/StandardTrainersCatalog.cs
Original file line number Diff line number Diff line change
Expand Up @@ -129,11 +129,11 @@ public static SgdNonCalibratedTrainer SgdNonCalibrated(this BinaryClassification
}

/// <summary>
/// Predict a target using a linear regression model trained with <see cref="SdcaRegressionTrainer"/>.
/// Create <see cref="SdcaRegressionTrainer"/>, which predicts a target using a linear regression model.
/// </summary>
/// <param name="catalog">The regression catalog trainer object.</param>
/// <param name="labelColumnName">The name of the label column.</param>
/// <param name="featureColumnName">The name of the feature column.</param>
/// <param name="labelColumnName">The name of the label column. The column data must be <see cref="System.Single"/></param>
/// <param name="featureColumnName">The name of the feature column. The column data must be a known-sized vector of <see cref="System.Single"/></param>
/// <param name="exampleWeightColumnName">The name of the example weight column (optional).</param>
/// <param name="lossFunction">The <a href="https://en.wikipedia.org/wiki/Loss_function">loss</a> function minimized in the training process. Using, for example, its default <see cref="SquaredLoss"/> leads to a least square trainer.</param>
/// <param name="l2Regularization">The L2 weight for <a href='https://en.wikipedia.org/wiki/Regularization_(mathematics)'>regularization</a>.</param>
Expand All @@ -160,7 +160,7 @@ public static SdcaRegressionTrainer Sdca(this RegressionCatalog.RegressionTraine
}

/// <summary>
/// Predict a target using a linear regression model trained with <see cref="SdcaRegressionTrainer"/> and advanced options.
/// Create <see cref="SdcaRegressionTrainer"/> with advanced options, which predicts a target using a linear regression model.
/// </summary>
/// <param name="catalog">The regression catalog trainer object.</param>
/// <param name="options">Trainer options.</param>
Expand All @@ -181,11 +181,11 @@ public static SdcaRegressionTrainer Sdca(this RegressionCatalog.RegressionTraine
}

/// <summary>
/// Predict a target using a linear classification model trained with <see cref="SdcaLogisticRegressionBinaryTrainer"/>.
/// Create <see cref="SdcaLogisticRegressionBinaryTrainer"/>, which predicts a target using a linear classification model.
/// </summary>
/// <param name="catalog">The binary classification catalog trainer object.</param>
/// <param name="labelColumnName">The name of the label column.</param>
/// <param name="featureColumnName">The name of the feature column.</param>
/// <param name="labelColumnName">The name of the label column. The column data must be <see cref="System.Single"/>.</param>
/// <param name="featureColumnName">The name of the feature column. The column data must be a known-sized vector of <see cref="System.Single"/>.</param>
/// <param name="exampleWeightColumnName">The name of the example weight column (optional).</param>
/// <param name="l2Regularization">The L2 weight for <a href='https://en.wikipedia.org/wiki/Regularization_(mathematics)'>regularization</a>.</param>
/// <param name="l1Regularization">The L1 <a href='https://en.wikipedia.org/wiki/Regularization_(mathematics)'>regularization</a> hyperparameter. Higher values will tend to lead to more sparse model.</param>
Expand All @@ -211,7 +211,7 @@ public static SdcaLogisticRegressionBinaryTrainer SdcaLogisticRegression(
}

/// <summary>
/// Predict a target using a linear classification model trained with <see cref="SdcaLogisticRegressionBinaryTrainer"/> and advanced options.
/// Create <see cref="SdcaLogisticRegressionBinaryTrainer"/> using advanced options, which predicts a target using a linear classification model.
/// </summary>
/// <param name="catalog">The binary classification catalog trainer object.</param>
/// <param name="options">Trainer options.</param>
Expand Down