Skip to content

Commit 6cb5a37

Browse files
authored
Add LR XML doc (#3385)
1 parent 5241462 commit 6cb5a37

File tree

2 files changed

+57
-6
lines changed

2 files changed

+57
-6
lines changed

src/Microsoft.ML.StandardTrainers/Standard/LogisticRegression/LogisticRegression.cs

+53-2
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,55 @@
2727
namespace Microsoft.ML.Trainers
2828
{
2929

30-
/// <include file='doc.xml' path='doc/members/member[@name="LBFGS"]/*' />
31-
/// <include file='doc.xml' path='docs/members/example[@name="LogisticRegressionBinaryClassifier"]/*' />
30+
/// <summary>
31+
/// The <see cref="IEstimator{TTransformer}"/> to predict a target using a linear logistic regression model trained with L-BFGS method.
32+
/// </summary>
33+
/// <remarks>
34+
/// <format type="text/markdown"><![CDATA[
35+
/// To create this trainer, use [LbfgsLogisticRegression](xref:Microsoft.ML.StandardTrainersCatalog.LbfgsLogisticRegression(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,System.String,System.String,System.String,System.Single,System.Single,System.Single,System.Int32,System.Boolean))
36+
/// or [LbfgsLogisticRegression(Options)](xref:Microsoft.ML.StandardTrainersCatalog.LbfgsLogisticRegression(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,Microsoft.ML.Trainers.LbfgsLogisticRegressionBinaryTrainer.Options)).
37+
///
38+
/// [!include[io](~/../docs/samples/docs/api-reference/io-columns-binary-classification.md)]
39+
///
40+
/// ### Trainer Characteristics
41+
/// | | |
42+
/// | -- | -- |
43+
/// | Machine learning task | Binary classification |
44+
/// | Is normalization required? | Yes |
45+
/// | Is caching required? | No |
46+
/// | Required NuGet in addition to Microsoft.ML | None |
47+
///
48+
/// ### Scoring Function
49+
/// Linear logistic regression is a variant of linear model. It maps feature vector $\boldsymbol{x} \in {\mathbb R}^n$ to a scalar via $\hat{y}\left(\boldsymbol{x}\right) = \boldsymbol{w}^T \boldsymbol{x} + b = \sum_{j=1}^n w_j x_j + b$,
50+
/// where the $x_j$ is the $j$-th feature's value, the $j$-th element of $\boldsymbol{w}$ is the $j$-th feature's coefficient, and $b$ is a learnable bias.
51+
/// The corresponding probability of getting a true label is $\frac{1}{1 + e^{\hat{y}\left(\boldsymbol{x}\right)}}$.
52+
///
53+
/// ### Training Algorithm Details
54+
/// The optimization technique implemented is based on [the limited memory Broyden-Fletcher-Goldfarb-Shanno method (L-BFGS)](https://en.wikipedia.org/wiki/Limited-memory_BFGS).
55+
/// L-BFGS is a [quasi-Newtonian method](https://en.wikipedia.org/wiki/Quasi-Newton_method) which replaces the expensive computation cost of Hessian matrix with an approximation but still enjoys a fast convergence rate like [Newton method](https://en.wikipedia.org/wiki/Newton%27s_method_in_optimization) where the full Hessian matrix is computed.
56+
/// Since L-BFGS approximation uses only a limited amount of historical states to compute the next step direction, it is especially suited for problems with high-dimensional feature vector.
57+
/// The number of historical states is a user-specified parameter, using a larger number may lead to a better approximation to the Hessian matrix but also a higher computation cost per step.
58+
///
59+
/// Regularization is a method that can render an ill-posed problem more tractable by imposing constraints that provide information to supplement the data and that prevents overfitting by penalizing model's magnitude usually measured by some norm functions.
60+
/// This can improve the generalization of the model learned by selecting the optimal complexity in the bias-variance tradeoff.
61+
/// Regularization works by adding the penalty that is associated with coefficient values to the error of the hypothesis.
62+
/// An accurate model with extreme coefficient values would be penalized more, but a less accurate model with more conservative values would be penalized less.
63+
///
64+
/// This learner supports [elastic net regularization](https://en.wikipedia.org/wiki/Elastic_net_regularization): a linear combination of L1-norm (LASSO), $|| \boldsymbol{w} ||_1$, and L2-norm (ridge), $|| \boldsymbol{w} ||_2^2$ regularizations.
65+
/// L1-norm and L2-norm regularizations have different effects and uses that are complementary in certain respects.
66+
/// Using L1-norm can increase sparsity of the trained $\boldsymbol{w}$.
67+
/// When working with high-dimensional data, it shrinks small weights of irrelevant features to 0 and therefore no resource will be spent on those bad features when making prediction.
68+
/// If L1-norm regularization is used, the used training algorithm would be [QWL-QN](http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.68.5260).
69+
/// L2-norm regularization is preferable for data that is not sparse and it largely penalizes the existence of large weights.
70+
///
71+
/// An aggressive regularization (that is, assigning large coefficients to L1-norm or L2-norm regularization terms) can harm predictive capacity by excluding important variables out of the model.
72+
/// Therefore, choosing the right regularization coefficients is important when applying logistic regression.
73+
/// ]]>
74+
/// </format>
75+
/// </remarks>
76+
/// <seealso cref="Microsoft.ML.StandardTrainersCatalog.LbfgsLogisticRegression(BinaryClassificationCatalog.BinaryClassificationTrainers, string, string, string, float, float, float, int, bool)"/>
77+
/// <seealso cref="Microsoft.ML.StandardTrainersCatalog.LbfgsLogisticRegression(BinaryClassificationCatalog.BinaryClassificationTrainers, LbfgsLogisticRegressionBinaryTrainer.Options)"/>
78+
/// <seealso cref="Options"/>
3279
public sealed partial class LbfgsLogisticRegressionBinaryTrainer : LbfgsTrainerBase<LbfgsLogisticRegressionBinaryTrainer.Options,
3380
BinaryPredictionTransformer<CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator>>,
3481
CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator>>
@@ -39,6 +86,10 @@ public sealed partial class LbfgsLogisticRegressionBinaryTrainer : LbfgsTrainerB
3986
internal const string Summary = "Logistic Regression is a method in statistics used to predict the probability of occurrence of an event and can "
4087
+ "be used as a classification algorithm. The algorithm predicts the probability of occurrence of an event by fitting data to a logistical function.";
4188

89+
/// <summary>
90+
/// Options for the <see cref="LbfgsLogisticRegressionBinaryTrainer"/> as used in
91+
/// <see cref="Microsoft.ML.StandardTrainersCatalog.LbfgsLogisticRegression(BinaryClassificationCatalog.BinaryClassificationTrainers, LbfgsLogisticRegressionBinaryTrainer.Options)"/>
92+
/// </summary>
4293
public sealed class Options : OptionsBase
4394
{
4495
/// <summary>

src/Microsoft.ML.StandardTrainers/StandardTrainersCatalog.cs

+4-4
Original file line numberDiff line numberDiff line change
@@ -518,11 +518,11 @@ public static OnlineGradientDescentTrainer OnlineGradientDescent(this Regression
518518
}
519519

520520
/// <summary>
521-
/// Predict a target using a linear binary classification model trained with the <see cref="Trainers.LbfgsLogisticRegressionBinaryTrainer"/> trainer.
521+
/// Create <see cref="Trainers.LbfgsLogisticRegressionBinaryTrainer"/>, which predicts a target using a linear binary classification model trained over boolean label data.
522522
/// </summary>
523523
/// <param name="catalog">The binary classification catalog trainer object.</param>
524-
/// <param name="labelColumnName">The name of the label column.</param>
525-
/// <param name="featureColumnName">The name of the feature column.</param>
524+
/// <param name="labelColumnName">The name of the label column. The column data must be <see cref="System.Boolean"/>.</param>
525+
/// <param name="featureColumnName">The name of the feature column. The column data must be a known-sized vector of <see cref="System.Single"/>.</param>
526526
/// <param name="exampleWeightColumnName">The name of the example weight column (optional).</param>
527527
/// <param name="enforceNonNegativity">Enforce non-negative weights.</param>
528528
/// <param name="l1Regularization">The L1 <a href='https://en.wikipedia.org/wiki/Regularization_(mathematics)'>regularization</a> hyperparameter. Higher values will tend to lead to more sparse model.</param>
@@ -552,7 +552,7 @@ public static LbfgsLogisticRegressionBinaryTrainer LbfgsLogisticRegression(this
552552
}
553553

554554
/// <summary>
555-
/// Predict a target using a linear binary classification model trained with the <see cref="Trainers.LbfgsLogisticRegressionBinaryTrainer"/> trainer.
555+
/// Create <see cref="Trainers.LbfgsLogisticRegressionBinaryTrainer"/> with advanced options, which predicts a target using a linear binary classification model trained over boolean label data.
556556
/// </summary>
557557
/// <param name="catalog">The binary classification catalog trainer object.</param>
558558
/// <param name="options">Advanced arguments to the algorithm.</param>

0 commit comments

Comments
 (0)