-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Cleaned LightGBM documentation #2886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/// Gradient boosting decision tree. | ||
/// </summary> | ||
/// <remarks> | ||
/// For details, please see <a href="https://en.wikipedia.org/wiki/Gradient_boosting#Gradient_tree_boosting">here</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace with "Gradient_tree_boosting" #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// </summary> | ||
/// <value> | ||
/// 0 means disable bagging. N means perform bagging at every N iterations. | ||
/// To enable bagging, <see cref="SubsampleFraction"/> should also be set to a value less than 1.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should also be set to a value less than 1.0 [](start = 71, length = 43)
Just curious, where did this information come from? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
three sources:
*https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters.rst
*https://lightgbm.readthedocs.io/en/latest/Parameters.html
*ML.NET source code
In reply to: 263689038 [](ancestors = 263689038)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// </summary> | ||
/// <param name="catalog">The <see cref="RegressionCatalog"/>.</param> | ||
/// <param name="options">Advanced options to the algorithm.</param> | ||
/// <param name="options">Trainer options.</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trainer [](start = 34, length = 7)
either estimator or algorithm. I think the other catalogs use "Algorithm advanced options" #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been changing them to "Trainer Options" based on Rogan's comment on AveragedPercentron. I don't want to change back everything. Trainer sounds good enough, unless there's a strong objection.
In reply to: 263689283 [](ancestors = 263689283)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -11360,7 +11360,7 @@ | |||
{ | |||
"Name": "CustomGains", | |||
"Type": "String", | |||
"Desc": "Comma seperated list of gains associated to each relevance label.", | |||
"Desc": "Comma separated list of gains associated to each relevance label.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separated [](start = 25, length = 9)
Nice #Resolved
src/Microsoft.ML.LightGBM/doc.xml
Outdated
--> | ||
<member name="LightGBM_remarks"> | ||
<remarks> | ||
LightGBM is an open source implementation of boosted trees. For implementation details, please see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
boosted trees [](start = 53, length = 13)
gradient boosted decision trees. (or Gradient Boosting Machine for GBM?) #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to "Gradient Boosting Decision Tree" based on Tom's paper title.
In reply to: 263944693 [](ancestors = 263944693)
/// <remarks> | ||
/// LightGBM is an external library that's integrated with ML.NET. For detailed information about the parameters | ||
/// please see https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters.rst. | ||
/// </remarks> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a version info anywhere that we could put here so that it's clear what version we have and it's auto-updated when we update LightGBM? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have an auto-update mechanism for external repo version in docs. Right now it's at 2.2.3 which is the latest. I can't think of a way to force-update the docs here, if LightGBM version is included in text.
It's also easy to look at the LightGBM nuget version that comes with ML.NET nugets. That's the most reliable way to find the version I think.
In reply to: 263945106 [](ancestors = 263945106)
src/Microsoft.ML.LightGBM/doc.xml
Outdated
<member name="LightGBM_remarks"> | ||
<remarks> | ||
LightGBM is an open source implementation of boosted trees. For implementation details, please see | ||
<a href='https://lightgbm.readthedocs.io/en/latest/index.html'>LightGBM's official documentation</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documentation [](start = 91, length = 13)
Also link to the paper with Guolin and Tom. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Argument(ArgumentType.AtMostOnce, HelpText = "Rounds of early stopping, 0 will disable it.", | ||
ShortName = "es")] | ||
public int EarlyStoppingRound = 0; | ||
|
||
[Argument(ArgumentType.AtMostOnce, HelpText = "Comma seperated list of gains associated to each relevance label.", ShortName = "gains")] | ||
/// <summary> | ||
/// Comma separated list of gains associated with each relevance label. Used only by <see cref="LightGbmRankingTrainer"/>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a sep [](start = 16, length = 5)
hyphen #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Argument(ArgumentType.AtMostOnce, HelpText = "L2 Regularization for categorical split.")] | ||
[TlcModule.Range(Min = 0.0)] | ||
[TlcModule.SweepableDiscreteParam("CatL2", new object[] { 0.1, 0.5, 1, 5, 10 })] | ||
public double L2CategoricalRegularization = 10; | ||
|
||
/// <summary> | ||
/// The random seed for LightGBM to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The random seed for LightGBM to use. [](start = 12, length = 36)
Please specify what happens if the seed is not set by the user. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -14,20 +14,20 @@ namespace Microsoft.ML | |||
public static class LightGbmExtensions | |||
{ | |||
/// <summary> | |||
/// Predict a target using a decision tree regression model trained with the <see cref="LightGbmRegressorTrainer"/>. | |||
/// Predict a target using a boosted decision tree regression model trained with the <see cref="LightGbmRegressorTrainer"/>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a boosted decision tree re [](start = 34, length = 27)
"gradient boosting decision tree" to be consistent with other docs. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -46,14 +46,14 @@ public static class LightGbmExtensions | |||
} | |||
|
|||
/// <summary> | |||
/// Predict a target using a decision tree regression model trained with the <see cref="LightGbmRegressorTrainer"/>. | |||
/// Predict a target using a boosted decision tree regression model trained with the <see cref="LightGbmRegressorTrainer"/> and advanced options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oosted decision tree regression [](start = 38, length = 31)
Ditto #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕐
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍇 📚 🎰
(closest emojis to GBM i could find)
Codecov Report
@@ Coverage Diff @@
## master #2886 +/- ##
=========================================
Coverage ? 71.81%
=========================================
Files ? 812
Lines ? 142644
Branches ? 16090
=========================================
Hits ? 102433
Misses ? 35827
Partials ? 4384
|
[Argument(ArgumentType.AtMostOnce, | ||
HelpText = "Minimum sum of instance weight(hessian) needed in a child. If the tree partition step results in a leaf " + | ||
"node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, " + | ||
"this simply corresponds to minimum number of instances needed to be in each node. The larger, the more conservative the algorithm will be.")] | ||
[TlcModule.Range(Min = 0.0)] | ||
public double MinimumChildWeight = 0.1; | ||
|
||
/// <summary> | ||
/// The frequency of performing subsamplig (bagging). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subsamplig [](start = 48, length = 10)
subsampling.
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. I expect the description/comments for each item was taken from LightGBM documentation. So, I am not putting comments around that. Let me know if you need that.
LightGBM API, trainers, boosters, and options documentation. Part of #2522.
Source of documentation are: