Skip to content

Cleaned LightGBM documentation #2886

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 11, 2019
Merged

Conversation

shmoradims
Copy link

@shmoradims shmoradims commented Mar 7, 2019

LightGBM API, trainers, boosters, and options documentation. Part of #2522.
Source of documentation are:

/// Gradient boosting decision tree.
/// </summary>
/// <remarks>
/// For details, please see <a href="https://en.wikipedia.org/wiki/Gradient_boosting#Gradient_tree_boosting">here</a>.
Copy link
Member

@sfilipi sfilipi Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with "Gradient_tree_boosting" #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


In reply to: 263686400 [](ancestors = 263686400)

/// </summary>
/// <value>
/// 0 means disable bagging. N means perform bagging at every N iterations.
/// To enable bagging, <see cref="SubsampleFraction"/> should also be set to a value less than 1.0.
Copy link
Member

@sfilipi sfilipi Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also be set to a value less than 1.0 [](start = 71, length = 43)

Just curious, where did this information come from? #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sfilipi
sfilipi previously approved these changes Mar 8, 2019
Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

/// </summary>
/// <param name="catalog">The <see cref="RegressionCatalog"/>.</param>
/// <param name="options">Advanced options to the algorithm.</param>
/// <param name="options">Trainer options.</param>
Copy link
Member

@sfilipi sfilipi Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trainer [](start = 34, length = 7)

either estimator or algorithm. I think the other catalogs use "Algorithm advanced options" #WontFix

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been changing them to "Trainer Options" based on Rogan's comment on AveragedPercentron. I don't want to change back everything. Trainer sounds good enough, unless there's a strong objection.


In reply to: 263689283 [](ancestors = 263689283)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is preferred to "advanced".


In reply to: 263905459 [](ancestors = 263905459,263689283)

@sfilipi sfilipi dismissed their stale review March 8, 2019 08:04

revoking review

@@ -11360,7 +11360,7 @@
{
"Name": "CustomGains",
"Type": "String",
"Desc": "Comma seperated list of gains associated to each relevance label.",
"Desc": "Comma separated list of gains associated to each relevance label.",
Copy link
Contributor

@rogancarr rogancarr Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separated [](start = 25, length = 9)

Nice #Resolved

-->
<member name="LightGBM_remarks">
<remarks>
LightGBM is an open source implementation of boosted trees. For implementation details, please see
Copy link
Contributor

@rogancarr rogancarr Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boosted trees [](start = 53, length = 13)

gradient boosted decision trees. (or Gradient Boosting Machine for GBM?) #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to "Gradient Boosting Decision Tree" based on Tom's paper title.


In reply to: 263944693 [](ancestors = 263944693)

/// <remarks>
/// LightGBM is an external library that's integrated with ML.NET. For detailed information about the parameters
/// please see https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters.rst.
/// </remarks>
Copy link
Contributor

@rogancarr rogancarr Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a version info anywhere that we could put here so that it's clear what version we have and it's auto-updated when we update LightGBM? #Pending

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have an auto-update mechanism for external repo version in docs. Right now it's at 2.2.3 which is the latest. I can't think of a way to force-update the docs here, if LightGBM version is included in text.

It's also easy to look at the LightGBM nuget version that comes with ML.NET nugets. That's the most reliable way to find the version I think.


In reply to: 263945106 [](ancestors = 263945106)

<member name="LightGBM_remarks">
<remarks>
LightGBM is an open source implementation of boosted trees. For implementation details, please see
<a href='https://lightgbm.readthedocs.io/en/latest/index.html'>LightGBM's official documentation</a>.
Copy link
Contributor

@rogancarr rogancarr Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation [](start = 91, length = 13)

Also link to the paper with Guolin and Tom. #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point.


In reply to: 263945392 [](ancestors = 263945392)

[Argument(ArgumentType.AtMostOnce, HelpText = "Rounds of early stopping, 0 will disable it.",
ShortName = "es")]
public int EarlyStoppingRound = 0;

[Argument(ArgumentType.AtMostOnce, HelpText = "Comma seperated list of gains associated to each relevance label.", ShortName = "gains")]
/// <summary>
/// Comma separated list of gains associated with each relevance label. Used only by <see cref="LightGbmRankingTrainer"/>.
Copy link
Contributor

@rogancarr rogancarr Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a sep [](start = 16, length = 5)

hyphen #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


In reply to: 263946259 [](ancestors = 263946259)

[Argument(ArgumentType.AtMostOnce, HelpText = "L2 Regularization for categorical split.")]
[TlcModule.Range(Min = 0.0)]
[TlcModule.SweepableDiscreteParam("CatL2", new object[] { 0.1, 0.5, 1, 5, 10 })]
public double L2CategoricalRegularization = 10;

/// <summary>
/// The random seed for LightGBM to use.
Copy link
Contributor

@rogancarr rogancarr Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The random seed for LightGBM to use. [](start = 12, length = 36)

Please specify what happens if the seed is not set by the user. #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


In reply to: 263946446 [](ancestors = 263946446)

@@ -14,20 +14,20 @@ namespace Microsoft.ML
public static class LightGbmExtensions
{
/// <summary>
/// Predict a target using a decision tree regression model trained with the <see cref="LightGbmRegressorTrainer"/>.
/// Predict a target using a boosted decision tree regression model trained with the <see cref="LightGbmRegressorTrainer"/>.
Copy link
Contributor

@rogancarr rogancarr Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a boosted decision tree re [](start = 34, length = 27)

"gradient boosting decision tree" to be consistent with other docs. #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


In reply to: 263946628 [](ancestors = 263946628)

@@ -46,14 +46,14 @@ public static class LightGbmExtensions
}

/// <summary>
/// Predict a target using a decision tree regression model trained with the <see cref="LightGbmRegressorTrainer"/>.
/// Predict a target using a boosted decision tree regression model trained with the <see cref="LightGbmRegressorTrainer"/> and advanced options.
Copy link
Contributor

@rogancarr rogancarr Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oosted decision tree regression [](start = 38, length = 31)

Ditto #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


In reply to: 263946785 [](ancestors = 263946785)

Copy link
Contributor

@rogancarr rogancarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@rogancarr rogancarr dismissed their stale review March 8, 2019 22:09

revoking review

Copy link
Contributor

@rogancarr rogancarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍇 📚 🎰

(closest emojis to GBM i could find)

@codecov
Copy link

codecov bot commented Mar 8, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@f5b9e43). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff            @@
##             master    #2886   +/-   ##
=========================================
  Coverage          ?   71.81%           
=========================================
  Files             ?      812           
  Lines             ?   142644           
  Branches          ?    16090           
=========================================
  Hits              ?   102433           
  Misses            ?    35827           
  Partials          ?     4384
Flag Coverage Δ
#Debug 71.81% <ø> (?)
#production 67.95% <ø> (?)
#test 86.24% <ø> (?)
Impacted Files Coverage Δ
src/Microsoft.ML.LightGBM/LightGbmBinaryTrainer.cs 83.52% <ø> (ø)
...Microsoft.ML.LightGBM/LightGbmMulticlassTrainer.cs 87.94% <ø> (ø)
...Microsoft.ML.LightGBM/LightGbmRegressionTrainer.cs 74.35% <ø> (ø)
src/Microsoft.ML.LightGBM/LightGbmArguments.cs 89.63% <ø> (ø)
src/Microsoft.ML.LightGBM/LightGbmCatalog.cs 100% <ø> (ø)
...rc/Microsoft.ML.LightGBM/LightGbmRankingTrainer.cs 73.46% <ø> (ø)

[Argument(ArgumentType.AtMostOnce,
HelpText = "Minimum sum of instance weight(hessian) needed in a child. If the tree partition step results in a leaf " +
"node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, " +
"this simply corresponds to minimum number of instances needed to be in each node. The larger, the more conservative the algorithm will be.")]
[TlcModule.Range(Min = 0.0)]
public double MinimumChildWeight = 0.1;

/// <summary>
/// The frequency of performing subsamplig (bagging).
Copy link
Contributor

@zeahmed zeahmed Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subsamplig [](start = 48, length = 10)

subsampling.
#Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch.


In reply to: 263963937 [](ancestors = 263963937)

Copy link
Contributor

@zeahmed zeahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. I expect the description/comments for each item was taken from LightGBM documentation. So, I am not putting comments around that. Let me know if you need that.

@shmoradims shmoradims merged commit 97c067c into dotnet:master Mar 11, 2019
@shmoradims shmoradims deleted the lightgbm_docs2 branch March 11, 2019 20:51
@shauheen shauheen added this to the 0319 milestone Mar 12, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants