Skip to content

Polish early stop rules in fast tree #2851

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 6, 2019

Conversation

wschin
Copy link
Member

@wschin wschin commented Mar 5, 2019

Fix #2520. The pattern implemented in this PR is

        [BestFriend]
        [Argument(ArgumentType.Multiple, HelpText = "Early stopping rule. (Validation set (/valid) is required.)", ShortName = "esr", NullName = "<Disable>")]
        [TGUI(Label = "Early Stopping Rule", Description = "Early stopping rule. (Validation set (/valid) is required.)")]
        internal IEarlyStoppingCriterionFactory EarlyStoppingRuleFactory;

        /// <summary>
        /// The underlying state of <see cref="EarlyStoppingRuleFactory"/> and <see cref="EarlyStoppingRule"/>.
        /// </summary>
        private EarlyStoppingRuleBase _earlyStoppingRuleBase;

        /// <summary>
        /// Early stopping rule used to terminate training process once meeting a specified criterion. Possible choices are
        /// <see cref="EarlyStoppingRuleBase"/>'s implementations such as <see cref="TolerantEarlyStoppingRule"/> and <see cref="GeneralityLossRule"/>.
        /// </summary>
        public EarlyStoppingRuleBase EarlyStoppingRule
        {
            get { return _earlyStoppingRuleBase;  }
            set
            {
                _earlyStoppingRuleBase = value;
                EarlyStoppingRuleFactory = _earlyStoppingRuleBase.BuildFactory();
            }
        }

You can see that EarlyStoppingRuleFactory (used in old infra) is exposed to users by adding EarlyStoppingRule.

@wschin wschin added the API Issues pertaining the friendly API label Mar 5, 2019
@wschin wschin self-assigned this Mar 5, 2019
@Ivanidzo4ka
Copy link
Contributor

@zeahmed @ganik You both working on same problem in Loss function and Text featurizer. Would be nice if we stay sync regarding solution.

@wschin wschin force-pushed the polish-early-stop-fast-tree branch from f2da1e7 to 5a6c9c6 Compare March 5, 2019 18:54
@@ -6622,7 +6622,7 @@
"Default": "GradientDescent"
},
{
"Name": "EarlyStoppingRule",
"Name": "EarlyStoppingRuleFactory",
"Type": {
"Kind": "Component",
"ComponentKind": "EarlyStoppingCriterion"
Copy link
Member

@ganik ganik Mar 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ComponentKind": "EarlyStoppingCriterion" [](start = 12, length = 41)

Manifest is broken. Here is a reference to ComponenKInd "EarlyStoppingCriterion" but the Components for this component kind is removed from manifest. This would cause Entrypoint Compiler to fail to figure out fitting components during entrypoint graph parse #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No (at least at iteration 3), those definitions of EarlyStoppingCriterion are still there. Please try searching for "EarlyStoppingCriterion".


In reply to: 262645293 [](ancestors = 262645293)

@@ -482,9 +482,12 @@ public enum OptimizationAlgorithmType { GradientDescent, AcceleratedGradientDesc
/// <summary>
/// Early stopping rule. (Validation set (/valid) is required).
/// </summary>
[BestFriend]
[Argument(ArgumentType.Multiple, HelpText = "Early stopping rule. (Validation set (/valid) is required.)", ShortName = "esr", NullName = "<Disable>")]
Copy link
Member

@ganik ganik Mar 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add attribute Name = "EarlyStoppingRule" this would keep core_manifest.json same and will avoid big changes in NimbusML #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed but I'd expect tons of breaking changes to entry-point had happened somewhere.


In reply to: 262650819 [](ancestors = 262650819)

@codecov
Copy link

codecov bot commented Mar 5, 2019

Codecov Report

Merging #2851 into master will increase coverage by <.01%.
The diff coverage is 59.82%.

@@            Coverage Diff             @@
##           master    #2851      +/-   ##
==========================================
+ Coverage    71.7%    71.7%   +<.01%     
==========================================
  Files         810      810              
  Lines      142473   142542      +69     
  Branches    16111    16121      +10     
==========================================
+ Hits       102159   102210      +51     
- Misses      35889    35900      +11     
- Partials     4425     4432       +7
Flag Coverage Δ
#Debug 71.7% <59.82%> (ø) ⬆️
#production 67.93% <57.27%> (ø) ⬆️
#test 85.91% <100%> (ø) ⬆️
Impacted Files Coverage Δ
....Core.Tests/UnitTests/TestEarlyStoppingCriteria.cs 100% <100%> (ø) ⬆️
test/Microsoft.ML.Functional.Tests/Validation.cs 100% <100%> (ø) ⬆️
src/Microsoft.ML.FastTree/FastTree.cs 80.75% <100%> (ø) ⬆️
src/Microsoft.ML.FastTree/FastTreeRanking.cs 48.19% <20%> (-0.24%) ⬇️
src/Microsoft.ML.FastTree/FastTreeTweedie.cs 56.29% <20%> (-0.6%) ⬇️
...soft.ML.FastTree/Training/EarlyStoppingCriteria.cs 71.73% <56.62%> (-11.47%) ⬇️
src/Microsoft.ML.FastTree/FastTreeRegression.cs 54.5% <80%> (+0.56%) ⬆️
src/Microsoft.ML.FastTree/FastTreeArguments.cs 85.38% <80%> (-0.22%) ⬇️
src/Microsoft.ML.FastTree/BoostingFastTree.cs 75.24% <83.33%> (+0.5%) ⬆️
...StandardLearners/Standard/LinearModelParameters.cs 60.63% <0%> (-0.27%) ⬇️
... and 2 more

@wschin wschin requested a review from rogancarr March 5, 2019 21:07
/// <summary>
/// Create <see cref="IEarlyStoppingCriterionFactory"/> for supporting legacy infra built upon <see cref="IComponentFactory"/>.
/// </summary>
internal abstract IEarlyStoppingCriterionFactory BuildFactory();
Copy link
Member

@ganik ganik Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal abstract IEarlyStoppingCriterionFactory BuildFactory(); [](start = 8, length = 64)

How would 3rd party devs create their own EarlyStoppingRuleBase implementations to supply into FastTree algos? This abstract internal method will prevent that #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Component authorship is not a goal for v1, even at the level of people being able to create their own ITransformer implementations. This is fine. Inf act I would go even further by making this thing have a private protected constructor.


In reply to: 262757300 [](ancestors = 262757300)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good Tom. I also think creating a proper stopping rule is a challenge to many C# developers.


In reply to: 263023672 [](ancestors = 263023672,262757300)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case if you have only restricted set of EarlyStoppingRule - why dont you go with enum? I dont understand creating such a complicated code where enum will suffice


In reply to: 263048215 [](ancestors = 263048215,263023672,262757300)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have explained that this can't be enum in the original issue (#2520). Stopping rule itself can contain states, parameters, and so on; it's more than an enum.


In reply to: 263057699 [](ancestors = 263057699,263048215,263023672,262757300)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to discuss this further. You are exposing EarlyStoppingRule and dont allow devs to use it. If you would decouple Factory from Class/Interface you would get a smaller code change and will allow 3rd party devs create their own stopping rule - same as I did for LossFunction


In reply to: 263057699 [](ancestors = 263057699,263048215,263023672,262757300)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make a note for our internal discussion --- public area should be as small as possible while all vital functions are still maintained. Thus, we don't allow the implementation of this base class.


In reply to: 263063854 [](ancestors = 263063854,263057699,263048215,263023672,262757300)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok as far as this pattern is an exception and doesnt propagate to things like LossFunction


In reply to: 263067053 [](ancestors = 263067053,263063854,263057699,263048215,263023672,262757300)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not affect LossFunction because it's an interface. If LossFunction is a class, the same pattern may apply. As Tom and Ivan have mentioned somewhere, public area should be as small as possible. On the other hand, another pattern here may affect LossFunction where we have a property to hide a field.


In reply to: 263068114 [](ancestors = 263068114,263067053,263063854,263057699,263048215,263023672,262757300)

@wschin wschin force-pushed the polish-early-stop-fast-tree branch from 7470bb7 to 5b071bf Compare March 6, 2019 17:47
Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @wschin !

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@wschin wschin merged commit 887aad2 into dotnet:master Mar 6, 2019
@wschin wschin deleted the polish-early-stop-fast-tree branch March 6, 2019 22:24
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants