Skip to content

Add cancellation signal checkpoints in FastTree. #3028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 22, 2019

Conversation

codemzs
Copy link
Member

@codemzs codemzs commented Mar 20, 2019

fixes #3027

Please read the issue before reviewing this PR.

@codecov
Copy link

codecov bot commented Mar 20, 2019

Codecov Report

Merging #3028 into master will increase coverage by 0.07%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3028      +/-   ##
==========================================
+ Coverage   72.41%   72.49%   +0.07%     
==========================================
  Files         803      804       +1     
  Lines      143851   144086     +235     
  Branches    16173    16179       +6     
==========================================
+ Hits       104171   104452     +281     
+ Misses      35258    35221      -37     
+ Partials     4422     4413       -9
Flag Coverage Δ
#Debug 72.49% <100%> (+0.07%) ⬆️
#production 68.13% <100%> (+0.04%) ⬆️
#test 88.69% <ø> (+0.08%) ⬆️
Impacted Files Coverage Δ
.../TreeLearners/FastForestLeastSquaresTreeLearner.cs 88.57% <100%> (ø) ⬆️
src/Microsoft.ML.FastTree/RandomForest.cs 100% <100%> (ø) ⬆️
src/Microsoft.ML.FastTree/BoostingFastTree.cs 75.49% <100%> (+0.24%) ⬆️
.../TreeLearners/LeastSquaresRegressionTreeLearner.cs 57.59% <100%> (+0.24%) ⬆️
src/Microsoft.ML.FastTree/FastTree.cs 80.67% <100%> (+0.01%) ⬆️
src/Microsoft.ML.Maml/MAML.cs 24.75% <0%> (-1.46%) ⬇️
...soft.ML.Data/DataView/DataViewConstructionUtils.cs 85.27% <0%> (-0.9%) ⬇️
.../Standard/LogisticRegression/LbfgsPredictorBase.cs 71.26% <0%> (ø) ⬆️
src/Microsoft.ML.FastTree/TreeTrainersCatalog.cs 94.18% <0%> (ø) ⬆️
...soft.ML.Data/DataLoadSave/DataOperationsCatalog.cs 73.23% <0%> (ø) ⬆️
... and 12 more

@codemzs codemzs requested a review from wschin March 20, 2019 17:36
@codemzs codemzs requested review from sfilipi and shauheen March 21, 2019 21:44
@TomFinley
Copy link
Contributor

TomFinley commented Mar 22, 2019

Hi @codemzs this looks pretty good I think. While doing this check during the histogram calcaulation is good, one area that can take a significant amount of time that is not covered at all by this change as far as I see is the dataset preparation -- your analysis from the issue might not have captured it, but it does on many datasets take a significant amount of time, and happening, as it does, at the start of the operation it is one of the most likely places where a cancellation might actually occur. What do you think?

@codemzs
Copy link
Member Author

codemzs commented Mar 22, 2019

@TomFinley You are right. My analysis was done without disk transpose so I missed the transpose function but now I have added checkpoints there.

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, thank you @codemzs !!

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@codemzs codemzs merged commit 1f6b3be into dotnet:master Mar 22, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cancellation checkpoints in FastTree
3 participants