-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Binary FastTree/Forest samples using T4 templates. #3035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
yield return new DataPoint | ||
{ | ||
Label = label, | ||
// Create random features that are correlated with label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add Note that data points with Label=false tend to have larger feature values.
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This example requires installation of additional NuGet package | ||
// <a href="https://www.nuget.org/packages/Microsoft.ML.FastTree/">Microsoft.ML.FastTree</a>. | ||
public static void Example() | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to execute those Example()
s in a test? Just want to make sure they are always executable. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var mlContext = new MLContext(seed: 0); | ||
|
||
// Create a list of training examples. | ||
var examples = GenerateRandomDataPoints(1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
examples [](start = 16, length = 8)
dataPoints? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
|
||
// Class used to capture predictions. | ||
private class Prediction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prediction [](start = 22, length = 10)
Not sure if we want to produce probability
for calibrated models or socre
for non-calibrated models. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had probabilities originally. But was surprised that FastForest wasn't calibrated and FastTree was. So for the sake of sharing the same T4 template and not having too much conditional formatting I'm using PredictedLabel which all binary trainers are guaranteed to produce. Scores are a bit confusing because it's not clear what -50 or +40.1 mean without calibration.
In reply to: 267449313 [](ancestors = 267449313)
public static class FastTreeWithOptions | ||
{ | ||
// This example requires installation of additional NuGet package | ||
// <a href="https://www.nuget.org/packages/Microsoft.ML.FastTree/">Microsoft.ML.FastTree</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very useful comment! #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! My only major concern is that we might need tests to make sure those samples can always be executed correctly in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks great now. I think .ttinclude can be made more generic as we create more samples. Thanks for doing it @shmoradims.
Codecov Report
@@ Coverage Diff @@
## master #3035 +/- ##
==========================================
+ Coverage 72.41% 72.41% +<.01%
==========================================
Files 803 803
Lines 143851 143851
Branches 16173 16173
==========================================
+ Hits 104171 104172 +1
+ Misses 35258 35257 -1
Partials 4422 4422
|
Related to #2522. The *.cs files are auto-generated. Please review the .tt and .ttinclude files.