-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Checking in the samples generated during bug bash for MissingNa, Repl… #2960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
var samples = new List<DataPoint>() | ||
{ | ||
new DataPoint(){ Label = 3, Features = new float[3] {1, 1, 0} }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Label
column is not used in the following transform, we can remove it completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have in-memory class definition, in-memory data set creation, in-memory prediction. What else I can ask?
// 'true' where the value in the input column is NaN. This value can be used | ||
// to replace missing values with other values. | ||
|
||
IEstimator<ITransformer> pipeline = mlContext.Transforms.IndicateMissingValues("MissingIndicator", "Features"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IEstimat [](start = 12, length = 8)
Blank line above #Resolved
|
||
// a small printing utility | ||
Func<object[], string> vectorPrinter = (object[] vector) => | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Break out of main code path and into a helper. #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the main logic is above. Breaking out would just change the order of what comes first to the attention of the users: the definition of printing or printing itself..
In reply to: 265789218 [](ancestors = 265789218)
// And finally, we can write out the rows of the dataset, looking at the columns of interest. | ||
foreach (var row in rowEnumerable) | ||
{ | ||
Console.WriteLine($"Label: {row.Label} Features: {vectorPrinter(row.Features.Cast<object>().ToArray())} MissingIndicator: {vectorPrinter(row.MissingIndicator.Cast<object>().ToArray())}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.Cast().ToArray() [](start = 92, length = 25)
This is a bit confusing for a sample, IMHO. Maybe better to just have two helper functions? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels self-explanatory since it casts, than ToArray. Addign yet another sample that does the same thing might make the sample look less professional.
In reply to: 265789625 [](ancestors = 265789625)
{ | ||
// Create a new ML context, for ML.NET operations. It can be used for exception tracking and logging, | ||
// as well as the source of randomness. | ||
var ml = new MLContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ml [](start = 16, length = 2)
mlContext #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits!
Codecov Report
@@ Coverage Diff @@
## master #2960 +/- ##
==========================================
+ Coverage 72.29% 72.29% +<.01%
==========================================
Files 796 796
Lines 142349 142349
Branches 16051 16051
==========================================
+ Hits 102905 102908 +3
+ Misses 35063 35062 -1
+ Partials 4381 4379 -2
|
@@ -6,7 +6,7 @@ internal static class Program | |||
{ | |||
static void Main(string[] args) | |||
{ | |||
CustomMapping.Example(); | |||
ReplaceMissingValues.Example(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReplaceMissingValues [](start = 12, length = 20)
please don't change this file. It creates unnecessary merge conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Towards #1209
Gathering the work of PR: #2814, #2779 and #2773