-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Samples for CustomMapping, IndicateMissingValues, ReplaceMissingValues #3216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSample.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSample.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSample.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSample.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSampleSaveAndLoad.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/ReplaceMissingValuesMultiColumn.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/ReplaceMissingValuesMultiColumn.cs
Outdated
Show resolved
Hide resolved
// ReplaceMissingValues is used to create a column where missing values are replaced according to the ReplacementMode. | ||
var defaultPipeline = mlContext.Transforms.ReplaceMissingValues(new[] { | ||
new InputOutputColumnPair("MissingReplaced1", "Features1"), | ||
new InputOutputColumnPair("MissingReplaced2", "Features2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: we don't have an in-place transformation multicolumn sample. can we transform the data in-place and not have MissingReplaced1/MissingReplaced2?
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/ReplaceMissingValuesMultiColumn.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/ReplaceMissingValuesMultiColumn.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/ReplaceMissingValuesMultiColumn.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/ReplaceMissingValuesMultiColumn.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSample.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSample.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSample.cs
Outdated
Show resolved
Hide resolved
Thank you for the review comments, I have update the code accordingly. |
<PropertyGroup> | ||
<TargetFramework>netcoreapp2.1</TargetFramework> | ||
<OutputType>Exe</OutputType> | ||
<AssemblyOriginatorKeyFile>$(ToolsDir)Test.snk</AssemblyOriginatorKeyFile> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default the assemblies are signed using Open.snk. However, if we want to register an assembly in the component catalog using mlContext.ComponentCatalog.RegisterAssembly()
, the assembly needs to pass the following condition:
machinelearning/src/Microsoft.ML.Core/ComponentModel/ComponentCatalog.cs
Lines 1025 to 1034 in b861b5d
private static bool CanContainExtensions(Assembly assembly) | |
{ | |
if (assembly.FullName.StartsWith("Microsoft.ML.", StringComparison.Ordinal) | |
&& HasMLNetPublicKey(assembly)) | |
{ | |
return false; | |
} | |
return true; | |
} |
So it needs to have a different signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with Eric, it could be even simpler not to sign this assembly. We don't need strong naming for the Samples assembly as it should not be referenced by external code anyways.
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMapping.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMapping.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/IndicateMissingValues.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/IndicateMissingValues.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/IndicateMissingValues.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/IndicateMissingValuesMultiColumn.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/ReplaceMissingValues.cs
Outdated
Show resolved
Hide resolved
// Features: [-1, 2, -3] MissingReplaced: [-1, 2, -3] | ||
// Features: [-1, NaN, -3] MissingReplaced: [-1, 0, -3] | ||
|
||
// Mean ReplaceMode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Mean ReplaceMode: [](start = 12, length = 20)
Isn't there one more mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also maximum and minimum, but I don't think it adds much to add them to this sample.
In reply to: 273727261 [](ancestors = 273727261)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the documentation, if we have more than one, we should have all of them — right now, it looks inconsistent or incomplete.
In reply to: 273735371 [](ancestors = 273735371,273727261)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather remove the mean replacement mode in this sample then.
We are not trying to exhaust all the possible settings in our samples. Would that be fine?
In reply to: 273736506 [](ancestors = 273736506,273735371,273727261)
private class TransformedData : InputData | ||
{ | ||
public bool IsUnderThirty { get; set; } | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra empty line.
docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/CustomMappingSaveAndLoad.cs
Show resolved
Hide resolved
// Expected output: | ||
// Features1: [1, 1, 0] MissingIndicator1: [False, False, False] Features2: [1, 1] MissingIndicator2: [False, False] | ||
// Features1: [0, NaN, 1] MissingIndicator1: [False, True, False] Features2: [NaN, 1] MissingIndicator2: [True, False] | ||
// Features1: [-1, NaN, -3] MissingIndicator1: [False, True, False] Features2: [1, ∞] MissingIndicator2: [False, False] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT - spacing for MissingIndicator doesnt align with the above lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #3216 +/- ##
==========================================
- Coverage 72.62% 72.62% -0.01%
==========================================
Files 807 807
Lines 145080 145080
Branches 16213 16213
==========================================
- Hits 105369 105365 -4
- Misses 35294 35298 +4
Partials 4417 4417
|
Related to #1209
Fixes #3117
Made samples for the multi-column setting of:
ReplaceMissingValues
IndicateMissingValues
I also made a sample to save and load the
CustomMapping
estimator.