-
Notifications
You must be signed in to change notification settings - Fork 1.9k
More Normalizer Scrubbing #2888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
8c0dd58
ed6e8b3
307df54
30d953e
59168be
aa8e8dc
7c6ff1d
750fbe5
8f96b8b
755a85c
3610204
9340cd0
a9570bb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -55,7 +55,7 @@ public static void Example() | |
//0.165 0.117 -0.547 0.014 | ||
|
||
// A pipeline to project Features column into L-p normalized vector. | ||
var lpNormalizePipeline = ml.Transforms.LpNormalize(nameof(SamplesUtils.DatasetUtils.SampleVectorOfNumbersData.Features), normKind: Transforms.LpNormalizingEstimatorBase.NormFunction.L1); | ||
var lpNormalizePipeline = ml.Transforms.LpNormalize(nameof(SamplesUtils.DatasetUtils.SampleVectorOfNumbersData.Features), norm: Transforms.LpNormalizingEstimatorBase.NormFunction.L1); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We should be using long-form names here. #Resolved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. In reply to: 264826528 [](ancestors = 264826528) |
||
// The transformed (projected) data. | ||
transformedData = lpNormalizePipeline.Fit(trainData).Transform(trainData); | ||
// Getting the data of the newly created column, so we can preview it. | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the field, we usually use the term
Standardize
to reflect this normalization technique. This is very "statistics-y", but it does seem to be standard. How would everyone feel about changing "MeanVariance" to "Standardize", or at least offer a "Standardize" alias? #ResolvedUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel MVN is super bad because it's already an operator in neural networks (e.g., ONNX, Caffe, CoreML).
Just for references:
https://apple.github.io/coremltools/coremlspecification/sections/NeuralNetwork.html#meanvariancenormalizelayerparams
https://github.com/onnx/onnx/blob/master/docs/Operators.md#MeanVarianceNormalization
In reply to: 264826182 [](ancestors = 264826182)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Plus it's more precise than
Standardize
. Let's keep it MVN. We can always add an alias forStandardize
if people are up in arms.In reply to: 264891367 [](ancestors = 264891367,264826182)