-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Converted LpNorm, GcNorm and Whitening transforms into transformers/estimators… #961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -88,5 +88,40 @@ public void TextTokenizationWorkout() | |||
CheckEquality("Text", "tokenized.tsv"); | |||
Done(); | |||
} | |||
|
|||
[Fact] | |||
public void LpGcNormAndWhiteningWorkout() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LpGcNormAndWhiteningWorkout [](start = 20, length = 27)
Test is failing right now. There is data mismatch. Looking into it. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <param name="subMean">Subtract mean from each value before normalizing.</param> | ||
/// <param name="useStdDev">Normalize by standard deviation rather than L2 norm.</param> | ||
/// <param name="scale">Scale features by this value.</param> | ||
public static Vector<float> GcNormalize(this Vector<float> input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GcNormalize [](start = 36, length = 11)
While 'Lp' is fine (it's a mathematical notation L_p), 'Gc' is probably not. 'GlobalContrast' ? Does it even make sense to have both as extensions though? They seem to share a lot in common #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they have different set of parameters and mixing them into one would not look good. I have renamed GcNormalize
to GlobalContrastNormalize
.
In reply to: 219260214 [](ancestors = 219260214)
/// <param name="maxRows">Max number of rows.</param> | ||
/// <param name="saveInverse">Whether to save inverse (recovery) matrix.</param> | ||
/// <param name="pcaNum">PCA components to retain.</param> | ||
public static Vector<float> Whiten(this Vector<float> input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whiten [](start = 36, length = 6)
this sounds like a wacky thing to do to an arbitrary float array, is it not?
Do you think there is a good name that's more elaborate?
And if not, maybe consider not having the pigstension at all? #Resolved
var dataSource = new MultiFileSource(dataPath); | ||
|
||
var reader = TextLoader.CreateReader(env, | ||
c => (label: c.LoadFloat(11), features: c.LoadFloat(0, 10)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
11 [](start = 41, length = 2)
these are 0-based, are you sure? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are i think.
features are from 0-10 and label is 11. Do you mean something else?
In reply to: 219263470 [](ancestors = 219263470)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @zeahmed !
This PR fixes #959