-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Projection documentation #3232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Projection documentation #3232
Conversation
// Convert training data to IDataView, the general data type used in ML.NET. | ||
var data = mlContext.Data.LoadFromEnumerable(samples); | ||
// NormalizeLpNorm normalize rows individually by rescaling them to unit norm. | ||
// Performs the following operaion on a row X: Y = (X - M) / D where M is mean, and D is selected norm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the selected norm? Is it norm of the feature vector in a row being processed? Also, what are the shapes of X, Y, M, and D? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mean
-> mean vector
D is selected norm
-> D is calculated value of selected norm parameter
Does that sound better?
In reply to: 273152443 [](ancestors = 273152443)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Norm on what? A column? A row? Is it a scalar? In tensor computation, norm operation can produce another tensor.
Say, if I have rows, x_1, x_2, x_3. Is M=1/3 (x_1 + x_2 + x_3) true? Or M=ReduceSum(x_i) for the x subscripted by i? In addition, is D=||x_i||_2 for the x subscripted by i?
In reply to: 273153905 [](ancestors = 273153905,273152443)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Please move the final answer to what NormalizeLpNorm does' to the section for the estimator.
In reply to: 273181155 [](ancestors = 273181155,273153905,273152443)
{ | ||
public static class ApproximatedKernelMap | ||
{ | ||
public static void Example() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static void Example() | |
// Transform feature vector to another non-linear space. See https://people.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf. | |
public static void Example() |
This transform is non-trivial, so some references are required. #Resolved
|
||
private class DataPoint | ||
{ | ||
[VectorType(7)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7 [](start = 24, length = 1)
It shouldn't work! #Resolved
|
||
private class DataPoint | ||
{ | ||
[VectorType(7)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it shouldn't work! #Resolved
//-0.0119, 0.5867, 0.4942, 0.7041 | ||
// 0.4720, 0.5639, 0.4346, 0.2671 | ||
//-0.2243, 0.7071, 0.7053, -0.1681 | ||
// 0.0846, 0.5836, 0.6575, 0.0581 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move these lines below the foreach
loop and use:
// Expected output:
Could you do the same for the other files? #Resolved
Codecov Report
@@ Coverage Diff @@
## master #3232 +/- ##
==========================================
- Coverage 72.62% 72.62% -0.01%
==========================================
Files 807 807
Lines 145080 145080
Branches 16213 16213
==========================================
- Hits 105369 105365 -4
- Misses 35294 35297 +3
- Partials 4417 4418 +1
|
@@ -279,7 +279,7 @@ internal static LpNormNormalizingEstimator NormalizeLpNorm(this TransformsCatalo | |||
/// <example> | |||
/// <format type="text/markdown"> | |||
/// <] | |||
/// [!code-csharp[GlobalContrastNormalize](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/NormalizeGlobalContrast.cs)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GlobalContrastNormalize [](start = 26, length = 23)
NormalizeGlobalContrast #Resolved
@@ -249,7 +249,7 @@ public static class NormalizationCatalog | |||
/// <example> | |||
/// <format type="text/markdown"> | |||
/// <] | |||
/// [!code-csharp[LpNormalize](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/NormalizeLpNorm.cs)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LpNormalize [](start = 26, length = 11)
NormalizeLpNorm #Resolved
// Performs the following operaion on a row X: Y = (X - M(X)) / D(X) | ||
// where M(X) is scalar value of mean for current row, | ||
// and D(X) is scalar value of selected `norm` parameter . | ||
var approximation = mlContext.Transforms.NormalizeLpNorm("Features", norm: LpNormNormalizingEstimatorBase.NormFunction.L1, ensureZeroMean: true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensureZeroMean [](start = 135, length = 14)
What does EnsureZeroMean do? Subtract the mean? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move parameter details to xml docstring.
In reply to: 273741392 [](ancestors = 273741392,273740225)
@@ -26,7 +26,7 @@ public static class KernelExpansionCatalog | |||
/// <example> | |||
/// <format type="text/markdown"> | |||
/// <] | |||
/// [!code-csharp[CreateRandomFourierFeatures](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/ApproximatedKernelMap.cs)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateRandomFourierFeatures [](start = 26, length = 27)
ApproximatedKernelMap #Resolved
foreach (var row in column) | ||
Console.WriteLine(string.Join(", ", row.Select(x => x.ToString("f4")))); | ||
// Expected output: | ||
// -0.0119, 0.5867, 0.4942, 0.7041 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[](start = 14, length = 1)
Space Space #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to align numbers, so one space was taken by minus sign.
In reply to: 273740487 [](ancestors = 273740487)
// NormalizeLpNorm normalize rows individually by rescaling them to unit norm. | ||
// Performs the following operaion on a row X: Y = scale *(X - M(X)) / D(X) | ||
// where M(X) is scalar value of mean for current row, | ||
// and D(X) is scalar value of either Standard deviation or L2 norm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment looks like a copy/paste holdover. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}; | ||
// Convert training data to IDataView, the general data type used in ML.NET. | ||
var data = mlContext.Data.LoadFromEnumerable(samples); | ||
// NormalizeLpNorm normalize rows individually by rescaling them to unit norm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NormalizeLpNorm [](start = 15, length = 15)
old name? #Resolved
}; | ||
// Convert training data to IDataView, the general data type used in ML.NET. | ||
var data = mlContext.Data.LoadFromEnumerable(samples); | ||
// NormalizeLpNorm normalize rows individually by rescaling them to unit norm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normalize [](start = 31, length = 9)
normalizes #Resolved
// NormalizeLpNorm normalize rows individually by rescaling them to unit norm. | ||
// Performs the following operaion on a row X: Y = scale *(X - M(X)) / D(X) | ||
// where M(X) is scalar value of mean for current row if ensureZeroMean = true or 0 othewise | ||
// and D(X) is scalar value of either Standard deviation or L2 norm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's actually drop detailed algorithm descriptions inside examples. such details belong to the section and we don't want to repeat them again here. #Resolved
@@ -276,10 +281,16 @@ internal static LpNormNormalizingEstimator NormalizeLpNorm(this TransformsCatalo | |||
/// <param name="ensureZeroMean">If <see langword="true"/>, subtract mean from each value before normalizing and use the raw input otherwise.</param> | |||
/// <param name="ensureUnitStandardDeviation">If <see langword="true"/>, resulted vector's standard deviation would be one. Otherwise, resulted vector's L2-norm would be one.</param> | |||
/// <param name="scale">Scale features by this value.</param> | |||
/// <remarks> | |||
/// This transform performs the following operation on a row X: Y = scale * (X - M(X)) / D(X) | |||
/// where M(X) is scalar value of mean for current row if <paramref name="ensureZeroMean"/>set to <see langword="true"/> or <value>0</value> othewise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// where M(X) is scalar value of mean for current row if <paramref name="ensureZeroMean"/>set to <see langword="true"/> or <value>0</value> othewise | |
/// where M(X) is scalar value of mean for all elements in the current row if <paramref name="ensureZeroMean"/>set to <see langword="true"/> or <value>0</value> othewise | |
``` #Resolved |
/// This transform performs the following operation on a row X: Y = scale * (X - M(X)) / D(X) | ||
/// where M(X) is scalar value of mean for current row if <paramref name="ensureZeroMean"/>set to <see langword="true"/> or <value>0</value> othewise | ||
/// D(X) is scalar value of standard deviation for row if <paramref name="ensureUnitStandardDeviation"/> set to <see langword="true"/> or | ||
/// L2 norm value for this row if it set to <see langword="false"/> and scale is <paramref name="scale"/>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// L2 norm value for this row if it set to <see langword="false"/> and scale is <paramref name="scale"/>. | |
/// L2 norm of this row vector if <paramref name="ensureUnitStandardDeviation"/> set to <see langword="false"/>. "scale" is defined by <paramref name="scale"/>. | |
``` #Resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Towards #1209