-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Towards #3204 - Documentation for MLContext.Transforms.Categorical #3388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3388 +/- ##
==========================================
+ Coverage 72.69% 72.7% +<.01%
==========================================
Files 807 807
Lines 145172 145171 -1
Branches 16225 16225
==========================================
+ Hits 105533 105542 +9
+ Misses 35223 35217 -6
+ Partials 4416 4412 -4
|
Codecov Report
@@ Coverage Diff @@
## master #3388 +/- ##
=======================================
Coverage 72.76% 72.76%
=======================================
Files 808 808
Lines 145452 145452
Branches 16244 16244
=======================================
Hits 105842 105842
Misses 35189 35189
Partials 4421 4421
|
nit: There are some very long line in this document could you break them up? #Resolved Refers to: src/Microsoft.ML.Transforms/CategoricalCatalog.cs:30 in 75fb8c4. [](commit_id = 75fb8c4, deletion_comment = False) |
75fb8c4
to
ae63aea
Compare
/// | ||
/// The output of this transform is specified by <xref:Microsoft.ML.Transforms.OneHotEncodingEstimator.OutputKind>: | ||
/// | ||
/// - <xref:Microsoft.ML.Transforms.OneHotEncodingEstimator.OutputKind.Indicator> produces an [indicator vector](https://en.wikipedia.org/wiki/Indicator_vector). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [](start = 8, length = 1)
just checking: is this valid for markdown lists? I've always used *
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both produce the same bullet
/// - <xref:Microsoft.ML.Transforms.OneHotEncodingEstimator.OutputKind.Key> produces keys in a <xref:Microsoft.ML.Data.KeyDataViewType> column. | ||
/// If the input column is a vector, the output contains a vectory [keys](xref:Microsoft.ML.Data.KeyDataViewType), where each slot of the | ||
/// vector corresponds to the respective slot of the input vector. | ||
/// If a category is not found in the bulit dictionary, it is assigned the value zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zero [](start = 85, length = 4)
suggest adding: which represents the key missing from the dictionary.
The mutual information of two random variables X and Y is a measure of the mutual dependence between the variables. | ||
Formally, the mutual information can be written as: | ||
</para> | ||
<para>I(X;Y) = E[log(p(x,y)) - log(p(x)) - log(p(y))]</para> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't there a latex formula that Senja has created just recently for this? https://github.com/dotnet/machinelearning/pull/3448/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't applicable to the categorical transforms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation for Categorical transforms using template given in #3204 and implemented in #3316.