-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Converted normalizers to be estimators #797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Column = new[]{ | ||
new TextLoader.Column("float1", DataKind.R4, 1), | ||
new TextLoader.Column("float4", DataKind.R4, new[]{new TextLoader.Range(1, 4) }), | ||
new TextLoader.Column("float4", DataKind.R4, new[]{new TextLoader.Range(1, 4) }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it on purpose to have two float4? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public readonly string Output; | ||
public readonly long MaxTrainingExamples; | ||
|
||
protected ColumnBase(string input, string output, long maxTrainingExamples) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
protected [](start = 12, length = 9)
Is there any reason not to make this and all these new public abstract classes you're creating subclassable by literally anyone? I don't see so, at least in the code written here. So how about a nice private protected
to ease your conscience? :) #Resolved
@@ -13,22 +13,21 @@ | |||
using Microsoft.ML.Runtime.Model.Pfa; | |||
using Microsoft.ML.Runtime.Internal.Internallearn; | |||
using Newtonsoft.Json.Linq; | |||
using Microsoft.ML.Runtime.EntryPoints; | |||
using System.Linq; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how we start with system namespaces, then we get into Microsoft namespaces, then towards the end we go back to system. It sort of reminds me of one of those stories where the protagonist starts at home, goes off to the big city, then comes right back where they started after many adventures, but everything's a bit different. So please keep this. #Resolved
/// </summary> | ||
public delegate void SignatureLoadColumnFunction(ModelLoadContext ctx, IHost host, ColumnType typeSrc); | ||
|
||
public interface IColumnFunctionBuilder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IColumnFunctionBuilder [](start = 21, length = 22)
I wonder if we can make all these interfaces internal. I know the idea was "hey we'll eventually have a bunch of transforms use this stuff" but that's never going to happen. As it is just keeping it public just sort of muddies the waters. Then maybe we can nuke them if ever we decide to reimplement normalizers to vaguely resemble how all the other transforms are implemented. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Zruty0 ... looks mostly good, except I hope we can lock down some of what is currently public for obvious reasons.
NormalizeTransform.BinNormalizerUserName, "BinNormalizer", NormalizeTransform.BinNormalizerShortName)] | ||
|
||
[assembly: LoadableClass(NormalizeTransform.SupervisedBinNormalizerSummary, typeof(NormalizeTransform), typeof(NormalizeTransform.SupervisedBinArguments), typeof(SignatureDataTransform), | ||
NormalizeTransform.SupervisedBinNormalizerUserName, "SupervisedBinNormalizer", NormalizeTransform.SupervisedBinNormalizerShortName)] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why is this going away? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This guy needs label column for training. I think I'll have a separate estimator for it, so that I don't have to pollute the 'normal' normalizers.
In reply to: 214791608 [](ancestors = 214791608)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, 'transformer'. I think we should instead get rid of this comment, since this method is not going to be in the public API. In reply to: 418247722 [](ancestors = 418247722) Refers to: src/Microsoft.ML.Data/Transforms/NormalizeColumn.cs:250 in c4cd6b7. [](commit_id = c4cd6b7, deletion_comment = False) |
Normalizers (except SupervisedBinning) become estimators.