-
Notifications
You must be signed in to change notification settings - Fork 1.9k
OneVersusAllModelParameters Strongly Typed #3928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Contracts.Check(IsValid(vm, ref inputType), "Predictor doesn't implement the expected interface"); | ||
_mappers[i] = vm; | ||
_mappers[i] = vm as IValueMapperDist; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you see a cast which must be successful, please use (T) vm
rather than vm as T
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 591 uses the Contracts.Check
to determine if the cast was successful. In fact, the cast is already happening on line 590, so I don't need to do it here again. I will remove the cast there. Does that work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
@@ -227,24 +222,25 @@ public override MulticlassPredictionTransformer<OneVersusAllModelParameters> Fit | |||
var transformer = TrainOne(ch, Trainer, td, i); | |||
featureColumn = transformer.FeatureColumnName; | |||
} | |||
predictors[i] = TrainOne(ch, Trainer, td, i).Model as T; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predictors[i] = TrainOne(ch, Trainer, td, i).Model as T; | |
predictors[i] = (T)TrainOne(ch, Trainer, td, i).Model; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also happens on line 152, I will fix it in both locations.
Log-loss: 0.393949 | ||
Log-loss reduction: 0.637753 | ||
Precision ||0.8750 |0.0000 |0.5091 | | ||
Accuracy(micro-avg): 0.620253 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This drop is large. I suppose that this change doesn't dramatically affect its prediction quality... Any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before it was taking an uncalibrated trainer, calibrating it automatically, and then using the calibrated trainer to get these results. With the changes in this PR, it no longer automatically calibrates a trainer. So these results are using the raw values instead of calibrated ones. That is why the drop happens. If you passed in a calibrated trainer, the output would be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know calibrator no longer presents. As a post-processing step, a calibrator should not affect the relative rank of scores. Say, we have three classes, A, B, and C. If a calibrator is appended, their scores might be 0.1, 0.5, 0.8. Without a calibrator, the scores might become -100, 100, 10000. If my assumption is correct, there should be no accuracy change. Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this behavior is expected, but I will look more into this and see what I find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So after doing more research, this change matches what is currently in the master branch.
The current file in master is running with a calibrated learner. To confirm my changes are not impacting the accuracy, checked out master, changed the tests that create these to not use probabilities, and ran it. The output matches from master without probabilities matches exactly with this file here.
To repro this, run the MulticlassReductionTest
in the TestPredictors
class. Make sure that you update TestLearners.Ova
and TestLearners.OvaWithFastForest
to not use probabilities, like so:
SubComponent("OVA", "p=AvgPer{ lr=0.8 } useProb=-")
and SubComponent("OVA", "p=FastForest{ } useProb=-")
. These are found in the Learners.cs class. If these numbers are not what they should be, then there is a bug in the non calibrated version of these trainers, but if it is a bug it exists in master already and my code is not changing it.
@@ -284,7 +284,7 @@ private protected static TCalibrator GetCalibrator(IHostEnvironment env, ModelLo | |||
} | |||
} | |||
|
|||
internal abstract class ValueMapperCalibratedModelParametersBase<TSubModel, TCalibrator> : | |||
public abstract class ValueMapperCalibratedModelParametersBase<TSubModel, TCalibrator> : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and changes below expose some very complicated classes to the users. From my understanding, the major thing to be exposed in OVA is TModel[]
and CalibratedModelParametersBase<TSubModel, TCalibrator>
, not the classes here. Is it possible to keep those classes here internal? I know some trainers produces ValueMapperCalibratedModelParametersBase
but it can be casted to CalibratedModelParametersBase<TSubModel, TCalibrator>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I have changed it back to internal and made everywhere public use the CalibratedModelParametersBase
Accuracy(micro-avg): 0.666667 | ||
Accuracy(macro-avg): 0.666667 | ||
Log-loss: 17.052120 | ||
Log-loss reduction: -14.521509 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you benchmark this PR before committing? All of the unit tests indicate a large drop in metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment I made to Wei-Shing above. The drop is also there in master when not using a calibrated learner. Part of this PR makes it so that you have to pass in a calibrated learner if you want one and the OneVersusAll model no longer auto creates one for you. Due to this, I had to change the tests to not use a calibrated learner as well and therefor the drop in metrics. This can be reproduced in master by following the steps listed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This demonstrates you can match the same low score. Can the new version match the quite higher scores of the old version?
I would recommend modifying the unit tests so that the old high scores are maintained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that now its a 2 step process. You have to make create the calibrated trainer, and then use that in the OneVersusAll trainer. In this particular case, the test is using the command line parameters p=AvgPer...
and I couldn't see a way to use the command line parameters to use a calibrated trainer instead of a non-calibrated trainer. Do you know if we have it set up to where we can do it? If so, could you point me in the right direction for that? I looked, and from what I saw it didn't look like we had the option to specify a calibrated trainer directly from the command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a breaking change for MAML users (mostly internal users). I do not think it's possible to calibrate a trainer within OVA in MAML, besides telling OVA to calibrate it for you. Is there a way to keep the MAML interface as it was (and achieve the goals of this PR)?
Speaking of, what is the goal of the PR? Seems we're just making it very hard for users to get good scores, whereas the current behavior simply gives them a good score by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of this PR is to have the OVAModelParameters strongly typed so that the users can access the underlying models. Right now they are cast as an array of object, so the user cannot do anything with them without knowing their type and casting it. This PR strongly types OVAModelParameters and anything that uses that so the model can be accessed directly. I will add unit tests with a calibrated trainer showing the results are the same as what is there currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a good goal. It'll be great to see unit tests showing the same good scores as before. Thanks for all your work.
Is there a way to keep the same interface/behavior for MAML folks?
This change may also affect NimbusML. They may have their own OVA, at which point they would be unaffected.
/// </summary> | ||
public sealed class OneVersusAllModelParameters : | ||
public sealed class OneVersusAllModelParameters<T> : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change backward compatible? What if a user is using OneVersusAllModelParameters
(no <T>
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not backwards compatible. After discussing the change with @codemzs we decided to leave this as a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eerhardt, any comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a similar comment here. I don't believe making an API breaking change on a minor version release is the right decision.
@michaelgsharp Can you please rebase this PR? It shows as 189 files modified and contains commits already in master. |
changes for type fully typed class checkpoint mostly finished typing
Rebase is done. Just making sure all the builds pass. |
@@ -0,0 +1,5 @@ | |||
Compat issues with assembly Microsoft.ML.StandardTrainers: | |||
MembersMustExist : Member 'Microsoft.ML.StandardTrainersCatalog.OneVersusAll<TModel>(Microsoft.ML.MulticlassClassificationCatalog.MulticlassClassificationTrainers, Microsoft.ML.Trainers.ITrainerEstimator<Microsoft.ML.Data.BinaryPredictionTransformer<TModel>, TModel>, System.String, System.Boolean, Microsoft.ML.IEstimator<Microsoft.ML.ISingleFeaturePredictionTransformer<Microsoft.ML.Calibrators.ICalibrator>>, System.Int32, System.Boolean)' does not exist in the implementation but it does exist in the contract. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The recommendation here from the .NET team is to not introduce a breaking change. It breaks our guarantee of having stable APIs (see https://devblogs.microsoft.com/dotnet/announcing-ml-net-1-0-rc-machine-learning-for-net/#updates-in-mlnet-10-rc-timeframe).
If we decide we must make a breaking change, it should be done on a major version change. Please see https://semver.org/. We should not be making an API breaking change on a minor or patch version update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not be making an API breaking change on a minor or patch version update.
Agreed. My recommendation is to work on a process for breaking changes. We don't want to go back where .NET Framework was (i.e. no breaking changes) but we need to make it predictable for customers. My general recommendation would be: obsolete the API in a point release, leave the behavior unchanged, and remove the API in the next major version (2.x). Don't perform API- or behavior breaking changes in point releases; nobody can reason about that.
@@ -313,7 +313,7 @@ public static (Vector<float> score, Key<uint, TVal> predictedLabel) | |||
int? minimumExampleCountPerLeaf = null, | |||
double? learningRate = null, | |||
int numberOfIterations = Defaults.NumberOfIterations, | |||
Action<OneVersusAllModelParameters> onFit = null) | |||
Action<OneVersusAllModelParameters<CalibratedModelParametersBase<LightGbmBinaryModelParameters, PlattCalibrator>>> onFit = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this have to be a PlattCalibrator
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question in the LightGbmMulticlassTrainer.cs file.
In reply to: 299495459 [](ancestors = 299495459)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the LightGbm
stuff was already hard coded in master in the functions themselves. If you look on line 197 in the master branch in LightGbmMulticlassTrainer
it is hard coded there to return FeatureWeightsCalibratedModelParameters<LightGbmBinaryModelParameters, PlattCalibrator>
. So my change just puts that in the function signatures as well. From what I saw, with LightGbm
you do not have the option to pass in a calibrator, and have to take this default option.
@@ -655,7 +655,7 @@ private void MixMatch(string dataPath) | |||
var binaryTrainer = mlContext.BinaryClassification.Trainers.AveragedPerceptron("Label", "Features"); | |||
|
|||
// Append the OVA learner to the pipeline. | |||
dynamicPipe = dynamicPipe.Append(mlContext.MulticlassClassification.Trainers.OneVersusAll(binaryTrainer)); | |||
dynamicPipe = dynamicPipe.Append(mlContext.MulticlassClassification.Trainers.OneVersusAll(binaryTrainer, useProbabilities: false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever you update the CookbookSamples.cs file, it usually means you need to update the https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md file as well.
@@ -340,8 +340,8 @@ public void LightGbmMulticlassEstimatorCorrectSigmoid() | |||
var model = trainer.Fit(transformedDataView, transformedDataView); | |||
|
|||
// The slope in the all the calibrators should be equal to the negative of the sigmoid passed into the trainer. | |||
Assert.True(model.Model.SubModelParameters.All(predictor => | |||
((FeatureWeightsCalibratedModelParameters<LightGbmBinaryModelParameters, PlattCalibrator>)predictor).Calibrator.Slope == -sigmoid)); | |||
// Since the Model/SubModelParameters are strongly typed, no cast is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't appear to add value after this change. Someone in a few months will read this, and not understand why it is talking about casting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I hadn't thought about it like that before. Removed that comment.
/// <param name="labelColumnName">The name of the label column.</param> | ||
/// <param name="imputeMissingLabelsAsNegative">Whether to treat missing labels as having negative labels, instead of keeping them missing.</param> | ||
/// <param name="maximumCalibrationExampleCount">Number of instances to train the calibrator.</param> | ||
/// <param name="useProbabilities">Use probabilities (vs. raw outputs) to identify top-score category.</param> | ||
/// <param name="useProbabilities">Use probabilities (vs. raw outputs) to identify top-score category. If specified, the type of the model must support probabilities or a runtime error will be thrown.</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the default should be changed from true
to false
. Having a default that throws an exception doesn't seem a like great behavior IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't run the benchmarks myself, though it seems like a large drop in metrics if you use uncalibrated models.
So the options would be:
- Default to
false
-- and give poor metrics (which makes the OVA rather useless) - Keep default as
true
-- and throw an error if an uncalibrated model is used (forcing the user down a path to success)
Granted, similar to wschin's comment, I wasn't expecting a very large drop in metric by using as uncalibrated models as the sort order of the base model's scores should be roughly the same.
If we are moving forward with no longer auto calibrating in OVA, we should also have a sample and a unit test showing how to calibrate manually. |
Closing this PR as since the code has substantially changed. I made a brand new one #4013. |
Fixes #2467
We used to remove all Type information when we constructed our
OneVersusAllModelParameters
. This prevented the users from access the inner model without run-time casting.This PR makes the
OneVersusAllModelParameters
strongly typed, as well as theOneVersusAllTrainer
and theLightGbmMultiClassTrainer
as they use theOneVersusAllModelParameters
. This change is a breaking change to the current public api. TheLightGbmMultiClassTrainer
model is typed to beFeatureWeightsCalibratedModelParameters<LightGbmBinaryModelParameters, PlattCalibrator>
as that is what it is hard coded to return on line 190. TheOneVersusAllTrainer
determines the type automatically if youOneVersusAll
method in theStandardTrainersCatalog
.As part of this change, the
OneVersusAllTrainer
will no longer auto calibrate the model and you must pass in the model already calibrated if you want it to be calibrated.DISCUSSION POINT - Currently the
useProbabilities
flag is still allowed. You must set it to false with a non-calibrated trainer or an error will be thrown. If your trainer is calibrated, you can pass eithertrue
orfalse
and it will either use raw values or calibrated ones accordingly. Do we still want to allow this? Or would it be better to remove theuseProbabilities
flag and have non-calibrated trainers always return raw results and have calibrated trainers always return calibrated results? This would simplify the call and prevent errors where you forget to setuseProbabilities
to false with a non-calibrated trainer, but would also prevent you from getting raw results from a calibrated trainer.Since the
OneVersusAllTrainer
no longer auto calibrates the trainer, the testing files we use for comparison were also updated to reflect that.Some calibrators where changed from
internal
scope topublic
scope as we now return them as a type for the user to use.The test for this is just adding a line in the existing
LightGbmMulticlassEstimatorCorrectSigmoid
test showing that casting is not needed to access the internal model.