Skip to content

ML.NET public API exposes parameter weights which is not used. #2175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
abgoswam opened this issue Jan 17, 2019 · 6 comments
Closed

ML.NET public API exposes parameter weights which is not used. #2175

abgoswam opened this issue Jan 17, 2019 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@abgoswam
Copy link
Member

abgoswam commented Jan 17, 2019

For some of the learners e.g. SdcaRegression ML.NET public API exposes parameter weights

public static SdcaRegressionTrainer StochasticDualCoordinateAscent(this RegressionContext.RegressionTrainers ctx,
string labelColumn = DefaultColumnNames.Label,
string featureColumn = DefaultColumnNames.Features,
string weights = null,
ISupportSdcaRegressionLoss loss = null,
float? l2Const = null,

However, the advanced Options for SdcaRegressionTrainer do not have a field for 'WeightColumn'. Also, I believe the algo itself does not use weights (need to verify)

We need to scrub our public API for such spurious uses of weights parameter.

EDIT :

  • Based on investigations (see below), we see that the SdcaRegressionTrainer does use the weights column. So the proper fix would be to ensure that the weight column passed in the constructor gets used by the trainer.
  • For the example above, we need to add WeightColumn in the advanced Options for SdcaRegressionTrainer and verify it gets used by the SDCA trainer.

@sfilipi @glebuk @TomFinley

@abgoswam abgoswam added the API Issues pertaining the friendly API label Jan 17, 2019
@glebuk
Copy link
Contributor

glebuk commented Jan 18, 2019

There are several additional problems with weigh parameters::

  • Name of the argument should be consistent with outer column names, it should end on *Column.
  • method should only have the parameter when algo uses it.
  • the term "weight" is ambigious with model weights. Perhaps we should disambiguate this further by calling it exampleWeightColumn
  • Confusing behavior between providing input values null and "weight":
  1. if arg is not specified and Weight column exists - column is used
  2. if arg is not specified and Weight column does NOT exist - no error
  3. if arg is specified and Weight column exist - column is used
  4. if arg is specified and Weight column does NOT exist - crash!

This comes from the command line days. However it does not make much sense from the API point of veiw.

@glebuk
Copy link
Contributor

glebuk commented Jan 18, 2019

seems related to issue #2177

@abgoswam abgoswam self-assigned this Jan 24, 2019
@TomFinley
Copy link
Contributor

TomFinley commented Jan 25, 2019

However, the advanced Options for SdcaRegressionTrainer do not have a field for 'WeightColumn'. Also, I believe the algo itself does not use weights (need to verify)

Hi @abgoswam ... from what I see, it does. See here, in the SdcaTrainerBase.TrainCore method:

invariants[idx] = Loss.ComputeDualUpdateInvariant(invariantCoeff * normSquared * lambdaNInv * GetInstanceWeight(cursor));

If the "connection" from weights in the constructor to weights in the trainer has been broken, that is what must be fixed. You should not remove the option -- you should make it work, right? At least so I would think.

@abgoswam
Copy link
Member Author

abgoswam commented Jan 27, 2019

@TomFinley . Correct. As you pointed out, for this issue we need a fix whereby the weight column passed in the constructor gets used by the trainer.

I have edited the bug description accordingly.

@abgoswam
Copy link
Member Author

@glebuk . I created a separate issue #2257 for the 2 comments above regarding naming convention for the weights parameter

  • Name of the argument should be consistent with outer column names, it should end on *Column.
  • The term "weight" is ambigious with model weights. Perhaps we should disambiguate this further by calling it exampleWeightColumn

@abgoswam
Copy link
Member Author

abgoswam commented Feb 21, 2019

Removing api label, added bug label. Hence removing it from Project13 as well.

@abgoswam abgoswam added bug Something isn't working and removed API Issues pertaining the friendly API labels Feb 21, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants