Skip to content

fix: supplying hyperparameters to training step constructor drops hyperparameters specified in estimator #144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 18, 2021

Conversation

shivlaks
Copy link
Contributor

@shivlaks shivlaks commented Jun 14, 2021

Summary

Hyperparameters can be specified in the estimator object and hyperparameters property.
Both of which are taken in the constructor of the TrainingStep class.

The current behaviour drops any hyperparameters that were specified in the estimator if the property
is set in the TrainingStep constructor. This is undesirable as the estimators often specify algorithm specific hyperparameters out of the box that we don't want to drop.

This change merges the hyperparameters in the constructor as well as the estimator that is used in TrainingStep.
If there are duplicate keys, the hyperparameters specified in the constructor will be used.

Closes #99, #72


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@shivlaks shivlaks requested a review from wong-a June 14, 2021 21:45
parameters['HyperParameters'] = hyperparameters
merged_hyperparameters = {}
if estimator.hyperparameters() is not None:
merged_hyperparameters.update(estimator.hyperparameters())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re:

should we warn/error instead of an in-place merge when we find duplicate keys?

Throwing an error wouldn't be helpful; it's more breaking (you can't create what you could before) and doesn't address the desired functionality in #99. Logging might at an INFO level may be helpful for duplicate keys.

We should also document if not already that parameters gets totally overridden by training_config. This isn't the case for all service integration steps. Maybe we should adopt an update strategy there too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally thinking that we should log it as a warning since INFO tends to generally also include a lot of junk but not strongly opinionated.

I'll re-spin the update calls to assemble the dicts so that they log something on duplicate keys.
absolutely agree that we need to document.

This isn't the case for all service integration steps. Maybe we should adopt an update strategy there too?

great call. wasn't on my radar, but I'm in favour of adopting the update strategy. The most consistent it is across things in the SDK, the more intuitive and idiomatic it will feel for users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make the changes to service integration steps in a separate PR. let me know if you had a different thought/idea of where we should be documenting this behaviour @wong-a

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the other way around actually. In all service integration steps besides sagemaker, the constructor accepts a parameters argument that becomes Parameters . We can update sagemaker step classes to accept a parameters dict which can be an escape hatch for full API coverage or override any explicitly exposed arguments.

For example, DynamoDBUpdateItem the caller must specify all parameters in the parameters field. The constructor doesn't have a table_name argument to set or other required fields:
https://github.com/aws/aws-step-functions-data-science-sdk-python/blob/main/src/stepfunctions/steps/service.py#L140-L167

Whereas the sagemaker steps always construct parameters using the sagemaker SDK and some special arguments in the constructor. You could provide parameters because the constructor accepts **kwargs, but it won't do anything.
https://github.com/aws/aws-step-functions-data-science-sdk-python/blob/main/src/stepfunctions/steps/sagemaker.py#L477-L479

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. I'm in favour of adding that parameters property and will address it in another PR. Escape hatches are powerful because it'll give users a path forward without requiring first class support to be developed and released.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's out of scope of this PR. Can you create an issue for tracking?

@shivlaks shivlaks changed the title [DRAFT] fix: supplying hyperparameters to training step constructor drops hyperparameters specified in estimator fix: supplying hyperparameters to training step constructor drops hyperparameters specified in estimator Jun 17, 2021
@shivlaks shivlaks marked this pull request as ready for review June 17, 2021 16:43
Copy link
Contributor

@wong-a wong-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides making documentation clearer

wong-a
wong-a previously approved these changes Jun 17, 2021
Copy link
Contributor

@wong-a wong-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

< ship it >
 ---------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

ca-nguyen
ca-nguyen previously approved these changes Jun 18, 2021
@shivlaks shivlaks dismissed stale reviews from ca-nguyen and wong-a via c431f7c June 18, 2021 05:40
@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-sEHrOdk7acJc
  • Commit ID: c431f7c
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@shivlaks shivlaks merged commit 349fc11 into main Jun 18, 2021
shivlaks added a commit that referenced this pull request Jun 21, 2021
bump for `v2.2.0` which includes the following changes:

Features
* Placeholders in TrainingStep to set S3 location for InputDataConfig and OutputDataConfig (#142)
* EventBridge service integration (#147)

Fixes
* supplying hyperparameters to training step constructor drops hyperparameters specified in estimator (#144)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhancement: TrainingStep hyperparameters updating the estimator ones
4 participants