Skip to content

Commit 2758879

Browse files
authored
Merge branch 'main' into master
2 parents e67804b + f8bbfaf commit 2758879

11 files changed

+1171
-23
lines changed

CONTRIBUTING.md

+36
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,15 @@ information to effectively respond to your bug report or contribution.
1414
* [Contributing via Pull Requests (PRs)](#contributing-via-pull-requests-prs)
1515
* [Pulling Down the Code](#pulling-down-the-code)
1616
* [Running the Unit Tests](#running-the-unit-tests)
17+
* [Running Unit Tests and Debugging in PyCharm](#running-unit-tests-and-debugging-in-pycharm)
1718
* [Running the Integration Tests](#running-the-integration-tests)
1819
* [Making and Testing Your Change](#making-and-testing-your-change)
1920
* [Committing Your Change](#committing-your-change)
2021
* [Sending a Pull Request](#sending-a-pull-request)
2122
* [Finding Contributions to Work On](#finding-contributions-to-work-on)
23+
* [Setting Up Your Development Environment](#setting-up-your-development-environment)
24+
* [Setting Up Your Environment for Debugging](#setting-up-your-environment-for-debugging)
25+
* [PyCharm](#pycharm)
2226
* [Code of Conduct](#code-of-conduct)
2327
* [Security Issue Notifications](#security-issue-notifications)
2428
* [Licensing](#licensing)
@@ -65,6 +69,11 @@ You can also run a single test with the following command: `tox -e py36 -- -s -v
6569
* Note that the coverage test will fail if you only run a single test, so make sure to surround the command with `export IGNORE_COVERAGE=-` and `unset IGNORE_COVERAGE`
6670
* Example: `export IGNORE_COVERAGE=- ; tox -e py36 -- -s -vv tests/unit/test_sagemaker_steps.py::test_training_step_creation_with_model ; unset IGNORE_COVERAGE`
6771

72+
#### Running Unit Tests and Debugging in PyCharm
73+
You can also run the unit tests with the following options:
74+
* Right click on a test file in the Project tree and select `Run/Debug 'pytest' for ...`
75+
* Right click on the test definition and select `Run/Debug 'pytest' for ...`
76+
* Click on the green arrow next to test definition
6877

6978
### Running the Integration Tests
7079

@@ -168,6 +177,33 @@ Please remember to:
168177
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels ((enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/aws/aws-step-functions-data-science-sdk-python/labels/help%20wanted) issues is a great place to start.
169178
170179
180+
## Setting Up Your Development Environment
181+
182+
### Setting Up Your Environment for Debugging
183+
184+
Setting up your IDE for debugging tests locally will save you a lot of time.
185+
You might be able to `Run` and `Debug` the tests directly in your IDE with your default settings, but if it's not the case,
186+
follow the steps described in this section.
187+
188+
#### PyCharm
189+
1. Set your Default test runner to `pytest` in _Preferences → Tools → Python Integrated Tools_
190+
1. If you are using `PyCharm Professional Edition`, go to _Preferences → Build, Execution, Deployment → Python Debugger_ and set the options with following values:
191+
192+
| Option | Value |
193+
|:------------------------------------------------------------ |:----------------------|
194+
| Attach subprocess automatically while debugging | `Enabled` |
195+
| Collect run-time types information for code insight | `Enabled` |
196+
| Gevent compatible | `Disabled` |
197+
| Drop into debugger on failed tests | `Enabled` |
198+
| PyQt compatible | `Auto` |
199+
| For Attach to Process show processes with names containing | `python` |
200+
201+
This will allow you to break into all subprocesses of the process being debugged and preserve functions types while debugging.
202+
1. Debug tests in PyCharm as per [Running Unit Tests and Debugging in PyCharm](#running-unit-tests-and-debugging-in-pycharm)
203+
204+
_Note: This setup was tested and confirmed to work with
205+
`PyCharm 2020.3.5 (Professional Edition)` and `PyCharm 2021.1.1 (Professional Edition)`_
206+
171207
## Code of Conduct
172208

173209
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).

doc/services.rst

+22
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ This module provides classes to build steps that integrate with Amazon DynamoDB,
88

99
- `Amazon DynamoDB <#amazon-dynamodb>`__
1010

11+
- `Amazon EKS <#amazon-eks>`__
12+
1113
- `Amazon EMR <#amazon-emr>`__
1214

1315
- `Amazon EventBridge <#amazon-eventbridge>`__
@@ -29,6 +31,26 @@ Amazon DynamoDB
2931

3032
.. autoclass:: stepfunctions.steps.service.DynamoDBUpdateItemStep
3133

34+
35+
Amazon EKS
36+
----------
37+
.. autoclass:: stepfunctions.steps.service.EksCallStep
38+
39+
.. autoclass:: stepfunctions.steps.service.EksCreateClusterStep
40+
41+
.. autoclass:: stepfunctions.steps.service.EksCreateFargateProfileStep
42+
43+
.. autoclass:: stepfunctions.steps.service.EksCreateNodeGroupStep
44+
45+
.. autoclass:: stepfunctions.steps.service.EksDeleteClusterStep
46+
47+
.. autoclass:: stepfunctions.steps.service.EksDeleteFargateProfileStep
48+
49+
.. autoclass:: stepfunctions.steps.service.EksDeleteNodegroupStep
50+
51+
.. autoclass:: stepfunctions.steps.service.EksRunJobStep
52+
53+
3254
Amazon EMR
3355
-----------
3456
.. autoclass:: stepfunctions.steps.service.EmrCreateClusterStep

src/stepfunctions/steps/__init__.py

+11
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,17 @@
1919
from stepfunctions.steps.sagemaker import TrainingStep, TransformStep, ModelStep, EndpointConfigStep, EndpointStep, TuningStep, ProcessingStep
2020
from stepfunctions.steps.compute import LambdaStep, BatchSubmitJobStep, GlueStartJobRunStep, EcsRunTaskStep
2121
from stepfunctions.steps.service import DynamoDBGetItemStep, DynamoDBPutItemStep, DynamoDBUpdateItemStep, DynamoDBDeleteItemStep
22+
23+
from stepfunctions.steps.service import (
24+
EksCallStep,
25+
EksCreateClusterStep,
26+
EksCreateFargateProfileStep,
27+
EksCreateNodeGroupStep,
28+
EksDeleteClusterStep,
29+
EksDeleteFargateProfileStep,
30+
EksDeleteNodegroupStep,
31+
EksRunJobStep,
32+
)
2233
from stepfunctions.steps.service import EmrCreateClusterStep, EmrTerminateClusterStep, EmrAddStepStep, EmrCancelStepStep, EmrSetClusterTerminationProtectionStep, EmrModifyInstanceFleetByNameStep, EmrModifyInstanceGroupByNameStep
2334
from stepfunctions.steps.service import EventBridgePutEventsStep
2435
from stepfunctions.steps.service import GlueDataBrewStartJobRunStep

src/stepfunctions/steps/fields.py

-1
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,6 @@ class Field(Enum):
6060
HeartbeatSeconds = 'heartbeat_seconds'
6161
HeartbeatSecondsPath = 'heartbeat_seconds_path'
6262

63-
6463
# Retry and catch fields
6564
ErrorEquals = 'error_equals'
6665
IntervalSeconds = 'interval_seconds'

src/stepfunctions/steps/sagemaker.py

+27-18
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
from stepfunctions.inputs import Placeholder
2020
from stepfunctions.steps.states import Task
2121
from stepfunctions.steps.fields import Field
22-
from stepfunctions.steps.utils import tags_dict_to_kv_list
22+
from stepfunctions.steps.utils import merge_dicts, tags_dict_to_kv_list
2323
from stepfunctions.steps.integration_resources import IntegrationPattern, get_service_integration_arn
2424

2525
from sagemaker.workflow.airflow import training_config, transform_config, model_config, tuning_config, processing_config
@@ -30,6 +30,7 @@
3030

3131
SAGEMAKER_SERVICE_NAME = "sagemaker"
3232

33+
3334
class SageMakerApi(Enum):
3435
CreateTrainingJob = "createTrainingJob"
3536
CreateTransformJob = "createTransformJob"
@@ -479,7 +480,9 @@ class ProcessingStep(Task):
479480
Creates a Task State to execute a SageMaker Processing Job.
480481
"""
481482

482-
def __init__(self, state_id, processor, job_name, inputs=None, outputs=None, experiment_config=None, container_arguments=None, container_entrypoint=None, kms_key_id=None, wait_for_completion=True, tags=None, **kwargs):
483+
def __init__(self, state_id, processor, job_name, inputs=None, outputs=None, experiment_config=None,
484+
container_arguments=None, container_entrypoint=None, kms_key_id=None, wait_for_completion=True,
485+
tags=None, **kwargs):
483486
"""
484487
Args:
485488
state_id (str): State name whose length **must be** less than or equal to 128 unicode characters. State names **must be** unique within the scope of the whole state machine.
@@ -491,15 +494,18 @@ def __init__(self, state_id, processor, job_name, inputs=None, outputs=None, exp
491494
outputs (list[:class:`~sagemaker.processing.ProcessingOutput`]): Outputs for
492495
the processing job. These can be specified as either path strings or
493496
:class:`~sagemaker.processing.ProcessingOutput` objects (default: None).
494-
experiment_config (dict, optional): Specify the experiment config for the processing. (Default: None)
495-
container_arguments ([str]): The arguments for a container used to run a processing job.
496-
container_entrypoint ([str]): The entrypoint for a container used to run a processing job.
497-
kms_key_id (str): The AWS Key Management Service (AWS KMS) key that Amazon SageMaker
497+
experiment_config (dict or Placeholder, optional): Specify the experiment config for the processing. (Default: None)
498+
container_arguments ([str] or Placeholder): The arguments for a container used to run a processing job.
499+
container_entrypoint ([str] or Placeholder): The entrypoint for a container used to run a processing job.
500+
kms_key_id (str or Placeholder): The AWS Key Management Service (AWS KMS) key that Amazon SageMaker
498501
uses to encrypt the processing job output. KmsKeyId can be an ID of a KMS key,
499502
ARN of a KMS key, alias of a KMS key, or alias of a KMS key.
500503
The KmsKeyId is applied to all outputs.
501504
wait_for_completion (bool, optional): Boolean value set to `True` if the Task state should wait for the processing job to complete before proceeding to the next step in the workflow. Set to `False` if the Task state should submit the processing job and proceed to the next step. (default: True)
502-
tags (list[dict], optional): `List to tags <https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html>`_ to associate with the resource.
505+
tags (list[dict] or Placeholder, optional): `List to tags <https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html>`_ to associate with the resource.
506+
parameters(dict, optional): The value of this field is merged with other arguments to become the request payload for SageMaker `CreateProcessingJob<https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob.html>`_.
507+
You can use `parameters` to override the value provided by other arguments and specify any field's value dynamically using `Placeholders<https://aws-step-functions-data-science-sdk.readthedocs.io/en/stable/placeholders.html?highlight=placeholder#stepfunctions.inputs.Placeholder>`_.
508+
503509
"""
504510
if wait_for_completion:
505511
"""
@@ -518,22 +524,25 @@ def __init__(self, state_id, processor, job_name, inputs=None, outputs=None, exp
518524
SageMakerApi.CreateProcessingJob)
519525

520526
if isinstance(job_name, str):
521-
parameters = processing_config(processor=processor, inputs=inputs, outputs=outputs, container_arguments=container_arguments, container_entrypoint=container_entrypoint, kms_key_id=kms_key_id, job_name=job_name)
527+
processing_parameters = processing_config(processor=processor, inputs=inputs, outputs=outputs, container_arguments=container_arguments, container_entrypoint=container_entrypoint, kms_key_id=kms_key_id, job_name=job_name)
522528
else:
523-
parameters = processing_config(processor=processor, inputs=inputs, outputs=outputs, container_arguments=container_arguments, container_entrypoint=container_entrypoint, kms_key_id=kms_key_id)
529+
processing_parameters = processing_config(processor=processor, inputs=inputs, outputs=outputs, container_arguments=container_arguments, container_entrypoint=container_entrypoint, kms_key_id=kms_key_id)
524530

525531
if isinstance(job_name, Placeholder):
526-
parameters['ProcessingJobName'] = job_name
532+
processing_parameters['ProcessingJobName'] = job_name
527533

528534
if experiment_config is not None:
529-
parameters['ExperimentConfig'] = experiment_config
530-
535+
processing_parameters['ExperimentConfig'] = experiment_config
536+
531537
if tags:
532-
parameters['Tags'] = tags_dict_to_kv_list(tags)
533-
534-
if 'S3Operations' in parameters:
535-
del parameters['S3Operations']
536-
537-
kwargs[Field.Parameters.value] = parameters
538+
processing_parameters['Tags'] = tags if isinstance(tags, Placeholder) else tags_dict_to_kv_list(tags)
539+
540+
if 'S3Operations' in processing_parameters:
541+
del processing_parameters['S3Operations']
542+
543+
if Field.Parameters.value in kwargs and isinstance(kwargs[Field.Parameters.value], dict):
544+
# Update processing_parameters with input parameters
545+
merge_dicts(processing_parameters, kwargs[Field.Parameters.value])
538546

547+
kwargs[Field.Parameters.value] = processing_parameters
539548
super(ProcessingStep, self).__init__(state_id, **kwargs)

0 commit comments

Comments
 (0)