Skip to content

Commit 3665703

Browse files
authored
Python script to upload sample data to S3 for training and hpo jobs (#38)
1 parent 6a68963 commit 3665703

File tree

15 files changed

+82
-20
lines changed

15 files changed

+82
-20
lines changed

Diff for: samples/README.md

+20-1
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ This sample demonstrates how to start jobs using your own script, packaged in a
66

77
Follow the instructions on our [installation page](/README.md#getting-started) to create a Kubernetes cluster and install sagemaker controller.
88

9+
### SageMaker execution IAM role
910
Run the following commands to create a SageMaker execution IAM role which is used by SageMaker service to access AWS resources.
1011

1112
```
@@ -20,7 +21,25 @@ SAGEMAKER_EXECUTION_ROLE_ARN=$(aws iam get-role --role-name ${SAGEMAKER_EXECUTIO
2021
2122
echo $SAGEMAKER_EXECUTION_ROLE_ARN
2223
```
23-
Note down the execution role ARN to use in samples
24+
Note down the execution role ARN to use in samples.
25+
26+
### S3 Bucket
27+
Run the following commands to create a S3 bucket which is used by SageMaker service to access and upload data. Use the region you used during installation unless you are trying cross region. SageMaker resources will be created in this region.
28+
```
29+
export AWS_DEFAULT_REGION=<REGION>
30+
export S3_BUCKET_NAME="ack-data-bucket-$AWS_DEFAULT_REGION"
31+
32+
# [Option 1] if your region is us-east-1
33+
aws s3api create-bucket --bucket $S3_BUCKET_NAME --region $AWS_DEFAULT_REGION
34+
35+
# [Option 2] if your region is NOT us-east-1
36+
aws s3api create-bucket --bucket $S3_BUCKET_NAME --region $AWS_DEFAULT_REGION \
37+
--create-bucket-configuration LocationConstraint=$AWS_DEFAULT_REGION
38+
39+
echo $S3_BUCKET_NAME
40+
```
41+
Note down your S3 bucket name which will be used in the samples.
42+
2443
### Creating your first Job
2544

2645
The easiest way to start is taking a look at the sample training jobs and its corresponding [README](/samples/training/README.md)

Diff for: samples/batch_transform/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to start batch transform jobs using your own batch-
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
88

99
You will also need a model in SageMaker for this sample. If you do not have one you must first create a [model](/samples/model/README.md).
1010

Diff for: samples/endpoint/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to create Endpoints using your own Endpoint_base/co
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
88

99
You will also need a model in SageMaker for this sample. If you do not have one you must first create a [model](/samples/model/README.md)
1010

Diff for: samples/hyperparameter_tuning/README.md

+9-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,14 @@ This sample demonstrates how to start hyperparameter jobs using your own hyperpa
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
8+
9+
### Upload S3 Data
10+
11+
You will need training data uploaded to an S3 bucket. Make sure you have AWS credentials and and have the bucket in the same region where you plan to create SageMaker resources. Run the following python script to upload sample data to your S3 bucket.
12+
```
13+
python3 ../training/s3_sample_data.py $S3_BUCKET_NAME
14+
```
815

916
### Get an Image
1017

@@ -20,6 +27,7 @@ A container image URL and tag looks has the following structure:
2027
In the `my-hyperparameter-job.yaml` file, modify the placeholder values with those associated with your account and hyperparameter job.
2128

2229
### Enabling Spot Training
30+
2331
In the `my-hyperparameter-job.yaml` file under `spec.trainingJobDefinition` add `enableManagedSpotTraining` and set the value to true. You will also need to specify a `spec.trainingJobDefinition.stoppingCondition.maxRuntimeInSeconds` and `spec.trainingJobDefinition.stoppingCondition.maxWaittimeInSeconds`
2432

2533
## Submitting your Hyperparameter Job

Diff for: samples/hyperparameter_tuning/my-hyperparameter-job.yaml

+6-4
Original file line numberDiff line numberDiff line change
@@ -63,9 +63,9 @@ spec:
6363
s3DataSource:
6464
s3DataType: S3Prefix
6565
# The source of the training data
66-
s3URI: s3://<YOUR BUCKET/PATH>
66+
s3URI: s3://<YOUR BUCKET>/sagemaker/xgboost/train
6767
s3DataDistributionType: FullyReplicated
68-
contentType: text/csv
68+
contentType: text/libsvm
6969
compressionType: None
7070
recordWrapperType: None
7171
inputMode: File
@@ -74,9 +74,9 @@ spec:
7474
s3DataSource:
7575
s3DataType: S3Prefix
7676
# The source of the validation data
77-
s3URI: s3://<YOUR BUCKET/PATH>
77+
s3URI: s3://<YOUR BUCKET>/sagemaker/xgboost/validation
7878
s3DataDistributionType: FullyReplicated
79-
contentType: text/csv
79+
contentType: text/libsvm
8080
compressionType: None
8181
recordWrapperType: None
8282
inputMode: File
@@ -87,5 +87,7 @@ spec:
8787
instanceType: ml.m4.xlarge
8888
instanceCount: 1
8989
volumeSizeInGB: 25
90+
stoppingCondition:
91+
maxRuntimeInSeconds: 3600
9092
enableNetworkIsolation: true
9193
enableInterContainerTrafficEncryption: false

Diff for: samples/job_definitions/data_quality/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to start data quality job definitions using your ow
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
88

99
You will need an [Endpoint](/samples/endpoint/README.md) configured in SageMaker and you will need to run a baselining job to generate baseline statistics and constraints.
1010

Diff for: samples/job_definitions/model_bias/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to start model-bias job definitions using your own
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
88

99
You will also need an [Endpoint](/samples/endpoint/README.md) configured in SageMaker and you will need to run a baselining job to generate baseline constraints only.
1010

Diff for: samples/job_definitions/model_explainability/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to start model-explainability job definitions using
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
88

99
You will also need an [Endpoint](/samples/endpoint/README.md) configured in SageMaker and you will need to run a baselining job to generate baseline constraints only.
1010

Diff for: samples/job_definitions/model_quality/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to start model-quality job definitions using your o
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
88

99
You will also need an [Endpoint](/samples/endpoint/README.md) configured in SageMaker and you will need to run a baselining job to generate baseline constraints only.
1010

Diff for: samples/model/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to start models/create a model using your own model
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
88

99
### Get an Image
1010

Diff for: samples/monitoring_schedule/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to start Monitoring scheduling using your own Monit
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md). It also assumes you have a job definition of one of these [types](/samples/job_definitions).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md). It also assumes you have a job definition of one of these [types](/samples/job_definitions).
88

99
### Updating the Monitoring Specification
1010

Diff for: samples/processing/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This sample demonstrates how to start processing jobs using your own processing
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
88

99
You will need to upload [kmeans_preprocessing.py](/samples/processing/kmeans_preprocessing.py) to an S3 bucket and update the s3Input s3URI path.
1010

Diff for: samples/training/README.md

+8-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,14 @@ This sample demonstrates how to start training jobs using your own training scri
44

55
## Prerequisites
66

7-
This sample assumes that you have completed the [common prerequisties](/samples/README.md).
7+
This sample assumes that you have completed the [common prerequisites](/samples/README.md).
8+
9+
### Upload S3 Data
10+
11+
You will need training data uploaded to an S3 bucket. Make sure you have AWS credentials and and have the bucket in the same region where you plan to create SageMaker resources. Run the following python script to upload sample data to your S3 bucket.
12+
```
13+
python3 s3_sample_data.py $S3_BUCKET_NAME
14+
```
815

916
### Get an Image
1017

Diff for: samples/training/my-training-job.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -38,16 +38,16 @@ spec:
3838
s3DataSource:
3939
s3DataType: S3Prefix
4040
# The input path of our train data
41-
s3URI: s3://<YOUR BUCKET/PATH>
41+
s3URI: s3://<YOUR BUCKET>/sagemaker/xgboost/train
4242
s3DataDistributionType: FullyReplicated
43-
contentType: text/csv
43+
contentType: text/libsvm
4444
compressionType: None
4545
- channelName: validation
4646
dataSource:
4747
s3DataSource:
4848
s3DataType: S3Prefix
4949
# The input path of our validation data
50-
s3URI: s3://<YOUR BUCKET/PATH>
50+
s3URI: s3://<YOUR BUCKET>/sagemaker/xgboost/validation
5151
s3DataDistributionType: FullyReplicated
52-
contentType: text/csv
52+
contentType: text/libsvm
5353
compressionType: None

Diff for: samples/training/s3_sample_data.py

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
import urllib.request
2+
from urllib.parse import urlparse
3+
import sys
4+
5+
# Gets your bucket name from command line
6+
try:
7+
bucket = str(sys.argv[1])
8+
except Exception as error:
9+
print("Please pass your bucket name as a commandline argument")
10+
sys.exit(1)
11+
12+
# Download dataset from pinned commit
13+
url = "https://github.com/aws/amazon-sagemaker-examples/raw/af6667bd0be3c9cdec23fecda7f0be6d0e3fa3ea/sagemaker-debugger/xgboost_realtime_analysis/data_utils.py"
14+
urllib.request.urlretrieve(url, "data_utils.py")
15+
16+
from data_utils import load_mnist, upload_to_s3
17+
18+
prefix = "sagemaker/xgboost"
19+
train_file, validation_file = load_mnist()
20+
upload_to_s3(train_file, bucket, f"{prefix}/train/mnist.train.libsvm")
21+
upload_to_s3(validation_file, bucket, f"{prefix}/validation/mnist.validation.libsvm")
22+
23+
# Remove downloaded file
24+
import os
25+
26+
os.remove("data_utils.py")

0 commit comments

Comments
 (0)