-
Notifications
You must be signed in to change notification settings - Fork 37
Feature group sample #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ack-bot
merged 3 commits into
aws-controllers-k8s:main
from
BriannaRoskind:feature_group_sample
Jul 17, 2021
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# Feature Group Sample | ||
|
||
This sample demonstrates how to create a feature group using the Amazon AWS Controllers for Kubernetes (ACK) service controller for Amazon SageMaker. | ||
|
||
Inspiration for this sample was taken from the notebook on [Fraud Detection with Amazon SageMaker FeatureStore](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-featurestore/sagemaker_featurestore_fraud_detection_python_sdk.html). | ||
|
||
## Prerequisites | ||
|
||
This sample assumes that you have completed the [common prerequisites](https://github.com/aws-controllers-k8s/sagemaker-controller/blob/main/samples/README.md). | ||
|
||
### Create an S3 bucket: | ||
|
||
Since we are using the offline store in this example, you need to set up an s3 bucket. [Here are directions](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) to set up your s3 bucket through the S3 Console, AWS SDK, or AWS CLI. | ||
|
||
### Updating the Feature Group Specification: | ||
|
||
In the `my-feature-group.yaml` file, modify the placeholder values with those associated with your account and feature group. | ||
|
||
## Creating your Feature Group | ||
|
||
BriannaRoskind marked this conversation as resolved.
Show resolved
Hide resolved
|
||
### Create a Feature Group: | ||
|
||
To submit your prepared feature group specification, apply the specification to your Kubernetes cluster as such: | ||
|
||
``` | ||
$ kubectl apply -f my-feature-group.yaml | ||
featuregroup.sagemaker.services.k8s.aws/my-feature-group created | ||
``` | ||
|
||
### List Feature Groups: | ||
|
||
To list all feature groups created using the ACK controller use the following command: | ||
|
||
``` | ||
$ kubectl get featuregroup | ||
``` | ||
|
||
### Describe a Feature Group: | ||
|
||
To get more details about the feature group once it's submitted, like checking the status, errors or parameters of the feature group use the following command: | ||
|
||
``` | ||
$ kubectl describe featuregroup my-feature-group | ||
``` | ||
|
||
## Ingesting Data into your Feature Group | ||
BriannaRoskind marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Note that ingestion is **not** supported in the controller. | ||
To ingest data from the my-sample-data.csv file into your feature group, use the following command: | ||
|
||
``` | ||
BriannaRoskind marked this conversation as resolved.
Show resolved
Hide resolved
|
||
$ python3 data_ingestion.py -i my-sample-data.csv -fg my-feature-group | ||
``` | ||
|
||
## Deleting your Feature Group | ||
|
||
To delete the feature group, use the following command: | ||
|
||
``` | ||
$ kubectl delete featuregroup my-feature-group | ||
featuregroup.sagemaker.services.k8s.aws "my-feature-group" deleted | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
#!/usr/bin/python | ||
|
||
import argparse | ||
import boto3 | ||
import csv | ||
|
||
sagemaker_featurestore_runtime_client = boto3.Session().client( | ||
service_name="sagemaker-featurestore-runtime") | ||
|
||
# Initialize the parser. | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument("-i", "--input_file", help = "Path to a csv file containing data for ingestion.") | ||
parser.add_argument("-fg", "--feature_group_name", help = "Name of the feature group to write data to.") | ||
|
||
# Read arguments from the command line. | ||
args = parser.parse_args() | ||
|
||
# Write records from the csv file to s3. | ||
with open(args.input_file) as file_handle: | ||
for row in csv.DictReader(file_handle, skipinitialspace=True): | ||
record=[] | ||
for featureName, valueAsString in row.items(): | ||
record.append({ | ||
'FeatureName':featureName, | ||
'ValueAsString':valueAsString | ||
}) | ||
sagemaker_featurestore_runtime_client.put_record( | ||
FeatureGroupName=args.feature_group_name, | ||
Record=record) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
apiVersion: sagemaker.services.k8s.aws/v1alpha1 | ||
kind: FeatureGroup | ||
metadata: | ||
name: <YOUR FEATURE GROUP NAME> | ||
spec: | ||
eventTimeFeatureName: EventTime | ||
featureDefinitions: | ||
- featureName: TransactionID | ||
featureType: Integral | ||
- featureName: EventTime | ||
featureType: Fractional | ||
featureGroupName: <YOUR FEATURE GROUP NAME> | ||
recordIdentifierFeatureName: TransactionID | ||
offlineStoreConfig: | ||
s3StorageConfig: | ||
s3URI: s3://<YOUR BUCKET>/feature-group-data | ||
onlineStoreConfig: | ||
enableOnlineStore: True | ||
roleARN: <YOUR SAGEMAKER ROLE ARN> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
TransactionID,EventTime | ||
1,1623434915 | ||
2,1623435267 | ||
3,1623435284 |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is taken from a notebook, can you please add a reference/link to it?
It'll be useful for us as well. We need to start doing it for all samples