-
Notifications
You must be signed in to change notification settings - Fork 429
feat(event_source): Add support for S3 batch operations #3572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
leandrodamascena
merged 20 commits into
aws-powertools:develop
from
sbailliez:feat/s3-batch-operations
Jan 17, 2024
Merged
Changes from 5 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
9d8e9a4
Add support for S3 Batch Operations event and response along with uni…
sbailliez 319a048
Add documentation with example based on the AWS S3 documentation
sbailliez d834c96
Use unquote_plus and add unit test for key encoded with space
sbailliez 0045c06
Merge branch 'aws-powertools:develop' into feat/s3-batch-operations
sbailliez 7af9a73
Merge branch 'develop' into feat/s3-batch-operations
leandrodamascena f3955eb
Merge branch 'develop' into feat/s3-batch-operations
leandrodamascena da5105a
Merge branch 'develop' into feat/s3-batch-operations
leandrodamascena eee9536
Merge branch 'develop' into feat/s3-batch-operations
leandrodamascena 688e746
Merge branch 'develop' into feat/s3-batch-operations
leandrodamascena a401507
Initial refactor
leandrodamascena bd29d0a
Changing the DX to improve usability
leandrodamascena dfb4618
Documentation
leandrodamascena b778609
Adding parser
leandrodamascena fe5424f
Small refactor
leandrodamascena b4996a3
Merge branch 'develop' into feat/s3-batch-operations
leandrodamascena c49adb7
Merge branch 'develop' into feat/s3-batch-operations
leandrodamascena deb270a
Addressing Ruben's feedback - Docs and examples
leandrodamascena 12e81d4
Addressing Ruben's feedback - Docs and examples
leandrodamascena 57681c4
Addressing Ruben's feedback - Code
leandrodamascena a99594f
Addressing Ruben's feedback - Code
leandrodamascena File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
220 changes: 220 additions & 0 deletions
220
aws_lambda_powertools/utilities/data_classes/s3_batch_operation_event.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,220 @@ | ||
from dataclasses import dataclass, field | ||
from typing import Any, Dict, Iterator, List, Optional, Tuple | ||
from urllib.parse import unquote_plus | ||
|
||
from typing_extensions import Literal | ||
|
||
from aws_lambda_powertools.utilities.data_classes.common import DictWrapper | ||
|
||
|
||
class S3BatchOperationJob(DictWrapper): | ||
@property | ||
def id(self) -> str: # noqa: A003 | ||
return self["id"] | ||
|
||
@property | ||
def user_arguments(self) -> Optional[Dict[str, str]]: | ||
"""Get user arguments provided for this job (only for invocation schema 2.0)""" | ||
return self.get("userArguments") | ||
|
||
|
||
class S3BatchOperationTask(DictWrapper): | ||
@property | ||
def task_id(self) -> str: | ||
"""Get the task id""" | ||
return self["taskId"] | ||
|
||
@property | ||
def s3_key(self) -> str: | ||
"""Get the object key unquote_plus using strict utf-8 encoding""" | ||
# note: AWS documentation example is using unquote but this actually | ||
# contradicts what happens in practice. The key is url encoded with %20 | ||
# in the inventory file but in the event it is sent with +. So use unquote_plus | ||
return unquote_plus(self["s3Key"], encoding="utf-8", errors="strict") | ||
|
||
@property | ||
def s3_version_id(self) -> Optional[str]: | ||
"""Object version if bucket is versioning-enabled, otherwise null""" | ||
return self.get("s3VersionId") | ||
|
||
@property | ||
def s3_bucket_arn(self) -> Optional[str]: | ||
"""Get the s3 bucket arn (present only for invocationSchemaVersion '1.0')""" | ||
return self.get("s3BucketArn") | ||
|
||
@property | ||
def s3_bucket(self) -> str: | ||
""" " | ||
Get the s3 bucket, either from 's3Bucket' property (invocationSchemaVersion '2.0') | ||
or from 's3BucketArn' (invocationSchemaVersion '1.0') | ||
""" | ||
if self.s3_bucket_arn: | ||
return self.s3_bucket_arn.split(":::")[-1] | ||
return self["s3Bucket"] | ||
|
||
|
||
class S3BatchOperationEvent(DictWrapper): | ||
"""Amazon S3BatchOperation Event | ||
|
||
Documentation: | ||
-------------- | ||
- https://docs.aws.amazon.com/AmazonS3/latest/userguide/batch-ops-invoke-lambda.html | ||
""" | ||
|
||
@property | ||
def invocation_id(self) -> str: | ||
"""Get the identifier of the invocation request""" | ||
return self["invocationId"] | ||
|
||
@property | ||
def invocation_schema_version(self) -> str: | ||
""" " | ||
rubenfonseca marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Get the schema version for the payload that Batch Operations sends when invoking an | ||
AWS Lambda function. Either '1.0' or '2.0'. | ||
""" | ||
return self["invocationSchemaVersion"] | ||
|
||
@property | ||
def tasks(self) -> Iterator[S3BatchOperationTask]: | ||
leandrodamascena marked this conversation as resolved.
Show resolved
Hide resolved
|
||
for task in self["tasks"]: | ||
yield S3BatchOperationTask(task) | ||
|
||
@property | ||
def task(self) -> S3BatchOperationTask: | ||
"""Get the first s3 batch operation task""" | ||
return next(self.tasks) | ||
|
||
@property | ||
def job(self) -> S3BatchOperationJob: | ||
"""Get the s3 batch operation job""" | ||
return S3BatchOperationJob(self["job"]) | ||
|
||
|
||
# list of valid result code. Used both in S3BatchOperationResult and S3BatchOperationResponse | ||
VALID_RESULT_CODE_TYPES: Tuple[str, str, str] = ("Succeeded", "TemporaryFailure", "PermanentFailure") | ||
|
||
|
||
@dataclass(repr=False, order=False) | ||
class S3BatchOperationResult: | ||
task_id: str | ||
result_code: Literal["Succeeded", "TemporaryFailure", "PermanentFailure"] | ||
result_string: Optional[str] = None | ||
|
||
def __post_init__(self): | ||
if self.result_code not in VALID_RESULT_CODE_TYPES: | ||
raise ValueError(f"Invalid result_code: {self.result_code}") | ||
leandrodamascena marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def asdict(self) -> Dict[str, Any]: | ||
return { | ||
"taskId": self.task_id, | ||
"resultCode": self.result_code, | ||
"resultString": self.result_string, | ||
} | ||
|
||
@classmethod | ||
def as_succeeded(cls, task: S3BatchOperationTask, result_string: Optional[str] = None) -> "S3BatchOperationResult": | ||
"""Create a `Succeeded` result for a given task""" | ||
return S3BatchOperationResult(task.task_id, "Succeeded", result_string) | ||
|
||
@classmethod | ||
def as_permanent_failure( | ||
cls, | ||
task: S3BatchOperationTask, | ||
result_string: Optional[str] = None, | ||
) -> "S3BatchOperationResult": | ||
"""Create a `PermanentFailure` result for a given task""" | ||
return S3BatchOperationResult(task.task_id, "PermanentFailure", result_string) | ||
|
||
@classmethod | ||
def as_temporary_failure( | ||
cls, | ||
task: S3BatchOperationTask, | ||
result_string: Optional[str] = None, | ||
) -> "S3BatchOperationResult": | ||
"""Create a `TemporaryFailure` result for a given task""" | ||
return S3BatchOperationResult(task.task_id, "TemporaryFailure", result_string) | ||
leandrodamascena marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
@dataclass(repr=False, order=False) | ||
class S3BatchOperationResponse: | ||
"""S3 Batch Operations response object | ||
|
||
Documentation: | ||
-------------- | ||
- https://docs.aws.amazon.com/lambda/latest/dg/services-s3-batch.html | ||
- https://docs.aws.amazon.com/AmazonS3/latest/userguide/batch-ops-invoke-lambda.html#batch-ops-invoke-lambda-custom-functions | ||
- https://docs.aws.amazon.com/AmazonS3/latest/API/API_control_LambdaInvokeOperation.html#AmazonS3-Type-control_LambdaInvokeOperation-InvocationSchemaVersion | ||
|
||
Parameters | ||
---------- | ||
invocation_schema_version : str | ||
Specifies the schema version for the payload that Batch Operations sends when invoking | ||
an AWS Lambda function., either '1.0' or '2.0'. This must be copied from the event. | ||
|
||
invocation_id : str | ||
The identifier of the invocation request. This must be copied from the event. | ||
|
||
treat_missing_keys_as : Literal["Succeeded", "TemporaryFailure", "PermanentFailure"] | ||
undocumented parameter, defaults to "PermanentFailure" | ||
|
||
results : List[S3BatchOperationResult] | ||
results of each S3 Batch Operations task, | ||
optional parameter at start. can be added later using `add_result` function. | ||
|
||
Examples | ||
-------- | ||
|
||
**S3 Batch Operations** | ||
|
||
```python | ||
from aws_lambda_powertools.utilities.typing import LambdaContext | ||
from aws_lambda_powertools.utilities.data_classes import ( | ||
S3BatchOperationEvent, | ||
S3BatchOperationResponse, | ||
S3BatchOperationResult | ||
) | ||
|
||
def lambda_handler(event: dict, context: LambdaContext): | ||
s3_event = S3BatchOperationEvent(event) | ||
response = S3BatchOperationResponse(s3_event.invocation_schema_version, s3_event.invocation_id) | ||
result = None | ||
|
||
task = s3_event.task | ||
try: | ||
do_work(task.s3_bucket, task.s3_key) | ||
result = S3BatchOperationResult.as_succeeded(task) | ||
except TimeoutError as e: | ||
result = S3BatchOperationResult.as_temporary_failure(task, str(e)) | ||
except Exception as e: | ||
result = S3BatchOperationResult.as_permanent_failure(task, str(e)) | ||
finally: | ||
response.add_result(result) | ||
|
||
return response.asdict() | ||
``` | ||
""" | ||
|
||
invocation_schema_version: str | ||
invocation_id: str | ||
treat_missing_keys_as: Literal["Succeeded", "TemporaryFailure", "PermanentFailure"] = "PermanentFailure" | ||
results: List[S3BatchOperationResult] = field(default_factory=list) | ||
|
||
def __post_init__(self): | ||
if self.treat_missing_keys_as not in VALID_RESULT_CODE_TYPES: | ||
raise ValueError(f"Invalid treat_missing_keys_as: {self.treat_missing_keys_as}") | ||
|
||
def add_result(self, result: S3BatchOperationResult): | ||
self.results.append(result) | ||
|
||
def asdict(self) -> Dict: | ||
if not self.results: | ||
raise ValueError("Response must have one result") | ||
if len(self.results) > 1: | ||
raise ValueError("Response cannot have more than one result") | ||
|
||
return { | ||
"invocationSchemaVersion": self.invocation_schema_version, | ||
"treatMissingKeysAs": self.treat_missing_keys_as, | ||
"invocationId": self.invocation_id, | ||
"results": [result.asdict() for result in self.results], | ||
} | ||
leandrodamascena marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"invocationSchemaVersion": "1.0", | ||
"invocationId": "YXNkbGZqYWRmaiBhc2RmdW9hZHNmZGpmaGFzbGtkaGZza2RmaAo", | ||
"job": { | ||
"id": "f3cc4f60-61f6-4a2b-8a21-d07600c373ce" | ||
}, | ||
"tasks": [ | ||
{ | ||
"taskId": "dGFza2lkZ29lc2hlcmUK", | ||
"s3Key": "prefix/dataset/dataset.20231222.json.gz", | ||
"s3VersionId": "1", | ||
"s3BucketArn": "arn:aws:s3:::powertools-dataset" | ||
} | ||
] | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
{ | ||
"invocationSchemaVersion": "2.0", | ||
"invocationId": "YXNkbGZqYWRmaiBhc2RmdW9hZHNmZGpmaGFzbGtkaGZza2RmaAo", | ||
"job": { | ||
"id": "f3cc4f60-61f6-4a2b-8a21-d07600c373ce", | ||
"userArguments": { | ||
"k1": "v1", | ||
"k2": "v2" | ||
} | ||
}, | ||
"tasks": [ | ||
{ | ||
"taskId": "dGFza2lkZ29lc2hlcmUK", | ||
"s3Key": "prefix/dataset/dataset.20231222.json.gz", | ||
"s3VersionId": null, | ||
"s3Bucket": "powertools-dataset" | ||
} | ||
] | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.