-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Add support to mask/encrypt/decrypt Pydantic models, Dataclasses, and standard Python classes in the DataMasking utility #3473
Comments
We've added this to our backlog, and we intend to work on this early next year. |
Hey @leandrodamascena, whenever you have time this week, could you please leave a comment with some more details about what needs to be done / implemented as part of this issue? This will help potential contributors to orient themselves. |
Hey everyone! Our current implementation supports dict and lists as they can be directly converted. However, we recognize that customers may want to use more complex data structures such as Pydantic Models, DataClasses, or custom Python classes. To accommodate these cases, we need to implement a pre-processing step to prepare the data before submitting it for erase, encrypt, or decrypt. This solution addresses the input data challenge, but it's important to note that we cannot guarantee the data will remain a valid instance of the original Pydantic model after processing, for example. The encryption/erase process may alter the data type in ways that break the model's validation rules. For example: from __future__ import annotations
from aws_lambda_powertools.utilities.data_masking import DataMasking
from pydantic import BaseModel
class MyModel(BaseModel):
name: str
age: int
data_masker = DataMasking()
data = MyModel(name="powertools", age=5)
erased = data_masker.erase(data, fields=["age"])
print(erased)
# output: {'name': 'powertools', 'age': '*****'} Not that For now, we can implement this support only when we input the data, not in the output. To do this, I suggest to create a new function called def prepare_data(data: Any) -> Any:
# Convert from dataclasses
if hasattr(data, "__dataclass_fields__"):
import dataclasses
return dataclasses.asdict(data)
# Convert from Pydantic model
if callable(getattr(data, "model_dump", None)):
return data.model_dump()
# Convert from event source data class
if callable(getattr(data, "dict", None)):
return data.dict()
return data After that, call this method in the first line of Also, need to add more tests here https://github.com/aws-powertools/powertools-lambda-python/tree/develop/tests/functional/data_masking to make sure it working as expected. |
@leandrodamascena, I see this a pretty old issue, I can pick this up |
Sure, go ahead @VatsalGoel3. |
…and standard classes (aws-powertools#3473)
…and standard classes (aws-powertools#3473)
… standard classes (#6413) * feat(data-masking): support masking of Pydantic models, dataclasses, and standard classes (#3473) * feat(data_masking): support complex input types via robust prepare_data() with and updated tests * docs(data-masking): add support docs for Pydantic, dataclasses, and custom classes and updated test code * docs(data-masking): update examples to use Lambda function entry points for supported input types and updated codebase * refactoring prepare_data method --------- Co-authored-by: Leandro Damascena <[email protected]>
|
Use case
Currently, the DataMasking utility only supports operations with traversable types in Python, for example: Lists, Dict, str, and others. It is a limitation for customers who want to integrate DataMasking utility with their existing Pydantic models, data classes, or standard Python classes.
Solution/User Experience
Add support for mask, encrypt and decrypt Pydantic models, Dataclasses, and standard Python classes.
Alternative solutions
No response
Acknowledgment
The text was updated successfully, but these errors were encountered: