Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structuring a dict containing class instances #634

Open
AlbertoRFer opened this issue Mar 18, 2025 · 4 comments
Open

Structuring a dict containing class instances #634

AlbertoRFer opened this issue Mar 18, 2025 · 4 comments

Comments

@AlbertoRFer
Copy link

I am dealing with some unstructured data that is supposed to be converted to some object instances of classes A and B.

@attrs.define
class A:
    id: str


@attrs.define
class B:
    member: A

This is an example of how the unstructured data might look:

data_A = [{"id": "A_1"}, {"id": "A_2"}]
data_B = [{"member": "A_1"}, {"member": "A_1"}]

In this example the desire output would be to obtain a list of instances of B where both instances contain a reference to the same instance of A. I will use cattrs to structure instances of A and B. To do that I write the following:

a_map = {element["id"]: cattrs.structure(element, A) for element in data_A}

b_list = []
for element in data_B:
    member_id = element["member"]
    member_instance = a_map[member_id]
    new_element = {"member": member_instance}
    b_instance = cattrs.structure(new_element, B)
    b_list.append(b_instance)

Now this does not work because cattrs will try to structure the A instances in the new_element dictionaries. I can write a custom hook to solve this and use the custom converter to structure both A and B.

def structure_instances(val, type):
    if isinstance(val, type):
        return val

    return cattrs.structure(val, type)


new_converter = cattrs.Converter()
new_converter.register_structure_hook(A, structure_instances)

This works but have some downsides:

  • If I am always expecting an instance of A in the unstructured data for B then the I can not make that check because a proper dictionary will also be accepted so I am losing some validation options there.
  • If I already have a custom converter with other custom hooks (I do), then using the default cattrs.structure is not possible.

Some workarounds that I have thought of:

  • Creating individual custom converters for each class, this solves both downsides but then i need to register the custom hooks in both custom converters. Adding more classes that need to be converted and have the same issue means creating more custom converters.
  • Register the custom hook just before using it in the structure of B and then register the default one afterwards. This also solves both issues with less code duplication and only one custom converter but I am not sure this is the best way of doing it. For completion sake this is my code using the second approach:
import attrs
import cattrs


@attrs.define
class A:
    id: str


@attrs.define
class B:
    member: A


data_A = [{"id": "A_1"}, {"id": "A_2"}]
data_B = [{"member": "A_1"}, {"member": "A_2"}]


def structure_instances(val, type):
    if not isinstance(val, type):
        raise ValueError(f"{val} is not a {type}")

    return val


new_converter = cattrs.Converter()
# Register other custom hooks

a_map = {element["id"]: new_converter.structure(element, A) for element in data_A}

b_list = []
new_converter.register_structure_hook(A, structure_instances)
for element in data_B:
    member_id = element["member"]
    member_instance = a_map[member_id]
    new_element = {"member": member_instance}
    b_instance = new_converter.structure(new_element, B)
    b_list.append(b_instance)
new_converter.register_structure_hook(A, cattrs.structure)

Is there a better way of solving this problem?

@Tinche
Copy link
Member

Tinche commented Mar 23, 2025

Definitely an interesting problem. First question: is your data definition recursive (in particular, for B)? In other words, when you structure B, does the hook for B get called more than once for each top-level instance of B?

@AlbertoRFer
Copy link
Author

I am not sure that I understand the question properly, but I guess the answer is no. B's attributes are all standard types (int, str, list) except for the attribute/s containing references to A.

@Tinche
Copy link
Member

Tinche commented Mar 25, 2025

I asked because there's a cute trick you can use if you don't want to change an existing converter: you can generate a hook but not register it. You can just call it directly. This doesn't work if your data is recursive and your hook expects to be able to call itself through the converter.

Another solution might be copying an existing converter using converter.copy() and then modifying it.

Playing around with your problem, here's how I would possibly solve it:

from threading import local

import attrs

import cattrs


@attrs.define
class A:
    id: str


@attrs.define
class B:
    member: A


data_A = [{"id": "A_1"}, {"id": "A_2"}]
data_B = [{"member": "A_1"}, {"member": "A_2"}]

converter = cattrs.Converter()

registry_of_a = local()

hook_for_b = cattrs.gen.make_dict_structure_fn(
    B,
    converter,
    member=cattrs.override(struct_hook=lambda a_id, _: registry_of_a.a_dict[a_id]),
)


def structure_data():
    a_instances = converter.structure(data_A, list[A])
    a_dict = {a.id: a for a in a_instances}

    registry_of_a.a_dict = a_dict
    try:
        b_instances = [hook_for_b(val) for val in data_B]

        print(b_instances)
    finally:
        del registry_of_a.a_dict


structure_data()

We use a thread local variable to keep our registry of As during the structuring process. If you don't need it to be threadsafe you can just use a module-level variable.

We generate a custom hook for B, overriding only the structure handler for member. This handler uses our thread local to look up instances of A.

Then we structure the entire thing in structure_data.

What do you think?

@AlbertoRFer
Copy link
Author

AlbertoRFer commented Mar 31, 2025

If I understand correctly, with this hook we are overriding how the member attribute of the B will be structured. If that is the case, then why use it as a normal independent function instead of registering it with the converter? By only changing the way the member attribute is structured I can achieve what I want without ever changing how the structuring of the instances of A is done, right?

It seems to me that using the cattrs.gen.make_dict_structure_fn() function is exactly what I was looking for. A way to override how an attribute is structured but still use my custom converter that has other hooks registered. All without affecting how other classes are structured. Did I understand correctly?

In the case I don't want to use a global variable to store de a_dict, what is the normal approach to generate the hook? I am thinking just to make a closure for this and re-register the hook whenever a_dict changes. I am not sure if there is a better way.

Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants