-
-
Notifications
You must be signed in to change notification settings - Fork 118
Better Union resolution #423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes, managed to circumvent this with factories, but I still think that should be configurable out of the box, especially when the documentation regarding hooks and factories is not very clear, and the solution was ultimately found by trial and error. If anyone wonders what I did: from typing import Type, Union, get_origin, get_args
from cattrs import structure, global_converter
def is_union(type_: Type) -> bool:
return get_origin(type_) is Union
def union_structure_hook_factory(_):
def union_hook(data, type_):
args = get_args(type_)
if data is None: # we don't try to coerce None into anything
return None
for t in args:
try:
return structure(data, t)
except Exception:
continue
raise ValueError(f"Could not cast {data} to {type_}")
return union_hook
global_converter.register_structure_hook_factory(is_union, union_structure_hook_factory) After that the Union resolution works (mostly) like a charm. |
Howdy! Thanks for your interest in cattrs. The good news is in the next release at least the first example will work out of the box if you use a preconfigured converter (like from I'm vaguely aware of the strategy you mention; I think Pydantic used to use it (and maybe they still do, I don't know). I really dislike it for a number of reasons though. Firstly, it makes the ordering inside of a union matter. In the Python type system, in general, Secondly, it's inefficient. The logic here will be O(n) where n is the size of the union, and if your value is at the end of the union, the structuring logic will hit the sad path for all other members of the union. The sad path is generally slow since it raises exceptions. Thirdly, cattrs generally deals in coercion, not validation. (I think other libraries like Pydantic v2 are following in this nowadays, but again, not sure.) There are good reasons for this which I won't go into here. But this means if you do And lastly, I think it just leads to loss of information. For example, if your type is For these reasons I feel this approach is an attractive nuisance and I would be reluctant to have cattrs do this by default. That said, the point of cattrs is to be flexible. If you're willing to contribute a strategy that enables this I'd be glad to provide review and feedback. (A strategy is a high-level customization that you can apply to any converter to enable some behavior.) I think you have a solid start in your second comment. |
Thank you for detailed answer! I can totally see your reasoning for avoiding this behavior as default. However, I still think it should be implemented in some form or other, even with this points in mind. Speaking of the equivalency of the types, you are right that ordering will add some cognitive load, but it kind of makes sense. Let me explain. Here's an example just to illustrate what I mean: @define
class A:
date_field: Union[datetime, str]
def __attrs_post_init__(self): # or we could have an attrs field with a converter specified
if isinstance(self.date_field, str):
self.date_field = datetime.strptime(self.date_field, '%Y-%m-%d')
def apply_offset(self, offset: timedelta = timedelta(days=1)):
self.date_field += offset So for me that would mean: "This code mainly works with Regarding the efficiency of the solution, or lack of thereof, from a user experience stand point I'd probably have some default strategy can be inefficient or has some nuances to keep in mind, but at least it works, instead forcing you to bang your head into the wall and having to provide some custom boilerplate for exactly the same thing. As you said, cattrs is designed to be flexible, so if you don't like the defaults - you just override them with your custom hooks. As for coercion / loss of information - cannot really comment on those, but I do feel that if you have some sort of conversion in your code, you should be fine most of the time. And if you don't, why would you want to unite different types in the first place? Anyway, if you wish to see this sort of thing added to cattrs, I can probably contribute this as a "strategy". However, it's not that clear from documentation how these strategies are supposed to be formatted. There should be some rules for this, right? |
Description
Right now I'm looking to migrate from dataclass + dacite to attrs + cattrs, but there's a feature in dacite that I miss in cattrs: consequtive resolution of Unions. What dacite does is that it takes each type in the Union and tries to match it to the data. If nothing is matched - the error is raised. This behavior seemed pretty intuitive for me.
However, if cattrs meets even a really basic Union like
Union[str, int, float]
and data like'123'
it will pretend it doesn't know how to deal with this one, even if the resolution is really straightforward. Yes, you can probably write a hook for that, but manually describing each Union is really undesirable and error-prone. Maybe hook factory solves this, but I'm still figuring this stuff out, and my idea is that this should not be as complicated as it is right now.What I Did
The text was updated successfully, but these errors were encountered: