-
Notifications
You must be signed in to change notification settings - Fork 229
Add support for custom options #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Separate comment to keep the first post clean. Towards point (3) I started working on a refactor of @dataclass
class Message:
"""Representation of a protobuf message.
"""
parent: Union[OutputFile, Message]
proto_obj: DescriptorProto
path: List[int]
fields: List[Union[Field, Message]] = field(default_factory=list)
@property # really this should be functools.cached_property but that's >=3.8 only
def proto_name(self) -> str:
return self.proto_obj.name
@property
def py_name(self) -> str:
return pythonize_class_name(self.proto_name)
def process_options(self, ??) -> ??:
pass Where the |
Hi Adrian! Options and extensibilityYes! Its a great idea, and it would be great to support extending betterproto with options.
OOPGreat initiative! I was looking forward to see some progress towards making the project more OOP. #110 still needs to be merged, so take care that this wouldn't make your work incompatible. Because I may not have time to look into that PR until the next week, perhaps we could start by laying out some classes and their responsibilities? I like idea of a message object where we can store information of both the source (proto file) and the output (like package, file path, needed imports etc). However, I would suggest to separate this refactor from the support of extensions (i.e. oop first, then extensibility), so that we can merge oop improvements, even when we haven't worked out the extensions. |
This does sound interesting, I just have no idea how it works, I'd have to look into it further. Even if one plugin was able to get the output from another, we would have to somehow encode the custom options into the python source or the request/response objects, at which point I feel like we'd still be coming up with some meta language. I think the same applies to any type of hooks or pre-made structures: we would have to come up with (and justify) that syntax/structure for encoding and managing the custom options. Since it would be hard to justify a universal approach that works with all possible uses of custom options, I was leaning towards delegating that work to the implementer of the custom options. In order to do that, I think I actually have a somewhat working version of |
@boukeversteegh I opened a PR at #121 to discuss modularization as a first step |
I like the sound of starting with a pure refactor to make the code easier to work (maybe in multiple stages) with before adding functionality. Concerning each of the three approaches to extending or building on top of betterproto for custom option based behavoir: A) Running multiple plugins in sequence has the advantage of minimal coupling between protoc and extensions. I'd suggest looking at protoc-docs-plugin as a relatively simple example of this. That's not to say it's a simple task, but it should still be possible to produce an easy to use template for creating new extensions. B) Creating an architecture for directly extending betterproto at compile time would obviously require a greater number of explicit API design choices to make (and live with thereafter), but might have the advantage of being more powerful and maybe nicer in the end if it works well. One way of realising this (similar to how I've seen such things done before) would be for betterproto to look at compile time for packages on sys.path with names derived from the custom option. So for example on encountering a custom option betterproto would do something along the lines of: try:
extension = importlib.import_module(f"betterproto_extension_{custom_option_name}")
annotated_entity = extension.apply(annotated_entity)
except ImportError:
# ignore where annotated_entity is the object modelling the message/field in the brave new OOP implementation. This mechanism has the advantage of being super simple to work with, but the potential disadvantage of being a bit implicit and uncontrolled WRT option names vs ownership of packages in pypi. C) Applying a generic annotation to generated code might lie somewhere in between options A & B in terms of coupling, required complexity, and power, though my (not very well substantiated) intuition is that it would be a worse tradeoff and less elegant solution than either of them overall. |
@adriangb i would be very curious to see a working proof of concept of something that extends betterproto. I saw the empty method, but at the moment I don't see how it could be used by an extension. But all in due time of course. I might prototype something myself if i get around to it. It's an interesting challenge in itself. @adriangb @nat-n i agree with your observations that defining an API surface will require careful deliberation and that it will be a longterm commitment. The impact of breaking the extension api will be possibly higher than breaking betterproto api itself, since upgrading users can always refactor their own code when upgrading betterproto, but if the plugins they depend on are broken, they will be stuck. Not being able to upgrade means the library is at end of life, so that's not an option imo. With that in mind, I am most positive towards suggestion 2 by @nat-n . The extension point is straightforward, although we will still need to define an api surface for the extension to interact with. The smaller a surface we can come up with, the better, in my opinion. Anything we expose will be forever frozen and slow down development. At this stage I think we can't afford it, unless we expose literally nothing of betterproto internals. I thought, we could do this by passing the compiled result to the extension, with a list of options that we found (Idea 1, basically) The extension could then modify the code at will. But i realize that this would, instead of exposing a few well defined objects as api surface, would make the plugins depend on the specific generated code (such as search and replace, modifying the syntax tree). This would be even worse, as we now cannot change the compiled output for fear of breaking extensions. So in conclusion, it seems better to define Some surface, than none (using the entire output as interface), and the smaller and more explicit the surface, the better. |
I definitely think this is a problem that requires careful deliberation. One option is to simply attach the reference like this: @dataclass
class MyMessage(betterproto.Message):
myfield: str = betterproto.string_field(options={"package.message.field": "value"}) And then let the implementer figure out how to deal with that. We could make some suggestions, such as: from xyz import MyMessage
class MessageWithOptions(MyMessage, MyOptionsParser):
pass Where Another option, that would work at compile time, would be to give each message/service/etc in these new OOP classes attribute that is an arbitrary text field that will be added at the end of the betterproto generated text. Then provide a single API point that allows modifying this field at compile time based on the custom options parsed. I hope this all makes sense, I'm on mobile but can try up something more detailed later today. |
But also, it may be more practical to look at how other proto compilers allow their extensions to hook in, rather than reinventing the wheel 😝 |
@adriangb I'm not sure I can picture what your last point entails. Contrary to my previous comment about what I labeled as approach C (which your latest comment expanded on), maybe this approach is worth considering in that it is quite low commitment with small footprint on the API, and does preclude adding more powerful compile time mechanisms later. Though I'm not sure how one would use it. One problem with the way you suggested it is that overriding Could you perhaps tell us a bit about your own use case to help make the discussion more concrete? |
Of course! Do you know of any other Python-based proto compilers that have similar issues? I'd love to take a look and learn from others. I am too unfamiliar with the space to really know which ones are good to look at... |
Very understandable, I do think I wasn't very clear. Basically, I was suggesting that we give the new OOP objects an attribute which holds arbitrary text: class Field(Message):
self.options_code: str = "" At some point, somehow, we'd give the custom options parser to modify this text. Then in the Jinja template: class {{ message.py_name }}(betterproto.Message):
{% if message.options_code%}
{{ message.options_code}} But really that custom options parser would still need access to
My use case is to convert protoc-gen-validate custom options into Pydantic validators (see my comment in PGV). There are many ways to do this, as we have been discussing. That said, I think it is important for this library to consider all use cases, not just my own. As far as I understand, custom options are little more than comments on a field/message. The implementer can choose to do with them whatever they want. I feel that the best solution would be one that fully allows the implementer to inject whatever behavior they want, be that a decorator on the class, a new method in the class, etc. Hope this helps clarify! |
Another option that might be nice but would require more extensive API planning and refactoring is to have each one of these OOP classes render itself: class MessageCompiler:
def render(self):
class_header = f"class {self.py_name}(betterproto.Message):\n"
body = ""
body += 'f""""{self.comment}""""\n'
for field in self.fields:
body += INDENT * 4 + field.render() This could also be done with Jinja templates I guess: from .templates import MessageCompilerTemplate
class MessageCompiler:
render_template: str = jinja2.Env(...MessageCompilerTemplate..)
def render(self):
return self.render_template(message=self) The downstream plugin/options handler would have to override the |
Now that #121 is merged, I think tackling this is a good next step since it will be very informative in designing any further changes to the plugin. Currently, I am thinking the option I proposed in #119 (comment) makes the most sense within the context of #121, I'm going to attempt an implementation and see how it works out. |
I like this idea. I had something along these lines in mind as well, sort of like how components compose in UI frameworks like ReactJS. The API will take some iteration to get right, but this could also be quite nice for handling a few other things that are planned like, nested classes. |
Do you have any strong feelings regarding use of Jinja templates within the class vs. just using g-strings and such (kind of how I did for the message fields in #121 )? My gut feeling is to keep the rendering within the dataclasses pure Python and use a Jinja template to render the overall output file. |
Any progress so far? I was just wondering why options can't be attached to the field metadata? It might be a bit hard to get them out of the dataclass, but at least it should be very easy to implement. |
Hi, awesome work and I appreciate the effort! We have bunch of option fields in our models and it would awesome if betterproto could render them in any form (personally I like the @Property approach). So is there any progress on this or it's just stalled without a clear path how to proceed? |
Custom options are a vanilla proto feature documented here. I personally have a use case for them to apply constraints, similar to protoc-gen-validate. I think supporting generic custom options would be a great addition to python-betterproto.
In order for this to happen, I see a couple changes that would be needed:
CodeGeneratorRequest. ParseFromString
is called. This req is documented in Python ParseFromString not parsing custom options protocolbuffers/protobuf#3321.plugin.py
needs the ability to parse the custom options and somehow store them.plugin.py
would need to be made extendable so that individual implementations can decide how to parse custom options and how to use them in the generation of the output.py
files.Is there any interest in this? I'd be willing to try and work on something.
The text was updated successfully, but these errors were encountered: