-
Notifications
You must be signed in to change notification settings - Fork 229
REF: Make plugin models self-render #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -281,6 +284,42 @@ def __post_init__(self): | |||
self.deprecated = self.proto_obj.options.deprecated | |||
super().__post_init__() | |||
|
|||
def compile(self, indent_level: int = 0) -> str: | |||
"""Construct Python code from MessageCompiler instance.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Construct Python code from MessageCompiler instance.""" | |
"""Generate Python code for this MessageCompiler instance.""" |
@@ -281,6 +284,42 @@ def __post_init__(self): | |||
self.deprecated = self.proto_obj.options.deprecated | |||
super().__post_init__() | |||
|
|||
def compile(self, indent_level: int = 0) -> str: | |||
"""Construct Python code from MessageCompiler instance.""" | |||
out = f"class {self.py_name}(betterproto.Message):" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear this is OK, certainly not worse than the jinja template. Building up a string line by line is clear enough, if a bit visually noisy, and a bit inefficient with all that copying.
However I wonder if it could be at least as clear and also more concise by building up and joining a list like:
result = [f"class {self.py_name}(betterproto.Message):"]
if self.comment:
result.append(self.comment)
if self.fields:
result.append(
field.compile(indent_level=indent_level + 1) for field in self.fields
)
...
return "\n".join(result)
Or even get a little declarative DSL going to manage the indentation and compiling of child compilers like:
code = CodeBuilder(indent=indent_level)
code(f"class {self.py_name}(betterproto.Message):")
if self.comment:
code(self.comment, space=1) # space adds extra empty lines after
if self.fields:
code(*self.fields, indent=1) # *args can be Compilers, and indents are cumulative
else:
code("pass", indent=1)
if self.deprecated or list(self.deprecated_fields):
code("def __post_init__(self) -> None:", indent=1)
code("super().__post_init__()", indent=2)
...
return code.build()
Given a "simple" helper along the lines of:
class CodeBuilder:
def __init__(self, indent:str):
self._base_indent = indent
self._lines = []
def __call__(*lines, *, indent: int = 0, space: int = 0)
for line in lines:
...
def compile(): str
...
I'm not saying it's necessary, and there's a risk of a getting carried away, but it could be nice. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the CodeBuilder
because it can be used to clean up new lines and track indentations. I think that as long as it's clear that it's only function is building the code and formatting (i.e. it has no knowledge of what the lines of code it is being fed mean) the scope should be manageable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding performance, while we should obviously try to optimize what can be optimized, my personal feeling is that speed here should not be as much of a concern as easy to understand code.
@nat-n what do you think of a context manager based indenter/spacer: Extensive Examplefrom contextlib import contextmanager
from typing import Iterator
class CodeBuilder:
"""Helper to organize and format lines of code."""
def __init__(self, indent: str = " " * 4):
self._base_indent = indent
self._lines = []
self._indent_level: int = 0
@contextmanager
def pre_space(self, space: int = 1) -> Iterator[None]:
self._lines.extend([""] * space)
yield
@contextmanager
def post_space(self, space: int = 1) -> Iterator[None]:
yield
self._lines.extend([""] * space)
@contextmanager
def indented_block(self) -> Iterator[None]:
self._indent_level += 1
yield
self._indent_level -= 1
def add_lines(self, *lines) -> None:
for line in lines:
self._lines.append((self._base_indent * self._indent_level) + line)
def compile(self):
return "\n".join(self._lines)
def get_lines(self):
return self._lines
class MessageCompiler:
def __init__(self, fields):
self.fields = fields
def get_lines(self):
code = CodeBuilder()
with code.post_space():
code.add_lines("class MyMessage(betterproto.message):")
with code.pre_space(), code.indented_block():
code.add_lines("# I'm a comment!")
with code.pre_space(), code.indented_block():
for field in self.fields:
code.add_lines(*field.get_lines())
return code.get_lines()
def compile(self):
return "\n".join(self.get_lines())
class FieldCompiler:
def __init__(self, name):
self.name = name
def get_lines(self):
return [f'field_{self.name} = "{self.name.upper()}"']
def compile(self):
return "\n".join(self.get_lines())
fields = [FieldCompiler("somefield"), FieldCompiler("anotherfield")]
message = MessageCompiler(fields=fields)
print(message.compile())
I think this gives a good amount of freedom while still being easy to understand. For example, the following are equivalent: code = CodeBuilder()
with code.post_space():
code.add_lines("class MyMessage(betterproto.message):")
with code.indented_block():
with code.pre_space():
code.add_lines("# I'm a comment!")
with code.pre_space(),
for field in self.fields:
code.add_lines(*field.get_lines())
return code.get_lines() code = CodeBuilder()
with code.post_space():
code.add_lines("class MyMessage(betterproto.message):")
with code.pre_space(), code.indented_block():
code.add_lines("# I'm a comment!")
with code.pre_space(), code.indented_block():
for field in self.fields:
code.add_lines(*field.get_lines())
return code.get_lines() |
@adriangb I guess the context manager is nice in that the code ends up sort of mirroring the resulting code in term of indent level. Looking again at my example it seems like an obvious way to explicitly express an aspect of the structure that's currently expressed through repetition. I'm not sure the post_space/pre_space makes sense as a context manager, since it doesn't alter the contents in the same sense, It could just be a call to "add an empty line" (although the relationship of the space to the code that "causes" is lost). Why indent the whole block just to append an empty line to the end? Honestly, the idea is good, but I don't like it as much, in that there's more for the reader to make sense of both visually and structurally. My taste is for a bit more minimalism, but I could understand if my example seemed to go too far to the point of looking cryptic. However there's something to be said for the simplicity of expressing the whole process in terms of just two operations on the builder "add stuff" (with certain properties) and "get the result". Within such as simple interaction I think a little magic (such as abstracting calls to compile sub-components) is nice to have without being too surprising. What do you think? |
I hear you on Extensive Examplefrom contextlib import contextmanager
from typing import Iterator
class CodeBuilder:
"""Helper to organize and format lines of code."""
def __init__(self, indent: str = " " * 4):
self._base_indent = indent
self._lines = []
self._indent_level: int = 0
def add_blank(self):
self._lines.append("")
@contextmanager
def indented_block(self) -> Iterator[None]:
self._indent_level += 1
yield
self._indent_level -= 1
def add_lines(self, *lines) -> None:
for line in lines:
self._lines.append((self._base_indent * self._indent_level) + line)
def compile(self):
return "\n".join(self._lines)
def get_lines(self):
return self._lines
class MessageCompiler:
def __init__(self, fields=None, messages=None):
self.fields = fields or []
self.messages = messages or []
def get_lines(self):
code = CodeBuilder()
code.add_lines("class MyMessage(betterproto.message):")
with code.indented_block():
code.add_blank()
code.add_lines("# I'm a class comment!")
code.add_blank()
if self.fields or self.messages:
for field in self.fields:
code.add_lines(*field.get_lines())
for message in self.messages:
code.add_blank()
code.add_lines(*message.get_lines())
else:
code.add_lines("pass")
code.add_blank()
return code.get_lines()
def compile(self):
return "\n".join(self.get_lines())
class FieldCompiler:
def __init__(self, name):
self.name = name
def get_lines(self):
return [f'field_{self.name} = "{self.name.upper()}"']
def compile(self):
return "\n".join(self.get_lines())
inner_message = MessageCompiler(fields=[])
fields = [FieldCompiler("somefield"), FieldCompiler("anotherfield")]
messages = [inner_message]
outer_message = MessageCompiler(fields=fields, messages=messages)
print(outer_message.compile())
|
Generally, I think it makes sense to have some type of helper for indentation and spacing since all of the compilers are going to be doing that and having random |
One thing I noticed that might be an issue: we'll have to rework the current |
Hmm, personally I feel that adding this abstraction layer is a step backwards in terms of simplicity and readability. Consider why we don't generate HTML page like that anymore and every single web framework uses template engines or html builders. If we really need to go this route, I'd go more in the direction of constructing an Abstract Syntax Tree (Python already has a built in package for it) and let a renderer like black turn it into a string. I'm not sure the python AST will be easy enough to work with, but it would be a well documented structure that supports all python by design. I haven't considered all perspectives deeply, so this is mostly my gut reaction, and I'm willing to change my mind. I do however like the idea of each object to selfrender, but if we could do that with templates still (compare with vue components), we might have the best of both worlds. Seems like the nesting is the biggest problem here. What if we return a rendered template from each object, and apply the indentation to the result? |
I'm not opposed to something like that, but then we really would be increasing the complexity/minimum level of understanding required.
We could embed the existing templates into these "compiler" classes and render that way, still using jinja. That's also (probably) a smaller diff.
That the idea here, have the parent object apply the indentation. I think it would be the same with a template. Overall not opposed to the idea of still using templates, as long as each one of these "compiler" classes had it's own template that can be self-contained so that things like nested message definitions would be possible. Personally, my main goal here is to be able to extend for parsing of custom options. As far as I can tell, either option should allow that. |
I think we're all agreed that self-rendering components is a step forward. My first thought on this was that each component would have it's own jinja template, though I'm not sure that's optimal, since I don't feel jinja works as elegantly for python as for html. Still a step forward from what we have. I am particularly cautious of using an AST. Modelling program syntax is a much larger problem than string building or templating. Of course if there's a sufficiently lightweight, simple, and flexible off the shelf AST library could be quite a nice way of doing it, but might still be overkill. My intuition is that a good balance of simplicity/clarity/expressiveness/efficiency is most easily achieved within the paradigm of a string builder –– basically a convenient API around appending to a list of strings to then concatenate. There's not a lot of abstraction required. My favourite realisation of this is still basically of what I proposed, where indent beyond the local baseline (of the compile method) is annotated per call to add lines. It could be thought of like specifying "padding" directly on a JSX component. That said annotating those lines via a context manager is also fine. Within this framework I think it's simpler to pass the baseline indentation to the rendering component (keep the indent_level arg) rather than indent the output after the fact. Especially since indentation is only sometimes related to component boundaries. I find this an interesting topic to explore, but if it gets tiresome, I wouldn't be fundamentally opposed to merging some variation of any of the proposals so far. |
@nat-n I think we agree regarding Jinja and AST: the former can be a bit ugly (do we just define a bunch of strings in a Regarding the string builder helper proposals or the specific internals used to build a string that we have been discussing in this PR and in your above comment, I don't 100% understand everything you are saying, so I'm sorry for any miscommunication. In any case, whatever you feel is best I am happy with, as long as it will eventually allow for extending the plugin to process custom options and transform them into injection of arbitrary code.
I think the one big pro of the Jinja approach is that we aren't proposing anything fundamentally new but rather just refactoring what already exists. I think the bar for that should be a lot lower (i.e. if the API isn't great or super flexible that's okay, it's just a first step). |
I'm going to close this. As per offline discussion with @nat-n, I don't think this is the path forward. |
Applying one of the proposals in #119.
This would be a pure-python way of making the dataclasses self render. The other option would be to continue to use Jinja templates. I don't feel strongly about either, I just feel that we should pick something that is extensible and compatible with #119.
@nat-n @boukeversteegh could you take a look at this and give some feedback on this path?