REF: Make plugin models self-render #131

adriangb · 2020-07-30T18:08:09Z

Applying one of the proposals in #119.

This would be a pure-python way of making the dataclasses self render. The other option would be to continue to use Jinja templates. I don't feel strongly about either, I just feel that we should pick something that is extensible and compatible with #119.

@nat-n @boukeversteegh could you take a look at this and give some feedback on this path?

nat-n · 2020-08-02T15:06:00Z

src/betterproto/plugin/models.py

@@ -281,6 +284,42 @@ def __post_init__(self):
        self.deprecated = self.proto_obj.options.deprecated
        super().__post_init__()

+    def compile(self, indent_level: int = 0) -> str:
+        """Construct Python code from MessageCompiler instance."""


Suggested change

"""Construct Python code from MessageCompiler instance."""

"""Generate Python code for this MessageCompiler instance."""

nat-n · 2020-08-02T15:34:45Z

src/betterproto/plugin/models.py

@@ -281,6 +284,42 @@ def __post_init__(self):
        self.deprecated = self.proto_obj.options.deprecated
        super().__post_init__()

+    def compile(self, indent_level: int = 0) -> str:
+        """Construct Python code from MessageCompiler instance."""
+        out = f"class {self.py_name}(betterproto.Message):"


To be clear this is OK, certainly not worse than the jinja template. Building up a string line by line is clear enough, if a bit visually noisy, and a bit inefficient with all that copying.

However I wonder if it could be at least as clear and also more concise by building up and joining a list like:

result = [f"class {self.py_name}(betterproto.Message):"] if self.comment: result.append(self.comment) if self.fields: result.append( field.compile(indent_level=indent_level + 1) for field in self.fields ) ... return "\n".join(result)

Or even get a little declarative DSL going to manage the indentation and compiling of child compilers like:

code = CodeBuilder(indent=indent_level) code(f"class {self.py_name}(betterproto.Message):") if self.comment: code(self.comment, space=1) # space adds extra empty lines after if self.fields: code(*self.fields, indent=1) # *args can be Compilers, and indents are cumulative else: code("pass", indent=1) if self.deprecated or list(self.deprecated_fields): code("def __post_init__(self) -> None:", indent=1) code("super().__post_init__()", indent=2) ... return code.build()

Given a "simple" helper along the lines of:

class CodeBuilder: def __init__(self, indent:str): self._base_indent = indent self._lines = [] def __call__(*lines, *, indent: int = 0, space: int = 0) for line in lines: ... def compile(): str ...

I'm not saying it's necessary, and there's a risk of a getting carried away, but it could be nice. What do you think?

I like the CodeBuilder because it can be used to clean up new lines and track indentations. I think that as long as it's clear that it's only function is building the code and formatting (i.e. it has no knowledge of what the lines of code it is being fed mean) the scope should be manageable.

Regarding performance, while we should obviously try to optimize what can be optimized, my personal feeling is that speed here should not be as much of a concern as easy to understand code.

adriangb · 2020-08-03T17:11:48Z

@nat-n what do you think of a context manager based indenter/spacer:

Extensive Example

from contextlib import contextmanager
from typing import Iterator


class CodeBuilder:
    """Helper to organize and format lines of code."""

    def __init__(self, indent: str = " " * 4):
        self._base_indent = indent
        self._lines = []
        self._indent_level: int = 0
    
    @contextmanager
    def pre_space(self, space: int = 1) -> Iterator[None]:
        self._lines.extend([""] * space)
        yield

    @contextmanager
    def post_space(self, space: int = 1) -> Iterator[None]:
        yield
        self._lines.extend([""] * space)
    
    @contextmanager
    def indented_block(self) -> Iterator[None]:
        self._indent_level += 1
        yield
        self._indent_level -= 1

    def add_lines(self, *lines) -> None:
        for line in lines:
            self._lines.append((self._base_indent * self._indent_level) + line)

    def compile(self):
        return "\n".join(self._lines)

    def get_lines(self):
        return self._lines


class MessageCompiler:

    def __init__(self, fields):
        self.fields = fields

    def get_lines(self):
        code = CodeBuilder()
        with code.post_space():
            code.add_lines("class MyMessage(betterproto.message):")
            with code.pre_space(), code.indented_block():
                code.add_lines("# I'm a comment!")
            with code.pre_space(), code.indented_block():
                for field in self.fields:
                    code.add_lines(*field.get_lines())
        return code.get_lines()

    def compile(self):
        return "\n".join(self.get_lines())


class FieldCompiler:

    def __init__(self, name):
        self.name = name
    
    def get_lines(self):
        return [f'field_{self.name} = "{self.name.upper()}"']

    def compile(self):
        return "\n".join(self.get_lines())

fields = [FieldCompiler("somefield"), FieldCompiler("anotherfield")]
message = MessageCompiler(fields=fields)

print(message.compile())

class MyMessage(betterproto.message):

    # I'm a comment!

    field_somefield = "SOMEFIELD"
    field_anotherfield = "ANOTHERFIELD"

I think this gives a good amount of freedom while still being easy to understand. For example, the following are equivalent:

code = CodeBuilder()
with code.post_space():
    code.add_lines("class MyMessage(betterproto.message):")
    with code.indented_block():
        with code.pre_space():
            code.add_lines("# I'm a comment!")
        with code.pre_space(), 
            for field in self.fields:
                code.add_lines(*field.get_lines())
return code.get_lines()

code = CodeBuilder()
with code.post_space():
    code.add_lines("class MyMessage(betterproto.message):")
    with code.pre_space(), code.indented_block():
        code.add_lines("# I'm a comment!")
    with code.pre_space(), code.indented_block():
        for field in self.fields:
            code.add_lines(*field.get_lines())
return code.get_lines()

nat-n · 2020-08-03T18:13:52Z

@adriangb I guess the context manager is nice in that the code ends up sort of mirroring the resulting code in term of indent level. Looking again at my example it seems like an obvious way to explicitly express an aspect of the structure that's currently expressed through repetition.

I'm not sure the post_space/pre_space makes sense as a context manager, since it doesn't alter the contents in the same sense, It could just be a call to "add an empty line" (although the relationship of the space to the code that "causes" is lost). Why indent the whole block just to append an empty line to the end?

Honestly, the idea is good, but I don't like it as much, in that there's more for the reader to make sense of both visually and structurally.

My taste is for a bit more minimalism, but I could understand if my example seemed to go too far to the point of looking cryptic. However there's something to be said for the simplicity of expressing the whole process in terms of just two operations on the builder "add stuff" (with certain properties) and "get the result". Within such as simple interaction I think a little magic (such as abstracting calls to compile sub-components) is nice to have without being too surprising.

What do you think?

adriangb · 2020-08-03T18:40:40Z

I hear you on post_space/pre_space. Easy enough to just replace with add_blank or something.

Extensive Example

from contextlib import contextmanager
from typing import Iterator


class CodeBuilder:
    """Helper to organize and format lines of code."""

    def __init__(self, indent: str = " " * 4):
        self._base_indent = indent
        self._lines = []
        self._indent_level: int = 0
    
    def add_blank(self):
        self._lines.append("")
    
    @contextmanager
    def indented_block(self) -> Iterator[None]:
        self._indent_level += 1
        yield
        self._indent_level -= 1

    def add_lines(self, *lines) -> None:
        for line in lines:
            self._lines.append((self._base_indent * self._indent_level) + line)

    def compile(self):
        return "\n".join(self._lines)

    def get_lines(self):
        return self._lines


class MessageCompiler:

    def __init__(self, fields=None, messages=None):
        self.fields = fields or []
        self.messages = messages or []

    def get_lines(self):
        code = CodeBuilder()
        code.add_lines("class MyMessage(betterproto.message):")
        with code.indented_block():
            code.add_blank()
            code.add_lines("# I'm a class comment!")
            code.add_blank()
            if self.fields or self.messages:
                for field in self.fields:
                    code.add_lines(*field.get_lines())
                for message in self.messages:
                    code.add_blank()
                    code.add_lines(*message.get_lines())
            else:
                code.add_lines("pass")
        code.add_blank()
        return code.get_lines()

    def compile(self):
        return "\n".join(self.get_lines())


class FieldCompiler:

    def __init__(self, name):
        self.name = name
    
    def get_lines(self):
        return [f'field_{self.name} = "{self.name.upper()}"']

    def compile(self):
        return "\n".join(self.get_lines())

inner_message = MessageCompiler(fields=[])
fields = [FieldCompiler("somefield"), FieldCompiler("anotherfield")]
messages = [inner_message]
outer_message = MessageCompiler(fields=fields, messages=messages)

print(outer_message.compile())

class MyMessage(betterproto.message):

    # I'm a class comment!

    field_somefield = "SOMEFIELD"
    field_anotherfield = "ANOTHERFIELD"

    class MyMessage(betterproto.message):
    
        # I'm a class comment!
    
        pass

adriangb · 2020-08-03T18:41:41Z

Generally, I think it makes sense to have some type of helper for indentation and spacing since all of the compilers are going to be doing that and having random "\n" thrown in there makes it seem... busy.

adriangb · 2020-08-03T18:46:21Z

One thing I noticed that might be an issue: we'll have to rework the current get_comments function to accurately reflow comments. As is, I believe it isn't reflowing comments correctly (perhaps I broke it in my previous PR).

boukeversteegh · 2020-08-03T19:12:36Z

Hmm, personally I feel that adding this abstraction layer is a step backwards in terms of simplicity and readability.
This can (and I'm afraid will) go out of hand very quickly as it turns out we need more and more builders/formatters/generators to generate the strings as we like.

Consider why we don't generate HTML page like that anymore and every single web framework uses template engines or html builders.

If we really need to go this route, I'd go more in the direction of constructing an Abstract Syntax Tree (Python already has a built in package for it) and let a renderer like black turn it into a string. I'm not sure the python AST will be easy enough to work with, but it would be a well documented structure that supports all python by design.

I haven't considered all perspectives deeply, so this is mostly my gut reaction, and I'm willing to change my mind.

I do however like the idea of each object to selfrender, but if we could do that with templates still (compare with vue components), we might have the best of both worlds.

Seems like the nesting is the biggest problem here. What if we return a rendered template from each object, and apply the indentation to the result?

adriangb · 2020-08-03T23:37:39Z

If we really need to go this route, I'd go more in the direction of constructing an Abstract Syntax Tree (Python already has a built in package for it) and let a renderer like black turn it into a string. I'm not sure the python AST will be easy enough to work with, but it would be a well documented structure that supports all python by design.

I'm not opposed to something like that, but then we really would be increasing the complexity/minimum level of understanding required.

I do however like the idea of each object to selfrender, but if we could do that with templates still (compare with vue components), we might have the best of both worlds.

We could embed the existing templates into these "compiler" classes and render that way, still using jinja. That's also (probably) a smaller diff.

Seems like the nesting is the biggest problem here. What if we return a rendered template from each object, and apply the indentation to the result?

That the idea here, have the parent object apply the indentation. I think it would be the same with a template.

Overall not opposed to the idea of still using templates, as long as each one of these "compiler" classes had it's own template that can be self-contained so that things like nested message definitions would be possible. Personally, my main goal here is to be able to extend for parsing of custom options. As far as I can tell, either option should allow that.

nat-n · 2020-08-04T09:38:59Z

I think we're all agreed that self-rendering components is a step forward. My first thought on this was that each component would have it's own jinja template, though I'm not sure that's optimal, since I don't feel jinja works as elegantly for python as for html. Still a step forward from what we have.

I am particularly cautious of using an AST. Modelling program syntax is a much larger problem than string building or templating. Of course if there's a sufficiently lightweight, simple, and flexible off the shelf AST library could be quite a nice way of doing it, but might still be overkill.

My intuition is that a good balance of simplicity/clarity/expressiveness/efficiency is most easily achieved within the paradigm of a string builder –– basically a convenient API around appending to a list of strings to then concatenate. There's not a lot of abstraction required.

My favourite realisation of this is still basically of what I proposed, where indent beyond the local baseline (of the compile method) is annotated per call to add lines. It could be thought of like specifying "padding" directly on a JSX component. That said annotating those lines via a context manager is also fine.

Within this framework I think it's simpler to pass the baseline indentation to the rendering component (keep the indent_level arg) rather than indent the output after the fact. Especially since indentation is only sometimes related to component boundaries.

I find this an interesting topic to explore, but if it gets tiresome, I wouldn't be fundamentally opposed to merging some variation of any of the proposals so far.

adriangb · 2020-08-04T15:52:14Z

@nat-n I think we agree regarding Jinja and AST: the former can be a bit ugly (do we just define a bunch of strings in a .py file?) while the latter can be a bit complex/overkill.

Regarding the string builder helper proposals or the specific internals used to build a string that we have been discussing in this PR and in your above comment, I don't 100% understand everything you are saying, so I'm sorry for any miscommunication. In any case, whatever you feel is best I am happy with, as long as it will eventually allow for extending the plugin to process custom options and transform them into injection of arbitrary code.

I find this an interesting topic to explore, but if it gets tiresome, I wouldn't be fundamentally opposed to merging some variation of any of the proposals so far.

I think the one big pro of the Jinja approach is that we aren't proposing anything fundamentally new but rather just refactoring what already exists. I think the bar for that should be a lot lower (i.e. if the API isn't great or super flexible that's okay, it's just a first step).

adriangb · 2020-08-30T20:11:47Z

I'm going to close this. As per offline discussion with @nat-n, I don't think this is the path forward.

adriangb added 3 commits July 30, 2020 12:56

initial test

0fa8d25

fix deprecated fields

9f73a68

blacken

5b465ca

nat-n reviewed Aug 2, 2020

View reviewed changes

adriangb closed this Aug 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REF: Make plugin models self-render #131

REF: Make plugin models self-render #131

adriangb commented Jul 30, 2020

nat-n Aug 2, 2020

nat-n Aug 2, 2020 •

edited

Loading

adriangb Aug 3, 2020 •

edited

Loading

adriangb Aug 3, 2020 •

edited

Loading

adriangb commented Aug 3, 2020 •

edited

Loading

nat-n commented Aug 3, 2020 •

edited

Loading

adriangb commented Aug 3, 2020 •

edited

Loading

adriangb commented Aug 3, 2020

adriangb commented Aug 3, 2020

boukeversteegh commented Aug 3, 2020

adriangb commented Aug 3, 2020 •

edited

Loading

nat-n commented Aug 4, 2020

adriangb commented Aug 4, 2020

adriangb commented Aug 30, 2020

	"""Construct Python code from MessageCompiler instance."""
	"""Generate Python code for this MessageCompiler instance."""

REF: Make plugin models self-render #131

REF: Make plugin models self-render #131

Conversation

adriangb commented Jul 30, 2020

nat-n Aug 2, 2020

Choose a reason for hiding this comment

nat-n Aug 2, 2020 • edited Loading

Choose a reason for hiding this comment

adriangb Aug 3, 2020 • edited Loading

Choose a reason for hiding this comment

adriangb Aug 3, 2020 • edited Loading

Choose a reason for hiding this comment

adriangb commented Aug 3, 2020 • edited Loading

nat-n commented Aug 3, 2020 • edited Loading

adriangb commented Aug 3, 2020 • edited Loading

adriangb commented Aug 3, 2020

adriangb commented Aug 3, 2020

boukeversteegh commented Aug 3, 2020

adriangb commented Aug 3, 2020 • edited Loading

nat-n commented Aug 4, 2020

adriangb commented Aug 4, 2020

adriangb commented Aug 30, 2020

nat-n Aug 2, 2020 •

edited

Loading

adriangb Aug 3, 2020 •

edited

Loading

adriangb Aug 3, 2020 •

edited

Loading

adriangb commented Aug 3, 2020 •

edited

Loading

nat-n commented Aug 3, 2020 •

edited

Loading

adriangb commented Aug 3, 2020 •

edited

Loading

adriangb commented Aug 3, 2020 •

edited

Loading