Replies: 1 comment
-
Since this language server project is somehow connected to supabase and supabase encourages to use declarative schemas, I wanted to ask if it would be (besides your mentioned limitations, theoretically) possible to define a directory or schema (for example supabase/schemas) that can (somehow) extend/overwrite the ground truth provided by the database. In that case, database types/symbols managed by the developer declaratively could be mapped to the code/files. The latter would have the benefit that a saved file change could be immediately available to the language server, so that developers are not required to run a migration after every (small) change and that required changes in dependent sql structures would be immediately visible. It also could enable many unique features that are not supported by any existing sql tooling tool and therefore set this project apart from others in the long term - like goto definitions that actually jump to the source code instead of generating a definition from the database. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I did some research about language server architecture and want to share my findings and a proposal for this project.
First, I will present some resources I used to get a better understanding and links to other projects that can serve as inspiration, before describing the special requirements a Postgres language server has and finally presenting my proposal.
Please note that this is just the result of a bit of research. I am in no means an expert and am very happy to get feedback on this.
This blog post on Three Architectures for a responsive IDE is a great starting point. It describes different architectures with their benefits and drawbacks. I can also recommend going through its references:
Another great resource is this blog post that explores a new idea for a languages server data model for the zig language server.
One important learning is that even though every language server is basically doing the same thing, the design of the language determines the architecture of the underlying data model. For example, if the language supports header files and has a declaration before use rule, the language server can compile the headers once and cache them. When the user types within a file, the compiler restarts from the point just after the header section of the very same file. All other files and headers are read from cache, and the compilation unit is reasonably small enough.
This does not work for every language though. For example, in Rust, the compilation unit is a whole crate, not a single file. And there is no declaration before use rule.
But we do not want to build a language server for C++ or Rust, but Postgres. And there is one very significant difference. Usually, types are defined within the codebase, and resolved from there. In the example below, the language server first has to compile
foo.ts
and up a data base withFoo
to resolve the type ofbar
and check whether it is correct.So the data model of the languages server is built by parsing and analysing the entire codebase. For Postgres (or any SQL dialect really), this is very different for the most part: while types can be defined within the source code e.g. within a migration file by using declarative schema management, the database itself is the single source of truth, and there is no relation between files. The variety of schema and migration management tools employed today means we cannot make a lot of assumptions about how the source code is structured and whether it reflects the current state.
There are three exceptions to this. First, types that are declared or altered in the same sql source before the current statement. Take for example
Until this migration is executed, the schema cache does not include the column email and our language server will report an error that the column does not exist on contact. What we could do to fix this is to alter the schema cache with every statement above the one currently analysed. From my experience, this is not a super important case, since ddl and dml are rarely written in the same file. And a developer could always execute the ddl statements once to update the schema cache before writing any dml statements.
The second execution that breaks the “no relationship between files” assumption is \include, which allows executing another sql file as it would have been part of the current one. Again, this is only relevant for ddl statements, and can be supported by simply treating the statements from the imported file as part of the current file.
The third exception is plpsql functions, e.g.
where local variable declarations are possible. At first, we can rely on https://github.com/okbob/plpgsql_check/tree/master for the static analysis. Eventually, we might want to write our own plpgsql linter. But either way: all types are declared local to the create function statement, or can be queried from the database schema.
Based on the above, we can make the following assumptions:
Given the assumptions, I propose the following architecture:
That’s it. Its still very high level, but I will start will implementing some proof of concepts now and would love some feedback!
Best
Philipp
Beta Was this translation helpful? Give feedback.
All reactions