|
| 1 | +# dts-tree-sitter |
| 2 | + |
| 3 | +**dts-tree-sitter** generates TypeScript `.d.ts` files for interacting the AST from a given tree-sitter grammar. |
| 4 | + |
| 5 | +## Usage |
| 6 | + |
| 7 | +```sh |
| 8 | +npm i @asgerf/dts-tree-sitter |
| 9 | + |
| 10 | +npx @asgerf/dts-tree-sitter INPUT > OUTPUT.d.ts |
| 11 | +``` |
| 12 | + |
| 13 | +where `INPUT` is used to locate a `node-types.json` file in one of the following locations: |
| 14 | +- `${INPUT}` |
| 15 | +- `${INPUT}/node-types.json` |
| 16 | +- `${INPUT}/src/node-types.json` |
| 17 | +- `node_modules/${INPUT}/src/node-types.json` |
| 18 | + |
| 19 | +## Example |
| 20 | + |
| 21 | +The `tree-sitter-javascript` grammar can be compiled like this: |
| 22 | +```sh |
| 23 | +npm i tree-sitter-javascript |
| 24 | +npx @asgerf/dts-tree-sitter tree-sitter-javascript > generated.d.ts |
| 25 | +``` |
| 26 | + |
| 27 | +In the resulting grammar, two of the node types look like this: |
| 28 | +```ts |
| 29 | +export interface ClassDeclarationNode extends SyntaxNodeBase { |
| 30 | + type: SyntaxType.ClassDeclaration; |
| 31 | + bodyNode: ClassBodyNode; |
| 32 | + decoratorNodes?: DecoratorNode[]; |
| 33 | + nameNode: IdentifierNode; |
| 34 | +} |
| 35 | + |
| 36 | +export interface ClassBodyNode extends SyntaxNodeBase { |
| 37 | + type: SyntaxType.ClassBody; |
| 38 | + memberNodes?: (MethodDefinitionNode | PublicFieldDefinitionNode)[]; |
| 39 | +} |
| 40 | +``` |
| 41 | + |
| 42 | +This can be used like this (see [full example](examples/javascript/index.ts)): |
| 43 | +```ts |
| 44 | +import * as g from "./generated"; |
| 45 | + |
| 46 | +function getMemberNames(node: g.ClassDeclarationNode) { |
| 47 | + let result = []; |
| 48 | + for (let member of node.bodyNode.memberNodes) { |
| 49 | + if (member.type === g.SyntaxType.MethodDefinition) { |
| 50 | + result.push(member.nameNode.text); |
| 51 | + } else { |
| 52 | + result.push(member.propertyNode.text); |
| 53 | + } |
| 54 | + } |
| 55 | + return result; |
| 56 | +} |
| 57 | +``` |
| 58 | + |
| 59 | +Observe TypeScript do its magic: the type check in the `if` promotes the type of `member` to a `MethodDefinitionNode` |
| 60 | +in the 'then' branch, and to `PublicFieldDefinitionNode` in the 'else' branch. |
| 61 | + |
| 62 | +## Typed Tree Cursors |
| 63 | + |
| 64 | +Tree sitter's `TreeCursor` allows fast traversal of an AST, and has two properties with correlated types: `nodeType`, and `currentNode`. |
| 65 | +Once you've checked `nodeType`, it's annoying to have to cast `currentNode` to the correponding type right afterwards: |
| 66 | +```ts |
| 67 | +if (cursor.nodeType === g.SyntaxType.Function) { |
| 68 | + let node = cursor.currentNode as g.Function; // annoying cast |
| 69 | +} |
| 70 | +``` |
| 71 | + |
| 72 | +There's another way, which is handy in large switches: Cast the cursor itself to a `TypedTreeCursor` before switching on `nodeType`. |
| 73 | +Then the guarded use of `currentNode` has the expected type. For example: |
| 74 | +```ts |
| 75 | +function printDeclaredNames() { |
| 76 | + let cursor = tree.walk(); |
| 77 | + do { |
| 78 | + const c = cursor as g.TypedTreeCursor; |
| 79 | + switch (c.nodeType) { |
| 80 | + case g.SyntaxType.ClassDeclaration: |
| 81 | + case g.SyntaxType.FunctionDeclaration: |
| 82 | + case g.SyntaxType.VariableDeclarator: { |
| 83 | + let node = c.currentNode; |
| 84 | + console.log(node.nameNode.text); |
| 85 | + break; |
| 86 | + } |
| 87 | + } |
| 88 | + } while(gotoPreorderSucc(cursor)); |
| 89 | +} |
| 90 | +``` |
| 91 | +- `node` gets the type `ClassDeclarationNode | FunctionDeclarationNode | VariableDeclaratorNode`. |
| 92 | +- This allows safe access to `node.nameNode`, since each of those types have a `name` field. |
| 93 | +- We don't pay the cost of invoking `currentNode` for other types of nodes. |
| 94 | + |
| 95 | +## Trouble-shooting |
| 96 | + |
| 97 | +### I get an error about "excessive stack depth" during compilation |
| 98 | + |
| 99 | +This happens if you compare types from the general `tree-sitter.d.ts` file with those from the generated `.d.ts` file. |
| 100 | +Every type from `tree-sitter.d.ts` has a stronger version in the generated file; make sure you don't mix and match. |
| 101 | + |
| 102 | + |
| 103 | +### I get `UnnamedNode` types in places where I don't expect them |
| 104 | + |
| 105 | +This can happen if the grammar contains rules and literals with the same name. For example this grammar rule, |
| 106 | +```js |
| 107 | + func: $ => seq('func', $.name, $.body) |
| 108 | +``` |
| 109 | +will produce a named node with type `func`, while the `'func'` literal will produce an unnamed node with type `func` as well. |
| 110 | + |
| 111 | +This means a check like `node.type === 'func'` is not an exact type check, and the type of `node` will only be restricted to `FuncNode | UnnamedNode<'func'>`. This is _not_ a bug in the generated `.d.ts` file: there really are two kinds of nodes you need to handle after that check. |
| 112 | + |
| 113 | +Some possible solutions are: |
| 114 | +- Change the grammar to avoid rules with the same name as a keyword. |
| 115 | +- Write the check as `node.isNamed && node.type === 'func'`. |
| 116 | +- Change the declared type of `node` from `SyntaxNode` to `NamedNode`. |
0 commit comments