|
| 1 | +# Changing CPython's grammar |
| 2 | + |
| 3 | +There's more to changing Python's grammar than editing |
| 4 | +[`Grammar/python.gram`](../Grammar/python.gram). |
| 5 | +Below is a checklist of things that may need to change. |
| 6 | + |
| 7 | +> [!NOTE] |
| 8 | +> |
| 9 | +> Many of these changes require re-generating some of the derived |
| 10 | +> files. If things mysteriously don't work, it may help to run |
| 11 | +> ``make clean``. |
| 12 | +
|
| 13 | +## Checklist |
| 14 | + |
| 15 | +* [`Grammar/python.gram`](../Grammar/python.gram): The grammar definition, |
| 16 | + with actions that build AST nodes. |
| 17 | + After changing it, run ``make regen-pegen`` (or ``build.bat --regen`` on Windows), |
| 18 | + to regenerate [`Parser/parser.c`](../Parser/parser.c). |
| 19 | + (This runs Python's parser generator, [`Tools/peg_generator`](../Tools/peg_generator)). |
| 20 | + |
| 21 | +* [`Grammar/Tokens`](../Grammar/Tokens) is a place for adding new token types. After |
| 22 | + changing it, run ``make regen-token`` to regenerate |
| 23 | + [`Include/internal/pycore_token.h`](../Include/internal/pycore_token.h), |
| 24 | + [`Parser/token.c`](../Parser/token.c), [`Lib/token.py`](../Lib/token.py) |
| 25 | + and [`Doc/library/token-list.inc`](../Doc/library/token-list.inc). |
| 26 | + If you change both ``python.gram`` and ``Tokens``, run ``make regen-token`` |
| 27 | + before ``make regen-pegen``. |
| 28 | + On Windows, ``build.bat --regen`` will regenerate both at the same time. |
| 29 | + |
| 30 | +* [`Parser/Python.asdl`](../Parser/Python.asdl) may need changes to match the grammar. |
| 31 | + Then run ``make regen-ast`` to regenerate |
| 32 | + [`Include/internal/pycore_ast.h`](../Include/internal/pycore_ast.h) and |
| 33 | + [`Python/Python-ast.c`](../Python/Python-ast.c). |
| 34 | + |
| 35 | +* [`Parser/lexer/`](../Parser/lexer/) contains the tokenization code. |
| 36 | + This is where you would add a new type of comment or string literal, for example. |
| 37 | + |
| 38 | +* [`Python/ast.c`](../Python/ast.c) will need changes to validate AST objects |
| 39 | + involved with the grammar change. |
| 40 | + |
| 41 | +* [`Python/ast_unparse.c`](../Python/ast_unparse.c) will need changes to unparse |
| 42 | + AST involved with the grammar change ("unparsing" is used to turn annotations |
| 43 | + into strings per [PEP 563](https://peps.python.org/pep-0563/). |
| 44 | + |
| 45 | +* The [`compiler`](compiler.md) may need to change when there are changes |
| 46 | + to the `AST`. |
| 47 | + |
| 48 | +* ``_Unparser`` in the [`Lib/ast.py`](../Lib/ast.py) file may need changes |
| 49 | + to accommodate any modifications in the AST nodes. |
| 50 | + |
| 51 | +* [`Doc/library/ast.rst`](../Doc/library/ast.rst) may need to be updated |
| 52 | + to reflect changes to AST nodes. |
| 53 | + |
| 54 | +* Add some usage of your new syntax to ``test_grammar.py``. |
| 55 | + |
| 56 | +* Certain changes may require tweaks to the library module |
| 57 | + [`pyclbr`](https://docs.python.org/3/library/pyclbr.html#module-pyclbr). |
| 58 | + |
| 59 | +* [`Lib/tokenize.py`](../Lib/tokenize.py) needs changes to match changes |
| 60 | + to the tokenizer. |
| 61 | + |
| 62 | +* Documentation must be written! Specifically, one or more of the pages in |
| 63 | + [`Doc/reference/`](../Doc/reference/) will need to be updated. |
0 commit comments