|
| 1 | +====================== |
| 2 | +Inline script metadata |
| 3 | +====================== |
| 4 | + |
| 5 | +.. warning:: |
| 6 | + This specification has been **provisionally accepted**. It is subject |
| 7 | + to being changed or abandoned. See the |
| 8 | + `PEP 723 conditional acceptance thread <pep723-thread_>`_ for details. |
| 9 | + |
| 10 | +.. _pep723-thread: https://discuss.python.org/t/36763 |
| 11 | + |
| 12 | +This specification defines a metadata format that can be embedded in single-file |
| 13 | +Python scripts to assist launchers, IDEs and other external tools which may need |
| 14 | +to interact with such scripts. |
| 15 | + |
| 16 | + |
| 17 | +Specification |
| 18 | +============= |
| 19 | + |
| 20 | +This specification defines a metadata comment block format (loosely inspired by |
| 21 | +`reStructuredText Directives`__). |
| 22 | + |
| 23 | +__ https://docutils.sourceforge.io/docs/ref/rst/directives.html |
| 24 | + |
| 25 | +Any Python script may have top-level comment blocks that MUST start with the |
| 26 | +line ``# /// TYPE`` where ``TYPE`` determines how to process the content. That |
| 27 | +is: a single ``#``, followed by a single space, followed by three forward |
| 28 | +slashes, followed by a single space, followed by the type of metadata. Block |
| 29 | +MUST end with the line ``# ///``. That is: a single ``#``, followed by a single |
| 30 | +space, followed by three forward slashes. The ``TYPE`` MUST only consist of |
| 31 | +ASCII letters, numbers and hyphens. |
| 32 | + |
| 33 | +Every line between these two lines (``# /// TYPE`` and ``# ///``) MUST be a |
| 34 | +comment starting with ``#``. If there are characters after the ``#`` then the |
| 35 | +first character MUST be a space. The embedded content is formed by taking away |
| 36 | +the first two characters of each line if the second character is a space, |
| 37 | +otherwise just the first character (which means the line consists of only a |
| 38 | +single ``#``). |
| 39 | + |
| 40 | +Precedence for an ending line ``# ///`` is given when the next line is not |
| 41 | +a valid embedded content line as described above. For example, the following |
| 42 | +is a single fully valid block: |
| 43 | + |
| 44 | +.. code:: python |
| 45 | +
|
| 46 | + # /// some-toml |
| 47 | + # embedded-csharp = """ |
| 48 | + # /// <summary> |
| 49 | + # /// text |
| 50 | + # /// |
| 51 | + # /// </summary> |
| 52 | + # public class MyClass { } |
| 53 | + # """ |
| 54 | + # /// |
| 55 | +
|
| 56 | +A starting line MUST NOT be placed between another starting line and its ending |
| 57 | +line. In such cases tools MAY produce an error. Unclosed blocks MUST be ignored. |
| 58 | + |
| 59 | +When there are multiple comment blocks of the same ``TYPE`` defined, tools MUST |
| 60 | +produce an error. |
| 61 | + |
| 62 | +Tools reading embedded metadata MAY respect the standard Python encoding |
| 63 | +declaration. If they choose not to do so, they MUST process the file as UTF-8. |
| 64 | + |
| 65 | +This is the canonical regular expression that MAY be used to parse the |
| 66 | +metadata: |
| 67 | + |
| 68 | +.. code:: text |
| 69 | +
|
| 70 | + (?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$ |
| 71 | +
|
| 72 | +In circumstances where there is a discrepancy between the text specification |
| 73 | +and the regular expression, the text specification takes precedence. |
| 74 | + |
| 75 | +Tools MUST NOT read from metadata blocks with types that have not been |
| 76 | +standardized by this PEP or future ones. |
| 77 | + |
| 78 | +pyproject type |
| 79 | +-------------- |
| 80 | + |
| 81 | +The first type of metadata block is named ``pyproject`` which represents |
| 82 | +content similar to what one would see in a ``pyproject.toml`` file. |
| 83 | + |
| 84 | +This document MAY include the ``[run]`` and ``[tool]`` tables. |
| 85 | + |
| 86 | +The :ref:`tool table <pyproject-tool-table>` MAY be used by any tool, |
| 87 | +script runner or otherwise, to configure behavior. |
| 88 | + |
| 89 | +The ``[run]`` table MAY include the following optional fields: |
| 90 | + |
| 91 | +* ``dependencies``: A list of strings that specifies the runtime dependencies |
| 92 | + of the script. Each entry MUST be a valid |
| 93 | + :ref:`dependency specifier <dependency-specifiers>`. |
| 94 | +* ``requires-python``: A string that specifies the Python version(s) with which |
| 95 | + the script is compatible. The value of this field MUST be a valid |
| 96 | + :ref:`version specifier <version-specifiers>`. |
| 97 | + |
| 98 | +Any future specifications that define additional fields for the ``[run]`` table |
| 99 | +when used in a ``pyproject.toml`` file MUST include the aforementioned fields |
| 100 | +exactly as specified. The fields defined by this specification are equally as |
| 101 | +applicable to full-fledged projects as they are to single-file scripts. |
| 102 | + |
| 103 | +Script runners MUST error if the specified ``dependencies`` cannot be provided. |
| 104 | +Script runners SHOULD error if no version of Python that satisfies the specified |
| 105 | +``requires-python`` can be provided. |
| 106 | + |
| 107 | +Example |
| 108 | +------- |
| 109 | + |
| 110 | +The following is an example of a script with an embedded ``pyproject.toml``: |
| 111 | + |
| 112 | +.. code:: python |
| 113 | +
|
| 114 | + # /// pyproject |
| 115 | + # [run] |
| 116 | + # requires-python = ">=3.11" |
| 117 | + # dependencies = [ |
| 118 | + # "requests<3", |
| 119 | + # "rich", |
| 120 | + # ] |
| 121 | + # /// |
| 122 | +
|
| 123 | + import requests |
| 124 | + from rich.pretty import pprint |
| 125 | +
|
| 126 | + resp = requests.get("https://peps.python.org/api/peps.json") |
| 127 | + data = resp.json() |
| 128 | + pprint([(k, v["title"]) for k, v in data.items()][:10]) |
| 129 | +
|
| 130 | +The following is an example of a proposed syntax for single-file Rust |
| 131 | +projects that embeds their equivalent of ``pyproject.toml``, which is called |
| 132 | +``Cargo.toml``: |
| 133 | + |
| 134 | +.. code:: rust |
| 135 | +
|
| 136 | + #!/usr/bin/env cargo |
| 137 | +
|
| 138 | + //! ```cargo |
| 139 | + //! [dependencies] |
| 140 | + //! regex = "1.8.0" |
| 141 | + //! ``` |
| 142 | +
|
| 143 | + fn main() { |
| 144 | + let re = Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap(); |
| 145 | + println!("Did our date match? {}", re.is_match("2014-01-01")); |
| 146 | + } |
| 147 | +
|
| 148 | +Reference Implementation |
| 149 | +======================== |
| 150 | + |
| 151 | +The following is an example of how to read the metadata on Python 3.11 or |
| 152 | +higher. |
| 153 | + |
| 154 | +.. code:: python |
| 155 | +
|
| 156 | + import re |
| 157 | + import tomllib |
| 158 | +
|
| 159 | + REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$' |
| 160 | +
|
| 161 | + def read(script: str) -> dict | None: |
| 162 | + name = 'pyproject' |
| 163 | + matches = list( |
| 164 | + filter(lambda m: m.group('type') == name, re.finditer(REGEX, script)) |
| 165 | + ) |
| 166 | + if len(matches) > 1: |
| 167 | + raise ValueError(f'Multiple {name} blocks found') |
| 168 | + elif len(matches) == 1: |
| 169 | + content = ''.join( |
| 170 | + line[2:] if line.startswith('# ') else line[1:] |
| 171 | + for line in matches[0].group('content').splitlines(keepends=True) |
| 172 | + ) |
| 173 | + return tomllib.loads(content) |
| 174 | + else: |
| 175 | + return None |
| 176 | +
|
| 177 | +Often tools will edit dependencies like package managers or dependency update |
| 178 | +automation in CI. The following is a crude example of modifying the content |
| 179 | +using the ``tomlkit`` library__. |
| 180 | + |
| 181 | +__ https://tomlkit.readthedocs.io/en/latest/ |
| 182 | + |
| 183 | +.. code:: python |
| 184 | +
|
| 185 | + import re |
| 186 | +
|
| 187 | + import tomlkit |
| 188 | +
|
| 189 | + REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$' |
| 190 | +
|
| 191 | + def add(script: str, dependency: str) -> str: |
| 192 | + match = re.search(REGEX, script) |
| 193 | + content = ''.join( |
| 194 | + line[2:] if line.startswith('# ') else line[1:] |
| 195 | + for line in match.group('content').splitlines(keepends=True) |
| 196 | + ) |
| 197 | +
|
| 198 | + config = tomlkit.parse(content) |
| 199 | + config['project']['dependencies'].append(dependency) |
| 200 | + new_content = ''.join( |
| 201 | + f'# {line}' if line.strip() else f'#{line}' |
| 202 | + for line in tomlkit.dumps(config).splitlines(keepends=True) |
| 203 | + ) |
| 204 | +
|
| 205 | + start, end = match.span('content') |
| 206 | + return script[:start] + new_content + script[end:] |
| 207 | +
|
| 208 | +Note that this example used a library that preserves TOML formatting. This is |
| 209 | +not a requirement for editing by any means but rather is a "nice to have" |
| 210 | +feature. |
| 211 | + |
| 212 | +The following is an example of how to read a stream of arbitrary metadata |
| 213 | +blocks. |
| 214 | + |
| 215 | +.. code:: python |
| 216 | +
|
| 217 | + import re |
| 218 | + from typing import Iterator |
| 219 | +
|
| 220 | + REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$' |
| 221 | +
|
| 222 | + def stream(script: str) -> Iterator[tuple[str, str]]: |
| 223 | + for match in re.finditer(REGEX, script): |
| 224 | + yield match.group('type'), ''.join( |
| 225 | + line[2:] if line.startswith('# ') else line[1:] |
| 226 | + for line in match.group('content').splitlines(keepends=True) |
| 227 | + ) |
| 228 | +
|
| 229 | +
|
| 230 | +Recommendations |
| 231 | +=============== |
| 232 | + |
| 233 | +Tools that support managing different versions of Python should attempt to use |
| 234 | +the highest available version of Python that is compatible with the script's |
| 235 | +``requires-python`` metadata, if defined. |
| 236 | + |
| 237 | + |
| 238 | +History |
| 239 | +======= |
| 240 | + |
| 241 | +This specification was originally defined as :pep:`723`. |
0 commit comments