Merge pull request #1388 from jeanas/pep723

willingc · web-flow · commit a081a119fadc · 2023-11-14T17:43:12.000Z
Import PEP 723 specification
diff --git a/source/specifications/declaring-build-dependencies.rst b/source/specifications/declaring-build-dependencies.rst
@@ -50,6 +50,8 @@ should consider it an error.
 
 .. TODO: move elsewhere
 
+.. _pyproject-tool-table:
+
 tool table
 ----------
 
diff --git a/source/specifications/index.rst b/source/specifications/index.rst
@@ -21,6 +21,7 @@ Package Distribution Metadata
    dependency-specifiers
    declaring-build-dependencies
    declaring-project-metadata
+   inline-script-metadata
    platform-compatibility-tags
 
 Package Installation Environment Metadata
diff --git a/source/specifications/inline-script-metadata.rst b/source/specifications/inline-script-metadata.rst
@@ -0,0 +1,241 @@
+======================
+Inline script metadata
+======================
+
+.. warning::
+   This specification has been **provisionally accepted**. It is subject
+   to being changed or abandoned. See the
+   `PEP 723 conditional acceptance thread <pep723-thread_>`_ for details.
+
+.. _pep723-thread: https://discuss.python.org/t/36763
+
+This specification defines a metadata format that can be embedded in single-file
+Python scripts to assist launchers, IDEs and other external tools which may need
+to interact with such scripts.
+
+
+Specification
+=============
+
+This specification defines a metadata comment block format (loosely inspired by
+`reStructuredText Directives`__).
+
+__ https://docutils.sourceforge.io/docs/ref/rst/directives.html
+
+Any Python script may have top-level comment blocks that MUST start with the
+line ``# /// TYPE`` where ``TYPE`` determines how to process the content. That
+is: a single ``#``, followed by a single space, followed by three forward
+slashes, followed by a single space, followed by the type of metadata. Block
+MUST end with the line ``# ///``. That is: a single ``#``, followed by a single
+space, followed by three forward slashes. The ``TYPE`` MUST only consist of
+ASCII letters, numbers and hyphens.
+
+Every line between these two lines (``# /// TYPE`` and ``# ///``) MUST be a
+comment starting with ``#``. If there are characters after the ``#`` then the
+first character MUST be a space. The embedded content is formed by taking away
+the first two characters of each line if the second character is a space,
+otherwise just the first character (which means the line consists of only a
+single ``#``).
+
+Precedence for an ending line ``# ///`` is given when the next line is not
+a valid embedded content line as described above. For example, the following
+is a single fully valid block:
+
+.. code:: python
+
+    # /// some-toml
+    # embedded-csharp = """
+    # /// <summary>
+    # /// text
+    # ///
+    # /// </summary>
+    # public class MyClass { }
+    # """
+    # ///
+
+A starting line MUST NOT be placed between another starting line and its ending
+line. In such cases tools MAY produce an error. Unclosed blocks MUST be ignored.
+
+When there are multiple comment blocks of the same ``TYPE`` defined, tools MUST
+produce an error.
+
+Tools reading embedded metadata MAY respect the standard Python encoding
+declaration. If they choose not to do so, they MUST process the file as UTF-8.
+
+This is the canonical regular expression that MAY be used to parse the
+metadata:
+
+.. code:: text
+
+    (?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$
+
+In circumstances where there is a discrepancy between the text specification
+and the regular expression, the text specification takes precedence.
+
+Tools MUST NOT read from metadata blocks with types that have not been
+standardized by this PEP or future ones.
+
+pyproject type
+--------------
+
+The first type of metadata block is named ``pyproject`` which represents
+content similar to what one would see in a ``pyproject.toml`` file.
+
+This document MAY include the ``[run]`` and ``[tool]`` tables.
+
+The :ref:`tool table <pyproject-tool-table>` MAY be used by any tool,
+script runner or otherwise, to configure behavior.
+
+The ``[run]`` table MAY include the following optional fields:
+
+* ``dependencies``: A list of strings that specifies the runtime dependencies
+  of the script. Each entry MUST be a valid
+  :ref:`dependency specifier <dependency-specifiers>`.
+* ``requires-python``: A string that specifies the Python version(s) with which
+  the script is compatible. The value of this field MUST be a valid
+  :ref:`version specifier <version-specifiers>`.
+
+Any future specifications that define additional fields for the ``[run]`` table
+when used in a ``pyproject.toml`` file MUST include the aforementioned fields
+exactly as specified. The fields defined by this specification are equally as
+applicable to full-fledged projects as they are to single-file scripts.
+
+Script runners MUST error if the specified ``dependencies`` cannot be provided.
+Script runners SHOULD error if no version of Python that satisfies the specified
+``requires-python`` can be provided.
+
+Example
+-------
+
+The following is an example of a script with an embedded ``pyproject.toml``:
+
+.. code:: python
+
+    # /// pyproject
+    # [run]
+    # requires-python = ">=3.11"
+    # dependencies = [
+    #   "requests<3",
+    #   "rich",
+    # ]
+    # ///
+
+    import requests
+    from rich.pretty import pprint
+
+    resp = requests.get("https://peps.python.org/api/peps.json")
+    data = resp.json()
+    pprint([(k, v["title"]) for k, v in data.items()][:10])
+
+The following is an example of a proposed syntax for single-file Rust
+projects that embeds their equivalent of ``pyproject.toml``, which is called
+``Cargo.toml``:
+
+.. code:: rust
+
+    #!/usr/bin/env cargo
+
+    //! ```cargo
+    //! [dependencies]
+    //! regex = "1.8.0"
+    //! ```
+
+    fn main() {
+        let re = Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap();
+        println!("Did our date match? {}", re.is_match("2014-01-01"));
+    }
+
+Reference Implementation
+========================
+
+The following is an example of how to read the metadata on Python 3.11 or
+higher.
+
+.. code:: python
+
+   import re
+   import tomllib
+
+   REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$'
+
+   def read(script: str) -> dict | None:
+       name = 'pyproject'
+       matches = list(
+           filter(lambda m: m.group('type') == name, re.finditer(REGEX, script))
+       )
+       if len(matches) > 1:
+           raise ValueError(f'Multiple {name} blocks found')
+       elif len(matches) == 1:
+           content = ''.join(
+               line[2:] if line.startswith('# ') else line[1:]
+               for line in matches[0].group('content').splitlines(keepends=True)
+           )
+           return tomllib.loads(content)
+       else:
+           return None
+
+Often tools will edit dependencies like package managers or dependency update
+automation in CI. The following is a crude example of modifying the content
+using the ``tomlkit`` library__.
+
+__ https://tomlkit.readthedocs.io/en/latest/
+
+.. code:: python
+
+   import re
+
+   import tomlkit
+
+   REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$'
+
+   def add(script: str, dependency: str) -> str:
+       match = re.search(REGEX, script)
+       content = ''.join(
+           line[2:] if line.startswith('# ') else line[1:]
+           for line in match.group('content').splitlines(keepends=True)
+       )
+
+       config = tomlkit.parse(content)
+       config['project']['dependencies'].append(dependency)
+       new_content = ''.join(
+           f'# {line}' if line.strip() else f'#{line}'
+           for line in tomlkit.dumps(config).splitlines(keepends=True)
+       )
+
+       start, end = match.span('content')
+       return script[:start] + new_content + script[end:]
+
+Note that this example used a library that preserves TOML formatting. This is
+not a requirement for editing by any means but rather is a "nice to have"
+feature.
+
+The following is an example of how to read a stream of arbitrary metadata
+blocks.
+
+.. code:: python
+
+   import re
+   from typing import Iterator
+
+   REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$'
+
+   def stream(script: str) -> Iterator[tuple[str, str]]:
+       for match in re.finditer(REGEX, script):
+           yield match.group('type'), ''.join(
+               line[2:] if line.startswith('# ') else line[1:]
+               for line in match.group('content').splitlines(keepends=True)
+           )
+
+
+Recommendations
+===============
+
+Tools that support managing different versions of Python should attempt to use
+the highest available version of Python that is compatible with the script's
+``requires-python`` metadata, if defined.
+
+
+History
+=======
+
+This specification was originally defined as :pep:`723`.