Skip to content

bpo-39452: rewrite and expand __main__.rst #26883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
Aug 24, 2021
Merged
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
02e9edf
__main__docs: intro and first secton
jdevries3133 Jun 21, 2021
c95f69b
bpo-44494: rewrite of Doc/library/__main__.rst (first draft)
jdevries3133 Jun 23, 2021
235e866
bpo-44494: add blurb
jdevries3133 Jun 23, 2021
a292ab6
Update __main__.rst
geryogam Jun 30, 2019
d7a1999
Update __main__.rst
geryogam Jun 30, 2019
8398f08
Take Steven d’Aprano’s review into account
geryogam Sep 16, 2020
1fcb2af
Rewrap lines
geryogam Sep 16, 2020
2bde063
Remove trailing whitespaces
geryogam Sep 17, 2020
d29fd2a
bpo-39452: rewrite and expansion of __main__.rst
jdevries3133 Jun 23, 2021
4c60f2c
mention runpy
jdevries3133 Jun 29, 2021
2b5f710
add "design patterns" section, fix section title hierarchies
jdevries3133 Jun 29, 2021
7e495d7
add sentence about console_scripts
jdevries3133 Jun 29, 2021
56afeaa
misc formatting; change "Design Patterns" to "Idiomatic Usage"
jdevries3133 Jun 30, 2021
2a398ef
add section about sys.exit(main()) convention
jdevries3133 Jun 30, 2021
1a9956c
make last paragraph 'idiomatic usage'; add comment about maybe deleting
jdevries3133 Jun 30, 2021
c4b5cea
fix linting error (default context used in comment)
jdevries3133 Jun 30, 2021
1f012b4
revise example so that main() does not take arguments
jdevries3133 Jun 30, 2021
f095362
add console_scripts section, remove bad old example
jdevries3133 Jun 30, 2021
c42b706
minor proofreading changes
jdevries3133 Jun 30, 2021
7fe7f1c
implement changes suggest by @merwok
jdevries3133 Jun 30, 2021
c063da1
fix: typos
jdevries3133 Jun 30, 2021
7c6b451
fix wording, slim down example, add reference to relative import docs
jdevries3133 Jul 7, 2021
06bcb09
respond to review from @pradyunsg
jdevries3133 Jul 21, 2021
80756b3
Merge remote-tracking branch 'upstream/main' into bpo-39452__main__docs
jdevries3133 Jul 31, 2021
647c471
add `import __main__` section
jdevries3133 Aug 1, 2021
457bbc9
revisions and proofreading
jdevries3133 Aug 1, 2021
3d9b3b9
eliminate opinionated section about idiomatic usage of `__main__.py`
jdevries3133 Aug 1, 2021
dd68513
proofread `__main__.py` section
jdevries3133 Aug 1, 2021
6ee7090
Merge branch 'main' of github.com:python/cpython into bpo-39452__main…
jdevries3133 Aug 4, 2021
757b03a
incorporate suggested changes from @Fidget-Spinner
jdevries3133 Aug 10, 2021
8e86468
incorporate suggested changes from @yaseppochi
jdevries3133 Aug 10, 2021
f33a081
fix formatting
jdevries3133 Aug 10, 2021
14bad85
name equals main
jdevries3133 Aug 12, 2021
d150674
Merge branch 'main' of github.com:python/cpython into bpo-39452__main…
jdevries3133 Aug 12, 2021
7b987cb
Merge branch 'bpo-39452__main__docs' of github.com:jdevries3133/cpyth…
jdevries3133 Aug 12, 2021
b9db705
also change reference to name equals main section
jdevries3133 Aug 12, 2021
168c774
implement feedback from @holdenweb, python-dev, and @merwork
jdevries3133 Aug 12, 2021
c450171
fix trailing whitespace
jdevries3133 Aug 12, 2021
077e7a4
Remove .bak file
ambv Aug 24, 2021
eb42489
Thorough editing pass
ambv Aug 24, 2021
7c61e78
Move `__main__.py` section above the `import __main__` section
ambv Aug 24, 2021
45a9425
s/command line/command-line/
ambv Aug 24, 2021
073e9d7
Mention asyncio.__main__
ambv Aug 24, 2021
ccb9004
Restore proper document name
ambv Aug 24, 2021
f8630fa
Use proper .. seealso:: sections.
ambv Aug 24, 2021
1e86e02
Appease `make suspicious`
ambv Aug 24, 2021
9c87442
Spell out examples of top-level code environments
ambv Aug 24, 2021
0d4fc8a
Appease double dot alignment aesthetics
ambv Aug 24, 2021
4e51333
Replace lies with truth
ambv Aug 24, 2021
6f0f82c
Improve flow introducing what "top-level code environment" is
ambv Aug 24, 2021
46e7668
Use module names consistently in example
ambv Aug 24, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
372 changes: 355 additions & 17 deletions Doc/library/__main__.rst
Original file line number Diff line number Diff line change
@@ -1,25 +1,363 @@

:mod:`__main__` --- Top-level script environment
================================================
:mod:`__main__` --- Top-level code environment
==============================================

.. module:: __main__
:synopsis: The environment where the top-level script is run.
:synopsis: The environment where top-level code is run. Covers command-line
interfaces, import-time behavior, and ``__name__ == '__main__'``.

--------------

``'__main__'`` is the name of the scope in which top-level code executes.
A module's __name__ is set equal to ``'__main__'`` when read from
standard input, a script, or from an interactive prompt.
In Python, the special name ``__main__`` is used for two important constructs:

1. the name of the top-level environment of the program, which can be
checked using the ``__name__ == '__main__'`` expression; and
2. the ``__main__.py`` file in Python packages.

Both of these mechanisms are related to Python modules; how users interact with
them and how they interact with each other. They are explained in detail
below. If you're new to Python modules, see the tutorial section
:ref:`tut-modules` for an introduction.


.. _name_equals_main:

``__name__ == '__main__'``
---------------------------

When a Python module or package is imported, ``__name__`` is set to the
module's name. Usually, this is the name of the Python file itself without the
``.py`` extension::

>>> import configparser
>>> configparser.__name__
'configparser'

If the file is part of a package, ``__name__`` will also include the parent
package's path::

>>> from concurrent.futures import process
>>> process.__name__
'concurrent.futures.process'

In some circumstances, ``__name__`` is set to the string ``'__main__'``.
``__main__`` is the name of the environment where top-level code is run.
"Top-level code" is the first user-specified Python module that starts running.
It's "top-level" because it imports all other modules that the program needs.
Sometimes "top-level code" is called an *entry point* to the application.

The top-level code environment can be:

* the scope of an interactive prompt::

>>> __name__
'__main__'

* the Python module passed to the Python interpreter as a file argument:

.. code-block:: shell-session

$ python3 helloworld.py
Hello, world!

* the Python module or package passed to the Python interpreter with the
:option:`-m` argument:

.. code-block:: shell-session

$ python3 -m tarfile
usage: tarfile.py [-h] [-v] (...)

* Python code read by the Python interpreter from standard input:

.. code-block:: shell-session

$ echo "import this" | python3
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
...

* Python code passed to the Python interpreter with the :option:`-c` argument:

.. code-block:: shell-session

$ python3 -c "import this"
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
...

In each of these situations, the top-level module's ``__name__`` is set to
``'__main__'``.

As a result, a module can discover whether or not it is running in the
top-level environment by checking its own ``__name__``, which allows a common
idiom for conditionally executing code when the module is not initialized from
an import statement::

if __name__ == '__main__':
# Execute when the module is not initialized from an import statement.
...

.. seealso::

For a more detailed look at how ``__name__`` is set in all situations, see
the tutorial section :ref:`tut-modules`.


Idiomatic Usage
^^^^^^^^^^^^^^^

Some modules contain code that is intended for script use only, like parsing
command-line arguments or fetching data from standard input. When a module
like this were to be imported from a different module, for example to unit test
it, the script code would unintentionally execute as well.

This is where using the ``if __name__ == '__main__'`` code block comes in
handy. Code within this block won't run unless the module is executed in the
top-level environment.

Putting as few statements as possible in the block below ``if __name___ ==
'__main__'`` can improve code clarity and correctness. Most often, a function
named ``main`` encapsulates the program's primary behavior::

# echo.py

import shlex
import sys

def echo(phrase: str) -> None:
"""A dummy wrapper around print."""
# for demonstration purposes, you can imagine that there is some
# valuable and reusable logic inside this function
print(phrase)

def main() -> int:
"""Echo the input arguments to standard output"""
phrase = shlex.join(sys.argv)
echo(phrase)
return 0

if __name__ == '__main__':
sys.exit(main()) # next section explains the use of sys.exit

Note that if the module didn't encapsulate code inside the ``main`` function
but instead put it directly within the ``if __name__ == '__main__'`` block,
the ``phrase`` variable would be global to the entire module. This is
error-prone as other functions within the module could be unintentionally using
the global variable instead of a local name. A ``main`` function solves this
problem.

Using a ``main`` function has the added benefit of the ``echo`` function itself
being isolated and importable elsewhere. When ``echo.py`` is imported, the
``echo`` and ``main`` functions will be defined, but neither of them will be
called, because ``__name__ != '__main__'``.


Packaging Considerations
^^^^^^^^^^^^^^^^^^^^^^^^

``main`` functions are often used to create command-line tools by specifying
them as entry points for console scripts. When this is done,
`pip <https://pip.pypa.io/>`_ inserts the function call into a template script,
where the return value of ``main`` is passed into :func:`sys.exit`.
For example::

sys.exit(main())

Since the call to ``main`` is wrapped in :func:`sys.exit`, the expectation is
that your function will return some value acceptable as an input to
:func:`sys.exit`; typically, an integer or ``None`` (which is implicitly
returned if your function does not have a return statement).

By proactively following this convention ourselves, our module will have the
same behavior when run directly (i.e. ``python3 echo.py``) as it will have if
we later package it as a console script entry-point in a pip-installable
package.

In particular, be careful about returning strings from your ``main`` function.
:func:`sys.exit` will interpret a string argument as a failure message, so
your program will have an exit code of ``1``, indicating failure, and the
string will be written to :data:`sys.stderr`. The ``echo.py`` example from
earlier exemplifies using the ``sys.exit(main())`` convention.

.. seealso::

`Python Packaging User Guide <https://packaging.python.org/>`_
contains a collection of tutorials and references on how to distribute and
install Python packages with modern tools.


``__main__.py`` in Python Packages
----------------------------------

If you are not familiar with Python packages, see section :ref:`tut-packages`
of the tutorial. Most commonly, the ``__main__.py`` file is used to provide
a command-line interface for a package. Consider the following hypothetical
package, "bandclass":

.. code-block:: text

bandclass
├── __init__.py
├── __main__.py
└── student.py

``__main__.py`` will be executed when the package itself is invoked
directly from the command line using the :option:`-m` flag. For example:

.. code-block:: shell-session

$ python3 -m bandclass

This command will cause ``__main__.py`` to run. How you utilize this mechanism
will depend on the nature of the package you are writing, but in this
hypothetical case, it might make sense to allow the teacher to search for
students::

# bandclass/__main__.py

import sys
from .student import search_students

student_name = sys.argv[2] if len(sys.argv) >= 2 else ''
print(f'Found student: {search_students(student_name)}')

Note that ``from .student import search_students`` is an example of a relative
import. This import style must be used when referencing modules within a
package. For more details, see :ref:`intra-package-references` in the
:ref:`tut-modules` section of the tutorial.

Idiomatic Usage
^^^^^^^^^^^^^^^

The contents of ``__main__.py`` typically isn't fenced with
``if __name__ == '__main__'`` blocks. Instead, those files are kept short,
functions to execute from other modules. Those other modules can then be
easily unit-tested and are properly reusable.

If used, an ``if __name__ == '__main__'`` block will still work as expected
for a ``__main__.py`` file within a package, because its ``__name__``
attribute will include the package's path if imported::

>>> import asyncio.__main__
>>> asyncio.__main__.__name__
'asyncio.__main__'

This won't work for ``__main__.py`` files in the root directory of a .zip file
though. Hence, for consistency, minimal ``__main__.py`` like the :mod:`venv`
one mentioned above are preferred.

.. seealso::

See :mod:`venv` for an example of a package with a minimal ``__main__.py``
in the standard library. It doesn't contain a ``if __name__ == '__main__'``
block. You can invoke it with ``python3 -m venv [directory]``.

See :mod:`runpy` for more details on the :option:`-m` flag to the
interpreter executable.

See :mod:`zipapp` for how to run applications packaged as *.zip* files. In
this case Python looks for a ``__main__.py`` file in the root directory of
the archive.



``import __main__``
-------------------

Regardless of which module a Python program was started with, other modules
running within that same program can import the top-level environment's scope
(:term:`namespace`) by importing the ``__main__`` module. This doesn't import
a ``__main__.py`` file but rather whichever module that received the special
name ``'__main__'``.

Here is an example module that consumes the ``__main__`` namespace::

# namely.py

import __main__

def did_user_define_their_name():
return 'my_name' in dir(__main__)

def print_user_name():
if not did_user_define_their_name():
raise ValueError('Define the variable `my_name`!')

if '__file__' in dir(__main__):
print(__main__.my_name, "found in file", __main__.__file__)
else:
print(__main__.my_name)

Example usage of this module could be as follows::

# start.py

import sys

from namely import print_user_name

# my_name = "Dinsdale"

def main():
try:
print_user_name()
except ValueError as ve:
return str(ve)

if __name__ == "__main__":
sys.exit(main())

Now, if we started our program, the result would look like this:

.. code-block:: shell-session

$ python3 start.py
Define the variable `my_name`!

The exit code of the program would be 1, indicating an error. Uncommenting the
line with ``my_name = "Dinsdale"`` fixes the program and now it exits with
status code 0, indicating success:

.. code-block:: shell-session

$ python3 start.py
Dinsdale found in file /path/to/start.py

Note that importing ``__main__`` doesn't cause any issues with unintentionally
running top-level code meant for script use which is put in the
``if __name__ == "__main__"`` block of ``start.py``. Why does this work?

Python inserts an empty ``__main__`` module in :attr:`sys.modules` at
interpreter startup, and populates it by running top-level code. In our example
this is the ``start.py`` file which runs line by line and imports ``namely``.
In turn, ``namely.py`` imports ``__main__`` (which is ``start.py``). That's an
import cycle! Fortunately, since the partially populated ``__main__``
module is present in :attr:`sys.modules`, Python passes that to ``namely.py``.
See :ref:`Special considerations for __main__ <import-dunder-main>` in the
import system's reference for details on how this works.

The Python REPL is another example of a "top-level environment", so anything
defined in the REPL becomes part of the ``__main__`` scope::

A module can discover whether or not it is running in the main scope by
checking its own ``__name__``, which allows a common idiom for conditionally
executing code in a module when it is run as a script or with ``python
-m`` but not when it is imported::
>>> import namely
>>> namely.did_user_define_their_name()
False
>>> namely.print_user_name()
Traceback (most recent call last):
...
ValueError: Define the variable `my_name`!
>>> my_name = 'Jabberwocky'
>>> namely.did_user_define_their_name()
True
>>> namely.print_user_name()
Jabberwocky

if __name__ == "__main__":
# execute only if run as a script
main()
Note that in this case the ``__main__`` scope doesn't contain a ``__file__``
attribute as it's interactive.

For a package, the same effect can be achieved by including a
``__main__.py`` module, the contents of which will be executed when the
module is run with ``-m``.
The ``__main__`` scope is used in the implementation of :mod:`pdb` and
:mod:`rlcompleter`.
2 changes: 2 additions & 0 deletions Doc/reference/import.rst
Original file line number Diff line number Diff line change
Expand Up @@ -975,6 +975,8 @@ should expose ``XXX.YYY.ZZZ`` as a usable expression, but .moduleY is
not a valid expression.


.. _import-dunder-main:

Special considerations for __main__
===================================

Expand Down
1 change: 1 addition & 0 deletions Doc/tools/susp-ignored.csv
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ howto/pyporting,,::,Programming Language :: Python :: 3
howto/regex,,::,
howto/regex,,:foo,(?:foo)
howto/urllib2,,:password,"""joe:[email protected]"""
library/__main__,,`,
library/ast,,:upper,lower:upper
library/ast,,:step,lower:upper:step
library/audioop,,:ipos,"# factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)],"
Expand Down
Loading