Skip to content

stubtest crash with special unicode characters #19071

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Avasam opened this issue May 10, 2025 · 3 comments · May be fixed by #19085
Open

stubtest crash with special unicode characters #19071

Avasam opened this issue May 10, 2025 · 3 comments · May be fixed by #19085

Comments

@Avasam
Copy link
Contributor

Avasam commented May 10, 2025

Crash Report

I tried running stubtest on typeshed's networkx stubs

Traceback

networkx... Note: networkx is not currently tested on win32 in typeshed's CI.
(49.63 s) fail

**********************************************************************

Commands run:

C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Scripts\pip.exe install networkx[]==3.4.2 mypy==1.15.0 numpy>=1.20 pandas
MYPYPATH=stubs\networkx C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Scripts\python.exe -m mypy.stubtest --mypy-config-file C:\Users\Avasam\AppData\Local\Temp\tmpyz4as0cw --custom-typeshed-dir . networkx --allowlist stubs\networkx\@tests\stubtest_allowlist.txt

**********************************************************************

Command output:

error: networkx.readwrite.text.AsciiDirectedGlyphs.vertical_edge is not present in stub
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\readwrite\text.pyi
MISSING
Runtime:
'!'

error: networkx.readwrite.text.AsciiUndirectedGlyphs.vertical_edge is not present in stub
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\readwrite\text.pyi
MISSING
Runtime:
'|'

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Lib\site-packages\mypy\stubtest.py", line 2127, in <module>
    sys.exit(main())
             ~~~~^^
  File "C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Lib\site-packages\mypy\stubtest.py", line 2123, in main
    return test_stubs(parse_options(sys.argv[1:]))
  File "C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Lib\site-packages\mypy\stubtest.py", line 2015, in test_stubs
    print(error.get_description(concise=args.concise))
    ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Avasam\AppData\Roaming\uv\data\python\cpython-3.13.1-windows-x86_64-none\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\u257d' in position 263: character maps to <undefined>

**********************************************************************

Python version: Python 3.13.1 (main, Dec 19 2024, 14:38:48) [MSC v.1942 64 bit (AMD64)]

Ran with the following environment:
mypy==1.15.0
mypy_extensions==1.1.0
networkx==3.4.2
numpy==2.2.5
pandas==2.2.3
pip==25.1.1
python-dateutil==2.9.0.post0
pytz==2025.2
six==1.17.0
typing_extensions==4.13.2
tzdata==2025.2

To Reproduce

  1. Clone type shed at commit ee5fcf264bcfddb14259f9f20601781f84834f29
  2. Comment out line networkx.readwrite.text.* in stubs/networkx/@tests/stubtest_allowlist.txt
  3. Run ./tests/stubtest_third_party.py networkx

Your Environment

  • Mypy version used: mypy 1.15.0 (compiled: yes)
  • Mypy command-line flags: python.exe -m mypy.stubtest --mypy-config-file C:\Users\Avasam\AppData\Local\Temp\tmpyz4as0cw --custom-typeshed-dir . networkx --allowlist stubs\networkx\@tests\stubtest_allowlist.txt
  • Mypy configuration options from mypy.ini (and other config files): None afaik
  • Python version used: Python 3.13.1
  • Operating system and version: Version 10.0.19045 Build 19045

Could be related to #11031

@sterliakov
Copy link
Collaborator

sterliakov commented May 10, 2025

Does it crash if you execute a windows equivalent of export PYTHONUTF8=1 (docs) before running stubtest?

A sensible default for mypy would be to use open(encoding="utf-8") for all files that can contain parts of python code, as (hopefully) no one uses # -*- coding comments to set non-utf-8 source encoding nowadays, and python sources are by default UTF-8 (we can add parsing of those later if needed). This is indeed the same root cause as #11031 but different symptoms.

@Avasam
Copy link
Contributor Author

Avasam commented May 12, 2025

Adding {"PYTHONUTF8": "1"} to the subprocess.run's env parameter that calls stubtest did work to prevent the crash: python/typeshed@080fb80#diff-468e55534823989bcfd0a85e82ccf8e4376c4115cc80c27a64795e35c1b5853c

@sterliakov
Copy link
Collaborator

Amazing! I'll try to open an exploration PR in a few days. Any file that contains python code obtained from .py files should be UTF8-encoded, excluding cases with magic coding comments. Defaulting to system encoding is worse than defaulting to UTF8 - python code has been UTF8 on all platforms for a long time. This won't be a problem in 3.15 (PEP686), but we probably don't want to wait that long:)

@sterliakov sterliakov linked a pull request May 12, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants