Skip to content

Adjust SCC setup to enable earlier collections.abc import in typeshed #14088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Michael0x2a
Copy link
Collaborator

Fixes #11860 (?)

Typeshed is currently unable to import Sequence, MutableSequence, or ByteString from collections.abc within builtins.pyi. It seems this is because:

  1. In order to analyze collections.abc, we first need to analyze collections.

  2. Since collections is a package containing an __init__.pyi file, the add_implicit_module_attrs function will try adding the __path__ variable to the symboltable.

  3. The __path__ variable has type builtins.str. But str is a subclass of Sequence, which we have not analyzed yet since we're still in the middle of analyzing collections and collections.abc.

This diff tries repairing this by:

  1. Adding _collections_abc and collections.abc to the set of special-cased core modules we deliberately process early.

  2. Modifying add_implicit_module_attrs so it does the same trick we do for the __doc__ symbol and fall back to using an UnboundType if builtins.str is not defined yet.

To be 100% honest, I'm not really sold on this PR for a few reasons:

  • I was able to test these changes manually, but wasn't sure how to write tests for them.
  • We have 3-4 subtly different lists of "core modules" scattered throughout mypy. For example, see CORE_BUILTIN_MODULES in mypy/build.py or try grepping for the string "typing" in the mypy dir. Arguably, we should defer landing this PR until we've had a chance to consolidate these lists and confirm there are no additional places where we need to special-case _collections_abc, collections, and collections.abc.
  • PEP 585 attempted to declare that we should one day remove entries like Sequence from typing module, but this realistically doesn't seem ever achievable given that (a) it would break backwards compat and (b) there doesn't seem to be any incentives for users to proactively switch. In that case, is there any pressing reason to change typeshed?

Regardless, this is a crash and my goal atm is to de-crash mypy, so I'm throwing this over the wall.

Fixes python#11860 (?)

Typeshed is currently unable to import Sequence, MutableSequence,
or ByteString from collections.abc within builtins.pyi. It seems
this is because:

1. In order to analyze `collections.abc`, we first need to analyze
   `collections`.

2. Since `collections` is a package containing an `__init__.pyi`
   file, the `add_implicit_module_attrs` function will try adding
   the `__path__` variable to the symboltable.

3. The `__path__` variable has type `builtins.str`. But str is a
   subclass of Sequence, which we have not analyzed yet since we're
   still in the middle of analyzing `collections` and `collections.abc`.

This diff tries repairing this by:

1. Adding `_collections_abc` and `collections.abc` to the set of
   special-cased core modules we deliberately process early.

2. Modifying `add_implicit_module_attrs` so it does the same trick
   we do for the `__doc__` symbol and fall back to using an UnboundType
   if `builtins.str` is not defined yet.

To be 100% honest, I'm not really sold on this PR for a few reasons:

- I was able to test these changes manually, but wasn't sure how to
  write tests for them.
- We have 3-4 subtly different lists of "core modules" scattered
  throughout mypy. For example, see `CORE_BUILTIN_MODULES` in
  mypy/build.py or try grepping for the string `"typing"` in the mypy
  dir. Arguably, we should defer landing this PR until we've had a
  chance to consolidate these lists and confirm there are no additional
  places where we need to special-case `_collections_abc`,
  `collections`, and `collections.abc`.
- PEP 585 attempted to declare that we should one day remove entries like
  Sequence from `typing` module, but this realistically doesn't seem
  ever achievable given that (a) it would break backwards compat and
  (b) there doesn't seem to be any incentives for users to proactively
  switch. In that case, is there any pressing reason to change typeshed?

Regardless, this is a crash and my goal atm is to de-crash mypy, so
I'm throwing this over the wall.
@github-actions
Copy link
Contributor

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@AlexWaygood
Copy link
Member

AlexWaygood commented Nov 14, 2022

  • is there any pressing reason to change typeshed?

FWIW, I agree that this is about as low-priority as a crash report can be (and I'm the original filer of the report). It would be nice to be able to be consistent in typeshed, but there's definitely no urgent need to change anything :)

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG as a medium term fix.

@AlexWaygood AlexWaygood added the affects-typeshed Anything that blocks a typeshed change label Dec 11, 2022
@AlexWaygood AlexWaygood mentioned this pull request Jan 26, 2023
17 tasks
@JelleZijlstra JelleZijlstra merged commit 6442b02 into python:master Jan 26, 2023
AlexWaygood pushed a commit to AlexWaygood/mypy that referenced this pull request Jan 26, 2023
…python#14088)

Fixes python#11860 (?)

Typeshed is currently unable to import Sequence, MutableSequence, or
ByteString from collections.abc within builtins.pyi. It seems this is
because:

1. In order to analyze `collections.abc`, we first need to analyze
`collections`.

2. Since `collections` is a package containing an `__init__.pyi` file,
the `add_implicit_module_attrs` function will try adding the `__path__`
variable to the symboltable.

3. The `__path__` variable has type `builtins.str`. But str is a
subclass of Sequence, which we have not analyzed yet since we're still
in the middle of analyzing `collections` and `collections.abc`.

This diff tries repairing this by:

1. Adding `_collections_abc` and `collections.abc` to the set of
special-cased core modules we deliberately process early.

2. Modifying `add_implicit_module_attrs` so it does the same trick we do
for the `__doc__` symbol and fall back to using an UnboundType if
`builtins.str` is not defined yet.

To be 100% honest, I'm not really sold on this PR for a few reasons:

- I was able to test these changes manually, but wasn't sure how to
write tests for them.
- We have 3-4 subtly different lists of "core modules" scattered
throughout mypy. For example, see `CORE_BUILTIN_MODULES` in
mypy/build.py or try grepping for the string `"typing"` in the mypy dir.
Arguably, we should defer landing this PR until we've had a chance to
consolidate these lists and confirm there are no additional places where
we need to special-case `_collections_abc`, `collections`, and
`collections.abc`.
- PEP 585 attempted to declare that we should one day remove entries
like Sequence from `typing` module, but this realistically doesn't seem
ever achievable given that (a) it would break backwards compat and (b)
there doesn't seem to be any incentives for users to proactively switch.
In that case, is there any pressing reason to change typeshed?

Regardless, this is a crash and my goal atm is to de-crash mypy, so I'm
throwing this over the wall.
JukkaL pushed a commit that referenced this pull request Jan 27, 2023
…ort in typeshed (#14531)

A backport of #14088 to the `release-1.0` branch.

Co-authored-by: Michael Lee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-typeshed Anything that blocks a typeshed change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash when (Mutable)Sequence is imported from collections.abc in typeshed/stdlib/builtins.pyi
4 participants