Skip to content

Remove distutils path patching #1321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 31, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions astroid/interpreter/_import/spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@
import zipimport
from functools import lru_cache

try:
from setuptools import _distutils
except ImportError:
_distutils = None

from . import util

ModuleType = enum.Enum(
Expand Down Expand Up @@ -167,6 +172,9 @@ def contribute_to_path(self, spec, processed):
# distutils is patched inside virtualenvs to pick up submodules
# from the original Python, not from the virtualenv itself.
path = list(distutils.__path__)
elif spec.name == "distutils" and _distutils:
# distutils is patched in setuptools >= 60.0.0 to _distutils
path = list(_distutils.__path__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I wonder if we should just always use distutils.__path__. We do import it earlier, so it's most likely the one we actually want. From what I can tell, the first condition is equal to the else case anyway.

        elif spec.name == "distutils":
            path = list(distutils.__path__)

Defaulting to the _distutils path from setuptools alone can however be dangerous. It was added in v48.0.0 but only became the default in v60.0.0. So just in theory, if the first condition wouldn't hit, it would pick the wrong path. In that case you probably should add an additional condition to make sure the setuptools distutils is actual the distutils we are using. With that we likely end up with the simplified case from above again.

        elif spec.name == "distutils" and _distutils and _distutils.__path__ == distutils.__path__:
            path = list(distutils.__path__)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, just doing elif spec.name == "distutils" would remove a setuptools dependency. -> #1320
Maybe we should just test it in production and see if we get any new bug reports πŸ€”

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not completely following you here. Especially with all the latest hot fixes and releases I'm not sure what actually needs to be fixed.

For pylint everything seems fine. We might be incompatible with setuptools~=60.0 but due to all the recent releases it would be quite a big effort to find the specific release which fixed the bug that made the tbump upgrade fail. I feel like we can let that be, right?

For astroid this test is still failing for setuptools==60.2.0. I agree that the first if seems to do the same as the else. That can likely be removed, leaving only a if for _distutils to be necessary. Do you have any idea how to do that if without depending on setuptools but being compatible with both v47 and v60?

Please correct me where I'm wrong. This has been a difficult issue to keep track off and its getting late πŸ˜„

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been a difficult issue to keep track off and its getting late πŸ˜„

Same 😴 I can try to summarize what I think is going on here.

What do we know?
Starting with v60 setuptools patches (or better "does some sketchy things") to replace the stdlib version of distutils with their own copy. This is vendored from pypa/distutils and "should" be functionally identical with the stdlib version. If at all, it might include some more bugfixes that won't get included into stdlib.

Why?
distuils has been deprecated and is slated for removal in 3.12. Everyone who is still using it should either switch to the setuptools version (or another similar one) or even better replace those function calls altogether. Most, but not all, functions already have equivalents in the stdlib. https://www.python.org/dev/peps/pep-0632/#migration-advice
Longterm, the maintainers of setuptools also plan to discontinue distutils, but for the meantime it will continue to work.

What is actually failing?
As best as I can tell, only the astroid test case. Since the included distutils copy is functional identical to the stdlib it doesn't really matter too much that the spec path points towards the stdlib while the module is coming from setuptools._distutils.

What failed previously?
I only have an educated guess here. There seems to be a habit among certain packages to ship with their own version of distutils. In particular a (presumably old) version of virtualenv seem to have done so. It seems astroid had inferred the virtualenv version as spec.location instead of the stdlib one. That caused ImportErrors as the virtualenv version wasn't a complete copy.

Is that still an issue?
I don't know πŸ€·πŸ»β€β™‚οΈ

What should the goal here be for astroid?
Pylint shouldn't report a false-positive ImportError for things like distutils.util.

What path for distutils should astroid return?
I'm uncertain about this one. If the goal is to prevent the false-positive and enable inference, any "good" copy of distutils would probably work. It doesn't really matter too much if it's the one from the stdlib or setuptools, since I don't think the main interface is going to change.

What do we do now?
Some options

  1. Disable the test. As explained in the section before it doesn't matter to much that it's inferred as the stdlib version although technically it's using the setuptools one.
  2. Patch the test. Instead of comparing it with the actual distutils.__path__, we could use a good python file to find the stdlib location and built the expected stdlib distutils path from there.
import os
[f"{os.__file__.rsplit('/', 1)[0]}/distutils"]
  1. Use the actual setuptools path if it's imported from setuptools. Basically the proposed fix. We try to import distutils twice, once from distutils and another time from setuptools._distutils. If both match, we choose to use the setuptools one for the path. Although probably the "most correct one", it requires us to do an expensive import twice.
  2. YOLO it. We just remove all custom handling for distutils and see if we get any bug reports. I suspect / hope not but if we do, we at least would know what needs fixing.

Is the first conditional for distutils still needed?
πŸ€·πŸ»β€β™‚οΈ

My preference?
I believe all of the 4 options above should work. If we don't want to risk it all with (4), option (2) might be an idea. We can always come back to it in case we start getting bug reports eventually.

Did I miss something?
Likely πŸ˜„ At this point I don't even know if I've fully understood what's going on. It's also much too late to be writing such long posts 😴

else:
path = [spec.location]
return path
Expand Down