Skip to content

Order of installation sources is undocumented #3299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
grothesque opened this issue Dec 11, 2015 · 16 comments
Closed

Order of installation sources is undocumented #3299

grothesque opened this issue Dec 11, 2015 · 16 comments
Labels
auto-locked Outdated issues that have been locked by automation type: docs Documentation related

Comments

@grothesque
Copy link

It seems that current pip documentation does not specify whether the command

pip install hello

will install the package "hello" from PyPI, or from the local path "hello", if both exist. I believe that this aspect of pip deserves to be documented well.

@grothesque grothesque changed the title Order of installation is undocumented Order of installation sources is undocumented Dec 11, 2015
@pfmoore
Copy link
Member

pfmoore commented Dec 11, 2015

Agreed this isn't immediately obvious. However, if you think about it, what you're asking pip to do is to install a requirement hello. In order to do this it checks both PyPI and the local filesystem. There's no order imposed at this stage, both are considered equally valid options. Which one is chosen is based on which has the latest version (via the metadata).

In theory there is a choice to be made if the local filesystem and PyPI have the same version - but it shouldn't actually matter then which pip chooses - as the version is identical, the result should be the same either way (and with the caches involved, it's not even obvious that the local filesystem is faster...)

But it would certainly be useful to document this a bit more clearly - I went hunting for a while before I realised what I said above. Maybe some explanation at https://pip.pypa.io/en/stable/reference/pip_install/#finding-packages would be worthwhile? Would you be interested in suggesting some text (either as a PR, or if you prefer just here and I'll turn it into a PR)?

@pfmoore
Copy link
Member

pfmoore commented Dec 11, 2015

Thinking some more, my suggestion for some text would be:

Pip looks for packages in a number of places, on PyPI, in the local filesystem, and in any additional repositories specified via --find-links or --index-url. There is no ordering in the locations that are searched, rather they are all checked, and the "best" match for the requirements (in terms of version number - see PEP 440 for details) is selected.

Comments?

@grothesque
Copy link
Author

Thanks for the quick reaction and the explaination. You see, even though I am experienced with many packaging tools, it was not at all clear to me that pip considers the various sources in parallel. I would have rather expected that it first looks locally, and only if there's nothing there, in the index.

I think your text is very good. Just one comment: I find that currently it is also not very clear that --no-index disables PyPI. Perhaps one could add behind the first mention of PyPI:

(if it has not been disabled via --no-index)

@pfmoore
Copy link
Member

pfmoore commented Dec 11, 2015

OK, I've created #3300. I'll let Travis check I didn't make any stupid typos, then commit it.

Thanks for the suggestion!

@grothesque
Copy link
Author

#3300 looks good. I realize, however, that some subtle issues are still not clear (at least to me). I believe that these things should be cristal clear just from reading the documentation, please bear with me.

So pip install foo will nstall the best-fitting version of package foo, from any active source. With the default sources, If there is a local directory foo, this will be examined as well. If the local directory foo contains a version that is older than that on PyPI, the version from PyPI will be fetched and installed. But what if there is a local file foo.whl? And what if there is, say, a file foo-1.5.1-cp27-none-win32.whl? Will those be also examined, or are only files examined whose filename matches the package name?

But then, what if the directory foo contains a package called bla?

If I say pip install foo-1.5.1-cp27-none-win32.whl, but foo 1.5.2 is available on PyPI, will it be installed instead? What if there is also a local directory foo with version 1.5.3?

One more in the same spirit: Say I download one of the wheels for Gevent from http://www.lfd.uci.edu/~gohlke/pythonlibs/#gevent, but I forget to download its dependency greenlet from the same source. Now I execute pip install gevent-1.1rc1-cp34-none-win_amd64.whl. This will fetch and install greenlet from PyPI unless --no-deps is given, right?

@pfmoore
Copy link
Member

pfmoore commented Dec 12, 2015

The directory that is examined is actually .. A directory in the current directory called foo is the package foo, and it has a version determined by what setup.py delivers. It doesn't "contain" a version, in the sense I assume you mean - nor is it possible for it to contain a package bla.

foo.whl isn't a wheel (because the filename doesn't match the required format). OTOH foo-1.5.1-cp27-none-win32.whl is, it's a wheel for package foo version 1.5.1, compatible with Python 2.7 on win32.

I see why you might be having trouble understanding all the implications here, but I think it's because you are misunderstanding fundamental concepts (what is a package, what is a requirement...) that we shouldn't be explaining in detail in the install page, but should be made clear somewhere in the manual (I don't know if they are, I haven't checked).

pip install foo-1.5.1-cp27-none-win32.whl is installing a specific wheel, not a requirement, which is subtly different, and almost certainly isn't explained well - but I'm not sure how to improve it. Specifically it's an instance of pip install [options] <archive url/path> rather than pip install [options] <requirement specifier> [package-index-options], and so behaves somewhat differently.

You're right in your understanding of the Gevent case.

I think I'll merge the update I made, but beyond that I'm not sure what (or how much) extra needs changing. A PR clarifying some of the points would be welcome, but we should remember that this is a manual page, not a tutorial, and it's reasonable to assume some familiarity with the underlying concepts and principles of packaging (explanations of which are in the realm of the packaging user guide).

@xavfernandez xavfernandez added the type: docs Documentation related label Dec 12, 2015
@grothesque
Copy link
Author

Thanks again for the explanation. I agree of course that reference documentation should not be a tutorial, but then reference documentation should strive to completely specify the behavior while remaining terse.

Please do go ahead merging your PR. Consider the following comments as inspiration for another "issue" that I believe should be opened regarding pip documentation.

It is true that I have very little experience with pip, but I do have considerable experience with packaging Python software: I did the (rather complex) packaging of https://pypi.python.org/pypi/kwant/ for example. If after reading the output of pip help install and pip_install.rst I am still confused about central aspects of the program's behavior, this is a clue that the documentation could be improved. This is what I am trying to help with. It is difficult for me to make a PR myself, as I would have first to understand how pip behaves completely.

I think a lot could be gained by adding (to the "Description" section of pip help install and pip_install.rst) a terse narrative of what pip actually does when called in the ways mentioned in "Usage". This narrative should briefly explain what all the placeholders in the "Usage" section mean (standard practice for manpages). In the case of pip, It should be clear, for example, what arguments count as "requirement specifiers" and what as "archive url/path", etc. and what is done with these arguments.

I couldn't find any documentation that states that pip searches the current directory for packages like you say. I believe this should be mentioned at a prominent place. Does this for example mean that if the current directory contains packages that are dependencies of the one that one is trying to install, these dependencies might be installed from that source? If this is indeed the case this should be made clear somehow.

Above, you say that a directory "foo" cannot contain a python package "bla". Where is this explained in the docs? To my knowledge the name of a Python package is defined as the name argument to distutils.setup or setuptools.setup. So a directory "foo-new" can very well contain a package named "foo".

I noted that PEP 427 specifies a naming scheme for wheel files. So, pip enforces this scheme, i.e. it's valid to execute pip install foo-1.5.1-cp27-none-win32.whl with the mentioned whl-file in the current directory, pip install foo.whl (after renaming the file) would be invalid? If this is so, then pip help install should say something like that:

Usage:
pip install [options] ...
(...)

Description:
(...) A is a file that follows the naming convention specified by PEP 427. (...)

@pfmoore
Copy link
Member

pfmoore commented Dec 13, 2015

Thanks for the suggestions. I see where you're coming from - the problem we have is that most of the people developing pip are a bit too close to the problem to know what's difficult to understand. We have people working on the "what does a beginner need to know" side of the question (there is work going on with the PUG in that area) but "people experienced with Python packaging but not with pip" is a different (and possibly much bigger) group of users that we haven't really addressed, so your feedback is really helpful.

I agree with your suggestion on having a description of what each of the things in "Usage" does. I'll see if I can write some words around that.

Rethinking your search path question in the context of your suggestion that we need to define what each of the forms in the "usage" section does, makes me realise that my original explanation (and hence, some of the details in my PR) was wrong. The problem is that once we take a step back, your question about the search path doesn't actually make sense - there isn't really a single search path.

The thing is, pip install foo can be either pip install <requirement> or pip install <local project path>. In the first case, it looks for a requirement foo in one of the given index locations. In that case, it's looking for (sdists or) wheels, which it identifies as being for project foo by the filename. In the case of wheels, this is mandated by the spec, but for sdists it's a bit more informal (and a sdist named foo-X.Y.zip could in theory contain sources that build a project named bar according to setup.py, but that's a messy side-case we'd expect a proper sdist spec to remove). It doesn't look at the local directory at all. When it finds all valid candidates, it chooses the best by looking for the latest version that satisfies all the identified constraints.

But in the case of pip install <local project path>, foo is the name of a local path (i.e., subdirectory foo in the current directory). That is just installed directly (currently by running setup.py to build a wheel then installing that wheel,l unless wheel caching is disabled when we run setup.py install). There's no search, no selection of a candidate, or anything like that. However, as part of installing foo, pip does identify any dependencies of foo. Dependencies are located by looking at the project's metadata - and dependencies are always requirements, not local paths, so they go down the "search the index" route.

So my previous explanation (and my PR) was wrong. I'll need to make a new PR that clarifies - I'll do so by basing it on your suggestion of explaining each of the usage scenarios in turn. I'd appreciate your review on that PR, when I produce it - this sort of area is where (as we've seen!) it's very easy for the developers to assume too much knowledge, and not explain clearly enough. And the key thing is that we say what you need to know in a way that you can understand.

Regarding your point on the naming scheme for wheel files. I'm still not 100% sure what you're getting at here. As you say, PEP 427 specifies a naming scheme for wheels. But it's not so much that pip enforces that scheme, as that pip expects it. You suggest renaming foo-1.5.1-cp27-none-win32.whl as foo.whl. But if you do that it's not a valid wheel any more and pip's behaviour when asked to install it is undefined, just as it would be if you said pip install abc.csv. Maybe it's unusual for the name of a file to be constrained by a file spec as well as the contents, I don't know, but it is what the wheel spec says. So renaming wheels is not allowed (except in certain very specific cases, such as to change the compatibility metadata that is encoded in the filename, and if you do that you're taking responsibility for doing it correctly). But none of this (IMO) needs to be in the pip install manpage, as it's not about pip, but about what a wheel is. If it's possible to explain it very briefly, then a reminder note would be appropriate, but the wheel PEP should be the definitive document.

Maybe when saying what pip install [options] <archive url/path> does, we could say

The archive url/path must point to a valid sdist or wheel (see the relevant specifications for what counts as valid). The filename of the archive is defined by the format specification, so arbitrary filenames are not allowed here. (Note that there is no formal specification at present for a sdist, what constitutes a valid sdist is basically "something setup.py sdist produces).

How does that sound?

@xavfernandez
Copy link
Member

The archive url/path must point to a valid sdist or wheel

Don't forget that pip install [options] <archive url/path> can also install a directory containing a setup.py (e.g. pip install .).

@pfmoore
Copy link
Member

pfmoore commented Dec 13, 2015

Thanks. It's awfully easy to miss important details like this when trying to summarise. Which is why review of any changes here is so important (and also why I'm reluctant to go into too much detail - I'm definitely not familiar with all the nuances here so I'm likely to miss things).

@pfmoore
Copy link
Member

pfmoore commented Dec 13, 2015

... although rereading the manpage, that would be an instance of pip install [options] [-e] <local project path> ... so actually I didn't get this wrong. I just didn't include full context...

@grothesque
Copy link
Author

@pfmoore, your text sounds good to me. I'm available to review the PR.

@pfmoore
Copy link
Member

pfmoore commented Dec 18, 2015

@cgroth Thanks, I haven't forgotten this, but I've been a bit busy the last week or so. I'll ping when I have a PR ready.

@pfmoore
Copy link
Member

pfmoore commented Jan 15, 2016

@cgroth I'm just getting back to this - sorry for the delay.

I've created PR #3366 that covers the following points:

  • How pip chooses whether an argument is a requirement, a local filename, or whatever
  • How pip works out what project name/version a file or URL is for
  • How pip satisfies requirements (latest version that satisfies all constraints)

Let me know what you think

@grothesque
Copy link
Author

Looks good to me, thanks a lot!

@xavfernandez
Copy link
Member

Closing as the need seems to have been addressed. Comment/reopen if needed.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 4, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 4, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation type: docs Documentation related
Projects
None yet
Development

No branches or pull requests

3 participants