Skip to content

Add new option --rebuild (or --never-binary) to the install command #13088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
paugier opened this issue Nov 20, 2024 · 4 comments
Open
1 task done

Add new option --rebuild (or --never-binary) to the install command #13088

paugier opened this issue Nov 20, 2024 · 4 comments
Labels
state: needs discussion This needs some more discussion type: feature request Request for a new feature

Comments

@paugier
Copy link

paugier commented Nov 20, 2024

What's the problem this feature will solve?

I try to split issue #12954 in smaller and more tractable issues.

As discussed in detailed in #12954, the option --no-binary was used in particular for HPC to rebuild a package from source with different build arguments or different libraries (for example hdf5 sequential/parallel, or openMPI/MPICH).

However, since 23.1 (https://pip.pypa.io/en/stable/news/#v23-1), "--no-binary does not disable the cache of locally built wheels anymore. It only means “don’t download wheels”. (#11453)".

As pointed out here, web documentation and supercomputer tutorials still use --no-binary for this purpose because of the old behavior (before 23.1) and the misleading name of this option (--no-binary only means “don’t download wheels”).

For this purpose ("reinstall a package from source"), one now needs to run things like:

pip cache remove h5py; pip install h5py --no-binary h5py --force
pip cache remove pyFFTW; pip install pyfftw --no-binary pyfftw --force
pip cache remove fluidfft_fftwmpi*; pip install fluidfft-fftwmpi --no-binary fluidfft-fftwmpi --force

Such commands are quite ugly, long and error prone. Note the "pyFFTW" and "fluidfft_fftwmpi", whereas the packages are named pyfftw and fluidfft-fftwmpi (#13086).

Describe the solution you'd like

I propose to add a --rebuild option so that

pip install pyfftw --rebuild

(re)installs pyfftw from source (no wheel at all).

It seems to me that it would be convenient to be able to use --rebuild alone (like above, equivalent to pip install pyfftw --rebuild pyfftw) or as

pip install fluidsim[mpi] --rebuild mpi4py --rebuild pyfftw --rebuild h5py

like for --no-binary.

I think

pip install pyfftw mpi4py h5py --rebuild

should be equivalent to

pip install pyfftw mpi4py h5py --rebuild pyfftw --rebuild mpi4py --rebuild h5py

Alternative Solutions

Other name like --never-binary could be used.

Additional context

Code of Conduct

@sbidoul
Copy link
Member

sbidoul commented Nov 23, 2024

Since the issue seems very much related to caching, I wonder if options to provide finer grained control of the cache of locally built wheels would address the use case too.

@sbidoul sbidoul added state: needs discussion This needs some more discussion and removed S: needs triage Issues/PRs that need to be triaged labels Nov 23, 2024
@pfmoore
Copy link
Member

pfmoore commented Nov 23, 2024

I agree with @sbidoul here. I would prefer to see this named in a way that makes it clear that it's about using the cache. So something like --no-build-cache PKG (with the usual :all: keyword to disable all build cache use) would work better for me.

The invocation would then be:

pip install fluidsim[mpi] --no-build-cache mpi4py --no-build-cache pyfftw --no-build-cache h5py

For me, --rebuild sounds too much like it would also rebuild an already-installed copy of the package.

@paugier
Copy link
Author

paugier commented Jan 27, 2025

With all the respect that I have for you @pfmoore and @sbidoul, it seems to me that you take the point of view of pip developers (who know the internal details) to think about the design of the user API.

From the point of view of users, we don't care about caching. What some users want is the possibility to ask for installation of few packages with new builds from source (and clearly most of the times --no-build-cache :all: is not what we want).

IMHO, we really need to find a good solution to avoid things like:

pip cache remove h5py; pip install h5py --no-binary h5py --force

and take into account the need of users that used --no-binary to ask for a new build.

What about --new-build?

With --no-build-cache, to modify an environment, I would still need

pip install mpi4py pyfftw h5py --no-build-cache mpi4py --no-build-cache pyfftw --no-build-cache h5py --force

(which is ugly and would force reinstallation of many other dependencies, which is not what we want)

pip install mpi4py pyfftw h5py --new-build

is much simpler and cleaner.

@pfmoore
Copy link
Member

pfmoore commented Jan 27, 2025

I think that worrying about the name is probably a mistake at this point. We need a clean design first, and then someone needs to create a PR. Until both of those things happen, worrying about the option name is proably premature.

As far as design is concerned, my first question is, does the user always want every project to be rebuilt from source? Even dependencies? That's a genuine question, I have no knowledge of HPC-style workflows. If so, then I guess a --new-build option might make sense.

But if we ever want to allow previous builds of anything to be used (for example, a pure Python dependency like requests), then we need some way for the user to say what they want rebuilding, and our current design for that is a multiple-use option where you specify one package at a time (with an :all: shortcut). If this isn't acceptable, you'll need to clarify what in this specific case is sufficiently unique to justify a different UI from the rest of pip.

it seems to me that you take the point of view of pip developers (who know the internal details) to think about the design of the user API.

I hope we don't - I certainly try not to. But I'll concede that most users shouldn't need to know or care about the wheel cache. Having said that, though, most users don't reconfigure build options between installs. So using the cache is a significant benefit, at no real cost, for most of our users. For those users like yourself where it isn't a benefit, we should have an option to bypass the cache, I agree. But without some perspective on how you view what you're doing (which you gave in your last message) it's hard to know what form the UI should take.

I'll also note that pip is intended as an install tool, not as a build tool. The fact that pip builds packages when there are no binaries available is intended itself to be an implementation detail the user shouldn't care about. So from that perspective, using --no-binary at all is already getting involved in implementation details, and is not something we'd expect the average user to worry about.

But anyway, I think the next steps are for someone (I don't know if it's something you are offering to do) to:

  1. Work out a design for the feature. Start by answering my questions above, and don't worry too much at this stage about the option name.
  2. Create a PR implementing the feature. If you're concerned about wasting effort should the feature get rejected, post your design here first, and get agreement on your approach before writing the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: needs discussion This needs some more discussion type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

3 participants