Skip to content

Feature: Deterministic Builds #5102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Syeberman opened this issue Mar 21, 2018 · 5 comments
Open

Feature: Deterministic Builds #5102

Syeberman opened this issue Mar 21, 2018 · 5 comments
Labels
state: needs discussion This needs some more discussion type: feature request Request for a new feature

Comments

@Syeberman
Copy link

This has been raised before, and closed, but I'm hoping you can reconsider adding something like pip install --no-deps=error, or perhaps pip check --ignore-versions. Let me explain.

In our build we attain reproducibility by fixing our package versions in requirements.txt. Unfortunately, if someone forgets to list a package dependency in this file, it is implicitly installed by pip install. This implicit installation likely uses the latest version of the dependency, bypassing our reproducibility requirements.

Fortunately, --no-deps exists, ensuring that missing dependencies are not implicitly installed. This is a good thing, but it shifts the problem to runtime, where we won't see a failure until the code trips over a missing import. Ideally, this sort of error should occur during build time.

Fortunately, pip check exists, which will fail if there are missing dependencies. However, pip check also fails if the dependencies aren't the correct versions, as determined by each package. This would be OK if packages properly managed their dependency requirements, but as stated in your blog post people misunderstand the difference between setup.py and requirements.txt. Just last week, we had a build failure because a package introduced a python-dateutil<2.7.0 restriction in their setup.py so they could continue to run Python 2.6 in their CI builds.

Even if all packages did properly manage their dependency requirements, there's still a possibility of conflicting requirements between two packages, which would cause pip check to fail. One of the things I love about pip is that it gives the user power to override and resolve these conflicts by simply using the first specification it finds. However, even once these conflicts are resolved, pip check is still going to fail.

A pip install --no-deps=error would solve this problem by disabling implicit installations of dependencies and turning them into a build error. A pip check --ignore-versions would solve this problem by only checking if a package's dependencies exist, disregarding any version constraints. Either one of these solutions can pave the way to truly reproducible builds.

@pradyunsg pradyunsg added the type: feature request Request for a new feature label Apr 7, 2018
@pradyunsg pradyunsg added the S: needs triage Issues/PRs that need to be triaged label Jun 28, 2018
@chrahunt
Copy link
Member

If I'm understanding correctly, your main objective is to be able to reproduce a build/runtime environment including direct and transitive dependencies.

To that end, pipenv may be a much better tool for the job since, using its Pipfile.lock, it provides the kind of reproducibility you're looking for while also making things easier since in most cases you'll only have to directly manage the list of direct dependencies.

@chrahunt chrahunt added the state: needs discussion This needs some more discussion label Jul 26, 2019
@ghost ghost removed the S: needs triage Issues/PRs that need to be triaged label Jul 26, 2019
@HectorOrdonezCergentis
Copy link

HectorOrdonezCergentis commented Sep 2, 2021

Why pip does not include this functionality?. Pipenv is a better tool for that just because pip does not do it - is there a reason for pip to avoid deterministic builds?

Other package managers like composer in php and bundler in ruby do it.

I personally find deterministic builds to be a necessity in software projects, so it was a shock for me to find out this is not included in pip

Edit: changed the wording since initially this was quite an aggressive comment.

@potiuk
Copy link
Contributor

potiuk commented Sep 2, 2021

It is ridiculous that we have to add another dependency to our projects (pipenv) just to have deterministic builds, which is such a basic aspect in software development.

Comment from my side. I see it completely differently from my experience and needs (and it is not at all ridiculous).

It's not as easy as you might think to support deterministic builds when you want to put them as part of CI build for example (which is the main purpose). Especially when you have many people, complex dependencies, often conflicting and you want to add people possibiliy of changing those dependencies and you want to keep up with latest versions. And I like the approach and capabilities of PIP and the fact that it does not force it's "opnionated" ways on how to do it.

For those "deterministic" builds you can only do when you go with very opinionated approach how the "maintenance" of the dependencies should look like. For complex projects with complex dependencies that might not work.

When you go into complex projects (example Apache Airflow) neither pipenv, nor poetry nor any other tool we tried works well because they are too opinionated in the way how they treat libraries vs. application. PIP on tho other hand is super versatile and have features such as pip freeze, --pip constraints, eager upgrade that make it possible for us to have a complex, but well working solution. Only thanks to that we can get this working - https://airflow.apache.org/docs/apache-airflow/stable/installation.html#constraints-files (and we explicitly excluded pipenv/poetry as installation tools until they support something similar to constraints approach).

Also If you have simple case, it's super easy to run pip freeze > requirements.txt to get deterministic build if you want and it's pretty good idea to just have 'setup.py` with dynamic builds and periodically refreshed requirements.txt if you want a simple solution with PIP.

I really like the fact that pip is a "swiss-knife" of installation, which means that you can use it both in simple and complex scenarios and that it is not as opinionated as other tools. And I think this is perfectly fine if you choose another tool if the "opnionated" approach is ok for you (for many projects, pipenv/poetry are good choices).

@pfmoore
Copy link
Member

pfmoore commented Sep 2, 2021

that is not a reason for pip to stop being lazy about an essential aspect in any respectable package manager

Please consider the tone of your comments, there is no reason to insult the pip maintainers and their efforts just because a feature you'd like hasn't been implemented. There are probably millions of people who use pip who haven't complained about the lack of this feature, so calling it "essential" is at best an exaggeration.

@HectorOrdonezCergentis
Copy link

@potiuk Thank you for explaining that, to support deterministic builds, the package managers have to take opinionated approaches regarding how do they solve dependencies. I also appreciate the use case you provide, so it is not a theoretical scenario.

In my opinion it is not necessarily bad that a dependencies manager takes an opinionated approach to allow for deterministic builds - as long as it allows for a custom solution.

@pfmoore You are absolutely right - this was closer to a rant than a constructive feedback from my behalf, consequence of several hours of struggling with setting up an environment with pipenv and docker. No excuse for the aggressive comment; I apologize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: needs discussion This needs some more discussion type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

6 participants