Skip to content

CI: Use mamba-org/provision-with-micromamba to setup micromamba #2435

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 61 commits into from
Apr 23, 2023

Conversation

seisman
Copy link
Member

@seisman seisman commented Mar 17, 2023

Description of proposed changes

Use the provision-with-micromamba action to setup micromamba, which is faster than the current conda-incubator/setup-miniconda action which setups mamba.

Comparisons between the old (using mamba) and the new (using micromamba) runs

Here are the times to setup the virtual enviroment using micromamb and mamba:

Platform micromamba mamba
Linux 100 s 150 s
macOS 200 s 330 s
Windows 200 s 330 s

After enabling caching of the downloads and the environment, the workflow takes 2-3 minutes to make the tarball of cache and upload them, so the whole workflow may be even slower. However, for the next run with the downloads and env cache (e.g., https://github.com/GenericMappingTools/pygmt/actions/runs/4459966220/usage), setting up the environment only takes 25s (Linux), 72s (macOS) and 260s (Windows).

Reference: https://github.com/mamba-org/provision-with-micromamba

Addresses #584.

@seisman seisman marked this pull request as ready for review March 17, 2023 14:54
@seisman seisman added the maintenance Boring but important stuff for the core devs label Mar 17, 2023
@seisman seisman added this to the 0.10.0 milestone Mar 19, 2023
@seisman seisman requested a review from a team April 17, 2023 05:53
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
python-version: ['3.11']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we will build docs on multiple Python versions, so no need to define python-version in matrix.

Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @seisman, I've been thinking about using provision-with-micromamba for a while to speed the CI tests up. But am a little unsure about maintaining more environment.yml files in the repo because it'll require more jumping back and forth. Trying to think if there's a cleaner way to handle our matrix builds somehow...

Comment on lines +101 to +102
cache-downloads: true
cache-env: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://github.com/mamba-org/provision-with-micromamba/tree/v15#cache-env-key:

With the default environment cache key, separate caches will be created for each operating system (eg., Linux) and platform (eg., x64) and day (eg., 2022-01-31), and the cache will be invalidated whenever the contents of environment-file or extra-specs change.

And at https://github.com/mamba-org/provision-with-micromamba/tree/v15#notes-on-caching / mamba-org/provision-with-micromamba#38 (comment)

  • Env caching should always work OK given how Conda works, and assuming that tar + untar doesn't lose file system permissions etc. One caveat is that you might have older versions of dependencies but that is limited to the cache TTL (currently hardcoded to 1 day). If you make any changes to your Conda env folder for whatever horrible reason, the changes are included in the cached version and your build may break.

  • Download caching is generally slower because it only eliminates the download time. Actually it doesn't eliminate it, only (presumably) reduce it because it's loaded from a "faster" network (Azure Blob storage?!) Potentially it's even slower than an uncached install because download + extraction are possibly done with less parallelism. In practice it always seems to be faster though. Advantage of download caching is that you are always up to date.

So we want to use both env and download caching? We could try both for now, and can modify things later if it becomes problematic.

@seisman
Copy link
Member Author

seisman commented Apr 21, 2023

But am a little unsure about maintaining more environment.yml files in the repo because it'll require more jumping back and forth.

There are two different options to specify the dependencies:

  1. List all dependencies in environment.yml
  2. List all dependencies in the extra-specs key

Option 2 doesn't work well for the tests.yaml workflow, mainly because the extra-specs key requires one package per line, so there is no easy way to use ${{ matrix.optional-packages }} in extra-specs (see my previous tries in f7e9764 and b05467a). That's why I have to choose option 1 for the ci_tests.yaml workflow.

For the ci_doctests.yaml workflow, we already have the ci/requirements/docs.yml file, so it's straightforward to reuse it.

@seisman
Copy link
Member Author

seisman commented Apr 21, 2023

Option 2 doesn't work well for the tests.yaml workflow, mainly because the extra-specs key requires one package per line, so there is no easy way to use ${{ matrix.optional-packages }} in extra-specs (see my previous tries in f7e9764 and b05467a). That's why I have to choose option 1 for the ci_tests.yaml workflow.

The problem can be solved if we use multiline-string syntax for optional packages. See https://github.com/GenericMappingTools/pygmt/pull/2507/files for a proof of concept. It means we can list all packages in the workflow file.

with:
environment-name: pygmt
environment-file: false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must set environment-file to false, otherwise it will read the environment.yml file plus the packages listed in extra-specs.

Comment on lines -66 to -68
# Show installed pkg information for postmortem diagnostic
- name: List installed packages
run: mamba list
Copy link
Member Author

@seisman seisman Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mamba list step is removed because all packages are already listed at the "Environment Info" section of the Setup Micromamba step:

image

@seisman seisman merged commit 5d2d8ba into main Apr 23, 2023
@seisman seisman deleted the try-micromamba branch April 23, 2023 23:43
@seisman seisman removed the needs review This PR has higher priority and needs review. label Apr 23, 2023
weiji14 added a commit that referenced this pull request May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants