Skip to content

Initial GPU support #1967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
Aug 30, 2024
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
22a5807
Initial implementation of a GPU version of Buffer and NDBuffer
akshaysubr Jun 14, 2024
d8cc79f
Adding cupy as an optional dependency
akshaysubr Jun 14, 2024
4d2b8c7
Adding GPU prototype test
akshaysubr Jun 14, 2024
36b1cb2
Adding GPU memory store implementation
akshaysubr Jun 14, 2024
04001b4
Addressing comments
akshaysubr Jun 17, 2024
74a13c4
Making GpuMemoryStore tests conditional on cupy being available
akshaysubr Jun 17, 2024
bdc0a24
Adding test checking that existing host memory codecs use the gpu_buf…
akshaysubr Jun 18, 2024
d900aa3
Reducing code and docs duplication
akshaysubr Jun 28, 2024
0eca795
Formatting
akshaysubr Jun 28, 2024
d9ed6c4
Fixing silent rebase conflicts
akshaysubr Jun 28, 2024
5405e38
Reducing code duplication in GpuMemoryStore
akshaysubr Jun 28, 2024
2858701
Refactoring to an abstract Buffer class and concrete CPU and GPU impl…
akshaysubr Jul 8, 2024
4e18098
Templating store tests on Buffer type
akshaysubr Jul 8, 2024
35948d4
Changing imports to prevent circular dependencies
akshaysubr Jul 8, 2024
bd2a20b
Fixing unsafe calls to Buffer abstract methods in metadata.py and gro…
akshaysubr Jul 15, 2024
828401f
Preventing calls to abstract classmethods of Buffer and NDBuffer
akshaysubr Jul 15, 2024
02a6e9d
Fixing some more unsafe usage of Buffer abstract class
akshaysubr Aug 9, 2024
ff40d3c
Initial testing with cirun based GPU CI
akshaysubr Aug 9, 2024
e5cfd2f
Reverting to basic ubuntu machine image on GCP
akshaysubr Aug 9, 2024
d473a3d
Switching to cuda image from the docker registry
akshaysubr Aug 9, 2024
2a2e399
Revert "Switching to cuda image from the docker registry"
akshaysubr Aug 9, 2024
b89ab9a
Revert "Reverting to basic ubuntu machine image on GCP"
akshaysubr Aug 9, 2024
c5a387d
Revert "Initial testing with cirun based GPU CI"
akshaysubr Aug 9, 2024
72d172d
Adding pytest mark for GPU tests
akshaysubr Aug 9, 2024
3db61bd
Updating GPU memory store test with gpu mark
akshaysubr Aug 9, 2024
425c3f8
Adding GPU workflow that only runs GPU tests
akshaysubr Aug 9, 2024
75b0ad7
First pass at fixing merge conflicts, still many changes needed
akshaysubr Aug 20, 2024
c8c7e6d
Formatting
akshaysubr Aug 21, 2024
25a67ca
Fixing mypy errors in buffer code
akshaysubr Aug 23, 2024
ce7f5e2
Merging again with v3
akshaysubr Aug 23, 2024
ac061d9
Fixing errors in test_buffer.py
akshaysubr Aug 23, 2024
523d8d5
Fixing errors in test_buffer.py
akshaysubr Aug 23, 2024
b559ee4
Fixing store test errors
akshaysubr Aug 23, 2024
26a74f4
Fixing stateful store test
akshaysubr Aug 23, 2024
7307833
Fixing config test
akshaysubr Aug 23, 2024
f6fddd9
Fixing group tests
akshaysubr Aug 23, 2024
2b1fe14
Fixing indexing tests
akshaysubr Aug 23, 2024
abd135f
Manually installing cupy in the GPU workflow
akshaysubr Aug 23, 2024
1db58e7
Ablating GPU test matrix and adding gpu optional dependencies to the …
akshaysubr Aug 24, 2024
296bd02
Adding some more logging to debug GPU test failures
akshaysubr Aug 26, 2024
b33c887
Adding GA step to install the CUDA toolkit
akshaysubr Aug 26, 2024
c894f60
Merging with v3
akshaysubr Aug 26, 2024
e0da0fb
Adding a separate gputest hatch environment to simplify GPU testing
akshaysubr Aug 27, 2024
07277af
Fixing error in cuda-toolkit step
akshaysubr Aug 28, 2024
6e49e85
Downgrading to CUDA 12.4.1 in cuda-toolkit GA
akshaysubr Aug 28, 2024
02c319c
Trying manual install of the CUDA toolkit
akshaysubr Aug 29, 2024
e82ddc1
Updating environment variables with CUDA installation
akshaysubr Aug 29, 2024
7854ce9
Removing PATH env and setting it only through GITHUB_PATH
akshaysubr Aug 29, 2024
9688ad6
Merge branch 'v3' into gpu-buffer-implementation
akshaysubr Aug 29, 2024
3852c9f
Fixing issue from merge conflict
akshaysubr Aug 29, 2024
2e8069c
Merge branch 'v3' into gpu-buffer-implementation
d-v-b Aug 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .github/workflows/gpu_test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: GPU Test V3

on:
push:
branches: [ v3 ]
pull_request:
branches: [ v3 ]
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
test:
name: py=${{ matrix.python-version }}, np=${{ matrix.numpy-version }}, deps=${{ matrix.dependency-set }}

runs-on: gpu-runner
strategy:
matrix:
python-version: ['3.10', '3.11', '3.12']
numpy-version: ['1.24', '1.26', '2.0']
dependency-set: ["minimal", "optional"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhamman What test matrix do we want to have for GPU testing? This current config seems a bit excessive? Might be worth cutting this down to the bare minimum to keep CI costs down?

Copy link
Member

@jhamman jhamman Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we did something like this for now:

Suggested change
python-version: ['3.10', '3.11', '3.12']
numpy-version: ['1.24', '1.26', '2.0']
dependency-set: ["minimal", "optional"]
python-version: ['3.11']
numpy-version: ['2.0']
dependency-set: ["gpu"]

The gpu dependency set would need to be defined in pyproject.toml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be ideal. There is currently a gpu dependency set in pyproject.toml, but I'm not sure how to get hatch to pick it up.


steps:
- uses: actions/checkout@v4
- name: GPU check
run: |
nvidia-smi
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install Hatch
run: |
python -m pip install --upgrade pip
pip install hatch
- name: Set Up Hatch Env
run: |
hatch env create test.py${{ matrix.python-version }}-${{ matrix.numpy-version }}-${{ matrix.dependency-set }}
hatch env run -e test.py${{ matrix.python-version }}-${{ matrix.numpy-version }}-${{ matrix.dependency-set }} list-env
- name: Run Tests
run: |
hatch env run --env test.py${{ matrix.python-version }}-${{ matrix.numpy-version }}-${{ matrix.dependency-set }} run-coverage-gpu
8 changes: 8 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,9 @@ jupyter = [
'ipytree>=0.2.2',
'ipywidgets>=8.0.0',
]
gpu = [
"cupy>=13.0.0",
]
docs = [
'sphinx',
'sphinx-autobuild>=2021.3.14',
Expand Down Expand Up @@ -136,6 +139,7 @@ features = ["optional"]

[tool.hatch.envs.test.scripts]
run-coverage = "pytest --cov-config=pyproject.toml --cov=pkg --cov=tests"
run-coverage-gpu = "pytest -m gpu --cov-config=pyproject.toml --cov=pkg --cov=tests"
run = "run-coverage --no-cov"
run-verbose = "run-coverage --verbose"
run-mypy = "mypy src"
Expand Down Expand Up @@ -223,4 +227,8 @@ filterwarnings = [
"error:::zarr.*",
"ignore:PY_SSIZE_T_CLEAN will be required.*:DeprecationWarning",
"ignore:The loop argument is deprecated since Python 3.8.*:DeprecationWarning",
"ignore:Creating a zarr.buffer.gpu.*:UserWarning",
]
markers = [
"gpu: mark a test as requiring CuPy and GPU"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it is worth using cuda here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're thinking, @jakirkham, that then there could be a multiplicity of these.

]
3 changes: 2 additions & 1 deletion src/zarr/codecs/blosc.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@

from zarr.abc.codec import BytesBytesCodec
from zarr.core.array_spec import ArraySpec
from zarr.core.buffer import Buffer, as_numpy_array_wrapper
from zarr.core.buffer import Buffer
from zarr.core.buffer.cpu import as_numpy_array_wrapper
from zarr.core.common import JSON, parse_enum, parse_named_configuration, to_thread
from zarr.registry import register_codec

Expand Down
3 changes: 2 additions & 1 deletion src/zarr/codecs/gzip.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@

from zarr.abc.codec import BytesBytesCodec
from zarr.core.array_spec import ArraySpec
from zarr.core.buffer import Buffer, as_numpy_array_wrapper
from zarr.core.buffer import Buffer
from zarr.core.buffer.cpu import as_numpy_array_wrapper
from zarr.core.common import JSON, parse_named_configuration, to_thread
from zarr.registry import register_codec

Expand Down
3 changes: 2 additions & 1 deletion src/zarr/codecs/zstd.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@

from zarr.abc.codec import BytesBytesCodec
from zarr.core.array_spec import ArraySpec
from zarr.core.buffer import Buffer, as_numpy_array_wrapper
from zarr.core.buffer import Buffer
from zarr.core.buffer.cpu import as_numpy_array_wrapper
from zarr.core.common import JSON, parse_named_configuration, to_thread
from zarr.registry import register_codec

Expand Down
13 changes: 11 additions & 2 deletions src/zarr/core/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -512,15 +512,24 @@ async def _set_selection(

# check value shape
if np.isscalar(value):
value = np.asanyarray(value, dtype=self.metadata.dtype)
array_like = prototype.buffer.create_zero_length().as_array_like()
if isinstance(array_like, np._typing._SupportsArrayFunc):
# TODO: need to handle array types that don't support __array_function__
# like PyTorch and JAX
array_like_ = cast(np._typing._SupportsArrayFunc, array_like)
value = np.asanyarray(value, dtype=self.metadata.dtype, like=array_like_)
else:
if not hasattr(value, "shape"):
value = np.asarray(value, self.metadata.dtype)
# assert (
# value.shape == indexer.shape
# ), f"shape of value doesn't match indexer shape. Expected {indexer.shape}, got {value.shape}"
if not hasattr(value, "dtype") or value.dtype.name != self.metadata.dtype.name:
value = np.array(value, dtype=self.metadata.dtype, order="A")
if hasattr(value, "astype"):
# Handle things that are already NDArrayLike more efficiently
value = value.astype(dtype=self.metadata.dtype, order="A")
else:
value = np.array(value, dtype=self.metadata.dtype, order="A")
value = cast(NDArrayLike, value)
# We accept any ndarray like object from the user and convert it
# to a NDBuffer (or subclass). From this point onwards, we only pass
Expand Down
19 changes: 19 additions & 0 deletions src/zarr/core/buffer/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from zarr.core.buffer.core import (
ArrayLike,
Buffer,
BufferPrototype,
NDArrayLike,
NDBuffer,
default_buffer_prototype,
)
from zarr.core.buffer.cpu import numpy_buffer_prototype

__all__ = [
"ArrayLike",
"Buffer",
"NDArrayLike",
"NDBuffer",
"BufferPrototype",
"default_buffer_prototype",
"numpy_buffer_prototype",
]
Loading
Loading