Add "multi device" support #59

betatim · 2024-09-04T08:56:50Z

Having more than one device is useful during testing to allow you to find bugs related to how arrays on different devices are handled. Closes #56

With scikit-learn we run into the frustrating situation were contributors execute tests locally, they all pass but then see failures on the CI related to the fact that e.g. PyTorch has several devices and some things work on the CPU device but not on the CUDA/MPS device. However, if you have neither of those on your local machine you can't really test this upfront and to debug it you need to rely on the CI.

The idea of this PR is to add support for multiple devices to array-api-strict to make testing easier. The default device continues to be the CPU device and for arrays that use it nothing should change. However, you can now place an array on a different device with array_api_strict.Device("pony") (or some other string, each string is a new device). For arrays on a device that isn't the CPU device calls like np.asarray(some_strict_array) will raise an error. This mirrors how PyTorch treats arrays on the CPU and MPS device.

What isn't yet implemented in this PR is raising an error if you try to operate on arrays that are not on the same device.

I wanted to open this PR already now after just a short amount of effort to get feedback what people think about this before putting in the time to update all the tests, etc.

Having more than one device is useful during testing to allow you to find bugs related to how arrays on different devices are handled.

lucascolley · 2024-09-04T18:20:05Z

+1! Another question is what the default should be (technically Device("pony") is more strict), but probably better if we can keep the cpu default for backwards compatibility.

betatim · 2024-09-06T08:28:45Z

I think the CPU device should be the default. That way code that exists today should keep working and the only people who notice any changes are those who use the pony device.

asmeurer · 2024-09-07T20:22:16Z

This looks good so far. We need to make sure the semantics specified at https://data-apis.org/array-api/latest/design_topics/device_support.html#semantics are followed, namely, disallowing combining arrays from different devices, and making sure that if a function creates a new array based on an existing array that it uses the same device.

For tests, ideally this would be tested in array-api-tests, but right now device support is not tested at all there. If you just want to add some basic tests here for now, that is fien.

Finally, there is the devices inspection API. https://data-apis.org/array-api/latest/API_specification/generated/array_api.info.devices.html#array_api.info.devices We need to think about how that will work. One option would just be to create a small but fixed number of devices. Or we could add some flags to make it configurable https://data-apis.org/array-api-strict/api.html#array_api_strict.set_array_api_strict_flags

betatim · 2024-09-27T09:26:47Z

I've rebooted my work. The long pause is because I went on holiday :D

I think we have to limit ourselves to a fixed number of devices, otherwise we can't full fill the requirement that the info extension can provide a list of devices. So now you can use the CPU_DEVICE and the (creatively named) device1 and device2.

Slowly making progress towards the creation functions and "array combination" functions respecting the device

betatim · 2024-09-27T12:05:42Z

Do you use ruff or black or something like that for formatting?

betatim · 2024-09-27T14:10:00Z

It looks like it would be quite tricky to add device testing to array-api-tests. At least at my level of knowledge (unfamiliar with hypothesis and array-api-tests). It looks like you'd add a helper, maybe all_devices, similar to all_dtypes and then use it in the @given decorator. The tricky thing is that how to specify a device depends on the library, in a new version of the standard you could use the inspection API to get all devices. So yeah, for now I might add some basic testing here.

asmeurer · 2024-09-27T21:52:40Z

Do you use ruff or black or something like that for formatting?

There's no autoformatting on this repo.

It looks like it would be quite tricky to add device testing to array-api-tests. At least at my level of knowledge (unfamiliar with hypothesis and array-api-tests). It looks like you'd add a helper, maybe all_devices, similar to all_dtypes and then use it in the @given decorator. The tricky thing is that how to specify a device depends on the library, in a new version of the standard you could use the inspection API to get all devices. So yeah, for now I might add some basic testing here.

I think it would have to use the devices() function in the inspection API. That would mean that the tests would only work against the newest version of the standard and it would only work against the compat library, but I think that's fine. You'd also probably want to make it optional.

It's also possible to do some basic testing using the default device, like that x.device and device= are consistent.

The annoying thing for the test suite is making sure every function everywhere is passing device through properly so that everything gets created on the same device. It would also probably require some upstream fixes to the hypothesis array-api support.

asmeurer · 2024-09-27T21:56:44Z

I think what we need here are just some big parameterized tests combining basic example arrays with different devices across all the APIs. For instance, there's an existing test that checks type promotion and the "no mixing devices" test could look very similar to that.

betatim · 2024-10-03T07:05:13Z

array_api_strict/_elementwise_functions.py



-def logaddexp(x1: Array, x2: Array) -> Array:
+def logaddexp(x1: Array, x2: Array, /) -> Array:


I think this was missing/typo.

betatim · 2024-10-03T07:09:57Z

array_api_strict/tests/test_elementwise_functions.py

@@ -19,8 +19,16 @@

 import pytest

+import array_api_strict
+
+
 def nargs(func):


I modified this so that it works with decorated/wrapped functions as well. len(getfullargspec(f).args) returns zero for functions that are decorated with the array API version decorator. From what I can tell from the Python docs this is kind of on purpose/to preserve existing behaviour. signature() does the right thing for wrapped functions, but it needs slightly more explicit work to count the arguments.

I think the intention/rule is that nargs() counts the number of positional only arguments, which is basically the "number of arrays you need to pass to a elementwise function". I went with the very explicit way of counting the args partially as a way to make it easier for people from the future to understand what nargs is meant to do (even if it contains a bug and doesn't actually do what it is meant to do).

betatim · 2024-10-03T07:12:38Z

array_api_strict/tests/test_elementwise_functions.py

@@ -91,12 +99,57 @@ def nargs(func):
    "trunc": "real numeric",
 }

+
+def test_nargs():


A short test to make sure nargs works but also that all of the functions that we look at have "the right signature" - I found logaddexp was missing that trailing / when working on nargs. So it seems useful to have a "all functions have a reasonable number of arguments" test.

That's good. The array-api test suite doesn't check for positional-only, though it probably could now that it uses inspect.signature.

The default device should continue to convert, but other arrays from other devices should error.

betatim · 2024-10-08T13:38:40Z

Does someone know more about the failure? It looks like it is not to do with the actual code but with computing the expected shape and that overflowing because the array is of dtype int8. How do we fix that?

asmeurer · 2024-10-09T10:26:25Z

Sorry, that is from a new test that I added in the test suite. I guess I didn't catch all the corner cases. You can ignore it for now.

betatim · 2024-10-09T11:54:54Z

In that case, I think, this is ready?!

I tried to modify all the functions that return an array to take into account the device of the input. For some I've added tests that check this, but I think really array-api-tests should be doing this (check input and output device are consistent). SO maybe adding tests for all functions is not needed.

asmeurer · 2024-10-14T17:41:10Z

array_api_strict/__init__.py

@@ -309,6 +309,9 @@

 __all__ += ["all", "any"]

+from ._array_object import Device
+__all__ += ["Device"]


I'd rather not add this to the __init__.py since it isn't part of the array API. If it is necessary to have some public APIs to create device objects we should make APIs that are more obviously array-api-strict specific (similar to the flags APIs).

asmeurer · 2024-10-14T17:44:38Z

array_api_strict/_array_object.py

@@ -625,19 +661,21 @@ def __getitem__(
        """
        Performs the operation __getitem__.
        """
+        # XXX Does key have to be on the same device? Is there an exception for CPU_DEVICE?


I guess this is underspecified. I would need to test what PyTorch and others do, but I would suspect that an implicit cross-device array key is not something that's intended to be supported, since that still would require an implicit device transfer.

array_api_strict/_array_object.py

asmeurer · 2024-10-14T17:48:12Z

array_api_strict/_dtypes.py

@@ -121,7 +121,7 @@ def __hash__(self):
    "integer": _integer_dtypes,
    "integer or boolean": _integer_or_boolean_dtypes,
    "boolean": _boolean_dtypes,
-    "real floating-point": _floating_dtypes,
+    "real floating-point": _real_floating_dtypes,


Good catch. Is this covered by one of the new tests?

Not explicitly. I found this because a test was failing, looked into this and found it. But also it is too long ago for me to remember what exactly it was :-/

asmeurer · 2024-10-14T17:53:24Z

It's hard to tell just from the diff if you missed anything. Here are all the places in the code that call _new without a device keyword

$ git grep -n '_new(' | grep -v 'device'
array_api_strict/_array_object.py:278:        return Array._new(np.array(scalar, dtype=self.dtype._np_dtype))
array_api_strict/_array_object.py:310:            x1 = Array._new(x1._array[None])
array_api_strict/_array_object.py:312:            x2 = Array._new(x2._array[None])
array_api_strict/_array_object.py:496:        return self.__class__._new(res)
array_api_strict/_array_object.py:684:        return self.__class__._new(res)
array_api_strict/_array_object.py:1030:        return self.__class__._new(res)
array_api_strict/_array_object.py:1042:        return self.__class__._new(res)
array_api_strict/_array_object.py:1063:        return self.__class__._new(res)
array_api_strict/_array_object.py:1084:        return self.__class__._new(res)
array_api_strict/_array_object.py:1105:        return self.__class__._new(res)
array_api_strict/_array_object.py:1149:        return self.__class__._new(res)
array_api_strict/_array_object.py:1170:        return self.__class__._new(res)
array_api_strict/_array_object.py:1191:        return self.__class__._new(res)
array_api_strict/_array_object.py:1212:        return self.__class__._new(res)
array_api_strict/_array_object.py:1282:        return self.__class__._new(self._array.T)
array_api_strict/_creation_functions.py:82:                return Array._new(new_array)
array_api_strict/_creation_functions.py:214:    return Array._new(np.from_dlpack(x))
array_api_strict/_data_type_functions.py:57:        Array._new(array) for array in np.broadcast_arrays(*[a._array for a in arrays])
array_api_strict/_data_type_functions.py:69:    return Array._new(np.broadcast_to(x._array, shape))
array_api_strict/_linalg.py:62:        U = Array._new(L).mT
array_api_strict/_linalg.py:66:    return Array._new(L)
array_api_strict/_linalg.py:94:    return Array._new(np.cross(x1._array, x2._array, axis=axis))
array_api_strict/_linalg.py:107:    return Array._new(np.linalg.det(x._array))
array_api_strict/_linalg.py:119:    return Array._new(np.diagonal(x._array, offset=offset, axis1=-2, axis2=-1))
array_api_strict/_linalg.py:150:    return Array._new(np.linalg.eigvalsh(x._array))
array_api_strict/_linalg.py:164:    return Array._new(np.linalg.inv(x._array))
array_api_strict/_linalg.py:184:    return Array._new(np.linalg.norm(x._array, axis=(-2, -1), keepdims=keepdims, ord=ord))
array_api_strict/_linalg.py:200:    return Array._new(np.linalg.matrix_power(x._array, n))
array_api_strict/_linalg.py:223:    return Array._new(np.count_nonzero(S > tol, axis=-1))
array_api_strict/_linalg.py:243:    return Array._new(np.outer(x1._array, x2._array))
array_api_strict/_linalg.py:262:    return Array._new(np.linalg.pinv(x._array, rcond=rtol))
array_api_strict/_linalg.py:351:    return Array._new(_solve(x1._array, x2._array))
array_api_strict/_linalg.py:375:    return Array._new(np.linalg.svd(x._array, compute_uv=False))
array_api_strict/_linalg.py:400:    return Array._new(np.asarray(np.trace(x._array, offset=offset, axis1=-2, axis2=-1, dtype=dtype)))
array_api_strict/_linalg.py:440:    res = Array._new(np.linalg.norm(a, axis=_axis, ord=ord))
array_api_strict/_searching_functions.py:46:    return tuple(Array._new(i) for i in np.nonzero(x._array))

So it would be worth double checking all of those.

asmeurer · 2024-10-14T17:55:10Z

As you mentioned, we should indeed be testing most of this in the test suite. However, I'm not really sure how soon that will happen. There's quite a backlog of things to do in the test suite right now, and my current priority is implementing tests for new functions added in 2023.12 or 2024.12 versions of the standard. So some very simple tests here would not hurt. The tests already have a list of two-argument functions which could be reused.

asmeurer · 2024-10-14T17:55:46Z

Were you wanting to add support for devices that don't support certain dtypes, or is that something that we should add in a later pull request?

asmeurer · 2024-10-14T17:59:43Z

Here are all the places in the code that call _new without a device keyword

Actually, I think we should make device a required argument to Array._new. That way all APIs in array-api-strict are required to make sure they do proper device handling.

betatim · 2024-10-16T08:41:30Z

Were you wanting to add support for devices that don't support certain dtypes, or is that something that we should add in a later pull request?

I'd do that in a separate PR. If only because this one is already quite long and hard to check by looking at the diff.

Here are all the places in the code that call _new without a device keyword

Actually, I think we should make device a required argument to Array._new. That way all APIs in array-api-strict are required to make sure they do proper device handling.

That is a good idea. I like it

asmeurer · 2024-10-16T18:54:07Z

Nice. I feel much better about this after the latest commit making device required.

It looks like another PR I just merged has created a small conflict here, but other than that, I am +1 to merging this.

betatim · 2024-10-18T08:31:25Z

Thanks for taking this over the finish line!

asmeurer · 2024-10-18T18:34:49Z

I opened #70 for ideas for further work here.

lucascolley · 2024-12-10T22:00:19Z

@betatim @asmeurer am I correct in thinking that the "default device" is now supposed to be CPU_DEVICE as defined in the array object file? If so, I think there is a one-line fix to gh-99, which is blocking us from removing dependency on __array__ in SciPy.

betatim · 2024-12-19T08:49:44Z

Yes that is the default device (I've not followed the scipy discussion, so I have no idea if this is a good thing?)

Add "multi device" support

600df5e

Having more than one device is useful during testing to allow you to find bugs related to how arrays on different devices are handled.

lucascolley mentioned this pull request Sep 19, 2024

TST/DEV: allow stacking of skip_xp_backends scipy/scipy#21579

Merged

betatim added 2 commits September 27, 2024 10:38

Loop device through elementwise functions

325b9d0

Define __hash__

6cc7bac

betatim added 2 commits September 27, 2024 13:55

More device pass through

bca670a

Fix meshgrid

426609f

Add testing and small typo fixes

727072f

betatim commented Oct 3, 2024

View reviewed changes

betatim added 9 commits October 3, 2024 09:50

Add a comment about atanh special casing

23f390e

Add conversion to NumPy test

405b7e7

The default device should continue to convert, but other arrays from other devices should error.

Add multi-device support to sorting functions

03e1ae7

More multi-device support

3bc8199

Formatting

032f3bb

Add multi-device test for take

724e071

Multi-device support in linear algebra functions

e0b2a64

Multi-device support for array manipulation

9323324

Add multi-device support for searching

ff37de7

betatim force-pushed the multiple-devices branch from 9245481 to ff37de7 Compare October 3, 2024 14:55

betatim added 2 commits October 3, 2024 17:05

Add multi-device support to stats and sets

bae7482

Add multi-device support for utils

cca1785

betatim force-pushed the multiple-devices branch from 760c230 to 9c5436c Compare October 7, 2024 12:37

Fix result device

0dbabcc

asmeurer reviewed Oct 14, 2024

View reviewed changes

array_api_strict/_array_object.py Outdated Show resolved Hide resolved

asmeurer reviewed Oct 14, 2024

View reviewed changes

Make device= a required argument to create an Array

8e6365b

asmeurer added 3 commits October 16, 2024 13:07

Merge branch 'main' into betatim-multiple-devices

635e14d

Add device check to repeat()

78def19

Use ValueError for different device errors

33450f3

asmeurer enabled auto-merge October 16, 2024 19:15

asmeurer merged commit cc61b49 into data-apis:main Oct 16, 2024
21 checks passed

ogrisel mentioned this pull request Oct 17, 2024

TST enable non-CPU device testing via array-api-strict scikit-learn/scikit-learn#30090

Merged

betatim deleted the multiple-devices branch October 18, 2024 08:31

asmeurer mentioned this pull request Oct 18, 2024

Improvements to device support #70

Open

This was referenced Oct 21, 2024

BUG: input's device isn't inherited for newly created xp.asarrays scipy/scipy#21736

Open

Python scalars in elementwise functions data-apis/array-api#807

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "multi device" support #59

Add "multi device" support #59

betatim commented Sep 4, 2024

lucascolley commented Sep 4, 2024

betatim commented Sep 6, 2024

asmeurer commented Sep 7, 2024

betatim commented Sep 27, 2024

betatim commented Sep 27, 2024

betatim commented Sep 27, 2024

asmeurer commented Sep 27, 2024

asmeurer commented Sep 27, 2024

betatim Oct 3, 2024

betatim Oct 3, 2024

betatim Oct 3, 2024

asmeurer Oct 3, 2024

betatim commented Oct 8, 2024

asmeurer commented Oct 9, 2024

betatim commented Oct 9, 2024

asmeurer Oct 14, 2024

asmeurer Oct 14, 2024

asmeurer Oct 14, 2024

betatim Oct 16, 2024

asmeurer commented Oct 14, 2024 •

edited

Loading

asmeurer commented Oct 14, 2024

asmeurer commented Oct 14, 2024

asmeurer commented Oct 14, 2024

betatim commented Oct 16, 2024 •

edited

Loading

asmeurer commented Oct 16, 2024

betatim commented Oct 18, 2024

asmeurer commented Oct 18, 2024

lucascolley commented Dec 10, 2024 •

edited

Loading

betatim commented Dec 19, 2024



		def logaddexp(x1: Array, x2: Array) -> Array:
		def logaddexp(x1: Array, x2: Array, /) -> Array:

Add "multi device" support #59

Add "multi device" support #59

Conversation

betatim commented Sep 4, 2024

lucascolley commented Sep 4, 2024

betatim commented Sep 6, 2024

asmeurer commented Sep 7, 2024

betatim commented Sep 27, 2024

betatim commented Sep 27, 2024

betatim commented Sep 27, 2024

asmeurer commented Sep 27, 2024

asmeurer commented Sep 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betatim commented Oct 8, 2024

asmeurer commented Oct 9, 2024

betatim commented Oct 9, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asmeurer commented Oct 14, 2024 • edited Loading

asmeurer commented Oct 14, 2024

asmeurer commented Oct 14, 2024

asmeurer commented Oct 14, 2024

betatim commented Oct 16, 2024 • edited Loading

asmeurer commented Oct 16, 2024

betatim commented Oct 18, 2024

asmeurer commented Oct 18, 2024

lucascolley commented Dec 10, 2024 • edited Loading

betatim commented Dec 19, 2024

asmeurer commented Oct 14, 2024 •

edited

Loading

betatim commented Oct 16, 2024 •

edited

Loading

lucascolley commented Dec 10, 2024 •

edited

Loading