Skip to content

Tracker: semantic differences between numpy and pytorch #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ev-br opened this issue Dec 16, 2022 · 5 comments
Open

Tracker: semantic differences between numpy and pytorch #5

ev-br opened this issue Dec 16, 2022 · 5 comments
Labels
documentation Improvements or additions to documentation

Comments

@ev-br
Copy link
Collaborator

ev-br commented Dec 16, 2022

This is to track situations where eponymous functions in numpy and torch have different semantics, and to record decisions about these. Typically, the difference is views vs copies.

  • flip: np.flip creates a view, torch.flip creates a copy.

  • slicing with negative step is not supported:

In [34]: t = torch.arange(12).reshape((3, 4))

In [35]: t[:, ::-1]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [35], in <cell line: 1>()
----> 1 t[:, ::-1]

ValueError: step must be greater than zero

  • strides are value strides, not byte strides

  • if given an out= array of a wrong size, numpy raises on an attempt to broadcast, while pytorch emits a warning and resizes the out array

In [13]: torch.add(torch.ones(2), torch.ones(2), out=torch.empty(3))
<ipython-input-13-991dccd2c5b7>:1: UserWarning: An output with one or more elements was resized since it had shape [3], which does not match the required output shape [2]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1670076225403/work/aten/src/ATen/native/Resize.cpp:17.)
  torch.add(torch.ones(2), torch.ones(2), out=torch.empty(3))
Out[13]: tensor([2., 2.])

In [14]: np.add(np.ones(2), np.ones(2), out=np.empty(3))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [14], in <cell line: 1>()
----> 1 np.add(np.ones(2), np.ones(2), out=np.empty(3))

ValueError: operands could not be broadcast together with shapes (2,) (2,) (3,) 

  • squeeze on axes of non-zero length is a noop in torch, while numpy raises
In [86]: t = torch.as_tensor([[1],[2],[3]])

In [87]: t.squeeze(0).shape
Out[87]: torch.Size([3, 1])

In [88]: t.numpy().squeeze(0).shape
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [88], in <cell line: 1>()
----> 1 t.numpy().squeeze(0).shape

ValueError: cannot select an axis to squeeze out which has size not equal to one

  • torch treats one-element arrays as scalars, while numpy requires that ndim == 0:
In [49]: operator.index(_np.array([1]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [49], in <cell line: 1>()
----> 1 operator.index(_np.array([1]))

TypeError: only integer scalar arrays can be converted to a scalar index

In [50]: operator.index(torch.as_tensor([1]))
Out[50]: 1

In [51]: operator.index(torch.as_tensor([[1]]))
Out[51]: 1


  • Raising integers to negative powers is a ValueError in numpy and an integer zero in pytorch:
In [68]: torch.as_tensor(2)**torch.as_tensor(-1)
Out[68]: tensor(0)

In [69]: np.array(2)**np.array(-1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [69], in <cell line: 1>()
----> 1 np.array(2)**np.array(-1)

ValueError: Integers to negative integer powers are not allowed.

In some cases NumPy and PyTorch raise different exception classes: numpy raises a ValueError or TypeError, while pytorch defaults to RuntimeError.

For instance (this one is straight from the numpy test suite) subtracting booleans :

In [12]: a = np.ones((), dtype=np.bool_); a - a
Input In [12], in <cell line: 1>()
----> 1 a = np.ones((), dtype=np.bool_); a - a

TypeError: numpy boolean subtract, the `-` operator, is not supported, use the bitwise_xor, the `^` operator, or the logical_xor function instead.

The matching pytorch call raises a RuntmeError.

For the time being, I'm adding a raises((TypeError, RuntimeError)) in the test suite, with an intention to revisit this later, and either translate exceptions everywhere, or just better document the difference.

@ev-br
Copy link
Collaborator Author

ev-br commented Jan 5, 2023

Here's one possibly ugly corner with scalars vs zero-dim arrays. I'm trying to have a system where tnp.float32(2) creates a zero-dim array which wraps torch.Tensor(2, dtype=torch.float32). A possible issue is what tnp.int32(2) * [1, 2, 3] do:

In [353]: np.int32(2) * [1, 2, 3]          # scalar decays to a python int
Out[353]: [1, 2, 3, 1, 2, 3]

In [357]: np.asarray(2) * [1, 2, 3]     # zero-dim array is an array-like
Out[357]: array([2, 4, 6])

@rgommers
Copy link
Member

rgommers commented Jan 5, 2023

Hmm, maybe it's necessary to not support using as Python scalars (e.g. by using an ndarray subclass or a special flag to indicate that the 0-D array was actually a scalar). Or maybe it's too much of a corner case. Probably for now just document the problem, add an xfail test, and move on?

@ev-br
Copy link
Collaborator Author

ev-br commented Jan 5, 2023

Let's see if we can pull off the following: scalars are 0d arrays and do not decay to python scalars unless explicitly requested. In the example above this means the Out[357] behavior unless there is an explicit conversion to the python scalar, as in int(np.int32(2)) * [1, 2, 3]. We might even follow pytorch in allowing int(torch.Tensor([[[1]]])) which numpy raises on.

@ev-br
Copy link
Collaborator Author

ev-br commented Jan 5, 2023

As long as we do not special-case scalars, let's relax the NumPy casting rule that scalars do not upcast arrays. So, the casting rule is then everything upcasts everything else, and there is no value-based casting.

@ev-br
Copy link
Collaborator Author

ev-br commented Feb 28, 2023

Am splitting this tracker into two: this one is for numpy/pytorch differences, and gh-73 attempt to track the differences between the original numpy and our wrapper.

@ev-br ev-br added the documentation Improvements or additions to documentation label Aug 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants