Skip to content

Change in wrapped ufunc handling between 1.15 -> 1.16 #12997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jorisvandenbossche opened this issue Feb 20, 2019 · 3 comments
Open

Change in wrapped ufunc handling between 1.15 -> 1.16 #12997

jorisvandenbossche opened this issue Feb 20, 2019 · 3 comments

Comments

@jorisvandenbossche
Copy link
Contributor

Reproducing code example:

Consider this example, where I create a pandas TimedeltaIndex (an "array-like") and call np.sum on it, which correctly sums the timedelta64[ns] data and returns a scalar:

In [1]: idx = pd.TimedeltaIndex(np.arange(10)*1e9) 

In [2]: idx
Out[2]: 
TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04',
                '00:00:05', '00:00:06', '00:00:07', '00:00:08', '00:00:09'],
               dtype='timedelta64[ns]', freq=None)

In [3]: np.sum(idx) 
Out[3]: numpy.timedelta64(45000000000,'ns')

In [4]: pd.__version__   
Out[4]: '0.23.4'

In [5]: np.__version__
Out[5]: '1.15.4'

The above is with numpy 1.15, but starting from 1.16 (and on master as well) the same code now gives an error (using the same pandas version):

In [1]: idx = pd.TimedeltaIndex(np.arange(10)*1e9)   

In [2]: np.sum(idx)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-4137fa3a65d6> in <module>
----> 1 np.sum(idx)

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py in sum(a, axis, dtype, out, keepdims, initial)
   2074 
   2075     return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
-> 2076                           initial=initial)
   2077 
   2078 

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
     84                 return reduction(axis=axis, out=out, **passkwargs)
     85 
---> 86     return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
     87 
     88 

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __array_wrap__(self, result, context)
    658         attrs = self._get_attributes_dict()
    659         attrs = self._maybe_update_attributes(attrs)
--> 660         return Index(result, **attrs)
    661 
    662     @cache_readonly

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    301                   (dtype is not None and is_timedelta64_dtype(dtype))):
    302                 from pandas.core.indexes.timedeltas import TimedeltaIndex
--> 303                 result = TimedeltaIndex(data, copy=copy, name=name, **kwargs)
    304                 if dtype is not None and _o_dtype == dtype:
    305                     return Index(result.to_pytimedelta(), dtype=_o_dtype)

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/timedeltas.py in __new__(cls, data, unit, freq, start, end, periods, closed, dtype, copy, name, verify_integrity)
    250 
    251         # check that we are matching freqs
--> 252         if verify_integrity and len(data) > 0:
    253             if freq is not None and not freq_infer:
    254                 index = cls._simple_new(data, name=name)

TypeError: len() of unsized object

In [3]: np.__version__      
Out[3]: '1.16.1'

In [4]: pd.__version__                                     
Out[4]: '0.23.4'

The error you see comes from passing a 0d array to the TimedeltaIndex constructor. But it seems that something changed in numpy how this error is handled (I was using the same pandas version, so this error will happen under the hood in both cases).

We can rather easily work around this in pandas (checking if the result is a 0dim array or scalar, and then not passing it to the class constructor in __array_wrap__, see pandas-dev/pandas#25329), but reporting this here to check if this is an intentional change or rather an uncatched regression.

Some more information on this specific case: TimedeltaIndex has no sum method implemented. So np.sum does not directly dispatch to such a method (in contrast to eg Series, which has a sum method). That means that np.sum goes through the __array__ and __array_wrap__.
(note: I suppose towards the future we should also fix this by adding a __array_ufunc__)

Numpy/Python version information:

Both is in the same environment with python 3.7

In [9]: sys.version
Out[9]: '3.7.1 | packaged by conda-forge | (default, Feb 18 2019, 01:42:00) \n[GCC 7.3.0]'
@seberg
Copy link
Member

seberg commented Feb 20, 2019

Hmmm, did not check it too closely yet. But the operation is actually successfull, but printing fails. Similarly, this already gives a bad object:

res = pd.TimedeltaIndex(np.array(0))

on the other hand:

pd.TimedeltaIndex([0])[0]

seems fine, but by comparison res[()] fails. Not sure what to make of it, but seems a possible regression.

@jorisvandenbossche
Copy link
Contributor Author

@seberg I suppose the above is with the latest pandas release (0.24)?
I didn't really explicitly stress it, but my posting above is with the previous stable release 0.23.4. In 0.23.4, the res = pd.TimedeltaIndex(np.array(0)) correctly raises an error (because it cannot create a 1-D index from and 0-D array). But this is something we broke in pandas in 0.24, so this now indeed results in an invalid index that only raises when printing (instead of directly raising when trying to create it). And it was while fixing that regression in pandas, that we encountered this issue in numpy described above.

@seberg
Copy link
Member

seberg commented Feb 20, 2019

Ah ok, then I guess it is a bit more complicated. Yes my testing was with 0.24.1. How bad is this for you actually? And is there any chance you can bisect it to find how it was broken?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants