DataArray.sum does not respect dtype keyword #1838

gerritholl · 2018-01-18T22:01:07Z

Code Sample, a copy-pastable example if possible

# Your code here
da = xarray.DataArray(arange(5, dtype="i2"))
print(da.sum(dtype="i4").dtype)

Problem description

The result is int64. This is a problem because I asked for int32.

Expected Output

Expected output int32.

Output of `xr.show_versions()`

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 2.6.32-696.6.3.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8

xarray: 0.10.0+dev12.gf882a58
pandas: 0.22.0
numpy: 1.14.0
scipy: 1.0.0
netCDF4: 1.3.1
h5netcdf: None
Nio: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.16.1
matplotlib: 2.1.1
cartopy: 0.15.1
seaborn: 0.8.1
setuptools: 38.4.0
pip: 9.0.1
conda: 4.3.16
pytest: 3.1.2
IPython: 6.1.0
sphinx: 1.6.2

The text was updated successfully, but these errors were encountered:

fujiisoup · 2018-01-19T02:21:39Z

I notice here

xarray/xarray/core/duck_array_ops.py

Lines 177 to 181 in f3deb2f

    
           def f(values, axis=None, skipna=None, **kwargs): 
        
               # ignore keyword args inserted by np.mean and other numpy aggregators 
        
               # automatically: 
        
               kwargs.pop('dtype', None) 
        
               kwargs.pop('out', None)

we removed dtype arguments from reduce methods, but I'm not sure why.
This might be a bug.

As a temporal solution,

da.reduce(np.sum, dtype="i4")

preserves dtype.

shoyer · 2018-01-19T02:50:13Z

We do this for two reasons:

bottleneck's aggregation functions like bottleneck.nansum() don't have a dtype argument, so passing on a dtype argument causes an error to be raised.
If you call a numpy function like numpy.sum() on an xarray object, it calls the appropriate method with all keyword arguments, e.g., numpy.sum(xarray_obj) -> xarray_obj.sum(axis=None, dtype=None, out=None).

What we should probably do here instead of ignoring dtype and out entirely is to look at their values:

If dtype is not None, use numpy's aggregation function instead of bottleneck's.
If out is not None, raise an error.

fujiisoup added the bug label Jan 19, 2018

fujiisoup mentioned this issue Jan 19, 2018

Add dtype support for reduce methods. #1841

Merged

5 tasks

shoyer closed this as completed in #1841 Jan 20, 2018

fujiisoup mentioned this issue Feb 1, 2018

dtype kwargs of aggregation methods not passed to numpy function #1878

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataArray.sum does not respect dtype keyword #1838

DataArray.sum does not respect dtype keyword #1838

gerritholl commented Jan 18, 2018

fujiisoup commented Jan 19, 2018 •

edited

Loading

shoyer commented Jan 19, 2018

DataArray.sum does not respect dtype keyword #1838

DataArray.sum does not respect dtype keyword #1838

Comments

gerritholl commented Jan 18, 2018

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of xr.show_versions()

fujiisoup commented Jan 19, 2018 • edited Loading

shoyer commented Jan 19, 2018

Output of `xr.show_versions()`

fujiisoup commented Jan 19, 2018 •

edited

Loading