Skip to content

groupby and mean on a MultiIndex level raises ValueError #7282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
jjpr-mit opened this issue Nov 11, 2022 · 4 comments
Closed
4 tasks done

groupby and mean on a MultiIndex level raises ValueError #7282

jjpr-mit opened this issue Nov 11, 2022 · 4 comments

Comments

@jjpr-mit
Copy link

What happened?

After using set_index to create a MultiIndex, calling groupby on a MultiIndex level and then mean raises an error.

What did you expect to happen?

Apply mean to groups, no error.

Minimal Complete Verifiable Example

d = DataArray(
    data=[
        [0, 1, 2, 3, 4, 5, 6],
        [7, 8, 9, 10, 11, 12, 13],
        [14, 15, 16, 17, 18, 19, 20]
    ],
    coords={
        "greek": ("a", ['alpha', 'beta', 'gamma']),
        "colors": ("a", ['red', 'green', 'blue']),
        "compass": ("b", ['north', 'south', 'east', 'west', 'northeast', 'southeast', 'southwest']),
        "integer": ("b", [0, 1, 2, 3, 4, 5, 6]),
    },
    dims=("a", "b")
)
d = d.set_index(a=['greek', 'colors'], b=['compass', 'integer'])
g = d.groupby('greek')
m = g.mean(...)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/site-packages/xarray/core/_aggregations.py", line 5698, in mean
    return self.reduce(
  File "/usr/local/lib/python3.10/site-packages/xarray/core/groupby.py", line 1201, in reduce
    return self.map(reduce_array, shortcut=shortcut)
  File "/usr/local/lib/python3.10/site-packages/xarray/core/groupby.py", line 1104, in map
    return self._combine(applied, shortcut=shortcut)
  File "/usr/local/lib/python3.10/site-packages/xarray/core/groupby.py", line 1136, in _combine
    index, index_vars = create_default_index_implicit(coord)
  File "/usr/local/lib/python3.10/site-packages/xarray/core/indexes.py", line 1045, in create_default_index_implicit
    index = PandasMultiIndex(array, name)
  File "/usr/local/lib/python3.10/site-packages/xarray/core/indexes.py", line 615, in __init__
    raise ValueError(
ValueError: conflicting multi-index level name 'greek' with dimension 'greek'

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.10.7 (main, Sep 13 2022, 14:31:33) [GCC 10.2.1 20210110]
python-bits: 64
OS: Linux
OS-release: 5.15.49-linuxkit
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2022.11.0
pandas: 1.5.1
numpy: 1.23.4
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 63.2.0
pip: 22.2.2
conda: None
pytest: None
IPython: None
sphinx: None

@jjpr-mit jjpr-mit added bug needs triage Issue that has not been reviewed by xarray team member labels Nov 11, 2022
@mschrimpf
Copy link

This error occurs in versions 2022.6 through 2022.11. The code worked fine in previous versions, e.g. in 2022.3.0 it executes as expected.
Related: #6836.

@benbovy
Copy link
Member

benbovy commented Nov 14, 2022

Thanks @jjpr-mit and @mschrimpf for the report. See #6836 (comment).

@benbovy benbovy added topic-indexing topic-groupby and removed needs triage Issue that has not been reviewed by xarray team member labels Nov 14, 2022
@benbovy
Copy link
Member

benbovy commented Aug 23, 2023

The example above now works (v2023.8.0) without flox:

with xr.set_options(use_flox=False):
    m = g.mean(...)
<xarray.DataArray (greek: 3)>
array([ 3., 10., 17.])
Coordinates:
  * greek    (greek) object 'alpha' 'beta' 'gamma'

But it still fails with flox:

with xr.set_options(use_flox=True):
    m = g.mean(...)
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'

cc @dcherian

@dcherian
Copy link
Contributor

This example is working with flox 0.7.2. I'm going to close but feel free to reopen if it doesn't work for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants