Skip to content

Raise proper error for scalar array when coords is a dict #3271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 29, 2019

Conversation

griverat
Copy link
Contributor

As explained here #3159 (comment) , when a user uses a scalar array to build a DataArray with coords given as a dictionary the error is not self explanatory.

>>> xr.DataArray(np.array(1), coords={'x': np.arange(4), 'y': 'a'}, dims=['x'])
...
KeyError: 'x'

This PR makes sure that when data is a scalar array and dims is not empty, it sets the shape to (0,) to make it fail with the proper raise message

>>> xr.DataArray(np.array(1), coords={'x': np.arange(4), 'y': 'a'}, dims=['x'])
...
ValueError: conflicting sizes for dimension 'x': length 0 on the data but length 4 on coordinate 'x'
  • Test updated
  • Passes black . && mypy . && flake8
  • Fully documented, including whats-new.rst for all changes and api.rst for new API (is this needed for a change like this?)

@shoyer
Copy link
Member

shoyer commented Aug 28, 2019

ValueError: conflicting sizes for dimension 'x': length 0 on the data but length 4 on coordinate 'x' seems like an improvement, but I'm still not sure it's the right message.

Maybe something like ValueError: different number of dimensions on data and dims: 0 vs 3 would be more precise?

I think this could be achieved by checking len(dims) == len(shape) in the case where dims was explicitly provided, i.e., around this line:

for d in dims:

For reference, here's the current traceback for this error:

In [3]: xr.DataArray(np.array(1), coords={'x': np.arange(4), 'y': 'a'}, dims=['x'])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-68fe094298fd> in <module>
----> 1 xr.DataArray(np.array(1), coords={'x': np.arange(4), 'y': 'a'}, dims=['x'])

~/dev/xarray/xarray/core/dataarray.py in __init__(self, data, coords, dims, name, attrs, encoding, indexes, fastpath)
    344             data = _check_data_shape(data, coords, dims)
    345             data = as_compatible_data(data)
--> 346             coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
    347             variable = Variable(dims, data, attrs, encoding, fastpath=True)
    348

~/dev/xarray/xarray/core/dataarray.py in _infer_coords_and_dims(shape, coords, dims)
    140
    141         for d, s in zip(v.dims, v.shape):
--> 142             if s != sizes[d]:
    143                 raise ValueError(
    144                     "conflicting sizes for dimension %r: "

KeyError: 'x'

@griverat
Copy link
Contributor Author

I think this could be achieved by checking len(dims) == len(shape) in the case where dims was explicitly provided, i.e., around this line:

Thanks for the suggestion. The raise now happens if len(dims) != len(shape) near the line you suggested.

@shoyer shoyer merged commit f864718 into pydata:master Aug 29, 2019
@shoyer
Copy link
Member

shoyer commented Aug 29, 2019

Looks great, thank you!

(We don't really need release notes for every minor change, especially if nothing is changing from a user perspective.)

@griverat griverat deleted the scalar-raise-msg branch August 29, 2019 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants