-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fixes dimension order in xarray.Dataset.to_stacked_array
#10205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient. |
this looks great @aFarchi , thank you! I don't have that much context, but merging given it looks like a nice improvement |
stacking_dims = tuple(dim for dim in self.dims if dim not in sample_dims) | ||
# add stacking dims by order of appearance | ||
stacking_dims_list: list[Hashable] = [] | ||
for da in self.data_vars.values(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late review @aFarchi . We will need to iterate over self.coords
too. We can have coords
variables with dimensions not present on any data_var
@@ -5246,7 +5246,13 @@ def to_stacked_array( | |||
""" | |||
from xarray.structure.concat import concat | |||
|
|||
stacking_dims = tuple(dim for dim in self.dims if dim not in sample_dims) | |||
# add stacking dims by order of appearance | |||
stacking_dims_list: list[Hashable] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment: This could also be a xarray.core.utils.OrderedSet
sorry for merging too early! @aFarchi let us know if you can take a look at those; otherwise I can revert or look at them myself |
thanks for the feedback. Good catch! I hadn't thought about the case where coordinates have dimensions that are not on any variables. |
Coming back to this issue, I had a quick look at the "old" behaviour of
which currently returns:
As far as I understand, if we have a dimension which is in a coordinate but in no variables, this dimension will necessarily be filled with nans in the stacked array. But I don't think that it is necessary to include an array of nans, is it? Or am I missing something here? @dcherian do you see any cases where it would be useful to keep that extra dimension? |
* main: (76 commits) Update how-to-add-new-backend.rst (#10240) Support extension array indexes (#9671) Switch documentation to pydata-sphinx-theme (#8708) Bump codecov/codecov-action from 5.4.0 to 5.4.2 in the actions group (#10239) Fix mypy, min-versions CI, xfail Zarr tests (#10255) Remove `test_dask_layers_and_dependencies` (#10242) Fix: Docs generation create temporary files that are not cleaned up. (#10238) opendap / dap4 support for pydap backend (#10182) Add RangeIndex (#10076) Fix mypy (#10232) Fix doctests (#10230) Fix broken Sphinx Roles (#10225) `DatasetView.map` fix `keep_attrs` (#10219) Add datatree repr asv (#10214) CI: Automatic PR labelling is back (#10201) Fixes dimension order in `xarray.Dataset.to_stacked_array` (#10205) Fix references to core classes in docs (#10207) Update pre-commit hooks (#10208) add `scipy-stubs` as extra `[types]` dependency (#10202) Fix sparse dask repr test (#10200) ...
whats-new.rst
xarray.Dataset.to_stacked_array
now uses dimensions in order of appearance.This fixes the issue where using
xarray.Dataset.transpose
beforexarray.Dataset.to_stacked_array
had no effect.(Mentioned in #9921)