Skip to content
forked from pydata/xarray

Commit fe870e5

Browse files
committed
Merge branch 'main' into groupby-reduce
* main: Add typing_extensions as a required dependency (pydata#5911) pydata#5740 follow up: supress xr.ufunc warnings in tests (pydata#5914) Avoid accessing slow .data in unstack (pydata#5906) Add wradlib to ecosystem in docs (pydata#5915) Use .to_numpy() for quantified facetgrids (pydata#5886) [test-upstream] fix pd skipna=None (pydata#5899) Add var and std to weighted computations (pydata#5870) Check for path-like objects rather than Path type, use os.fspath (pydata#5879) Handle single `PathLike` objects in `open_mfdataset()` (pydata#5884)
2 parents 85b63b6 + bcb96ce commit fe870e5

29 files changed

+440
-148
lines changed

ci/requirements/environment-windows.yml

+1
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ dependencies:
3939
- setuptools
4040
- sparse
4141
- toolz
42+
- typing_extensions
4243
- zarr
4344
- pip:
4445
- numbagg

ci/requirements/environment.yml

+1
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ dependencies:
4343
- setuptools
4444
- sparse
4545
- toolz
46+
- typing_extensions
4647
- zarr
4748
- pip:
4849
- numbagg

ci/requirements/py37-bare-minimum.yml

+1
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@ dependencies:
1313
- numpy=1.17
1414
- pandas=1.0
1515
- setuptools=40.4
16+
- typing_extensions=3.7

ci/requirements/py37-min-all-deps.yml

+1
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ dependencies:
4747
- setuptools=40.4
4848
- sparse=0.8
4949
- toolz=0.10
50+
- typing_extensions=3.7
5051
- zarr=2.4
5152
- pip:
5253
- numbagg==0.1

ci/requirements/py38-all-but-dask.yml

+1
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ dependencies:
3939
- setuptools
4040
- sparse
4141
- toolz
42+
- typing_extensions
4243
- zarr
4344
- pip:
4445
- numbagg

doc/api.rst

+6
Original file line numberDiff line numberDiff line change
@@ -779,12 +779,18 @@ Weighted objects
779779

780780
core.weighted.DataArrayWeighted
781781
core.weighted.DataArrayWeighted.mean
782+
core.weighted.DataArrayWeighted.std
782783
core.weighted.DataArrayWeighted.sum
784+
core.weighted.DataArrayWeighted.sum_of_squares
783785
core.weighted.DataArrayWeighted.sum_of_weights
786+
core.weighted.DataArrayWeighted.var
784787
core.weighted.DatasetWeighted
785788
core.weighted.DatasetWeighted.mean
789+
core.weighted.DatasetWeighted.std
786790
core.weighted.DatasetWeighted.sum
791+
core.weighted.DatasetWeighted.sum_of_squares
787792
core.weighted.DatasetWeighted.sum_of_weights
793+
core.weighted.DatasetWeighted.var
788794

789795

790796
Coarsen objects

doc/ecosystem.rst

+1
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ Geosciences
3737
- `Spyfit <https://spyfit.readthedocs.io/en/master/>`_: FTIR spectroscopy of the atmosphere
3838
- `windspharm <https://ajdawson.github.io/windspharm/index.html>`_: Spherical
3939
harmonic wind analysis in Python.
40+
- `wradlib <https://wradlib.org/>`_: An Open Source Library for Weather Radar Data Processing.
4041
- `wrf-python <https://wrf-python.readthedocs.io/>`_: A collection of diagnostic and interpolation routines for use with output of the Weather Research and Forecasting (WRF-ARW) Model.
4142
- `xarray-simlab <https://xarray-simlab.readthedocs.io>`_: xarray extension for computer model simulations.
4243
- `xarray-spatial <https://makepath.github.io/xarray-spatial>`_: Numba-accelerated raster-based spatial processing tools (NDVI, curvature, zonal-statistics, proximity, hillshading, viewshed, etc.)

doc/getting-started-guide/installing.rst

+1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ Required dependencies
88

99
- Python (3.7 or later)
1010
- setuptools (40.4 or later)
11+
- ``typing_extensions`` (3.7 or later)
1112
- `numpy <http://www.numpy.org/>`__ (1.17 or later)
1213
- `pandas <http://pandas.pydata.org/>`__ (1.0 or later)
1314

doc/user-guide/computation.rst

+17-3
Original file line numberDiff line numberDiff line change
@@ -263,7 +263,7 @@ Weighted array reductions
263263

264264
:py:class:`DataArray` and :py:class:`Dataset` objects include :py:meth:`DataArray.weighted`
265265
and :py:meth:`Dataset.weighted` array reduction methods. They currently
266-
support weighted ``sum`` and weighted ``mean``.
266+
support weighted ``sum``, ``mean``, ``std`` and ``var``.
267267

268268
.. ipython:: python
269269
@@ -298,13 +298,27 @@ The weighted sum corresponds to:
298298
weighted_sum = (prec * weights).sum()
299299
weighted_sum
300300
301-
and the weighted mean to:
301+
the weighted mean to:
302302

303303
.. ipython:: python
304304
305305
weighted_mean = weighted_sum / weights.sum()
306306
weighted_mean
307307
308+
the weighted variance to:
309+
310+
.. ipython:: python
311+
312+
weighted_var = weighted_prec.sum_of_squares() / weights.sum()
313+
weighted_var
314+
315+
and the weighted standard deviation to:
316+
317+
.. ipython:: python
318+
319+
weighted_std = np.sqrt(weighted_var)
320+
weighted_std
321+
308322
However, the functions also take missing values in the data into account:
309323

310324
.. ipython:: python
@@ -327,7 +341,7 @@ If the weights add up to to 0, ``sum`` returns 0:
327341
328342
data.weighted(weights).sum()
329343
330-
and ``mean`` returns ``NaN``:
344+
and ``mean``, ``std`` and ``var`` return ``NaN``:
331345

332346
.. ipython:: python
333347

doc/whats-new.rst

+11
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ v0.19.1 (unreleased)
2323
2424
New Features
2525
~~~~~~~~~~~~
26+
- Add :py:meth:`var`, :py:meth:`std` and :py:meth:`sum_of_squares` to :py:meth:`Dataset.weighted` and :py:meth:`DataArray.weighted`.
27+
By `Christian Jauvin <https://github.com/cjauvin>`_.
2628
- Added a :py:func:`get_options` method to xarray's root namespace (:issue:`5698`, :pull:`5716`)
2729
By `Pushkar Kopparla <https://github.com/pkopparla>`_.
2830
- Xarray now does a better job rendering variable names that are long LaTeX sequences when plotting (:issue:`5681`, :pull:`5682`).
@@ -80,6 +82,15 @@ Bug fixes
8082
By `Jimmy Westling <https://github.com/illviljan>`_.
8183
- Numbers are properly formatted in a plot's title (:issue:`5788`, :pull:`5789`).
8284
By `Maxime Liquet <https://github.com/maximlt>`_.
85+
- Faceted plots will no longer raise a `pint.UnitStrippedWarning` when a `pint.Quantity` array is plotted,
86+
and will correctly display the units of the data in the colorbar (if there is one) (:pull:`5886`).
87+
By `Tom Nicholas <https://github.com/TomNicholas>`_.
88+
- With backends, check for path-like objects rather than ``pathlib.Path``
89+
type, use ``os.fspath`` (:pull:`5879`).
90+
By `Mike Taves <https://github.com/mwtoews>`_.
91+
- ``open_mfdataset()`` now accepts a single ``pathlib.Path`` object (:issue: `5881`).
92+
By `Panos Mavrogiorgos <https://github.com/pmav99>`_.
93+
- Improved performance of :py:meth:`Dataset.unstack` (:pull:`5906`). By `Tom Augspurger <https://github.com/TomAugspurger>`_.
8394

8495
Documentation
8596
~~~~~~~~~~~~~

requirements.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@
55
numpy >= 1.17
66
pandas >= 1.0
77
setuptools >= 40.4
8-
typing-extensions >= 3.10
8+
typing-extensions >= 3.7

setup.cfg

+1
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ python_requires = >=3.7
7878
install_requires =
7979
numpy >= 1.17
8080
pandas >= 1.0
81+
typing_extensions >= 3.7
8182
setuptools >= 40.4 # For pkg_resources
8283

8384
[options.extras_require]

xarray/backends/api.py

+11-10
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
from glob import glob
33
from io import BytesIO
44
from numbers import Number
5-
from pathlib import Path
65
from typing import (
76
TYPE_CHECKING,
87
Callable,
@@ -808,7 +807,7 @@ def open_mfdataset(
808807
- "override": if indexes are of same size, rewrite indexes to be
809808
those of the first object with that dimension. Indexes for the same
810809
dimension must have the same size in all objects.
811-
attrs_file : str or pathlib.Path, optional
810+
attrs_file : str or path-like, optional
812811
Path of the file used to read global attributes from.
813812
By default global attributes are read from the first file provided,
814813
with wildcard matches sorted by filename.
@@ -865,8 +864,10 @@ def open_mfdataset(
865864
)
866865
else:
867866
paths = sorted(glob(_normalize_path(paths)))
867+
elif isinstance(paths, os.PathLike):
868+
paths = [os.fspath(paths)]
868869
else:
869-
paths = [str(p) if isinstance(p, Path) else p for p in paths]
870+
paths = [os.fspath(p) if isinstance(p, os.PathLike) else p for p in paths]
870871

871872
if not paths:
872873
raise OSError("no files to open")
@@ -958,8 +959,8 @@ def multi_file_closer():
958959

959960
# read global attributes from the attrs_file or from the first dataset
960961
if attrs_file is not None:
961-
if isinstance(attrs_file, Path):
962-
attrs_file = str(attrs_file)
962+
if isinstance(attrs_file, os.PathLike):
963+
attrs_file = os.fspath(attrs_file)
963964
combined.attrs = datasets[paths.index(attrs_file)].attrs
964965

965966
return combined
@@ -992,8 +993,8 @@ def to_netcdf(
992993
993994
The ``multifile`` argument is only for the private use of save_mfdataset.
994995
"""
995-
if isinstance(path_or_file, Path):
996-
path_or_file = str(path_or_file)
996+
if isinstance(path_or_file, os.PathLike):
997+
path_or_file = os.fspath(path_or_file)
997998

998999
if encoding is None:
9991000
encoding = {}
@@ -1134,7 +1135,7 @@ def save_mfdataset(
11341135
----------
11351136
datasets : list of Dataset
11361137
List of datasets to save.
1137-
paths : list of str or list of Path
1138+
paths : list of str or list of path-like objects
11381139
List of paths to which to save each corresponding dataset.
11391140
mode : {"w", "a"}, optional
11401141
Write ("w") or append ("a") mode. If mode="w", any existing file at
@@ -1302,7 +1303,7 @@ def check_dtype(var):
13021303

13031304
def to_zarr(
13041305
dataset: Dataset,
1305-
store: Union[MutableMapping, str, Path] = None,
1306+
store: Union[MutableMapping, str, os.PathLike] = None,
13061307
chunk_store=None,
13071308
mode: str = None,
13081309
synchronizer=None,
@@ -1326,7 +1327,7 @@ def to_zarr(
13261327
if v.size == 0:
13271328
v.load()
13281329

1329-
# expand str and Path arguments
1330+
# expand str and path-like arguments
13301331
store = _normalize_path(store)
13311332
chunk_store = _normalize_path(chunk_store)
13321333

xarray/backends/common.py

+3-4
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
import logging
2-
import os.path
2+
import os
33
import time
44
import traceback
5-
from pathlib import Path
65
from typing import Any, Dict, Tuple, Type, Union
76

87
import numpy as np
@@ -20,8 +19,8 @@
2019

2120

2221
def _normalize_path(path):
23-
if isinstance(path, Path):
24-
path = str(path)
22+
if isinstance(path, os.PathLike):
23+
path = os.fspath(path)
2524

2625
if isinstance(path, str) and not is_remote_uri(path):
2726
path = os.path.abspath(os.path.expanduser(path))

xarray/backends/netCDF4_.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
import functools
22
import operator
33
import os
4-
import pathlib
54
from contextlib import suppress
65

76
import numpy as np
@@ -346,7 +345,7 @@ def open(
346345
autoclose=False,
347346
):
348347

349-
if isinstance(filename, pathlib.Path):
348+
if isinstance(filename, os.PathLike):
350349
filename = os.fspath(filename)
351350

352351
if not isinstance(filename, str):

xarray/backends/zarr.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
import os
2-
import pathlib
32
import warnings
43
from distutils.version import LooseVersion
54

@@ -346,7 +345,7 @@ def open_group(
346345
):
347346

348347
# zarr doesn't support pathlib.Path objects yet. zarr-python#601
349-
if isinstance(store, pathlib.Path):
348+
if isinstance(store, os.PathLike):
350349
store = os.fspath(store)
351350

352351
open_kwargs = dict(

xarray/core/dataset.py

+33-33
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
from html import escape
88
from numbers import Number
99
from operator import methodcaller
10-
from pathlib import Path
10+
from os import PathLike
1111
from typing import (
1212
TYPE_CHECKING,
1313
Any,
@@ -1832,7 +1832,7 @@ def to_netcdf(
18321832
18331833
Parameters
18341834
----------
1835-
path : str, Path or file-like, optional
1835+
path : str, path-like or file-like, optional
18361836
Path to which to save this dataset. File-like objects are only
18371837
supported by the scipy engine. If no path is provided, this
18381838
function returns the resulting netCDF file as bytes; in this case,
@@ -1914,8 +1914,8 @@ def to_netcdf(
19141914

19151915
def to_zarr(
19161916
self,
1917-
store: Union[MutableMapping, str, Path] = None,
1918-
chunk_store: Union[MutableMapping, str, Path] = None,
1917+
store: Union[MutableMapping, str, PathLike] = None,
1918+
chunk_store: Union[MutableMapping, str, PathLike] = None,
19191919
mode: str = None,
19201920
synchronizer=None,
19211921
group: str = None,
@@ -1944,9 +1944,9 @@ def to_zarr(
19441944
19451945
Parameters
19461946
----------
1947-
store : MutableMapping, str or Path, optional
1947+
store : MutableMapping, str or path-like, optional
19481948
Store or path to directory in local or remote file system.
1949-
chunk_store : MutableMapping, str or Path, optional
1949+
chunk_store : MutableMapping, str or path-like, optional
19501950
Store or path to directory in local or remote file system only for Zarr
19511951
array chunks. Requires zarr-python v2.4.0 or later.
19521952
mode : {"w", "w-", "a", "r+", None}, optional
@@ -4153,34 +4153,34 @@ def unstack(
41534153
)
41544154

41554155
result = self.copy(deep=False)
4156-
for dim in dims:
41574156

4158-
if (
4159-
# Dask arrays don't support assignment by index, which the fast unstack
4160-
# function requires.
4161-
# https://github.com/pydata/xarray/pull/4746#issuecomment-753282125
4162-
any(is_duck_dask_array(v.data) for v in self.variables.values())
4163-
# Sparse doesn't currently support (though we could special-case
4164-
# it)
4165-
# https://github.com/pydata/sparse/issues/422
4166-
or any(
4167-
isinstance(v.data, sparse_array_type)
4168-
for v in self.variables.values()
4169-
)
4170-
or sparse
4171-
# Until https://github.com/pydata/xarray/pull/4751 is resolved,
4172-
# we check explicitly whether it's a numpy array. Once that is
4173-
# resolved, explicitly exclude pint arrays.
4174-
# # pint doesn't implement `np.full_like` in a way that's
4175-
# # currently compatible.
4176-
# # https://github.com/pydata/xarray/pull/4746#issuecomment-753425173
4177-
# # or any(
4178-
# # isinstance(v.data, pint_array_type) for v in self.variables.values()
4179-
# # )
4180-
or any(
4181-
not isinstance(v.data, np.ndarray) for v in self.variables.values()
4182-
)
4183-
):
4157+
# we want to avoid allocating an object-dtype ndarray for a MultiIndex,
4158+
# so we can't just access self.variables[v].data for every variable.
4159+
# We only check the non-index variables.
4160+
# https://github.com/pydata/xarray/issues/5902
4161+
nonindexes = [
4162+
self.variables[k] for k in set(self.variables) - set(self.xindexes)
4163+
]
4164+
# Notes for each of these cases:
4165+
# 1. Dask arrays don't support assignment by index, which the fast unstack
4166+
# function requires.
4167+
# https://github.com/pydata/xarray/pull/4746#issuecomment-753282125
4168+
# 2. Sparse doesn't currently support (though we could special-case it)
4169+
# https://github.com/pydata/sparse/issues/422
4170+
# 3. pint requires checking if it's a NumPy array until
4171+
# https://github.com/pydata/xarray/pull/4751 is resolved,
4172+
# Once that is resolved, explicitly exclude pint arrays.
4173+
# pint doesn't implement `np.full_like` in a way that's
4174+
# currently compatible.
4175+
needs_full_reindex = sparse or any(
4176+
is_duck_dask_array(v.data)
4177+
or isinstance(v.data, sparse_array_type)
4178+
or not isinstance(v.data, np.ndarray)
4179+
for v in nonindexes
4180+
)
4181+
4182+
for dim in dims:
4183+
if needs_full_reindex:
41844184
result = result._unstack_full_reindex(dim, fill_value, sparse)
41854185
else:
41864186
result = result._unstack_once(dim, fill_value)

0 commit comments

Comments
 (0)