Skip to content

Commit d1edd15

Browse files
committed
Merge branch 'backend-indexing' into tuples_backend_indexing
* backend-indexing: Trigger CI only if code files are modified. (pydata#9006) Enable explicit use of key tuples (instead of *Indexer objects) in indexing adapters and explicitly indexed arrays (pydata#8870) add `.oindex` and `.vindex` to `BackendArray` (pydata#8885) temporary enable CI triggers on feature branch Avoid auto creation of indexes in concat (pydata#8872) Fix benchmark CI (pydata#9013) Avoid extra read from disk when creating Pandas Index. (pydata#8893) Add a benchmark to monitor performance for large dataset indexing (pydata#9012) Zarr: Optimize `region="auto"` detection (pydata#8997) Trigger CI only if code files are modified. (pydata#9006) Fix for ruff 0.4.3 (pydata#9007) Port negative frequency fix for `pandas.date_range` to `cftime_range` (pydata#8999) Bump codecov/codecov-action from 4.3.0 to 4.3.1 in the actions group (pydata#9004) Speed up localize (pydata#8536) Simplify fast path (pydata#9001) Add argument check_dims to assert_allclose to allow transposed inputs (pydata#5733) (pydata#8991) Fix syntax error in test related to cupy (pydata#9000)
2 parents ebfb715 + 18c5c70 commit d1edd15

32 files changed

+657
-254
lines changed

.github/workflows/benchmarks.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,11 @@ jobs:
2828
environment-name: xarray-tests
2929
cache-environment: true
3030
cache-environment-key: "${{runner.os}}-${{runner.arch}}-py${{env.PYTHON_VERSION}}-${{env.TODAY}}-${{hashFiles(env.CONDA_ENV_FILE)}}-benchmark"
31+
# add "build" because of https://github.com/airspeed-velocity/asv/issues/1385
3132
create-args: >-
3233
asv
34+
build
35+
mamba
3336
3437
3538
- name: Run benchmarks
@@ -47,9 +50,6 @@ jobs:
4750
asv machine --yes
4851
echo "Baseline: ${{ github.event.pull_request.base.sha }} (${{ github.event.pull_request.base.label }})"
4952
echo "Contender: ${GITHUB_SHA} (${{ github.event.pull_request.head.label }})"
50-
# Use mamba for env creation
51-
# export CONDA_EXE=$(which mamba)
52-
export CONDA_EXE=$(which conda)
5353
# Run benchmarks for current commit against base
5454
ASV_OPTIONS="--split --show-stderr --factor $ASV_FACTOR"
5555
asv continuous $ASV_OPTIONS ${{ github.event.pull_request.base.sha }} ${GITHUB_SHA} \

.github/workflows/ci-additional.yaml

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,13 @@ on:
88
branches:
99
- "main"
1010
- "backend-indexing"
11+
paths:
12+
- 'ci/**'
13+
- '.github/**'
14+
- '/*' # covers files such as `pyproject.toml`
15+
- 'properties/**'
16+
- 'xarray/**'
17+
- "backend-indexing"
1118
workflow_dispatch: # allows you to trigger manually
1219

1320
concurrency:
@@ -129,7 +136,7 @@ jobs:
129136
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report xarray/
130137
131138
- name: Upload mypy coverage to Codecov
132-
uses: codecov/[email protected].0
139+
uses: codecov/[email protected].1
133140
with:
134141
file: mypy_report/cobertura.xml
135142
flags: mypy
@@ -183,7 +190,7 @@ jobs:
183190
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report xarray/
184191
185192
- name: Upload mypy coverage to Codecov
186-
uses: codecov/[email protected].0
193+
uses: codecov/[email protected].1
187194
with:
188195
file: mypy_report/cobertura.xml
189196
flags: mypy39
@@ -244,7 +251,7 @@ jobs:
244251
python -m pyright xarray/
245252
246253
- name: Upload pyright coverage to Codecov
247-
uses: codecov/[email protected].0
254+
uses: codecov/[email protected].1
248255
with:
249256
file: pyright_report/cobertura.xml
250257
flags: pyright
@@ -303,7 +310,7 @@ jobs:
303310
python -m pyright xarray/
304311
305312
- name: Upload pyright coverage to Codecov
306-
uses: codecov/[email protected].0
313+
uses: codecov/[email protected].1
307314
with:
308315
file: pyright_report/cobertura.xml
309316
flags: pyright39

.github/workflows/ci.yaml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@ on:
88
branches:
99
- "main"
1010
- "backend-indexing"
11+
paths:
12+
- 'ci/**'
13+
- '.github/**'
14+
- '/*' # covers files such as `pyproject.toml`
15+
- 'properties/**'
16+
- 'xarray/**'
1117
workflow_dispatch: # allows you to trigger manually
1218

1319
concurrency:
@@ -158,7 +164,7 @@ jobs:
158164
path: pytest.xml
159165

160166
- name: Upload code coverage to Codecov
161-
uses: codecov/[email protected].0
167+
uses: codecov/[email protected].1
162168
with:
163169
file: ./coverage.xml
164170
flags: unittests

.github/workflows/upstream-dev-ci.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ jobs:
143143
run: |
144144
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report
145145
- name: Upload mypy coverage to Codecov
146-
uses: codecov/[email protected].0
146+
uses: codecov/[email protected].1
147147
with:
148148
file: mypy_report/cobertura.xml
149149
flags: mypy

asv_bench/asv.conf.json

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
// If missing or the empty string, the tool will be automatically
3030
// determined by looking for tools on the PATH environment
3131
// variable.
32-
"environment_type": "conda",
32+
"environment_type": "mamba",
3333
"conda_channels": ["conda-forge"],
3434

3535
// timeout in seconds for installing any dependencies in environment
@@ -41,7 +41,7 @@
4141

4242
// The Pythons you'd like to test against. If not provided, defaults
4343
// to the current version of Python used to run `asv`.
44-
"pythons": ["3.10"],
44+
"pythons": ["3.11"],
4545

4646
// The matrix of dependencies to test. Each key is the name of a
4747
// package (in PyPI) and the values are version numbers. An empty
@@ -72,8 +72,12 @@
7272
"sparse": [""],
7373
"cftime": [""]
7474
},
75-
76-
75+
// fix for bad builds
76+
// https://github.com/airspeed-velocity/asv/issues/1389#issuecomment-2076131185
77+
"build_command": [
78+
"python -m build",
79+
"python -mpip wheel --no-deps --no-build-isolation --no-index -w {build_cache_dir} {build_dir}"
80+
],
7781
// Combinations of libraries/python versions can be excluded/included
7882
// from the set to test. Each entry is a dictionary containing additional
7983
// key-value pairs to include/exclude.

asv_bench/benchmarks/groupby.py

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ def setup(self, *args, **kwargs):
6868
self.ds2d_mean = self.ds2d.groupby("b").mean().compute()
6969

7070

71+
# TODO: These don't work now because we are calling `.compute` explicitly.
7172
class GroupByPandasDataFrame(GroupBy):
7273
"""Run groupby tests using pandas DataFrame."""
7374

@@ -111,11 +112,11 @@ def setup(self, *args, **kwargs):
111112
{
112113
"b": ("time", np.arange(365.0 * 24)),
113114
},
114-
coords={"time": pd.date_range("2001-01-01", freq="H", periods=365 * 24)},
115+
coords={"time": pd.date_range("2001-01-01", freq="h", periods=365 * 24)},
115116
)
116117
self.ds2d = self.ds1d.expand_dims(z=10)
117-
self.ds1d_mean = self.ds1d.resample(time="48H").mean()
118-
self.ds2d_mean = self.ds2d.resample(time="48H").mean()
118+
self.ds1d_mean = self.ds1d.resample(time="48h").mean()
119+
self.ds2d_mean = self.ds2d.resample(time="48h").mean()
119120

120121
@parameterized(["ndim"], [(1, 2)])
121122
def time_init(self, ndim):
@@ -127,15 +128,15 @@ def time_init(self, ndim):
127128
def time_agg_small_num_groups(self, method, ndim, use_flox):
128129
ds = getattr(self, f"ds{ndim}d")
129130
with xr.set_options(use_flox=use_flox):
130-
getattr(ds.resample(time="3M"), method)().compute()
131+
getattr(ds.resample(time="3ME"), method)().compute()
131132

132133
@parameterized(
133134
["method", "ndim", "use_flox"], [("sum", "mean"), (1, 2), (True, False)]
134135
)
135136
def time_agg_large_num_groups(self, method, ndim, use_flox):
136137
ds = getattr(self, f"ds{ndim}d")
137138
with xr.set_options(use_flox=use_flox):
138-
getattr(ds.resample(time="48H"), method)().compute()
139+
getattr(ds.resample(time="48h"), method)().compute()
139140

140141

141142
class ResampleDask(Resample):
@@ -154,13 +155,13 @@ def setup(self, *args, **kwargs):
154155
},
155156
coords={
156157
"time": xr.date_range(
157-
"2001-01-01", freq="H", periods=365 * 24, calendar="noleap"
158+
"2001-01-01", freq="h", periods=365 * 24, calendar="noleap"
158159
)
159160
},
160161
)
161162
self.ds2d = self.ds1d.expand_dims(z=10)
162-
self.ds1d_mean = self.ds1d.resample(time="48H").mean()
163-
self.ds2d_mean = self.ds2d.resample(time="48H").mean()
163+
self.ds1d_mean = self.ds1d.resample(time="48h").mean()
164+
self.ds2d_mean = self.ds2d.resample(time="48h").mean()
164165

165166

166167
@parameterized(["use_cftime", "use_flox"], [[True, False], [True, False]])

asv_bench/benchmarks/indexing.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,14 @@
1212
nt = 500
1313

1414
basic_indexes = {
15+
"1scalar": {"x": 0},
1516
"1slice": {"x": slice(0, 3)},
1617
"1slice-1scalar": {"x": 0, "y": slice(None, None, 3)},
1718
"2slicess-1scalar": {"x": slice(3, -3, 3), "y": 1, "t": slice(None, -3, 3)},
1819
}
1920

2021
basic_assignment_values = {
22+
"1scalar": 0,
2123
"1slice": xr.DataArray(randn((3, ny), frac_nan=0.1), dims=["x", "y"]),
2224
"1slice-1scalar": xr.DataArray(randn(int(ny / 3) + 1, frac_nan=0.1), dims=["y"]),
2325
"2slicess-1scalar": xr.DataArray(
@@ -74,6 +76,10 @@ def setup(self, key):
7476
"x_coords": ("x", np.linspace(1.1, 2.1, nx)),
7577
},
7678
)
79+
# Benchmark how indexing is slowed down by adding many scalar variable
80+
# to the dataset
81+
# https://github.com/pydata/xarray/pull/9003
82+
self.ds_large = self.ds.merge({f"extra_var{i}": i for i in range(400)})
7783

7884

7985
class Indexing(Base):
@@ -89,6 +95,11 @@ def time_indexing_outer(self, key):
8995
def time_indexing_vectorized(self, key):
9096
self.ds.isel(**vectorized_indexes[key]).load()
9197

98+
@parameterized(["key"], [list(basic_indexes.keys())])
99+
def time_indexing_basic_ds_large(self, key):
100+
# https://github.com/pydata/xarray/pull/9003
101+
self.ds_large.isel(**basic_indexes[key]).load()
102+
92103

93104
class Assignment(Base):
94105
@parameterized(["key"], [list(basic_indexes.keys())])

ci/min_deps_check.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ def process_pkg(
133133
- publication date of version suggested by policy (YYYY-MM-DD)
134134
- status ("<", "=", "> (!)")
135135
"""
136-
print("Analyzing %s..." % pkg)
136+
print(f"Analyzing {pkg}...")
137137
versions = query_conda(pkg)
138138

139139
try:

doc/user-guide/io.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -874,7 +874,7 @@ and then calling ``to_zarr`` with ``compute=False`` to write only metadata
874874
# The values of this dask array are entirely irrelevant; only the dtype,
875875
# shape and chunks are used
876876
dummies = dask.array.zeros(30, chunks=10)
877-
ds = xr.Dataset({"foo": ("x", dummies)})
877+
ds = xr.Dataset({"foo": ("x", dummies)}, coords={"x": np.arange(30)})
878878
path = "path/to/directory.zarr"
879879
# Now we write the metadata without computing any array values
880880
ds.to_zarr(path, compute=False)
@@ -890,7 +890,7 @@ where the data should be written (in index space, not label space), e.g.,
890890
891891
# For convenience, we'll slice a single dataset, but in the real use-case
892892
# we would create them separately possibly even from separate processes.
893-
ds = xr.Dataset({"foo": ("x", np.arange(30))})
893+
ds = xr.Dataset({"foo": ("x", np.arange(30))}, coords={"x": np.arange(30)})
894894
# Any of the following region specifications are valid
895895
ds.isel(x=slice(0, 10)).to_zarr(path, region="auto")
896896
ds.isel(x=slice(10, 20)).to_zarr(path, region={"x": "auto"})

doc/whats-new.rst

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,13 @@ New Features
2929
for example, will retain the object. However, one cannot do operations that are not possible on the `ExtensionArray`
3030
then, such as broadcasting.
3131
By `Ilan Gold <https://github.com/ilan-gold>`_.
32+
- :py:func:`testing.assert_allclose`/:py:func:`testing.assert_equal` now accept a new argument `check_dims="transpose"`, controlling whether a transposed array is considered equal. (:issue:`5733`, :pull:`8991`)
33+
By `Ignacio Martinez Vazquez <https://github.com/ignamv>`_.
3234
- Added the option to avoid automatically creating 1D pandas indexes in :py:meth:`Dataset.expand_dims()`, by passing the new kwarg
33-
`create_index=False`. (:pull:`8960`)
35+
`create_index_for_new_dim=False`. (:pull:`8960`)
36+
By `Tom Nicholas <https://github.com/TomNicholas>`_.
37+
- Avoid automatically re-creating 1D pandas indexes in :py:func:`concat()`. Also added option to avoid creating 1D indexes for
38+
new dimension coordinates by passing the new kwarg `create_index_for_new_dim=False`. (:issue:`8871`, :pull:`8872`)
3439
By `Tom Nicholas <https://github.com/TomNicholas>`_.
3540

3641
Breaking changes
@@ -58,6 +63,12 @@ Breaking changes
5863

5964
Bug fixes
6065
~~~~~~~~~
66+
- Following `an upstream bug fix
67+
<https://github.com/pandas-dev/pandas/issues/56147>`_ to
68+
:py:func:`pandas.date_range`, date ranges produced by
69+
:py:func:`xarray.cftime_range` with negative frequencies will now fall fully
70+
within the bounds of the provided start and end dates (:pull:`8999`). By
71+
`Spencer Clark <https://github.com/spencerkclark>`_.
6172

6273

6374
Internal Changes

0 commit comments

Comments
 (0)