Skip to content

Commit 5be5a35

Browse files
committed
Squashed commit of the following:
commit 51fb6e9 Author: Deepak Cherian <[email protected]> Date: Tue Nov 15 13:16:11 2022 -0700 Some typing (#190) * Some typing * fixes. commit 9b01c48 Author: Illviljan <[email protected]> Date: Sat Nov 5 09:02:07 2022 +0100 Add windows CI (#151) * Add windows CI * Update ci.yaml * Update ci.yaml * Make arg input the same as shown in pytest * Add dtype check * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * have expected and actual results on the same side * use np.intp for count expected * [revert] minimize test * specify dtypes * more fixers * more. * Fix groupby_reduce * [revert] only wiindows tests * more fixes? * more fixes. * more fix * Last fix? * Update .github/workflows/ci.yaml * revert * Better fix * Revert "revert" This reverts commit 3b79f6e. * better comment. * clean up test * Revert "Revert "revert"" This reverts commit 38438a2. * xfail labels dtype test * Revert "[revert] only wiindows tests" This reverts commit 232cf15. * Revert "[revert] minimize test" This reverts commit f993b31. * fix bad revert Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: dcherian <[email protected]> Co-authored-by: Deepak Cherian <[email protected]> commit e3ea0e7 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue Nov 1 08:26:30 2022 -0600 Bump mamba-org/provision-with-micromamba from 13 to 14 (#184) Bumps [mamba-org/provision-with-micromamba](https://github.com/mamba-org/provision-with-micromamba) from 13 to 14. - [Release notes](https://github.com/mamba-org/provision-with-micromamba/releases) - [Commits](mamba-org/provision-with-micromamba@v13...v14) --- updated-dependencies: - dependency-name: mamba-org/provision-with-micromamba dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit c440148 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue Nov 1 08:26:17 2022 -0600 Bump xarray-contrib/ci-trigger from 1.1 to 1.2 (#183) Bumps [xarray-contrib/ci-trigger](https://github.com/xarray-contrib/ci-trigger) from 1.1 to 1.2. - [Release notes](https://github.com/xarray-contrib/ci-trigger/releases) - [Commits](xarray-contrib/ci-trigger@v1.1...v1.2) --- updated-dependencies: - dependency-name: xarray-contrib/ci-trigger dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 471aa94 Author: Deepak Cherian <[email protected]> Date: Wed Oct 26 19:39:44 2022 -0600 Use blockwise to extract final result (#182) * Use blockwise to extract final result for method="blockwise" * FOr all methods * bugfix * Try return_array from _finalize_results * Revert "Try return_array from _finalize_results" This reverts commit cb25e38. * Fixes. commit df0da40 Author: Deepak Cherian <[email protected]> Date: Tue Oct 25 16:09:04 2022 -0600 Fix blockwise sort optimization (#181) commit c2c4e1d Author: Deepak Cherian <[email protected]> Date: Tue Oct 25 13:41:40 2022 -0600 Support reindexing in simple_combine (#177) * Support reindexing in simple_combine For 1D combine, great improvement for cohorts-type reductions More memory but similar time for map-reduce. Note that the map-reduce intermediates are a worst case where there are no shared groups between the chunks being combined. This case is actually optimized in _group_combine where reindexing is skipped for reducing along a single axis. [ 68.75%] ··· =========== ========= ========= -- combine ----------- ------------------- kind grouped combine =========== ========= ========= cohorts 760M 631M mapreduce 981M 1.81G =========== ========= ========= [ 75.00%] ··· =========== ========== =========== -- combine ----------- ---------------------- kind grouped combine =========== ========== =========== cohorts 393±10ms 137±10ms mapreduce 652±10ms 611±400ms =========== ========== =========== Fix bug in unique * Fix bug with all NaN blocks commit 0db264a Author: Deepak Cherian <[email protected]> Date: Mon Oct 24 12:27:12 2022 -0600 Update visualize for main changes (#179) * Update visualize for main changes * Update tests/__init__.py commit ccf578d Author: Deepak Cherian <[email protected]> Date: Mon Oct 24 09:26:07 2022 -0600 Use npg sum_of_squares (#135) Bump to npg >= 0.9.19 Closes #107 commit c370b5d Author: dcherian <[email protected]> Date: Thu Oct 20 10:07:36 2022 -0600 Fix unique commit 336e2bb Author: dcherian <[email protected]> Date: Wed Oct 19 14:32:12 2022 -0600 Remove split-reduce to reduce some test time. commit 5431813 Author: dcherian <[email protected]> Date: Wed Oct 19 20:28:15 2022 -0600 Type more utility functions commit aa14656 Author: dcherian <[email protected]> Date: Wed Oct 19 20:05:56 2022 -0600 Simplify code for blockwise commit b1b43d9 Author: dcherian <[email protected]> Date: Wed Oct 19 20:01:47 2022 -0600 Test argmax early commit 47e0b38 Author: Deepak Cherian <[email protected]> Date: Wed Oct 19 17:46:54 2022 -0600 Force reindex to be bool always (#176) * Force reindex to be bool always Closes #155 Turns out we weren't using the more efficient simple_combine with map_reduce in all cases because do_simple_combine was None when reindex was None. Now the default for map-reduce is reindex=True when (expected_groups is not None) or (expected_groups is None and by_is_dask is False)
1 parent 32cae9d commit 5be5a35

16 files changed

+372
-210
lines changed

.github/workflows/benchmarks.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ jobs:
2222
fetch-depth: 0
2323

2424
- name: Set up conda environment
25-
uses: mamba-org/provision-with-micromamba@v13
25+
uses: mamba-org/provision-with-micromamba@v14
2626
with:
2727
environment-file: ci/environment.yml
2828
environment-name: flox-tests

.github/workflows/ci-additional.yaml

+3-3
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ jobs:
2525
- uses: actions/checkout@v3
2626
with:
2727
fetch-depth: 2
28-
- uses: xarray-contrib/ci-trigger@v1.1
28+
- uses: xarray-contrib/ci-trigger@v1.2
2929
id: detect-trigger
3030
with:
3131
keyword: "[skip-ci]"
@@ -53,7 +53,7 @@ jobs:
5353
echo "TODAY=$(date +'%Y-%m-%d')" >> $GITHUB_ENV
5454
5555
- name: Setup micromamba
56-
uses: mamba-org/provision-with-micromamba@34071ca7df4983ccd272ed0d3625818b27b70dcc
56+
uses: mamba-org/provision-with-micromamba@v14
5757
with:
5858
environment-file: ${{env.CONDA_ENV_FILE}}
5959
environment-name: flox-tests
@@ -94,7 +94,7 @@ jobs:
9494
run: |
9595
echo "TODAY=$(date +'%Y-%m-%d')" >> $GITHUB_ENV
9696
- name: Setup micromamba
97-
uses: mamba-org/provision-with-micromamba@34071ca7df4983ccd272ed0d3625818b27b70dcc
97+
uses: mamba-org/provision-with-micromamba@v14
9898
with:
9999
environment-file: ${{env.CONDA_ENV_FILE}}
100100
environment-name: xarray-tests

.github/workflows/ci.yaml

+6-7
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ jobs:
2424
strategy:
2525
fail-fast: false
2626
matrix:
27-
os: ["ubuntu-latest"]
27+
os: ["ubuntu-latest", "windows-latest"]
2828
python-version: ["3.8", "3.10"]
2929
steps:
3030
- uses: actions/checkout@v3
@@ -34,7 +34,7 @@ jobs:
3434
run: |
3535
echo "PYTHON_VERSION=${{ matrix.python-version }}" >> $GITHUB_ENV
3636
- name: Set up conda environment
37-
uses: mamba-org/provision-with-micromamba@v13
37+
uses: mamba-org/provision-with-micromamba@v14
3838
with:
3939
environment-file: ci/environment.yml
4040
environment-name: flox-tests
@@ -43,8 +43,7 @@ jobs:
4343
python="${{ matrix.python-version }}"
4444
- name: Install flox
4545
run: |
46-
python -m pip install -e .
47-
conda list
46+
python -m pip install --no-deps -e .
4847
- name: Run Tests
4948
run: |
5049
pytest -n auto --cov=./ --cov-report=xml
@@ -78,7 +77,7 @@ jobs:
7877
with:
7978
fetch-depth: 0 # Fetch all history for all branches and tags.
8079
- name: Set up conda environment
81-
uses: mamba-org/provision-with-micromamba@v13
80+
uses: mamba-org/provision-with-micromamba@v14
8281
with:
8382
environment-file: ci/${{ matrix.env }}.yml
8483
environment-name: flox-tests
@@ -101,7 +100,7 @@ jobs:
101100
steps:
102101
- uses: actions/checkout@v3
103102
- name: Set up conda environment
104-
uses: mamba-org/provision-with-micromamba@v13
103+
uses: mamba-org/provision-with-micromamba@v14
105104
with:
106105
environment-file: ci/upstream-dev-env.yml
107106
environment-name: flox-tests
@@ -123,7 +122,7 @@ jobs:
123122
repository: 'pydata/xarray'
124123
fetch-depth: 0 # Fetch all history for all branches and tags.
125124
- name: Set up conda environment
126-
uses: mamba-org/provision-with-micromamba@v13
125+
uses: mamba-org/provision-with-micromamba@v14
127126
with:
128127
environment-file: ci/requirements/environment.yml
129128
environment-name: xarray-tests

asv_bench/benchmarks/combine.py

+19-10
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
from functools import partial
2+
13
import numpy as np
24

35
import flox
@@ -7,26 +9,31 @@
79
N = 1000
810

911

12+
def _get_combine(combine):
13+
if combine == "grouped":
14+
return partial(flox.core._grouped_combine, engine="numpy")
15+
else:
16+
return partial(flox.core._simple_combine, reindex=False)
17+
18+
1019
class Combine:
1120
def setup(self, *args, **kwargs):
1221
raise NotImplementedError
1322

14-
@parameterized("kind", ("cohorts", "mapreduce"))
15-
def time_combine(self, kind):
16-
flox.core._grouped_combine(
23+
@parameterized(("kind", "combine"), (("reindexed", "not_reindexed"), ("grouped", "simple")))
24+
def time_combine(self, kind, combine):
25+
_get_combine(combine)(
1726
getattr(self, f"x_chunk_{kind}"),
1827
**self.kwargs,
1928
keepdims=True,
20-
engine="numpy",
2129
)
2230

23-
@parameterized("kind", ("cohorts", "mapreduce"))
24-
def peakmem_combine(self, kind):
25-
flox.core._grouped_combine(
31+
@parameterized(("kind", "combine"), (("reindexed", "not_reindexed"), ("grouped", "simple")))
32+
def peakmem_combine(self, kind, combine):
33+
_get_combine(combine)(
2634
getattr(self, f"x_chunk_{kind}"),
2735
**self.kwargs,
2836
keepdims=True,
29-
engine="numpy",
3037
)
3138

3239

@@ -47,7 +54,7 @@ def construct_member(groups):
4754
}
4855

4956
# motivated by
50-
self.x_chunk_mapreduce = [
57+
self.x_chunk_not_reindexed = [
5158
construct_member(groups)
5259
for groups in [
5360
np.array((1, 2, 3, 4)),
@@ -57,5 +64,7 @@ def construct_member(groups):
5764
* 2
5865
]
5966

60-
self.x_chunk_cohorts = [construct_member(groups) for groups in [np.array((1, 2, 3, 4))] * 4]
67+
self.x_chunk_reindexed = [
68+
construct_member(groups) for groups in [np.array((1, 2, 3, 4))] * 4
69+
]
6170
self.kwargs = {"agg": flox.aggregations.mean, "axis": (3,)}

asv_bench/benchmarks/reduce.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
from . import parameterized
77

88
N = 1000
9-
funcs = ["sum", "nansum", "mean", "nanmean", "max"]
9+
funcs = ["sum", "nansum", "mean", "nanmean", "max", "var", "nanvar"]
1010
engines = ["flox", "numpy"]
1111

1212

ci/docs.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ dependencies:
77
- xarray
88
- numpy>=1.20
99
- numpydoc
10-
- numpy_groupies
10+
- numpy_groupies>=0.9.19
1111
- toolz
1212
- matplotlib-base
1313
- myst-parser

ci/environment.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ dependencies:
1616
- pytest-xdist
1717
- xarray
1818
- pre-commit
19-
- numpy_groupies>=0.9.15
19+
- numpy_groupies>=0.9.19
2020
- pooch
2121
- toolz
2222
- numba

ci/minimal-requirements.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ dependencies:
99
- pytest-cov
1010
- pytest-xdist
1111
- numpy==1.20
12-
- numpy_groupies==0.9.15
12+
- numpy_groupies==0.9.19
1313
- pandas
1414
- pooch
1515
- toolz

ci/no-dask.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ dependencies:
1313
- xarray
1414
- numpydoc
1515
- pre-commit
16-
- numpy_groupies>=0.9.15
16+
- numpy_groupies>=0.9.19
1717
- pooch
1818
- toolz
1919
- numba

ci/no-xarray.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ dependencies:
1313
- dask-core
1414
- numpydoc
1515
- pre-commit
16-
- numpy_groupies>=0.9.15
16+
- numpy_groupies>=0.9.19
1717
- pooch
1818
- toolz
1919
- numba

flox/aggregate_npg.py

+32-16
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,43 @@ def _get_aggregate(engine):
99

1010

1111
def sum_of_squares(
12-
group_idx, array, engine, *, axis=-1, func="sum", size=None, fill_value=None, dtype=None
12+
group_idx,
13+
array,
14+
engine,
15+
*,
16+
axis=-1,
17+
size=None,
18+
fill_value=None,
19+
dtype=None,
1320
):
1421

1522
return _get_aggregate(engine).aggregate(
1623
group_idx,
17-
array**2,
24+
array,
1825
axis=axis,
19-
func=func,
26+
func="sumofsquares",
27+
size=size,
28+
fill_value=fill_value,
29+
dtype=dtype,
30+
)
31+
32+
33+
def nansum_of_squares(
34+
group_idx,
35+
array,
36+
engine,
37+
*,
38+
axis=-1,
39+
size=None,
40+
fill_value=None,
41+
dtype=None,
42+
):
43+
44+
return _get_aggregate(engine).aggregate(
45+
group_idx,
46+
array,
47+
axis=axis,
48+
func="nansumofsquares",
2049
size=size,
2150
fill_value=fill_value,
2251
dtype=dtype,
@@ -55,19 +84,6 @@ def nanprod(group_idx, array, engine, *, axis=-1, size=None, fill_value=None, dt
5584
)
5685

5786

58-
def nansum_of_squares(group_idx, array, engine, *, axis=-1, size=None, fill_value=None, dtype=None):
59-
return sum_of_squares(
60-
group_idx,
61-
array,
62-
engine=engine,
63-
func="nansum",
64-
size=size,
65-
fill_value=fill_value,
66-
axis=axis,
67-
dtype=dtype,
68-
)
69-
70-
7187
def _len(group_idx, array, engine, *, func, axis=-1, size=None, fill_value=None, dtype=None):
7288
result = _get_aggregate(engine).aggregate(
7389
group_idx,

flox/aggregations.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ def generic_aggregate(
3939
method_ = getattr(aggregate_npg, func)
4040
method = partial(method_, engine=engine)
4141
except AttributeError:
42-
aggregate = npg.aggregate_np if engine == "numpy" else npg.aggregate_nb
42+
aggregate = aggregate_npg._get_aggregate(engine).aggregate
4343
method = partial(aggregate, func=func)
4444
else:
4545
raise ValueError(

0 commit comments

Comments
 (0)