More updates

dcherian · dcherian · commit ba4078147cc8 · 2022-11-15T20:49:35.000-07:00
diff --git a/docs/source/api.rst b/docs/source/api.rst
@@ -30,7 +30,7 @@ Visualization
     :toctree: generated/
 
     visualize.draw_mesh
-    visualize.visualize_groups
+    visualize.visualize_groups_1d
     visualize.visualize_cohorts_2d
 
 Aggregation Objects
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -43,8 +43,8 @@
 ]
 
 extlinks = {
-    "issue": ("https://github.com/xarray-contrib/flox/issues/%s", "GH#"),
-    "pr": ("https://github.com/xarray-contrib/flox/pull/%s", "GH#"),
+    "issue": ("https://github.com/xarray-contrib/flox/issues/%s", "GH#%s"),
+    "pr": ("https://github.com/xarray-contrib/flox/pull/%s", "PR#%s"),
 }
 
 templates_path = ["_templates"]
@@ -174,7 +174,7 @@
     "numpy": ("https://numpy.org/doc/stable", None),
     #    "numba": ("https://numba.pydata.org/numba-doc/latest", None),
     "dask": ("https://docs.dask.org/en/latest", None),
-    "xarray": ("http://xarray.pydata.org/en/stable/", None),
+    "xarray": ("https://docs.xarray.dev/en/stable/", None),
 }
 
 autosummary_generate = True
diff --git a/docs/source/custom.md b/docs/source/custom.md
diff --git a/docs/source/engines.md b/docs/source/engines.md
@@ -1 +1,20 @@
+(engines)=
 # Engines
+
+`flox` provides multiple options, using the `engine` kwarg, for computing the core GroupBy reduction on numpy or other array types other than dask.
+
+1. `engine="numpy"` wraps `numpy_groupies.aggregate_numpy`. This uses indexing tricks and functions like `np.bincount`, or the ufunc `.at` methods
+   (.e.g `np.maximum.at`) to provided reasonably performant aggregations.
+1. `engine="numba"` wraps `numpy_groupies.aggregate_numba`. This uses `numba` kernels for the core aggregation.
+1. `engine="flox"` uses the `ufunc.reduceat` method after first argsorting the array so that all group members occur sequentially. This was copied from
+    a [gist by Stephan Hoyer](https://gist.github.com/shoyer/f538ac78ae904c936844)
+
+There are some tradeoffs here. For the common case of reducing a nD array by a 1D array of group labels (e.g. `groupby("time.month")`), `engine="flox"` *can* be faster.
+The reason is that `numpy_groupies` converts all groupby problems to a 1D problem, this can involve [some overhead](https://github.com/ml31415/numpy-groupies/pull/46).
+It is possible to optimize this a bit in `flox` or `numpy_groupies` (though the latter is harder).
+The advantage of `engine="numpy"` is that it tends to work for more array types, since it appears to be more common to implement `np.bincount`, and not `np.add.reduceat`.
+
+```{tip}
+Other potential engines we could add are [`numbagg`](https://github.com/numbagg/numbagg) ([stalled PR here](https://github.com/xarray-contrib/flox/pull/72)) and [`datashader`](https://github.com/xarray-contrib/flox/issues/142).
+Both use numba for high-performance aggregations. Contributions or discussion is very welcome!
+```
diff --git a/docs/source/implementation.md b/docs/source/implementation.md
@@ -1,25 +1,28 @@
 (algorithms)=
 # Parallel Algorithms
 
-`flox` outsources the core GroupBy operation to the vectorized implementations in
-[numpy_groupies](https://github.com/ml31415/numpy-groupies).
-
-Running an efficient groupby reduction in parallel is hard, and strongly depends on how the
-groups are distributed amongst the blocks of an array.
+`flox` outsources the core GroupBy operation to the vectorized implementations controlled by the
+[`engine` kwarg](engines.md). Applying these implementations on a parallel array type like dask
+can be hard. Performance strongly depends on how the groups are distributed amongst the blocks of an array.
 
 `flox` implements 4 strategies for grouped reductions, each is appropriate for a particular distribution of groups
 among the blocks of a dask array. Switch between the various strategies by passing `method`
-and/or `reindex` to either {py:func}`flox.core.groupby_reduce` or `xarray_reduce`.
+and/or `reindex` to either {py:func}`flox.groupby_reduce` or {py:func}`flox.xarray.xarray_reduce`.
 
 Your options are:
 1. `method="map-reduce"` with `reindex=False`
 1. `method="map-reduce"` with `reindex=True`
-1. `method="blockwise"`
-1. `method="cohorts"`
+1. [`method="blockwise"`](method-blockwise)
+1. [`method="cohorts"`](method-cohorts)
 
 The most appropriate strategy for your problem will depend on the chunking of your dataset,
 and the distribution of group labels across those chunks.
 
+```{tip}
+Currently these strategieis are implemented for dask. We would like to generalize to other parallel array types
+as appropriate (e.g. Ramba, cubed, arkouda). Please open an issue to discuss if you are interested.
+```
+
 (xarray-split)=
 ## Background: Xarray's current GroupBy strategy
 
@@ -82,9 +85,10 @@ A bigger advantagee is that this approach allows grouping by a dask array so gro
 For example, consider `groupby("time.month")` with monthly frequency data and chunksize of 4 along `time`.
 ![cohorts-schematic](/../diagrams/cohorts-month-chunk4.png)
 With `reindex=True`, each block will become 3x its original size at the blockwise step: input blocks have 4 timesteps while output block
-has a value for all 12 months. One could use `reindex=False` to control memory usage but also see [`method="cohorts"`](cohorts) below.
+has a value for all 12 months. One could use `reindex=False` to control memory usage but also see [`method="cohorts"`](method-cohorts) below.
 
 
+(method-blockwise)=
 ## `method="blockwise"`
 
 One case where `method="map-reduce"` doesn't work well is the case of "resampling" reductions. An
@@ -113,6 +117,7 @@ so that all members of a group are in a single block. Then, the groupby operatio
 1. Works better when multiple groups are already in a single block; so that the intial
    rechunking only involves a small amount of communication.
 
+(method-cohorts)=
 ## `method="cohorts"`
 
 The `map-reduce` strategy is quite effective but can involve some unnecessary communication. It can be possible to exploit
diff --git a/docs/source/index.md b/docs/source/index.md
@@ -24,14 +24,14 @@ See a presentation ([video](https://discourse.pangeo.io/t/november-17-2021-flox-
 
 ## Why flox?
 
-1. {py:func}`flox.groupby_reduce` wraps the `numpy-groupies` package for performant Groupby reductions on nD arrays.
-1. {py:func}`flox.groupby_reduce` provides [parallel-friendly strategies](algorithms) for GroupBy reductions by wrapping `numpy-groupies` for dask arrays.
-1. `flox` integrates with xarray to provide more performant Groupby and Resampling operations.
-1. {py:func}`flox.xarray.xarray_reduce` extends Xarray's GroupBy operations allowing lazy grouping by dask arrays, grouping by multiple arrays,
-   as well as combining categorical grouping and histrogram-style binning operations using multiple variables.
+1. {py:func}`flox.groupby_reduce` [wraps](engines.md) the `numpy-groupies` package for performant Groupby reductions on nD arrays.
+1. {py:func}`flox.groupby_reduce` provides [parallel-friendly strategies](implementation.md) for GroupBy reductions by wrapping `numpy-groupies` for dask arrays.
+1. `flox` [integrates with xarray](xarray.md) to provide more performant Groupby and Resampling operations.
+1. {py:func}`flox.xarray.xarray_reduce` [extends](xarray.md) Xarray's GroupBy operations allowing lazy grouping by dask arrays, grouping by multiple arrays,
+   as well as combining categorical grouping and histogram-style binning operations using multiple variables.
 1. `flox` also provides utility functions for rechunking both dask arrays and Xarray objects along a single dimension using the group labels as a guide:
-  1. To rechunk for blockwise operations: {py:func}`flox.rechunk_for_blockwise`,  {py:func}`flox.xarray.rechunk_for_blockwise`.
-  1. To rechunk so that "cohorts", or groups of labels, tend to occur in the same chunks: {py:func}`flox.rechunk_for_cohorts`,  {py:func}`flox.xarray.rechunk_for_cohorts`.
+    1. To rechunk for blockwise operations: {py:func}`flox.rechunk_for_blockwise`,  {py:func}`flox.xarray.rechunk_for_blockwise`.
+    1. To rechunk so that "cohorts", or groups of labels, tend to occur in the same chunks: {py:func}`flox.rechunk_for_cohorts`,  {py:func}`flox.xarray.rechunk_for_cohorts`.
 
 ## Installing
 
@@ -59,9 +59,10 @@ It was motivated by many discussions in the [Pangeo](https://pangeo.io) communit
 .. toctree::
    :maxdepth: 1
 
-   implementation.md
+   aggregations.md
    engines.md
-   custom.md
+   implementation.md
+   arrays.md
    xarray.md
    api.rst
    user-stories.md
diff --git a/docs/source/xarray.md b/docs/source/xarray.md
@@ -1,3 +1,4 @@
+(xarray)=
 # Xarray
 
 Xarray will use flox by default (if installed) for DataArrays containing numpy and dask arrays. The default choice is `method="cohorts"` which generalizes

Original file line number	Diff line number	Diff line change
`@@ -43,8 +43,8 @@`
`43`	`43`	`]`
`44`	`44`
`45`	`45`	`extlinks = {`
`46`		`- "issue": ("https://github.com/xarray-contrib/flox/issues/%s", "GH#"),`
`47`		`- "pr": ("https://github.com/xarray-contrib/flox/pull/%s", "GH#"),`
	`46`	`+ "issue": ("https://github.com/xarray-contrib/flox/issues/%s", "GH#%s"),`
	`47`	`+ "pr": ("https://github.com/xarray-contrib/flox/pull/%s", "PR#%s"),`
`48`	`48`	`}`
`49`	`49`
`50`	`50`	`templates_path = ["_templates"]`
`@@ -174,7 +174,7 @@`
`174`	`174`	`"numpy": ("https://numpy.org/doc/stable", None),`
`175`	`175`	`# "numba": ("https://numba.pydata.org/numba-doc/latest", None),`
`176`	`176`	`"dask": ("https://docs.dask.org/en/latest", None),`
`177`		`- "xarray": ("http://xarray.pydata.org/en/stable/", None),`
	`177`	`+ "xarray": ("https://docs.xarray.dev/en/stable/", None),`
`178`	`178`	`}`
`179`	`179`
`180`	`180`	`autosummary_generate = True`
Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,4 @@`
	`1`	`+(xarray)=`
`1`	`2`	`# Xarray`
`2`	`3`
`3`	`4`	Xarray will use flox by default (if installed) for DataArrays containing numpy and dask arrays. The default choice is `method="cohorts"` which generalizes