diff --git a/docs/source/user-stories/climatology.ipynb b/docs/source/user-stories/climatology.ipynb index 30bf38b75..3cbf8b805 100644 --- a/docs/source/user-stories/climatology.ipynb +++ b/docs/source/user-stories/climatology.ipynb @@ -61,7 +61,9 @@ "source": [ "To account for Feb-29 being present in some years, we'll construct a time vector to group by as \"mmm-dd\" string.\n", "\n", - "For more options, see https://strftime.org/" + "```{seealso}\n", + "For more options, see [this great website](https://strftime.org/).\n", + "```" ] }, { @@ -80,7 +82,7 @@ "id": "6", "metadata": {}, "source": [ - "## map-reduce\n", + "## First, `method=\"map-reduce\"`\n", "\n", "The default\n", "[method=\"map-reduce\"](https://flox.readthedocs.io/en/latest/implementation.html#method-map-reduce)\n", @@ -110,7 +112,7 @@ "id": "8", "metadata": {}, "source": [ - "## Rechunking for map-reduce\n", + "### Rechunking for map-reduce\n", "\n", "We can split each chunk along the `lat`, `lon` dimensions to make sure the\n", "output chunk sizes are more reasonable\n" @@ -139,7 +141,7 @@ "But what if we didn't want to rechunk the dataset so drastically (note the 10x\n", "increase in tasks). For that let's try `method=\"cohorts\"`\n", "\n", - "## method=cohorts\n", + "## `method=\"cohorts\"`\n", "\n", "We can take advantage of patterns in the groups here \"day of year\".\n", "Specifically:\n", @@ -271,7 +273,7 @@ "id": "21", "metadata": {}, "source": [ - "And now our cohorts contain more than one group\n" + "And now our cohorts contain more than one group, *and* there is a substantial reduction in number of cohorts **162 -> 12**\n" ] }, { @@ -281,7 +283,7 @@ "metadata": {}, "outputs": [], "source": [ - "preferrd_method, new_cohorts = flox.core.find_group_cohorts(\n", + "preferred_method, new_cohorts = flox.core.find_group_cohorts(\n", " labels=codes,\n", " chunks=(rechunked.chunksizes[\"time\"],),\n", ")\n", @@ -295,13 +297,23 @@ "id": "23", "metadata": {}, "outputs": [], + "source": [ + "preferred_method" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "24", + "metadata": {}, + "outputs": [], "source": [ "new_cohorts.values()" ] }, { "cell_type": "markdown", - "id": "24", + "id": "25", "metadata": {}, "source": [ "Now the groupby reduction **looks OK** in terms of number of tasks but remember\n", @@ -311,7 +323,7 @@ { "cell_type": "code", "execution_count": null, - "id": "25", + "id": "26", "metadata": {}, "outputs": [], "source": [ @@ -320,7 +332,25 @@ }, { "cell_type": "markdown", - "id": "26", + "id": "27", + "metadata": {}, + "source": [ + "flox's heuristics will choose `\"cohorts\"` automatically!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "28", + "metadata": {}, + "outputs": [], + "source": [ + "flox.xarray.xarray_reduce(rechunked, day, func=\"mean\")" + ] + }, + { + "cell_type": "markdown", + "id": "29", "metadata": {}, "source": [ "## How about other climatologies?\n", @@ -331,7 +361,7 @@ { "cell_type": "code", "execution_count": null, - "id": "27", + "id": "30", "metadata": {}, "outputs": [], "source": [ @@ -340,7 +370,7 @@ }, { "cell_type": "markdown", - "id": "28", + "id": "31", "metadata": {}, "source": [ "This looks great. Why?\n",