Skip to content

Commit 9f9fc26

Browse files
authored
cloudless mosaic notebook: use Dask-Gateway (#351)
* calculation with Dask Gateway * correct study area * clear metadata
1 parent 03533be commit 9f9fc26

File tree

1 file changed

+70
-71
lines changed

1 file changed

+70
-71
lines changed

examples/cloudless-mosaic-sentinel2.ipynb

+70-71
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@
1717
"\n",
1818
"SENTINEL-2 (https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi/overview) is a wide-swath, high-resolution, multi-spectral imaging mission, supporting Copernicus Land Monitoring studies, including the monitoring of vegetation, soil and water cover, as well as observation of inland waterways and coastal areas.\n",
1919
"\n",
20-
"## 2. Imports"
20+
"## 2. Environment setup\n",
21+
"\n",
22+
"Necessary libraries are listed as below"
2123
]
2224
},
2325
{
@@ -26,12 +28,53 @@
2628
"metadata": {},
2729
"outputs": [],
2830
"source": [
31+
"import numpy as np\n",
32+
"import xarray as xr\n",
33+
"import datashader as ds\n",
34+
"from datashader import Canvas\n",
35+
"from datashader.transfer_functions import shade, Images\n",
36+
"\n",
2937
"import stackstac\n",
3038
"from satsearch import Search\n",
3139
"\n",
32-
"import xrspatial.multispectral as ms\n",
40+
"import xrspatial.multispectral as ms"
41+
]
42+
},
43+
{
44+
"cell_type": "code",
45+
"execution_count": null,
46+
"metadata": {},
47+
"outputs": [],
48+
"source": [
49+
"from dask_gateway import GatewayCluster\n",
50+
"from dask_gateway import Gateway\n",
51+
"from distributed import Client\n",
52+
"from dask.distributed import PipInstall\n",
3353
"\n",
34-
"import matplotlib.pyplot as plt"
54+
"plugin = PipInstall(packages=[\"stackstac\"])"
55+
]
56+
},
57+
{
58+
"cell_type": "markdown",
59+
"metadata": {},
60+
"source": [
61+
"Let's create a new cluster that configured to use Dask-Gateway, and a new client that executes all Dask computations on the cluster. And we can set the mode for the cluster to be adaptive mode so that it will resize itself automatically based on the workload."
62+
]
63+
},
64+
{
65+
"cell_type": "code",
66+
"execution_count": null,
67+
"metadata": {},
68+
"outputs": [],
69+
"source": [
70+
"cluster = GatewayCluster() # Creates the Dask Scheduler. Might take a minute.\n",
71+
"\n",
72+
"client = cluster.get_client()\n",
73+
"client.register_worker_plugin(plugin)\n",
74+
"\n",
75+
"cluster.adapt(minimum=8, maximum=100)\n",
76+
"\n",
77+
"client"
3578
]
3679
},
3780
{
@@ -40,7 +83,7 @@
4083
"source": [
4184
"## 3. Load Sentinel 2 data\n",
4285
"\n",
43-
"In this example, we use data from `sentinel-s2-l2a-cogs` collection within a bounding box of `[-93.112301, 29.649001, -92.075965, 30.719868]`, and the time range considered is from `2019-07-01` to `2020-06-30`. And the collected data has less than 25% cloud coverage."
86+
"In this example, we use data from `sentinel-s2-l2a-cogs` collection within a bounding box of `[-97.185642, 27.569157, -95.117574, 29.500710]`, and the time range considered is from `2019-07-01` to `2020-06-30`. And the collected data has less than 25% cloud coverage."
4487
]
4588
},
4689
{
@@ -51,7 +94,7 @@
5194
"source": [
5295
"items = Search(\n",
5396
" url=\"https://earth-search.aws.element84.com/v0\",\n",
54-
" bbox=[-93.112301, 29.649001, -92.075965, 30.719868],\n",
97+
" bbox=[-97.185642, 27.569157, -95.117574, 29.500710],\n",
5598
" collections=[\"sentinel-s2-l2a-cogs\"],\n",
5699
" query={'eo:cloud_cover': {'lt': 25}},\n",
57100
" datetime=\"2019-07-01/2020-06-30\"\n",
@@ -67,7 +110,7 @@
67110
"Let's combine all the above STAC items into a lazy xarray with following settings:\n",
68111
"- projection: epsg=32613\n",
69112
"- resolution: 100m\n",
70-
"- bands: green (B02), red (B03), blue (B04), NIR (B08), SWIR1 (B11)"
113+
"- bands: red (B04), green (B03), blue (B02)"
71114
]
72115
},
73116
{
@@ -76,10 +119,8 @@
76119
"metadata": {},
77120
"outputs": [],
78121
"source": [
79-
"bands = ['B02', 'B03', 'B04', 'B08', 'B11']\n",
80-
"\n",
81122
"stack_ds = stackstac.stack(\n",
82-
" items, epsg=32613, resolution=100, assets=bands\n",
123+
" items, epsg=32613, resolution=100, assets=['B04', 'B03', 'B02']\n",
83124
")\n",
84125
"\n",
85126
"stack_ds"
@@ -99,20 +140,10 @@
99140
"outputs": [],
100141
"source": [
101142
"monthly = stack_ds.resample(time=\"MS\").median(\"time\", keep_attrs=True)\n",
143+
"monthly.data = monthly.data.rechunk(1024, 1024)\n",
102144
"monthly"
103145
]
104146
},
105-
{
106-
"cell_type": "code",
107-
"execution_count": null,
108-
"metadata": {},
109-
"outputs": [],
110-
"source": [
111-
"import dask.diagnostics\n",
112-
"with dask.diagnostics.ProgressBar():\n",
113-
" monthly = monthly.compute()"
114-
]
115-
},
116147
{
117148
"cell_type": "markdown",
118149
"metadata": {},
@@ -128,44 +159,30 @@
128159
"metadata": {},
129160
"outputs": [],
130161
"source": [
131-
"median_scene = monthly.median(dim=['time'])"
162+
"median_scene = monthly.median(dim=['time'])\n",
163+
"median_scene.data = median_scene.data.rechunk(2048, 2048)\n",
164+
"median_scene"
132165
]
133166
},
134167
{
135168
"cell_type": "markdown",
136169
"metadata": {},
137170
"source": [
138-
"With 3 bands: red, green, blue, let's see the true color using the `true_color` function from `xrspatial.multispectral module` for each separate month and the median layer."
171+
"## 5. Save median layer to Azure "
139172
]
140173
},
141174
{
142-
"cell_type": "code",
143-
"execution_count": null,
175+
"cell_type": "markdown",
144176
"metadata": {},
145-
"outputs": [],
146177
"source": [
147-
"bands_mapping = {v: i for i, v in enumerate(bands)}\n",
148-
"\n",
149-
"band_blue = bands_mapping['B02']\n",
150-
"band_green = bands_mapping['B03']\n",
151-
"band_red = bands_mapping['B04']"
178+
"## 6. Downsample for visualization"
152179
]
153180
},
154181
{
155-
"cell_type": "code",
156-
"execution_count": null,
182+
"cell_type": "markdown",
157183
"metadata": {},
158-
"outputs": [],
159184
"source": [
160-
"months = 12\n",
161-
"imgs = []\n",
162-
"for month in range(months):\n",
163-
" # True color\n",
164-
" r = monthly[month][band_red]\n",
165-
" g = monthly[month][band_green]\n",
166-
" b = monthly[month][band_blue]\n",
167-
" img = ms.true_color(r, g, b)\n",
168-
" imgs.append(img)"
185+
"With 3 bands: red, green, blue, let's see visualize the cloud-free scene we just constructed using the `true_color` function from `xrspatial.multispectral module`"
169186
]
170187
},
171188
{
@@ -174,21 +191,18 @@
174191
"metadata": {},
175192
"outputs": [],
176193
"source": [
177-
"# Utility function for displaying images\n",
194+
"h, w = 600, 800\n",
195+
"canvas = Canvas(plot_height=h, plot_width=w)\n",
196+
"resampled_agg = canvas.raster(median_scene)\n",
178197
"\n",
179-
"def display_images(images, columns=2, width=50, height=50):\n",
180-
" height = max(height, int(len(images)/columns) * height)\n",
181-
" plt.figure(figsize=(width, height))\n",
182-
" for i, image in enumerate(images):\n",
183-
" plt.subplot(len(images) / columns + 1, columns, i + 1)\n",
184-
" plt.imshow(image)"
198+
"resampled_agg"
185199
]
186200
},
187201
{
188202
"cell_type": "markdown",
189203
"metadata": {},
190204
"source": [
191-
"#### Monthly data"
205+
"`true_color` function takes 3 bands: red, green, blue as inputs and returns a PIL.Image object"
192206
]
193207
},
194208
{
@@ -197,15 +211,16 @@
197211
"metadata": {},
198212
"outputs": [],
199213
"source": [
200-
"# takes some time to run\n",
201-
"display_images(imgs)"
214+
"image = ms.true_color(resampled_agg[2], resampled_agg[1], resampled_agg[0])\n",
215+
"\n",
216+
"image"
202217
]
203218
},
204219
{
205220
"cell_type": "markdown",
206221
"metadata": {},
207222
"source": [
208-
"#### Median layer"
223+
"Finally, close the client and the cluster."
209224
]
210225
},
211226
{
@@ -214,24 +229,8 @@
214229
"metadata": {},
215230
"outputs": [],
216231
"source": [
217-
"median_scene = monthly.median(dim=['time'])\n",
218-
"\n",
219-
"median_red_agg = median_scene[band_red]\n",
220-
"median_green_agg = median_scene[band_green]\n",
221-
"median_blue_agg = median_scene[band_blue]\n",
222-
"\n",
223-
"median_img = ms.true_color(median_red_agg, median_green_agg, median_blue_agg)\n",
224-
"\n",
225-
"median_img"
226-
]
227-
},
228-
{
229-
"cell_type": "markdown",
230-
"metadata": {},
231-
"source": [
232-
"### References\n",
233-
"\n",
234-
"- https://stackstac.readthedocs.io/en/latest/basic.html"
232+
"client.close()\n",
233+
"cluster.close()"
235234
]
236235
}
237236
],

0 commit comments

Comments
 (0)