Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

Add Topographic Data #105

Closed
jacobbieker opened this issue Sep 9, 2021 · 7 comments · Fixed by #150
Closed

Add Topographic Data #105

jacobbieker opened this issue Sep 9, 2021 · 7 comments · Fixed by #150
Labels
data New data source or feature; or modification of existing data source enhancement New feature or request

Comments

@jacobbieker
Copy link
Member

In MetNet, and originally in openclimatefix/satflow#1, I added topographic data from USGS for training SatFlow. This data is originally in GeoTIFF format and is quite high resolution (30m) across the globe. I have the original files for nearly all of Europe+North Africa locally, but I think it would be a good idea to include here, as it might be helpful with finding where clouds or other weather might go.

@jacobbieker jacobbieker added enhancement New feature or request data New data source or feature; or modification of existing data source labels Sep 9, 2021
@JackKelly JackKelly added this to the WP1 essential tasks milestone Sep 10, 2021
@JackKelly
Copy link
Member

I agree, this would be super-useful to include!

I'm being lazy but please could you link to the code that you wrote to import the topographic data?

@jacobbieker
Copy link
Member Author

https://github.com/openclimatefix/satflow/blob/d3e4a89810865fa4284d7b57e864ec6cfd51f913/satflow/examples/create_dem.py is how I created the final combined elevation map which is here: https://github.com/openclimatefix/satflow/blob/d3e4a89810865fa4284d7b57e864ec6cfd51f913/satflow/resources/cutdown_europe_dem.npy which threw away all the geo information, and is Europe+North Africa at a roughly 3km per pixel elevation map

@JackKelly
Copy link
Member

Awesome, thank you!

@peterdudfield peterdudfield removed this from the WP1 essential tasks milestone Sep 24, 2021
@jacobbieker
Copy link
Member Author

Originally, I was planning on recreating the one large file that was in SatFlow for this dataset. But as we are trying to potentially go directly from the raw data to the batches without intermediate storage, or losing information, it might be worth keeping as all the individual files (2114 of them for the ground covered by RSS). The SRTM files are in 1 degree files, and combining them into a single file is quite large (> 68GB) if kept at 30m resolution. One option could be to leave the files as 1 degree squares, which covers roughly 110km^2 of area each, and loading them from the individual raw files when creating batches. Or downsampling the files significantly could work as well and then combining them into one mosaic. Lowering the resolution to 1km instead would still match or be better than the resolution for other data sources, while making the data a lot smaller and easier to load and work with.

@jacobbieker
Copy link
Member Author

Just got the script working in #150 the 1km RSS coverage is 64mb total for it all. It makes it a lot simpler just to load from that file. And its simple enough if we want to resample at different resolutions as well.

@JackKelly
Copy link
Member

JackKelly commented Sep 25, 2021

Yeah, reducing the resolution ahead-of-time to 1km sounds good to me!

One of the main reasons for trying to cut out the Zarr intermediates for NWPs and satellite data is because those intermediate Zarr arrays are huge (many tens of TB when we have ~10 years of data) and so, if we can cut out those intermediate Zarr arrays, we approximately halve the amount of disk space we need :)

In contrast, storing a 64 MB 'intermediate' file of the topographic data feels absolutely like the right thing to do :)

@JackKelly JackKelly linked a pull request Oct 1, 2021 that will close this issue
7 tasks
@JackKelly
Copy link
Member

@jacobbieker now that #150 is merged, I assume we can close this issue? (If I'm wrong, please re-open this issue! :) )

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
data New data source or feature; or modification of existing data source enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants