You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to open a dataset that contains different chunks for different variables. If I open it with the option chunks = "auto", the result is this one:
As you can see, there are some coordinates that are indexes but others not (longitude and latitude in this case are not) and they are the ones that might create the issue (but I still want to have them in the final dataset).
With the tool we are developing, we want to be able to set the chunks to optimise the use of such datasets (and we saw that setting ourselves the chunks seemed to improve a bit the computation). But we were basing such computation on the variables we were retrieving, which have different chunking than the longitude and latitude. We are opening Zarr files with:
xarray.open_zarr(dataset, chunks = chunks)
When setting ourselves the chunks, this warning is shown:
UserWarning: The specified chunks separate the stored chunks along dimension "y" starting at index 192. This could degrade performance. Instead, consider rechunking after loading.
dataset = xr.open_dataset(
UserWarning: The specified chunks separate the stored chunks along dimension "x" starting at index 224. This could degrade performance. Instead, consider rechunking after loading.
dataset = xr.open_dataset(
I've seen this issue #8795 , which gives the impression that maybe chunking afterwards would be better suited. Something like:
Is there any specific way to also set the chunks for different variables in a different way using this method? Is that recommended?
What are the differences? Doesn't it affect performance to assign the dataset when chunking? And also, is it recommended to do the chunking, afterwards, if the chunks in the variables are different?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm trying to open a dataset that contains different chunks for different variables. If I open it with the option
chunks = "auto"
, the result is this one:As you can see, there are some coordinates that are indexes but others not (longitude and latitude in this case are not) and they are the ones that might create the issue (but I still want to have them in the final dataset).
With the tool we are developing, we want to be able to set the chunks to optimise the use of such datasets (and we saw that setting ourselves the chunks seemed to improve a bit the computation). But we were basing such computation on the variables we were retrieving, which have different chunking than the longitude and latitude. We are opening Zarr files with:
When setting ourselves the chunks, this warning is shown:
I've seen this issue #8795 , which gives the impression that maybe chunking afterwards would be better suited. Something like:
Is there any specific way to also set the chunks for different variables in a different way using this method? Is that recommended?
What are the differences? Doesn't it affect performance to assign the dataset when chunking? And also, is it recommended to do the chunking, afterwards, if the chunks in the variables are different?
Thank you very much in advance!
Beta Was this translation helpful? Give feedback.
All reactions