Skip to content

Concatenating datasets with staggered grids #362

Open
@TomNicholas

Description

@TomNicholas

Here are two toy datasets designed to represent sections of a dataset that has variables living on a staggered grid. This type of dataset is common in fluid modelling (handling staggered grids is why xGCM exists).

import xarray as xr

ds1 = xr.Dataset(
    data_vars={
        'a': ('x_center', [1, 2, 3]),
        'b': ('x_outer',  [0.5, 1.5, 2.5, 3.5]),  
    },
)

ds2 = xr.Dataset(
    data_vars={
        'a': ('x_center', [4, 5, 6]),
        'b': ('x_outer',  [4.5, 5.5, 6.5]),  
    },
)

I have netcdf output files from an ocean model (UCLA-ROMS) that have this basic structure.

Combining these types of datasets seems like a bit of a pain to do with kerchunk at the moment.
To concatenate along the x direction, I actually need to concatenate a along x_center, and b along x_outer. So presumably I have to call MultiZarrToZarr once for each variable (or group of variables) that needs to be concatenated along a common dimension. My real dataset is split along multiple dimensions and has multiple staggered grid locations for each dimension, meaning I have to call MultiZarrToZarr something like 6 times.

This problem is analogous to what happens inside xarray.combine_by_coords, which automatically groups variables into sets with common dimensions splits datasets up into sets consisting of the same variable from each dataset, concatenates each set separately (along multiple dimensions in general), then merges the results.

Is that approach (call MultiZarrToZarr multiple times then call kerchunk.combine.merge_vars) the recommended way to handle this case currently? Could we imagine some improvement to the kerchunk.combine API that might make this easier?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions