-
Notifications
You must be signed in to change notification settings - Fork 42
API for reorganizing levels #186
Comments
One method that might be useful (Inspired by xarray's class DataTree:
def swap_levels(
self: DataTree,
levels_dict: Mapping[Any, str] | None = None,
**levels_kwargs,
) -> DataTree:
"""
Returns a new DataTree where all nodes have swapped levels.
Renames components of paths to nodes in the tree.
Parameters
----------
levels_dict : dict-like
Dictionary whose keys are current levels and whose values
are new levels.
**levels_kwargs : {existing_level: new_level, ...}, optional
The keyword arguments form of ``levels_dict``.
One of levels_dict or levels_kwargs must be provided.
Returns
-------
swapped : DataTree
DataTree where every node has swapped levels.
""" I don't think this is enough to solve the use case in the comment above though... dt.swap_levels({"mod*": "scen*", "scen*": "mod*"}}) This would work by renaming components of paths in such a way that the levels end up reordered. The implementation would have to be careful though, possibly involving temporary names. I kind of want something like dt.reorder_levels("mod<->scen") but not sure what the API for that should look like, or how that should behave in cases where the path segment "mod" appears in multiple levels... |
The problem I see with the above API is that you are assuming each value of a level/category actually contains a 'globbable' part - or more generally the node name contains some meta information about the 'kind' of level. I dont think that is the case in many real world examples. Take for instance a CMIP example (simplified) /GFDL/hist There is no common string to identify # assume dt is ordered as "mod/scen/member"
dt.reorder_levels("member/scen/mod") #keeping with the 'filepath' like syntax |
Closed in favour of pydata/xarray#9344 |
@jbusecke and I were discussing API for reorganizing levels of a tree.
For example, say I have 2 models, each which ran 2 scenarios. Thats 4 data-containing leaves in my tree, but there are 2 equally-valid ways to organise this, either model-first or scenario-first.
The model-first tree has node paths:
/mod1/scen1
/mod2/scen1
/mod1/scen2
/mod2/scen2
whilst the scenario-first tree has node paths:
/scen1/mod1
/scen2/mod1
/scen1/mod2
/scen2/mod2
Either of these is equally valid, and one might be preferred sometimes over the other, so we should have a method than can rearrange one structure into the other.
The question is what the API to do this should look like so that it's general, intuitive, and powerful.
The text was updated successfully, but these errors were encountered: