Skip to content

Commit 4d075e9

Browse files
loco-philippepre-commit-ci[bot]mathause
authored andcommitted
User-guide - pandas : Add alternative to xarray.Dataset.from_dataframe (#9020)
* Update pandas.rst * Update pandas.rst * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update pandas.rst * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update ecosystem.rst * Update doc/user-guide/pandas.rst Co-authored-by: Mathias Hauser <[email protected]> * Update doc/user-guide/pandas.rst Co-authored-by: Mathias Hauser <[email protected]> * Update doc/user-guide/pandas.rst Co-authored-by: Mathias Hauser <[email protected]> * review comments * Update doc.yml * Update doc.yml * Update doc.yml * Update doc.yml * Update doc.yml * Update doc.yml * remove code * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update doc/user-guide/pandas.rst Co-authored-by: Mathias Hauser <[email protected]> * Update doc/user-guide/pandas.rst Co-authored-by: Mathias Hauser <[email protected]> * Update ci/requirements/doc.yml Co-authored-by: Mathias Hauser <[email protected]> * Update doc/user-guide/pandas.rst Co-authored-by: Mathias Hauser <[email protected]> * Update doc/user-guide/pandas.rst Co-authored-by: Mathias Hauser <[email protected]> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mathias Hauser <[email protected]>
1 parent 6772b42 commit 4d075e9

File tree

2 files changed

+21
-0
lines changed

2 files changed

+21
-0
lines changed

doc/ecosystem.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ Extend xarray capabilities
7474
- `Collocate <https://github.com/cistools/collocate>`_: Collocate xarray trajectories in arbitrary physical dimensions
7575
- `eofs <https://ajdawson.github.io/eofs/>`_: EOF analysis in Python.
7676
- `hypothesis-gufunc <https://hypothesis-gufunc.readthedocs.io/en/latest/>`_: Extension to hypothesis. Makes it easy to write unit tests with xarray objects as input.
77+
- `ntv-pandas <https://github.com/loco-philippe/ntv-pandas>`_ : A tabular analyzer and a semantic, compact and reversible converter for multidimensional and tabular data
7778
- `nxarray <https://github.com/nxarray/nxarray>`_: NeXus input/output capability for xarray.
7879
- `xarray-compare <https://github.com/astropenguin/xarray-compare>`_: xarray extension for data comparison.
7980
- `xarray-dataclasses <https://github.com/astropenguin/xarray-dataclasses>`_: xarray extension for typed DataArray and Dataset creation.

doc/user-guide/pandas.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,26 @@ work even if not the hierarchical index is not a full tensor product:
110110
s[::2]
111111
s[::2].to_xarray()
112112
113+
Lossless and reversible conversion
114+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
115+
116+
The previous ``Dataset`` example shows that the conversion is not reversible (lossy roundtrip) and
117+
that the size of the ``Dataset`` increases.
118+
119+
Particularly after a roundtrip, the following deviations are noted:
120+
121+
- a non-dimension Dataset ``coordinate`` is converted into ``variable``
122+
- a non-dimension DataArray ``coordinate`` is not converted
123+
- ``dtype`` is not allways the same (e.g. "str" is converted to "object")
124+
- ``attrs`` metadata is not conserved
125+
126+
To avoid these problems, the third-party `ntv-pandas <https://github.com/loco-philippe/ntv-pandas>`__ library offers lossless and reversible conversions between
127+
``Dataset``/ ``DataArray`` and pandas ``DataFrame`` objects.
128+
129+
This solution is particularly interesting for converting any ``DataFrame`` into a ``Dataset`` (the converter find the multidimensional structure hidden by the tabular structure).
130+
131+
The `ntv-pandas examples <https://github.com/loco-philippe/ntv-pandas/tree/main/example>`__ show how to improve the conversion for the previous ``Dataset`` example and for more complex examples.
132+
113133
Multi-dimensional data
114134
~~~~~~~~~~~~~~~~~~~~~~
115135

0 commit comments

Comments
 (0)