Skip to content

Commit 88315fd

Browse files
rwegener2pre-commit-ci[bot]TomNicholas
authored
Add HDF5 Section to read/write docs page (#8012)
* add HDF5 section to read/write docs * change wording for hdf5 vs. netcdf4 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add write section for hdf5 docs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tom Nicholas <[email protected]>
1 parent ba26410 commit 88315fd

File tree

1 file changed

+61
-0
lines changed

1 file changed

+61
-0
lines changed

doc/user-guide/io.rst

+61
Original file line numberDiff line numberDiff line change
@@ -559,6 +559,67 @@ and currently raises a warning unless ``invalid_netcdf=True`` is set:
559559
Note that this produces a file that is likely to be not readable by other netCDF
560560
libraries!
561561

562+
.. _io.hdf5:
563+
564+
HDF5
565+
----
566+
`HDF5`_ is both a file format and a data model for storing information. HDF5 stores
567+
data hierarchically, using groups to create a nested structure. HDF5 is a more
568+
general verion of the netCDF4 data model, so the nested structure is one of many
569+
similarities between the two data formats.
570+
571+
Reading HDF5 files in xarray requires the ``h5netcdf`` engine, which can be installed
572+
with ``conda install h5netcdf``. Once installed we can use xarray to open HDF5 files:
573+
574+
.. code:: python
575+
576+
xr.open_dataset("/path/to/my/file.h5")
577+
578+
The similarities between HDF5 and netCDF4 mean that HDF5 data can be written with the
579+
same :py:meth:`Dataset.to_netcdf` method as used for netCDF4 data:
580+
581+
.. ipython:: python
582+
583+
ds = xr.Dataset(
584+
{"foo": (("x", "y"), np.random.rand(4, 5))},
585+
coords={
586+
"x": [10, 20, 30, 40],
587+
"y": pd.date_range("2000-01-01", periods=5),
588+
"z": ("x", list("abcd")),
589+
},
590+
)
591+
592+
ds.to_netcdf("saved_on_disk.h5")
593+
594+
Groups
595+
~~~~~~
596+
597+
If you have multiple or highly nested groups, xarray by default may not read the group
598+
that you want. A particular group of an HDF5 file can be specified using the ``group``
599+
argument:
600+
601+
.. code:: python
602+
603+
xr.open_dataset("/path/to/my/file.h5", group="/my/group")
604+
605+
While xarray cannot interrogate an HDF5 file to determine which groups are available,
606+
the HDF5 Python reader `h5py`_ can be used instead.
607+
608+
Natively the xarray data structures can only handle one level of nesting, organized as
609+
DataArrays inside of Datasets. If your HDF5 file has additional levels of hierarchy you
610+
can only access one group and a time and will need to specify group names.
611+
612+
.. note::
613+
614+
For native handling of multiple HDF5 groups with xarray, including I/O, you might be
615+
interested in the experimental
616+
`xarray-datatree <https://github.com/xarray-contrib/datatree>`_ package.
617+
618+
619+
.. _HDF5: https://hdfgroup.github.io/hdf5/index.html
620+
.. _h5py: https://www.h5py.org/
621+
622+
562623
.. _io.zarr:
563624

564625
Zarr

0 commit comments

Comments
 (0)