Skip to content

Reading netcdf file with string coordinates makes IPython kernel crash (netcdf4 engine) #8544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
5 tasks done
Paul-Aime opened this issue Dec 12, 2023 · 14 comments
Closed
5 tasks done
Labels
needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

Comments

@Paul-Aime
Copy link

Paul-Aime commented Dec 12, 2023

What happened?

When trying to open a netcdf file that has strings as coordinates it makes the notebook kernel crash.

This only happens when engine=netcdf4, and not when engine=h5netcdf.

The bug occurs in IPython, in Jupyter in the web browser and in VSCode notebooks at least.

The bug can consistently be reproduced when reading the same file twice on the same cell, when running the cell twice.

What did you expect to happen?

It is expected for engine=netcdf4 to work the same as engine=h5netcdf, i.e. don't make the kernel crash.

Minimal Complete Verifiable Example

# %%
import numpy as np
import xarray as xr

# %%
fpath = "test.nc"

da = xr.DataArray(
    data=np.random.randn(3, 10),
    dims=["label", "values"],
    coords=dict(
        label=["a", "b", "c"],
    ),
)
da.to_netcdf(fpath)

# %%
# engine = "h5netcdf"
engine = "netcdf4"
xr.open_dataarray(fpath, engine=engine)
xr.open_dataarray(fpath, engine=engine)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

IPython crashes with: Segmentation fault (core dumped)

Jupyter Notebook logs:

[I 2023-12-12 15:20:00.474 ServerApp] Kernel restarted: 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c
[I 2023-12-12 15:20:00.482 ServerApp] Starting buffering for 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c:0bd5dcd6-faa7-413a-b6c5-080b1c774933
[I 2023-12-12 15:20:00.494 ServerApp] Connecting to kernel 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c.
[I 2023-12-12 15:20:00.494 ServerApp] Restoring connection for 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c:0bd5dcd6-faa7-413a-b6c5-080b1c774933
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[IPKernelApp] WARNING | Unknown error in handling startup files:
[I 2023-12-12 15:20:09.463 ServerApp] AsyncIOLoopKernelRestarter: restarting kernel (1/5), keep random ports
[W 2023-12-12 15:20:09.463 ServerApp] kernel 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c restarted
[I 2023-12-12 15:20:09.470 ServerApp] Starting buffering for 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c:0bd5dcd6-faa7-413a-b6c5-080b1c774933
[I 2023-12-12 15:20:09.504 ServerApp] Connecting to kernel 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c.
[I 2023-12-12 15:20:09.505 ServerApp] Restoring connection for 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c:0bd5dcd6-faa7-413a-b6c5-080b1c774933
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[IPKernelApp] WARNING | Unknown error in handling startup files:

VSCode notebook Jupyter logs:

15:23:03.501 [info] Restart requested ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:03.502 [info] Dispose Kernel process 2763594.
15:23:03.589 [info] Process Execution: ~/miniconda3/bin/python -c "import ipykernel; print(ipykernel.__version__); print("5dc3a68c-e34e-4080-9c3e-2a532b2ccb4d"); print(ipykernel.__file__)"
15:23:03.671 [info] Process Execution: ~/miniconda3/bin/python -m ipykernel_launcher --f=~/.local/share/jupyter/runtime/kernel-v2-2727807dzOm3m1LEA5V.json
    > cwd: ~/Desktop/bug_xarray_notebook
15:23:04.149 [warn] StdErr from Kernel Process [IPKernelApp] WARNING | Unknown error in handling startup files:
15:23:04.454 [info] Restarted bd04fd87-98e7-486d-a6c6-7308101edcdf
15:23:08.046 [info] Handle Execution of Cells 0 for ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:08.055 [info] Kernel acknowledged execution of cell 0 @ 1702390988054
15:23:08.412 [info] End cell 0 execution after 0.358s, completed @ 1702390988412, started @ 1702390988054
15:23:09.260 [info] Handle Execution of Cells 1 for ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:09.269 [info] Kernel acknowledged execution of cell 1 @ 1702390989268
15:23:09.305 [info] End cell 1 execution after 0.036s, completed @ 1702390989304, started @ 1702390989268
15:23:10.893 [info] Handle Execution of Cells 2 for ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:10.907 [info] Kernel acknowledged execution of cell 2 @ 1702390990907
15:23:10.971 [info] End cell 2 execution after 0.064s, completed @ 1702390990971, started @ 1702390990907
15:23:12.255 [info] Handle Execution of Cells 2 for ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:12.262 [info] Kernel acknowledged execution of cell 2 @ 1702390992262
15:23:12.504 [error] Disposing session as kernel process died ExitCode: undefined, Reason: [IPKernelApp] WARNING | Unknown error in handling startup files:

15:23:12.505 [info] Dispose Kernel process 2764104.
15:23:12.518 [info] End cell 2 execution after -1702390992.262s, completed @ undefined, started @ 1702390992262

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.0 | packaged by conda-forge | (main, Oct 3 2023, 08:43:22) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-91-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2023.12.0
pandas: 2.1.4
numpy: 1.26.2
scipy: None
netCDF4: 1.6.5
pydap: None
h5netcdf: 1.3.0
h5py: 3.10.0
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.2.2
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: 8.18.1
sphinx: None

@Paul-Aime Paul-Aime added bug needs triage Issue that has not been reviewed by xarray team member labels Dec 12, 2023
@dcherian
Copy link
Contributor

Can you read with netcdf4-python only, removing xarray from the mix?

If not, please open an issue there. we can't do anything about it unfortunately in that case

@Paul-Aime
Copy link
Author

Paul-Aime commented Dec 12, 2023

I can, if I do close the dataset each time.

Tested with following cell, again it is needed to run it multiple time until the bug occurs:

from netCDF4 import Dataset
da = Dataset("test.nc", "r", format="NETCDF4")
# da.close()

So maybe xarray does not close the dataset properly.

Also, the bug still occurs if I add load() in original example (xr.open_dataarray(fpath, engine=engine).load()), in which case xarray shouldn't need the file open anymore.

@max-sixty
Copy link
Collaborator

Does it reproduce in a normal python script, outside of a notebook?

@Paul-Aime
Copy link
Author

It does for the line Dataset("test.nc", "r", format="NETCDF4"), but only if repeated 8 or more times ... However it does not for the line xr.open_dataarray("test.nc", engine="netcdf4"), even when repeated a thousand times.

@max-sixty
Copy link
Collaborator

Probably some of the difference is that Jupyter will hold onto to the outputs in its Out dict.

And if we can reproduce with Dataset("test.nc", "r", format="NETCDF4") — surely that means it's not an xarray issue?

I'll add — unfortunately it's really difficult to engage with these sorts of issues across libraries without MCVEs. So to make much progress here we do need one. That can be difficult to generate, and probably requires looking at what objects are in memory and their size, or a verifiable case of xarray not closing a file properly.

@max-sixty max-sixty added needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports and removed bug needs triage Issue that has not been reviewed by xarray team member labels Dec 12, 2023
@Paul-Aime
Copy link
Author

Yes I do think IPython has some role in it, as it is fine reading with xarray within a script, and it requires duplicated reading in a cell to trigger the bug.

It was strange for me that it would still happen when explicitly using .load() but I realized it does not close the file. Now I tested with load_dataarray() (instead of open_dataarray()) which does close the file and it does not trigger the bug.

So yes it rather seems to be a strange interaction between netcdf-python and ipython.

As it makes the kernel crash it was a pain to find the source of the problem, hence my report, but using engine="h5netcdf" or load_dataarray solves the problem so I guess it's ok like that.

@kmuehlbauer
Copy link
Contributor

@Paul-Aime That's a valid point @max-sixty has raised. If the output is not bound to a variable it will be bound to the cell somehow and might not be cleaned up.

Please check if wrapping inside print- statement or within display does change the situation.

print(xr.open_dataarray(fpath, engine=engine))

display(xr.open_dataarray(fpath, engine=engine))

@Paul-Aime
Copy link
Author

@kmuehlbauer Indeed wrapping inside a print statement does not triggers the bug.

@kmuehlbauer
Copy link
Contributor

@Paul-Aime Is there anything we can do here? Otherwise we can close?

@Paul-Aime
Copy link
Author

@kmuehlbauer Seems like it might be on netcdf-python, so I think you can close. May I just ask if there is a way to make h5netcdf the default engine ? Because as is the default is not reliably usable in notebooks.

@kmuehlbauer
Copy link
Contributor

May I just ask if there is a way to make h5netcdf the default engine ? Because as is the default is not reliably usable in notebooks.

It looks like there is no option. You could tweak your installation here, moving "h5netcdf" to first position:

STANDARD_BACKENDS_ORDER = ["netcdf4", "h5netcdf", "scipy"]

@Paul-Aime
Copy link
Author

Thanks! Even looks like it works doing it dynamically with xr.backends.plugins.STANDARD_BACKENDS_ORDER = ["h5netcdf", "netcdf4", "scipy"] after import.

@kmuehlbauer
Copy link
Contributor

@Paul-Aime I'm going to close for now. Please feel free to re-open, if there is anything to do on the xarray side of things.

@jbphyswx
Copy link

jbphyswx commented Jan 4, 2024

engine="h5netcdf"

For future users, I also encountered this issue -- but both using load_dataset() or engine="h5netcdf" resolved it.

Also using dataset = xr.open_dataset(datset_path).load(); dataset.close() worked, but (dataset = xr.open_dataset(datset_path); dataset.close() did not).

My datasets were assigned to variables, and I did not have string coordinates, however I did have duplicated coordinates as described in #3731, so I think this issue can manifest more broadly.

Trying the suggested/documented solution for duplicated coordinates of renaming coordinates (i.e. dataset.variables['offending_variable'].dims = ("repeated_dim", "repeated_dim2") still resulted in a crash with dataset = xr.open_dataset(dataset_path).load();, but not with load_dataset() or engine="h5netcdf" , so I suspect the problem is something deeper between IPython, netCDFs, and xarray. I have not tested moving the function to a script to see if that resolves the problems yet.

For reference, my code always crashed on the cell's 2nd run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
Projects
None yet
Development

No branches or pull requests

5 participants