Skip to content

HDF store not closed on AssertionError #28430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
adimascio opened this issue Sep 13, 2019 · 0 comments
Open

HDF store not closed on AssertionError #28430

adimascio opened this issue Sep 13, 2019 · 0 comments
Labels
Bug IO HDF5 read_hdf, HDFStore

Comments

@adimascio
Copy link

adimascio commented Sep 13, 2019

Code Sample, a copy-pastable example if possible

Code adapted from https://github.com/ToucanToco/peakina/blob/master/peakina/cache.py#L84

        try:
            metadata = pd.read_hdf(filepath)
        except Exception:  # catch all, on purpose
            metadata = pd.DataFrame(columns=['key', 'mtime', 'created_at'])
            metadata.to_hdf(filepath, key, mode='w')

        return metadata
        

Problem description

In read_hdf, if the store.select() call throws either a ValueError, a TypeError or a KeyError then the store is closed.

However, if any other exception is thrown (e.g. an AssertionError if there are gaps in blocks ref_loc) , the store is not closed and some client code could end up trying to reopen the store and hit an error like: the file XXX is already opened. Please close it before reopening in write mode.

Expected Output

No reentrant exception.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.6.8.final.0 python-bits : 64 OS : Linux OS-release : 4.19.0-5-amd64 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : fr_FR.UTF-8 LOCALE : fr_FR.UTF-8

pandas : 0.25.1
numpy : 1.17.2
pytz : 2019.2
dateutil : 2.8.0
pip : 19.2.3
setuptools : 41.2.0
Cython : None
pytest : 5.1.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : 2.7.0
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.14.1
pytables : None
s3fs : 0.3.4
scipy : None
sqlalchemy : None
tables : 3.5.2
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None

adimascio pushed a commit to ToucanToco/peakina that referenced this issue Sep 13, 2019
Pandas does not always close the HDF store when it's invalid. Since we catch
all exception to write an empty metadata file, it crashes because the HDF file
is still opened. cf. pandas-dev/pandas#28430

If we handle the file closing ourselves and pass the file-object to `pd.read_hdf`,
everything's fine.
@WillAyd WillAyd added the IO HDF5 read_hdf, HDFStore label Sep 13, 2019
PrettyWood pushed a commit to ToucanToco/peakina that referenced this issue Sep 13, 2019
Pandas does not always close the HDF store when it's invalid. Since we catch
all exception to write an empty metadata file, it crashes because the HDF file
is still opened. cf. pandas-dev/pandas#28430

If we handle the file closing ourselves and pass the file-object to `pd.read_hdf`,
everything's fine.

The choice of `with open() as metaf` seemed good but it appears
pandas `read_hdf` does not actually support buffer
"Support for generic buffers has not been implemented."
So let's instantiate HDFStore directly in order to close it
PrettyWood pushed a commit to ToucanToco/peakina that referenced this issue Sep 13, 2019
Pandas does not always close the HDF store when it's invalid. Since we catch
all exception to write an empty metadata file, it crashes because the HDF file
is still opened. cf. pandas-dev/pandas#28430

If we handle the file closing ourselves and pass the file-object to `pd.read_hdf`,
everything should be fine BUT unfortunetaly `pd.read_hdf` does not actually
support buffer: "Support for generic buffers has not been implemented."

So let's instantiate HDFStore directly to be able to close it no matter what
PrettyWood pushed a commit to ToucanToco/peakina that referenced this issue Sep 13, 2019
Pandas does not always close the HDF store when it's invalid. Since we catch
all exception to write an empty metadata file, it crashes because the HDF file
is still opened. cf. pandas-dev/pandas#28430

If we handle the file closing ourselves and pass the file-object to `pd.read_hdf`,
everything should be fine BUT unfortunetaly `pd.read_hdf` does not actually
support buffer: "Support for generic buffers has not been implemented."

So let's instantiate HDFStore directly to be able to close it no matter what
adimascio pushed a commit to adimascio/pandas that referenced this issue Sep 13, 2019
In `read_hdf`, if the `store.select()` call throws either a
`ValueError`, a `TypeError` or a `KeyError` then the store is closed.

However, if any other exception is thrown (e.g. an `AssertionError` if
there are gaps in blocks ref_loc) , the store is not closed and some
client code could end up trying to reopen the store and hit an
error like: `the file XXX is already opened. Please close it before
reopening in write mode`. Furthermore, the exception is re-raised in all
cases.

This commit just closes store in all cases.

Closes pandas-dev#28430
adimascio pushed a commit to adimascio/pandas that referenced this issue Sep 13, 2019
In `read_hdf`, if the `store.select()` call throws either a
`ValueError`, a `TypeError` or a `KeyError` then the store is closed.

However, if any other exception is thrown (e.g. an `AssertionError` if
there are gaps in blocks ref_loc) , the store is not closed and some
client code could end up trying to reopen the store and hit an
error like: `the file XXX is already opened. Please close it before
reopening in write mode`. Furthermore, the exception is re-raised in all
cases.

This commit just closes store in all cases.

Closes pandas-dev#28430
adimascio pushed a commit to adimascio/pandas that referenced this issue Sep 15, 2019
In `read_hdf`, if the `store.select()` call throws either a
`ValueError`, a `TypeError` or a `KeyError` then the store is closed.

However, if any other exception is thrown (e.g. an `AssertionError` if
there are gaps in blocks ref_loc) , the store is not closed and some
client code could end up trying to reopen the store and hit an
error like: `the file XXX is already opened. Please close it before
reopening in write mode`. Furthermore, the exception is re-raised in all
cases.

This commit just closes store in all cases.

Closes pandas-dev#28430
adimascio pushed a commit to adimascio/pandas that referenced this issue Oct 10, 2019
In `read_hdf`, if the `store.select()` call throws either a
`ValueError`, a `TypeError` or a `KeyError` then the store is closed.

However, if any other exception is thrown (e.g. an `AssertionError` if
there are gaps in blocks ref_loc) , the store is not closed and some
client code could end up trying to reopen the store and hit an
error like: `the file XXX is already opened. Please close it before
reopening in write mode`. Furthermore, the exception is re-raised in all
cases.

This commit just closes store in all cases.

Closes pandas-dev#28430
adimascio pushed a commit to adimascio/pandas that referenced this issue Dec 4, 2019
In `read_hdf`, if the `store.select()` call throws either a
`ValueError`, a `TypeError` or a `KeyError` then the store is closed.

However, if any other exception is thrown (e.g. an `AssertionError` if
there are gaps in blocks ref_loc) , the store is not closed and some
client code could end up trying to reopen the store and hit an
error like: `the file XXX is already opened. Please close it before
reopening in write mode`. Furthermore, the exception is re-raised in all
cases.

This commit just closes store in all cases.

Closes pandas-dev#28430
@mroeschke mroeschke added the Bug label May 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants