Skip to content

ENH: Add Timedelta Support to JSON Reader with orient=table (#21140) #21827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ Other Enhancements
- :class:`IntervalIndex` has gained the :meth:`~IntervalIndex.set_closed` method to change the existing ``closed`` value (:issue:`21670`)
- :func:`~DataFrame.to_csv` and :func:`~DataFrame.to_json` now support ``compression='infer'`` to infer compression based on filename (:issue:`15008`)
- :func:`to_timedelta` now supports iso-formated timedelta strings (:issue:`21877`)
- :func:`read_json` now parse timedelta with `orient='table'` (:issue:`21140`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed this on previous comment but there's a small typo here. parse -> parses

Copy link
Contributor Author

@fjdiod fjdiod Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd I've pushed a fix, but now there is a merge conflict. How should I deal with it on github?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to deal locally and re-push. Assuming you have things set up as mentioned in the pandas contributing guide do this on your local branch

git fetch upstream
git merge upstream/master

You'll probably get a merge conflict there, so fix that up and do:

git merge --continue

Everything should be resolved then so re-push after that

-

.. _whatsnew_0240.api_breaking:
Expand Down
11 changes: 5 additions & 6 deletions pandas/io/json/table_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import warnings

import pandas._libs.json as json
from pandas import DataFrame
from pandas import DataFrame, to_timedelta
from pandas.api.types import CategoricalDtype
import pandas.core.common as com
from pandas.core.dtypes.common import (
Expand Down Expand Up @@ -163,7 +163,7 @@ def convert_json_field_to_pandas_type(field):
elif typ == 'boolean':
return 'bool'
elif typ == 'duration':
return 'timedelta64'
return 'timedelta64[ns]'
elif typ == 'datetime':
if field.get('tz'):
return 'datetime64[ns, {tz}]'.format(tz=field['tz'])
Expand Down Expand Up @@ -306,10 +306,9 @@ def parse_table_schema(json, precise_float):
raise NotImplementedError('table="orient" can not yet read timezone '
'data')

# No ISO constructor for Timedelta as of yet, so need to raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this not hit in tests prior to this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have been covered by test_read_json_table_orient_raises through parametrization

if 'timedelta64' in dtypes.values():
raise NotImplementedError('table="orient" can not yet read '
'ISO-formatted Timedelta data')
for col, dtype in dtypes.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this block necessary? Assumed the subsequent astype call would take care of this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, support for iso-format timedeltas should be added to astype function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No just go with what jreback said for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, are you sure you need to do this? I believe .astype(dtypes) should handle this, and if not we should fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now

delta = pd.Timedelta(1e9).isoformat()
pd.DataFrame([delta]).astype('timedelta64[ns]')

results in error.

if dtype == 'timedelta64[ns]':
df[col] = to_timedelta(df[col])

df = df.astype(dtypes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any conflict between this and the loop above? Wondering if we should be removing any timedelta calls from the dtypes dict since they get accounted for in the loop


Expand Down
6 changes: 3 additions & 3 deletions pandas/tests/io/json/test_json_table_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -390,7 +390,7 @@ def test_convert_pandas_type_to_json_field_categorical(self, kind,
({'type': 'integer'}, 'int64'),
({'type': 'number'}, 'float64'),
({'type': 'boolean'}, 'bool'),
({'type': 'duration'}, 'timedelta64'),
({'type': 'duration'}, 'timedelta64[ns]'),
({'type': 'datetime'}, 'datetime64[ns]'),
({'type': 'datetime', 'tz': 'US/Hawaii'}, 'datetime64[ns, US/Hawaii]'),
({'type': 'any'}, 'object'),
Expand Down Expand Up @@ -499,6 +499,7 @@ class TestTableOrientReader(object):
'level_0'])
@pytest.mark.parametrize("vals", [
{'ints': [1, 2, 3, 4]},
{'timedeltas': pd.timedelta_range('1H', periods=4, freq='T')},
{'objects': ['a', 'b', 'c', 'd']},
{'date_ranges': pd.date_range('2016-01-01', freq='d', periods=4)},
{'categoricals': pd.Series(pd.Categorical(['a', 'b', 'c', 'c']))},
Expand All @@ -516,7 +517,6 @@ def test_read_json_table_orient(self, index_nm, vals, recwarn):
@pytest.mark.parametrize("index_nm", [
None, "idx", "index"])
@pytest.mark.parametrize("vals", [
{'timedeltas': pd.timedelta_range('1H', periods=4, freq='T')},
{'timezones': pd.date_range('2016-01-01', freq='d', periods=4,
tz='US/Central')}])
def test_read_json_table_orient_raises(self, index_nm, vals, recwarn):
Expand All @@ -530,7 +530,7 @@ def test_comprehensive(self):
{'A': [1, 2, 3, 4],
'B': ['a', 'b', 'c', 'c'],
'C': pd.date_range('2016-01-01', freq='d', periods=4),
# 'D': pd.timedelta_range('1H', periods=4, freq='T'),
'D': pd.timedelta_range('1H', periods=4, freq='T'),
'E': pd.Series(pd.Categorical(['a', 'b', 'c', 'c'])),
'F': pd.Series(pd.Categorical(['a', 'b', 'c', 'c'],
ordered=True)),
Expand Down