Skip to content

API/BUG: partially-inferring datetimes #40111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Feb 27, 2021 · 4 comments
Closed

API/BUG: partially-inferring datetimes #40111

jbrockmendel opened this issue Feb 27, 2021 · 4 comments
Labels
API - Consistency Internal Consistency of API/Behavior Bug Constructors Series/DataFrame/Index/pd.array Constructors Dtype Conversions Unexpected or buggy dtype conversions

Comments

@jbrockmendel
Copy link
Member

from test_mixed_offsets_with_native_datetime_raises

vals = [
    "nan",
    pd.Timestamp("1990-01-01"),
    "2015-03-14T16:15:14.123-08:00",
    "2019-03-04T21:56:32.620-07:00",
    None,
]
ser = pd.Series(vals)

As expected, this comes back as object-dtype, but it also does some casting:

>>> ser
0                                 NaT
1                 1990-01-01 00:00:00
2    2015-03-14 16:15:14.123000-08:00
3    2019-03-04 21:56:32.620000-07:00
4                                None
dtype: object

ser[0] is a str "NaT" and ser[[2, 3]] are stdlib datetime objects. ser[1] and ser[4] are unchanged.

This casting is done inside a call to to_datetime, but has the surprising effect of not being idempotent:

>>> pd.to_datetime(ser)
ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True

I'd expect ser to be constructed without doing any casting.

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 27, 2021
@jbrockmendel jbrockmendel added API - Consistency Internal Consistency of API/Behavior Constructors Series/DataFrame/Index/pd.array Constructors and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 24, 2021
@mroeschke mroeschke added Bug Dtype Conversions Unexpected or buggy dtype conversions labels Aug 19, 2021
@jbrockmendel
Copy link
Member Author

This may be fixed in 2.0 when the string-inference behavior in maybe_infer_to_datetimelike deprecation is enforced.

@jbrockmendel
Copy link
Member Author

partially closed by #44296; the lack of idempotency still needs to be addressed

@jorisvandenbossche
Copy link
Member

With latest main, I think none of the original objects have been converted:

In [38]: ser
Out[38]: 
0                              nan
1              1990-01-01 00:00:00
2    2015-03-14T16:15:14.123-08:00
3    2019-03-04T21:56:32.620-07:00
4                             None
dtype: object

In [39]: ser.values
Out[39]: 
array(['nan', Timestamp('1990-01-01 00:00:00'),
       '2015-03-14T16:15:14.123-08:00', '2019-03-04T21:56:32.620-07:00',
       None], dtype=object)

@jbrockmendel
Copy link
Member Author

This looks correct to me now, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Bug Constructors Series/DataFrame/Index/pd.array Constructors Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

No branches or pull requests

3 participants