-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: dt.timezone pydatetime parsed differently for ISO vs non-ISO dates #50025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
FWIW I think we want any fixed offset timezone to always parse to |
thanks! looks like that'll be for pandas3.0 at least for the next one though, do you agree that |
I'd actually say it's technically correct. The Similar reason why I think this is the correct behavior too
|
Thanks - yeah I can see that argument OK so it's the ISO path that would need fixing |
@jbrockmendel do you have thoughts on this? I think the solution will need to be to combine the ISO and non-ISO paths (the only difference being how they handle string inputs), I'm just trying to figure out which parts to keep from which Between this and #50071, it looks like we need to aim for a mixture of them |
BTW @MarcoGorelli im getting a lot of pings and trying to triage. pls let me know if any are time-sensitive or blockers etc.
Do you mean moving the array_strptime logic into array_to_datetime? if so, yikes.
id be very happy to revive that. #49677 only changed fixed-offsets, so didnt require zoneinfo. i.e. doesnt need to wait until we drop py38 I lean towards agreeing with Marco that the behavior should not hinge on what implementations of FixedOffset the user or our defaults use. |
True, I think this would be an okay API change and signal that pandas is moving toward more stdlib tz implementations anyways |
Here's another inconsistency: when there's a timezone-aware string and a timezone-naive datetime. In the ISO case, the naive one is converted, whilst in the non-ISO case, we get an In [2]: pd.to_datetime(["2020-01-01 01:00:00-01:00", datetime(2020, 1, 1, 3, 0)], format='%Y-%d-%m %H:%M:%S%z')
Out[2]: Index([2020-01-01 01:00:00-01:00, 2020-01-01 03:00:00], dtype='object')
In [3]: pd.to_datetime(["2020-01-01 01:00:00-01:00", datetime(2020, 1, 1, 3, 0)], format='%Y-%m-%d %H:%M:%S%z')
Out[3]: DatetimeIndex(['2020-01-01 01:00:00-01:00', '2020-01-01 02:00:00-01:00'], dtype='datetime64[ns, pytz.FixedOffset(-60)]', freq=None) I think I prefer the first one
Thanks @jbrockmendel , and sorry for the all the pings. I'd really like to get PDEP4 in for pandas2.0, so if we could agree here on which path (ISO or non-ISO) is correct on what, then that'd help unblock it
I was thinking the opposite (calling |
Unless the user passed |
closed in #50242 |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
This gives
Expected Behavior
Not sure they should be different. I think ideally from a user's perspective, it shouldn't matter whether a
format
is ISO8601 or notInstalled Versions
INSTALLED VERSIONS
commit : 3377a6d
python : 3.8.15.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.102.1-microsoft-standard-WSL2
Version : #1 SMP Wed Mar 2 00:30:59 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8
pandas : 2.0.0.dev0+817.g3377a6d33d
numpy : 1.23.5
pytz : 2022.6
dateutil : 2.8.2
setuptools : 65.5.1
pip : 22.3.1
Cython : 0.29.32
pytest : 7.2.0
hypothesis : 6.59.0
sphinx : 4.5.0
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.1
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : 2.9.3
jinja2 : 3.0.3
IPython : 8.7.0
pandas_datareader: 0.10.0
bs4 : 4.11.1
bottleneck : 1.3.5
brotli :
fastparquet : 2022.11.0
fsspec : 2021.11.0
gcsfs : 2021.11.0
matplotlib : 3.6.2
numba : 0.56.4
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : 0.17.9
pyarrow : 9.0.0
pyreadstat : 1.2.0
pyxlsb : 1.0.10
s3fs : 2021.11.0
scipy : 1.9.3
snappy :
sqlalchemy : 1.4.44
tables : 3.7.0
tabulate : 0.9.0
xarray : 2022.11.0
xlrd : 2.0.1
zstandard : 0.19.0
tzdata : None
qtpy : None
pyqt5 : None
None
The text was updated successfully, but these errors were encountered: