-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Support writing timestamps with timezones with to_sql #22654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 23 commits
776240b
befd200
e9f122f
969d2da
cc79b90
24dbaa5
6e86d58
c7c4a7a
6aa4878
513bbc8
58772e1
1a29148
96e9188
ded5584
d575089
a7d1b3e
305759c
7a79531
24823f8
7db4eaa
76e46dc
978a0d3
8025248
0e89370
bab5cfb
de62788
e940279
8c754b5
e85842f
5af83f7
6b3a3f1
c4304ec
1054fdb
f21c755
f872ff7
ef3b20f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4777,7 +4777,7 @@ writes ``data`` to the database in batches of 1000 rows at a time: | |
data.to_sql('data_chunked', engine, chunksize=1000) | ||
|
||
SQL data types | ||
++++++++++++++ | ||
'''''''''''''' | ||
|
||
:func:`~pandas.DataFrame.to_sql` will try to map your data to an appropriate | ||
SQL data type based on the dtype of the data. When you have columns of dtype | ||
|
@@ -4809,6 +4809,30 @@ default ``Text`` type for string columns: | |
Because of this, reading the database table back in does **not** generate | ||
a categorical. | ||
|
||
Datetime data types | ||
''''''''''''''''''' | ||
|
||
Using SQLAlchemy, :func:`~pandas.DataFrame.to_sql` is capable of writing | ||
datetime data that is timezone naive or timezone aware. However, the resulting | ||
data stored in the database ultimately depends on the corresponding, | ||
supported data type for datetime data. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With "corresponding", you mean depending on the database system? I would also add that the following table lists some common systems (but that it can still be different for other database dialects) |
||
|
||
=========== ============================================= =================== | ||
Database SQL Datetime Types Timezone Support | ||
=========== ============================================= =================== | ||
SQLite ``TEXT`` No | ||
MySQL ``TIMESTAMP`` or ``DATETIME`` No | ||
PostgreSQL ``TIMESTAMP`` or ``TIMESTAMP WITH TIME ZONE`` Yes | ||
=========== ============================================= =================== | ||
|
||
When writing timezone aware data to databases that do not support timezones, | ||
the data will be written as timezone naive timestamps that are in local time | ||
with respect to the timezone. | ||
|
||
:func:`~pandas.read_sql_table` is also capable of reading datetime data that is | ||
timezone aware of naive. When reading ``TIMESTAMP WITH TIME ZONE`` types, pandas | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. of -> or |
||
will convert the data to UTC. | ||
|
||
Reading Tables | ||
'''''''''''''' | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -183,6 +183,7 @@ Other Enhancements | |
- :class:`IntervalIndex` has gained the :meth:`~IntervalIndex.set_closed` method to change the existing ``closed`` value (:issue:`21670`) | ||
- :func:`~DataFrame.to_csv`, :func:`~Series.to_csv`, :func:`~DataFrame.to_json`, and :func:`~Series.to_json` now support ``compression='infer'`` to infer compression based on filename extension (:issue:`15008`). | ||
The default compression for ``to_csv``, ``to_json``, and ``to_pickle`` methods has been updated to ``'infer'`` (:issue:`22004`). | ||
- :meth:`DataFrame.to_sql` now supports writing ``TIMESTAMP WITH TIME ZONE`` types for supported databases. See the `io reference <https://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries>`__ for implications (:issue:`9086`). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can use a internal sphinx |
||
- :func:`to_timedelta` now supports iso-formated timedelta strings (:issue:`21877`) | ||
- :class:`Series` and :class:`DataFrame` now support :class:`Iterable` in constructor (:issue:`2193`) | ||
- :class:`DatetimeIndex` gained :attr:`DatetimeIndex.timetz` attribute. Returns local time with timezone information. (:issue:`21358`) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2312,6 +2312,13 @@ def to_sql(self, name, con, schema=None, if_exists='fail', index=True, | |
-------- | ||
pandas.read_sql : read a DataFrame from a table | ||
|
||
Notes | ||
----- | ||
Timezone aware datetime columns will be written as | ||
``Timestamp with timezone`` type with SQLAlchemy if supported by the | ||
database. Otherwise, the datetimes will stored in the closest available | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. stored -> be stored I think "closest available type" is a bit unclear. I would just say that they will be stored as "timezone unaware timestamps local ..." |
||
type as timezone unaware timestamps local to the original timezone. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a versionadded / changed |
||
|
||
References | ||
---------- | ||
.. [1] http://docs.sqlalchemy.org | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -961,7 +961,8 @@ def test_sqlalchemy_type_mapping(self): | |
utc=True)}) | ||
db = sql.SQLDatabase(self.conn) | ||
table = sql.SQLTable("test_type", db, frame=df) | ||
assert isinstance(table.table.c['time'].type, sqltypes.DateTime) | ||
# GH 9086: TIMESTAMP is the suggested type for datetimes with timezones | ||
assert isinstance(table.table.c['time'].type, sqltypes.TIMESTAMP) | ||
|
||
def test_database_uri_string(self): | ||
|
||
|
@@ -1361,9 +1362,40 @@ def check(col): | |
df = sql.read_sql_table("types_test_data", self.conn) | ||
check(df.DateColWithTz) | ||
|
||
def test_datetime_with_timezone_roundtrip(self): | ||
# GH 9086 | ||
# Write datetimetz data to a db and read it back | ||
# For dbs that support timestamps with timezones, should get back UTC | ||
# otherwise naive data should be returned | ||
expected = DataFrame({'A': date_range( | ||
'2013-01-01 09:00:00', periods=3, tz='US/Pacific' | ||
)}) | ||
expected.to_sql('test_datetime_tz', self.conn, index=False) | ||
|
||
if self.flavor == 'postgresql': | ||
# SQLAlchemy "timezones" (i.e. offsets) are coerced to UTC | ||
expected['A'] = expected['A'].dt.tz_convert('UTC') | ||
else: | ||
# Otherwise, timestamps are returned as local, naive | ||
expected['A'] = expected['A'].dt.tz_localize(None) | ||
|
||
result = sql.read_sql_table('test_datetime_tz', self.conn) | ||
tm.assert_frame_equal(result, expected) | ||
|
||
result = sql.read_sql_query( | ||
'SELECT * FROM test_datetime_tz', self.conn | ||
) | ||
if self.flavor == 'sqlite': | ||
# read_sql_query does not return datetime type like read_sql_table | ||
assert isinstance(result.loc[0, 'A'], string_types) | ||
result['A'] = to_datetime(result['A']) | ||
tm.assert_frame_equal(result, expected) | ||
|
||
def test_date_parsing(self): | ||
# No Parsing | ||
df = sql.read_sql_table("types_test_data", self.conn) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't we rather test that it is not parsed instead of removing it? (but I agree this currently looks like this is not doing much) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I see. My cursory glance thought it was duplicate of the line below; you're right, I will try to add an assert to this result as well |
||
expected_type = object if self.flavor == 'sqlite' else np.datetime64 | ||
assert issubclass(df.DateCol.dtype.type, expected_type) | ||
|
||
df = sql.read_sql_table("types_test_data", self.conn, | ||
parse_dates=['DateCol']) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do the docs look ok after this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moving it one level up (since this is the only 'subsubsection' in the while sql section, I think that is fine). @mroeschke what was your reasoning here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I thought this was rendering incorrectly when I built the docs, but I was looking at the wrong section. I'll revert it.