Skip to content

Commit 8a26345

Browse files
bwanglzumax-sixty
authored andcommitted
BUG: resolve divide by 0 error when uploading empty dataframe (#252)
* resolve divide by 0 error when uploading empty dataframe * reformat with black * add unit test when uploading empty dataframe * add empty data upload system test * remove empty df unit test * update empty df * add 0.10.0 release note * update release note version number to 0.11.0 * update empty dataframe bug fix in change log
1 parent ebcbfbe commit 8a26345

File tree

3 files changed

+27
-3
lines changed

3 files changed

+27
-3
lines changed

docs/source/changelog.rst

+3-1
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ Changelog
66
0.10.0 / TBD
77
------------
88

9+
- This fixes a bug where pandas-gbq could not upload an empty database. (:issue:`237`)
10+
911
Dependency updates
1012
~~~~~~~~~~~~~~~~~~
1113

@@ -235,4 +237,4 @@ Initial release of transfered code from `pandas <https://github.com/pandas-dev/p
235237
Includes patches since the 0.19.2 release on pandas with the following:
236238

237239
- :func:`read_gbq` now allows query configuration preferences `pandas-GH#14742 <https://github.com/pandas-dev/pandas/pull/14742>`__
238-
- :func:`read_gbq` now stores ``INTEGER`` columns as ``dtype=object`` if they contain ``NULL`` values. Otherwise they are stored as ``int64``. This prevents precision lost for integers greather than 2**53. Furthermore ``FLOAT`` columns with values above 10**4 are no longer casted to ``int64`` which also caused precision loss `pandas-GH#14064 <https://github.com/pandas-dev/pandas/pull/14064>`__, and `pandas-GH#14305 <https://github.com/pandas-dev/pandas/pull/14305>`__
240+
- :func:`read_gbq` now stores ``INTEGER`` columns as ``dtype=object`` if they contain ``NULL`` values. Otherwise they are stored as ``int64``. This prevents precision lost for integers greather than 2**53. Furthermore ``FLOAT`` columns with values above 10**4 are no longer casted to ``int64`` which also caused precision loss `pandas-GH#14064 <https://github.com/pandas-dev/pandas/pull/14064>`__, and `pandas-GH#14305 <https://github.com/pandas-dev/pandas/pull/14305>`__

pandas_gbq/gbq.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -518,8 +518,8 @@ def load_data(
518518
chunks = tqdm.tqdm(chunks)
519519
for remaining_rows in chunks:
520520
logger.info(
521-
"\rLoad is {0}% Complete".format(
522-
((total_rows - remaining_rows) * 100) / total_rows
521+
"\r{} out of {} rows loaded.".format(
522+
total_rows - remaining_rows, total_rows
523523
)
524524
)
525525
except self.http_error as ex:

tests/system/test_gbq.py

+22
Original file line numberDiff line numberDiff line change
@@ -924,6 +924,28 @@ def test_upload_data(self, project_id):
924924
)
925925
assert result["num_rows"][0] == test_size
926926

927+
def test_upload_empty_data(self, project_id):
928+
test_id = "data_with_0_rows"
929+
test_size = 0
930+
df = DataFrame()
931+
932+
gbq.to_gbq(
933+
df,
934+
self.destination_table + test_id,
935+
project_id,
936+
credentials=self.credentials,
937+
)
938+
939+
result = gbq.read_gbq(
940+
"SELECT COUNT(*) AS num_rows FROM {0}".format(
941+
self.destination_table + test_id
942+
),
943+
project_id=project_id,
944+
credentials=self.credentials,
945+
dialect="legacy",
946+
)
947+
assert result["num_rows"][0] == test_size
948+
927949
def test_upload_data_if_table_exists_fail(self, project_id):
928950
test_id = "2"
929951
test_size = 10

0 commit comments

Comments
 (0)