@@ -38,7 +38,7 @@ object.
38
38
* :ref: `read_json<io.json_reader> `
39
39
* :ref: `read_msgpack<io.msgpack> ` (experimental)
40
40
* :ref: `read_html<io.read_html> `
41
- * :ref: `read_gbq<io.bigquery_reader > ` (experimental)
41
+ * :ref: `read_gbq<io.bigquery > ` (experimental)
42
42
* :ref: `read_stata<io.stata_reader> `
43
43
* :ref: `read_sas<io.sas_reader> `
44
44
* :ref: `read_clipboard<io.clipboard> `
@@ -53,7 +53,7 @@ The corresponding ``writer`` functions are object methods that are accessed like
53
53
* :ref: `to_json<io.json_writer> `
54
54
* :ref: `to_msgpack<io.msgpack> ` (experimental)
55
55
* :ref: `to_html<io.html> `
56
- * :ref: `to_gbq<io.bigquery_writer > ` (experimental)
56
+ * :ref: `to_gbq<io.bigquery > ` (experimental)
57
57
* :ref: `to_stata<io.stata_writer> `
58
58
* :ref: `to_clipboard<io.clipboard> `
59
59
* :ref: `to_pickle<io.pickle> `
@@ -4428,16 +4428,11 @@ DataFrame with a shape and data types derived from the source table.
4428
4428
Additionally, DataFrames can be inserted into new BigQuery tables or appended
4429
4429
to existing tables.
4430
4430
4431
- You will need to install some additional dependencies:
4432
-
4433
- - Google's `python-gflags <https://github.com/google/python-gflags/ >`__
4434
- - `httplib2 <http://pypi.python.org/pypi/httplib2 >`__
4435
- - `google-api-python-client <http://github.com/google/google-api-python-client >`__
4436
-
4437
4431
.. warning ::
4438
4432
4439
4433
To use this module, you will need a valid BigQuery account. Refer to the
4440
- `BigQuery Documentation <https://cloud.google.com/bigquery/what-is-bigquery >`__ for details on the service itself.
4434
+ `BigQuery Documentation <https://cloud.google.com/bigquery/what-is-bigquery >`__
4435
+ for details on the service itself.
4441
4436
4442
4437
The key functions are:
4443
4438
@@ -4451,22 +4446,58 @@ The key functions are:
4451
4446
4452
4447
.. currentmodule :: pandas
4453
4448
4454
- .. _io.bigquery_reader :
4449
+
4450
+ Supported Data Types
4451
+ ++++++++++++++++++++
4452
+
4453
+ Pandas supports these all `BigQuery data types <https://cloud.google.com/bigquery/data-types >`__:
4454
+ ``STRING ``, ``INTEGER `` (64bit), ``FLOAT `` (64 bit), ``BOOLEAN `` and
4455
+ ``TIMESTAMP `` (microsecond precision). Data types ``BYTES `` and ``RECORD ``
4456
+ are not supported.
4457
+
4458
+ Integer and boolean ``NA `` handling
4459
+ +++++++++++++++++++++++++++++++
4460
+
4461
+ .. versionadded :: 0.19
4462
+
4463
+ Since all columns in BigQuery queries are nullable, and NumPy lacks of ``NA ``
4464
+ support for integer and boolean types, this module will store ``INTEGER `` or
4465
+ ``BOOLEAN `` columns with at least one ``NULL `` value as ``dtype=object ``.
4466
+ Otherwise those columns will be stored as ``dtype=int64 `` or ``dtype=bool ``
4467
+ respectively.
4468
+
4469
+ This is opposite to default pandas behaviour which will promote integer
4470
+ type to float in order to store NAs. See the :ref: `gotchas<gotchas.intna> `
4471
+ for detailed explaination.
4472
+
4473
+ While this trade-off works well for most cases, it breaks down for storing
4474
+ values greater than 2**53. Such values in BigQuery can represent identifiers
4475
+ and unnoticed precision lost for identifier is what we want to avoid.
4476
+
4477
+ Dependencies
4478
+ ++++++++++++
4479
+
4480
+ This module requires these additional dependencies:
4481
+
4482
+ - `httplib2 <http://pypi.python.org/pypi/httplib2 >`__
4483
+ - `google-api-python-client <http://github.com/google/google-api-python-client >`__
4484
+ - `oauth2client <https://github.com/google/oauth2client >`__.
4485
+
4455
4486
4456
4487
.. _io.bigquery_authentication :
4457
4488
4458
4489
Authentication
4459
4490
''''''''''''''
4460
4491
4461
- .. versionadded :: 0.18.0
4492
+ .. versionadded :: 0.18
4462
4493
4463
4494
Authentication to the Google ``BigQuery `` service is via ``OAuth 2.0 ``.
4464
4495
Is possible to authenticate with either user account credentials or service account credentials.
4465
4496
4466
4497
Authenticating with user account credentials is as simple as following the prompts in a browser window
4467
4498
which will be automatically opened for you. You will be authenticated to the specified
4468
4499
``BigQuery `` account using the product name ``pandas GBQ ``. It is only possible on local host.
4469
- The remote authentication using user account credentials is not currently supported in Pandas .
4500
+ The remote authentication using user account credentials is not currently supported in pandas .
4470
4501
Additional information on the authentication mechanism can be found
4471
4502
`here <https://developers.google.com/identity/protocols/OAuth2#clientside/ >`__.
4472
4503
@@ -4475,17 +4506,13 @@ is particularly useful when working on remote servers (eg. jupyter iPython noteb
4475
4506
Additional information on service accounts can be found
4476
4507
`here <https://developers.google.com/identity/protocols/OAuth2#serviceaccount >`__.
4477
4508
4478
- You will need to install an additional dependency: `oauth2client <https://github.com/google/oauth2client >`__.
4479
-
4480
4509
Authentication via ``application default credentials `` is also possible. This is only valid
4481
4510
if the parameter ``private_key `` is not provided. This method also requires that
4482
4511
the credentials can be fetched from the environment the code is running in.
4483
4512
Otherwise, the OAuth2 client-side authentication is used.
4484
4513
Additional information on
4485
4514
`application default credentials <https://developers.google.com/identity/protocols/application-default-credentials >`__.
4486
4515
4487
- .. versionadded :: 0.19.0
4488
-
4489
4516
.. note ::
4490
4517
4491
4518
The `'private_key' ` parameter can be set to either the file path of the service account key
@@ -4496,6 +4523,7 @@ Additional information on
4496
4523
A private key can be obtained from the Google developers console by clicking
4497
4524
`here <https://console.developers.google.com/permissions/serviceaccounts >`__. Use JSON key type.
4498
4525
4526
+ .. _io.bigquery_reader :
4499
4527
4500
4528
Querying
4501
4529
''''''''
@@ -4539,7 +4567,6 @@ destination DataFrame as well as a preferred column order as follows:
4539
4567
4540
4568
.. _io.bigquery_writer :
4541
4569
4542
-
4543
4570
Writing DataFrames
4544
4571
''''''''''''''''''
4545
4572
@@ -4629,6 +4656,8 @@ For example:
4629
4656
often as the service seems to be changing and evolving. BiqQuery is best for analyzing large
4630
4657
sets of data quickly, but it is not a direct replacement for a transactional database.
4631
4658
4659
+ .. _io.bigquery_create_tables :
4660
+
4632
4661
Creating BigQuery Tables
4633
4662
''''''''''''''''''''''''
4634
4663
@@ -4658,6 +4687,7 @@ produce the dictionary representation schema of the specified pandas DataFrame.
4658
4687
the new table with a different name. Refer to
4659
4688
`Google BigQuery issue 191 <https://code.google.com/p/google-bigquery/issues/detail?id=191 >`__.
4660
4689
4690
+
4661
4691
.. _io.stata :
4662
4692
4663
4693
Stata Format
0 commit comments