You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[BigQuery](https://cloud.google.com/bigquery/docs/) is a petabyte-scale analytics data warehouse that you can use to run SQL queries over vast amounts of data in near realtime. This page shows you how to get started with the Google BigQuery API using the Python client library.
5
5
@@ -16,13 +16,11 @@ import pandas
16
16
To use the BigQuery Python client library, start by initializing a client. The BigQuery client is used to send and receive messages from the BigQuery API.
17
17
18
18
### Client project
19
-
The project used by the client will default to the project associated with the credentials file stored in the `GOOGLE_APPLICATION_CREDENTIALS` environment variable.
20
-
21
-
See the [google-auth](https://google-auth.readthedocs.io/en/latest/reference/google.auth.html) for more information about Application Default Credentials.
19
+
The `bigquery.Client` object uses your default project. Alternatively, you can specify a project in the `Client` constructor. For more information about how the default project is determined, see the [google-auth documentation](https://google-auth.readthedocs.io/en/latest/reference/google.auth.html).
22
20
23
21
24
22
### Client location
25
-
Locations are required for certain BigQuery operations such as creating a Dataset. If a location is provided to the client when it is initialized, it will be the default location for jobs, datasets, and tables.
23
+
Locations are required for certain BigQuery operations such as creating a dataset. If a location is provided to the client when it is initialized, it will be the default location for jobs, datasets, and tables.
26
24
27
25
Run the following to create a client with your default project:
print("Client creating using default project: {}".format(client.project))
33
31
```
34
32
35
-
Client creating using default project: your-project-id
36
-
37
-
38
-
Alternatively, you can explicitly specify a project when constructing the client:
33
+
To explicitly specify a project when constructing the client, set the `project` parameter:
39
34
40
35
41
36
```python
@@ -44,15 +39,17 @@ Alternatively, you can explicitly specify a project when constructing the client
44
39
45
40
## Run a query on a public dataset
46
41
47
-
The following example runs a query on the BigQuery `usa_names` public dataset, which is a Social Security Administration dataset that contains all names from Social Security card applications for births that occurred in the United States after 1879.
42
+
The following example queries the BigQuery `usa_names` public dataset to find the 10 most popular names. `usa_names` is a Social Security Administration dataset that contains all names from Social Security card applications for births that occurred in the United States after 1879.
48
43
49
-
Use the [Client.query()](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client.query) method to run the query, and the [QueryJob.to_dataframe()](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.QueryJob.html#google.cloud.bigquery.job.QueryJob.to_dataframe) method to return the results as a [pandas DataFrame](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).
44
+
Use the [Client.query](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client.query) method to run the query, and the [QueryJob.to_dataframe](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.QueryJob.html#google.cloud.bigquery.job.QueryJob.to_dataframe) method to return the results as a pandas [`DataFrame`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).
50
45
51
46
52
47
```python
53
48
query ="""
54
-
SELECT name FROM `bigquery-public-data.usa_names.usa_1910_current`
55
-
WHERE state = "TX"
49
+
SELECT name, SUM(number) as total
50
+
FROM `bigquery-public-data.usa_names.usa_1910_current`
51
+
GROUP BY name
52
+
ORDER BY total DESC
56
53
LIMIT 10
57
54
"""
58
55
query_job = client.query(
@@ -65,72 +62,13 @@ df = query_job.to_dataframe()
65
62
df
66
63
```
67
64
68
-
69
-
70
-
71
-
<div>
72
-
73
-
<table>
74
-
<thead>
75
-
<tr>
76
-
<th></th>
77
-
<th>name</th>
78
-
</tr>
79
-
</thead>
80
-
<tbody>
81
-
<tr>
82
-
<th>0</th>
83
-
<td>Mary</td>
84
-
</tr>
85
-
<tr>
86
-
<th>1</th>
87
-
<td>Ruby</td>
88
-
</tr>
89
-
<tr>
90
-
<th>2</th>
91
-
<td>Annie</td>
92
-
</tr>
93
-
<tr>
94
-
<th>3</th>
95
-
<td>Willie</td>
96
-
</tr>
97
-
<tr>
98
-
<th>4</th>
99
-
<td>Ruth</td>
100
-
</tr>
101
-
<tr>
102
-
<th>5</th>
103
-
<td>Gladys</td>
104
-
</tr>
105
-
<tr>
106
-
<th>6</th>
107
-
<td>Maria</td>
108
-
</tr>
109
-
<tr>
110
-
<th>7</th>
111
-
<td>Frances</td>
112
-
</tr>
113
-
<tr>
114
-
<th>8</th>
115
-
<td>Margaret</td>
116
-
</tr>
117
-
<tr>
118
-
<th>9</th>
119
-
<td>Helen</td>
120
-
</tr>
121
-
</tbody>
122
-
</table>
123
-
</div>
124
-
125
-
126
-
127
65
## Run a parameterized query
128
66
129
-
BigQuery supports query parameters to help prevent [SQL injection](https://en.wikipedia.org/wiki/SQL_injection) when queries are constructed using user input. This feature is only available with [standard SQL syntax](https://cloud.google.com/bigquery/docs/reference/standard-sql/). Query parameters can be used as substitutes for arbitrary expressions. Parameters cannot be used as substitutes for identifiers, column names, table names, or other parts of the query.
67
+
BigQuery supports query parameters to help prevent [SQL injection](https://en.wikipedia.org/wiki/SQL_injection) when you construct a query with user input. Query parameters are only available with [standard SQL syntax](https://cloud.google.com/bigquery/docs/reference/standard-sql/). Query parameters can be used as substitutes for arbitrary expressions. Parameters cannot be used as substitutes for identifiers, column names, table names, or other parts of the query.
130
68
131
-
To specify a named parameter, use the `@` character followed by an [identifier](https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical#identifiers), such as `@param_name`. For example, this query finds all the words in a specific Shakespeare corpus with counts that are at least the specified value.
69
+
To specify a parameter, use the `@` character followed by an [identifier](https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical#identifiers), such as `@param_name`. For example, the following query finds all the words in a specific Shakespeare corpus with counts that are at least the specified value.
132
70
133
-
For more information, see [Running Parameterized Queries](https://cloud.google.com/bigquery/docs/parameterized-queries) in the BigQuery documentation.
71
+
For more information, see [Running parameterized queries](https://cloud.google.com/bigquery/docs/parameterized-queries) in the BigQuery documentation.
A dataset is contained within a specific [project](https://cloud.google.com/bigquery/docs/projects). Datasets are top-level containers that are used to organize and control access to your [tables](https://cloud.google.com/bigquery/docs/tables) and [views](https://cloud.google.com/bigquery/docs/views). A table or view must belong to a dataset, so you need to create at least one dataset before [loading data into BigQuery](https://cloud.google.com/bigquery/loading-data-into-bigquery).
101
+
A dataset is contained within a specific [project](https://cloud.google.com/bigquery/docs/projects). Datasets are top-level containers that are used to organize and control access to your [tables](https://cloud.google.com/bigquery/docs/tables) and [views](https://cloud.google.com/bigquery/docs/views). A table or view must belong to a dataset. You need to create at least one dataset before [loading data into BigQuery](https://cloud.google.com/bigquery/loading-data-into-bigquery).
244
102
245
103
246
104
```python
@@ -253,7 +111,7 @@ dataset = client.create_dataset(dataset_id) # API request
253
111
254
112
## Write query results to a destination table
255
113
256
-
For more information, see [Writing Query Results](https://cloud.google.com/bigquery/docs/writing-results) in the BigQuery documentation.
114
+
For more information, see [Writing query results](https://cloud.google.com/bigquery/docs/writing-results) in the BigQuery documentation.
257
115
258
116
259
117
```python
@@ -274,9 +132,6 @@ query_job.result() # Waits for the query to finish
274
132
print("Query results loaded to table {}".format(table_ref.path))
275
133
```
276
134
277
-
Query results loaded to table /projects/your-project-id/datasets/your_new_dataset/tables/your_new_table_id
278
-
279
-
280
135
## Load data from a pandas DataFrame to a new table
281
136
282
137
@@ -301,12 +156,9 @@ job.result() # Waits for table load to complete.
301
156
print("Loaded dataframe to {}".format(table_ref.path))
302
157
```
303
158
304
-
Loaded dataframe to /projects/your-project-id/datasets/your_new_dataset/tables/monty_python
305
-
306
-
307
159
## Load data from a local file to a table
308
160
309
-
The example below demonstrates how to load a local CSV file into a new or existing table. See [SourceFormat](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.SourceFormat.html#google.cloud.bigquery.job.SourceFormat) in the Python client library documentation for a list of available source formats. For more information, see [Loading Data into BigQuery from a Local Data Source](https://cloud.google.com/bigquery/docs/loading-data-local) in the BigQuery documentation.
161
+
The following example demonstrates how to load a local CSV file into a new table. See [SourceFormat](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.SourceFormat.html#google.cloud.bigquery.job.SourceFormat) in the Python client library documentation for a list of available source formats. For more information, see [Loading Data into BigQuery from a local data source](https://cloud.google.com/bigquery/docs/loading-data-local) in the BigQuery documentation.
310
162
311
163
312
164
```python
@@ -332,12 +184,9 @@ print('Loaded {} rows into {}:{}.'.format(
332
184
job.output_rows, dataset_id, table_ref.path))
333
185
```
334
186
335
-
Loaded 50 rows into your_new_dataset:/projects/your-project-id/datasets/your_new_dataset/tables/us_states_from_local_file.
187
+
## Load data from Cloud Storage to a table
336
188
337
-
338
-
## Load data from Google Cloud Storage to a table
339
-
340
-
The example below demonstrates how to load a local CSV file into a new or existing table. See [SourceFormat](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.SourceFormat.html#google.cloud.bigquery.job.SourceFormat) in the Python client library documentation for a list of available source formats. For more information, see [Introduction to Loading Data from Cloud Storage](https://cloud.google.com/bigquery/docs/loading-data-cloud-storage) in the BigQuery documentation.
189
+
The following example demonstrates how to load a local CSV file into a new table. See [SourceFormat](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.SourceFormat.html#google.cloud.bigquery.job.SourceFormat) in the Python client library documentation for a list of available source formats. For more information, see [Introduction to loading data from Cloud Storage](https://cloud.google.com/bigquery/docs/loading-data-cloud-storage) in the BigQuery documentation.
0 commit comments