diff --git a/bigquery/dml/.gitignore b/bigquery/dml/.gitignore deleted file mode 100644 index f4058b86817..00000000000 --- a/bigquery/dml/.gitignore +++ /dev/null @@ -1 +0,0 @@ -sample_db_export.sql diff --git a/bigquery/dml/README.rst b/bigquery/dml/README.rst deleted file mode 100644 index b9ce08f97f2..00000000000 --- a/bigquery/dml/README.rst +++ /dev/null @@ -1,150 +0,0 @@ -.. This file is automatically generated. Do not edit this file directly. - -Google BigQuery Python Samples -=============================================================================== - -This directory contains samples for Google BigQuery. `Google BigQuery`_ is Google's fully managed, petabyte scale, low cost analytics data warehouse. BigQuery is NoOps—there is no infrastructure to manage and you don't need a database administrator—so you can focus on analyzing data to find meaningful insights, use familiar SQL, and take advantage of our pay-as-you-go model. - - -This sample shows how to use Data Manipulation Language with BigQuery. - - -.. _Google BigQuery: https://cloud.google.com/bigquery/docs - -Setup -------------------------------------------------------------------------------- - - -Authentication -++++++++++++++ - -Authentication is typically done through `Application Default Credentials`_, -which means you do not have to change the code to authenticate as long as -your environment has credentials. You have a few options for setting up -authentication: - -#. When running locally, use the `Google Cloud SDK`_ - - .. code-block:: bash - - gcloud auth application-default login - - -#. When running on App Engine or Compute Engine, credentials are already - set-up. However, you may need to configure your Compute Engine instance - with `additional scopes`_. - -#. You can create a `Service Account key file`_. This file can be used to - authenticate to Google Cloud Platform services from any environment. To use - the file, set the ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable to - the path to the key file, for example: - - .. code-block:: bash - - export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json - -.. _Application Default Credentials: https://cloud.google.com/docs/authentication#getting_credentials_for_server-centric_flow -.. _additional scopes: https://cloud.google.com/compute/docs/authentication#using -.. _Service Account key file: https://developers.google.com/identity/protocols/OAuth2ServiceAccount#creatinganaccount - -Install Dependencies -++++++++++++++++++++ - -#. Install `pip`_ and `virtualenv`_ if you do not already have them. - -#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+. - - .. code-block:: bash - - $ virtualenv env - $ source env/bin/activate - -#. Install the dependencies needed to run the samples. - - .. code-block:: bash - - $ pip install -r requirements.txt - -.. _pip: https://pip.pypa.io/ -.. _virtualenv: https://virtualenv.pypa.io/ - -Samples -------------------------------------------------------------------------------- - -Populate sample DB -+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ - - - -To run this sample: - -.. code-block:: bash - - $ python populate_db.py - - usage: populate_db.py [-h] total_users host user password db - - Command-line tool to simulate user actions and write to SQL database. - - positional arguments: - total_users How many simulated users to create. - host Host of the database to write to. - user User to connect to the database. - password Password for the database user. - db Name of the database to write to. - - optional arguments: - -h, --help show this help message and exit - - -Insert SQL -+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ - - - -To run this sample: - -.. code-block:: bash - - $ python insert_sql.py - - usage: insert_sql.py [-h] project default_dataset sql_path - - Sample that runs a file containing INSERT SQL statements in Big Query. - - This could be used to run the INSERT statements in a mysqldump output such as - - mysqldump --user=root --password='secret-password' --host=127.0.0.1 --no-create-info sample_db --skip-add-locks > sample_db_export.sql - - To run, first create tables with the same names and columns as the sample - database. Then run this script. - - python insert_sql.py my-project my_dataset sample_db_export.sql - - positional arguments: - project Google Cloud project name - default_dataset Default BigQuery dataset name - sql_path Path to SQL file - - optional arguments: - -h, --help show this help message and exit - - - - -The client library -------------------------------------------------------------------------------- - -This sample uses the `Google Cloud Client Library for Python`_. -You can read the documentation for more details on API usage and use GitHub -to `browse the source`_ and `report issues`_. - -.. _Google Cloud Client Library for Python: - https://googlecloudplatform.github.io/google-cloud-python/ -.. _browse the source: - https://github.com/GoogleCloudPlatform/google-cloud-python -.. _report issues: - https://github.com/GoogleCloudPlatform/google-cloud-python/issues - - -.. _Google Cloud SDK: https://cloud.google.com/sdk/ \ No newline at end of file diff --git a/bigquery/dml/README.rst.in b/bigquery/dml/README.rst.in deleted file mode 100644 index 92fd9cd6df4..00000000000 --- a/bigquery/dml/README.rst.in +++ /dev/null @@ -1,29 +0,0 @@ -# This file is used to generate README.rst - -product: - name: Google BigQuery - short_name: BigQuery - url: https://cloud.google.com/bigquery/docs - description: > - `Google BigQuery`_ is Google's fully managed, petabyte scale, low cost - analytics data warehouse. BigQuery is NoOps—there is no infrastructure to - manage and you don't need a database administrator—so you can focus on - analyzing data to find meaningful insights, use familiar SQL, and take - advantage of our pay-as-you-go model. - -description: | - This sample shows how to use Data Manipulation Language with BigQuery. - -setup: -- auth -- install_deps - -samples: -- name: Populate sample DB - file: populate_db.py - show_help: true -- name: Insert SQL - file: insert_sql.py - show_help: true - -cloud_client_library: true diff --git a/bigquery/dml/insert_sql.py b/bigquery/dml/insert_sql.py deleted file mode 100644 index 2798be66248..00000000000 --- a/bigquery/dml/insert_sql.py +++ /dev/null @@ -1,78 +0,0 @@ -#!/usr/bin/env python - -# Copyright 2016 Google Inc. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -"""Sample that runs a file containing INSERT SQL statements in Big Query. - -This could be used to run the INSERT statements in a mysqldump output such as - - mysqldump --user=root \ - --password='secret-password' \ - --host=127.0.0.1 \ - --no-create-info sample_db \ - --skip-add-locks > sample_db_export.sql - -To run, first create tables with the same names and columns as the sample -database. Then run this script. - - python insert_sql.py my-project my_dataset sample_db_export.sql -""" - -# [START insert_sql] -import argparse - -from google.cloud import bigquery - - -def insert_sql(project, default_dataset, sql_path): - """Run all the SQL statements in a SQL file.""" - - client = bigquery.Client(project=project) - - with open(sql_path) as f: - for line in f: - line = line.strip() - - if not line.startswith('INSERT'): - continue - - print('Running query: {}{}'.format( - line[:60], - '...' if len(line) > 60 else '')) - query = client.run_sync_query(line) - - # Set use_legacy_sql to False to enable standard SQL syntax. - # This is required to use the Data Manipulation Language features. - # - # For more information about enabling standard SQL, see: - # https://cloud.google.com/bigquery/sql-reference/enabling-standard-sql - query.use_legacy_sql = False - query.default_dataset = client.dataset(default_dataset) - query.run() - - -if __name__ == "__main__": - parser = argparse.ArgumentParser( - description=__doc__, - formatter_class=argparse.RawDescriptionHelpFormatter) - parser.add_argument('project', help='Google Cloud project name') - parser.add_argument( - 'default_dataset', help='Default BigQuery dataset name') - parser.add_argument('sql_path', help='Path to SQL file') - - args = parser.parse_args() - - insert_sql(args.project, args.default_dataset, args.sql_path) -# [END insert_sql] diff --git a/bigquery/dml/insert_sql_test.py b/bigquery/dml/insert_sql_test.py deleted file mode 100644 index c85aeff1726..00000000000 --- a/bigquery/dml/insert_sql_test.py +++ /dev/null @@ -1,35 +0,0 @@ -# Copyright 2016 Google Inc. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import os.path - -from insert_sql import insert_sql - -PROJECT = os.environ['GCLOUD_PROJECT'] - - -def test_insert_sql(capsys): - sql_path = os.path.join( - os.path.dirname(__file__), - 'resources', - 'insert_sql_test.sql') - - insert_sql(PROJECT, 'test_dataset', sql_path) - - out, _ = capsys.readouterr() - - assert ( - 'INSERT INTO `test_table` (`Name`) VALUES (\'hello world\')' - in out) diff --git a/bigquery/dml/populate_db.py b/bigquery/dml/populate_db.py deleted file mode 100755 index 8b63d7897ed..00000000000 --- a/bigquery/dml/populate_db.py +++ /dev/null @@ -1,182 +0,0 @@ -#!/usr/bin/env python - -# Copyright 2016 Google Inc. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -"""Command-line tool to simulate user actions and write to SQL database. -""" - -from __future__ import division - -import argparse -import datetime -import random -import uuid - -from six.moves.urllib import parse -import sqlalchemy -from sqlalchemy.ext import declarative -import sqlalchemy.orm - - -SECONDS_IN_DAY = 24 * 60 * 60 -SECONDS_IN_2016 = 366 * SECONDS_IN_DAY - -# Unix timestamp for the beginning of 2016. -# http://stackoverflow.com/a/19801806/101923 -TIMESTAMP_2016 = ( - datetime.datetime(2016, 1, 1, 0, 0, 0) - - datetime.datetime.fromtimestamp(0)).total_seconds() - - -Base = declarative.declarative_base() - - -class User(Base): - __tablename__ = 'Users' - - id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True) - date_joined = sqlalchemy.Column(sqlalchemy.DateTime) - - -class UserSession(Base): - __tablename__ = 'UserSessions' - - id = sqlalchemy.Column(sqlalchemy.String(length=36), primary_key=True) - user_id = sqlalchemy.Column( - sqlalchemy.Integer, sqlalchemy.ForeignKey('Users.id')) - login_time = sqlalchemy.Column(sqlalchemy.DateTime) - logout_time = sqlalchemy.Column(sqlalchemy.DateTime) - ip_address = sqlalchemy.Column(sqlalchemy.String(length=40)) - - -def generate_users(session, num_users): - users = [] - - for userid in range(1, num_users + 1): - year_portion = random.random() - date_joined = datetime.datetime.fromtimestamp( - TIMESTAMP_2016 + SECONDS_IN_2016 * year_portion) - user = User(id=userid, date_joined=date_joined) - users.append(user) - session.add(user) - - session.commit() - return users - - -def random_ip(): - """Choose a random example IP address. - - Examples are chosen from the test networks described in - https://tools.ietf.org/html/rfc5737 - """ - network = random.choice([ - '192.0.2', # RFC-5737 TEST-NET-1 - '198.51.100', # RFC-5737 TEST-NET-2 - '203.0.113', # RFC-5737 TEST-NET-3 - ]) - ip_address = '{}.{}'.format(network, random.randrange(256)) - return ip_address - - -def simulate_user_session(session, user, previous_user_session=None): - """Simulates a single session (login to logout) of a user's history.""" - login_time = user.date_joined - - if previous_user_session is not None: - login_time = ( - previous_user_session.logout_time + - datetime.timedelta( - days=1, seconds=random.randrange(SECONDS_IN_DAY))) - - session_id = str(uuid.uuid4()) - user_session = UserSession( - id=session_id, - user_id=user.id, - login_time=login_time, - ip_address=random_ip()) - user_session.logout_time = ( - login_time + - datetime.timedelta(seconds=(1 + random.randrange(59)))) - session.commit() - session.add(user_session) - return user_session - - -def simulate_user_history(session, user): - """Simulates the entire history of activity for a single user.""" - total_sessions = random.randrange(10) - previous_user_session = None - - for _ in range(total_sessions): - user_session = simulate_user_session( - session, user, previous_user_session) - previous_user_session = user_session - - -def run_simulation(session, users): - """Simulates app activity for all users.""" - - for n, user in enumerate(users): - if n % 100 == 0 and n != 0: - print('Simulated data for {} users'.format(n)) - - simulate_user_history(session, user) - - print('COMPLETE: Simulated data for {} users'.format(len(users))) - - -def populate_db(session, total_users=3): - """Populate database with total_users simulated users and their actions.""" - users = generate_users(session, total_users) - run_simulation(session, users) - - -def create_session(engine): - Base.metadata.drop_all(engine) - Base.metadata.create_all(engine) - Session = sqlalchemy.orm.sessionmaker(bind=engine) - return Session() - - -def main(total_users, host, user, password, db_name): - engine = sqlalchemy.create_engine( - 'mysql+pymysql://{user}:{password}@{host}/{db_name}'.format( - user=user, - password=parse.quote_plus(password), - host=host, - db_name=db_name)) - session = create_session(engine) - - try: - populate_db(session, total_users) - finally: - session.close() - - -if __name__ == '__main__': - parser = argparse.ArgumentParser( - description=__doc__, - formatter_class=argparse.RawDescriptionHelpFormatter) - parser.add_argument( - 'total_users', help='How many simulated users to create.', type=int) - parser.add_argument('host', help='Host of the database to write to.') - parser.add_argument('user', help='User to connect to the database.') - parser.add_argument('password', help='Password for the database user.') - parser.add_argument('db', help='Name of the database to write to.') - - args = parser.parse_args() - - main(args.total_users, args.host, args.user, args.password, args.db) diff --git a/bigquery/dml/populate_db_test.py b/bigquery/dml/populate_db_test.py deleted file mode 100644 index 66775834c82..00000000000 --- a/bigquery/dml/populate_db_test.py +++ /dev/null @@ -1,34 +0,0 @@ -# Copyright 2016 Google Inc. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import sqlalchemy - -from populate_db import create_session, populate_db - - -def test_populate_db_populates_users(): - engine = sqlalchemy.create_engine('sqlite://') - session = create_session(engine) - - try: - populate_db(session, total_users=10) - - connection = session.connection().connection - cursor = connection.cursor() - cursor.execute('SELECT COUNT(*) FROM Users') - assert cursor.fetchone()[0] == 10 - cursor.execute('SELECT COUNT(*) FROM UserSessions') - assert cursor.fetchone()[0] >= 10 - finally: - session.close() diff --git a/bigquery/dml/requirements.txt b/bigquery/dml/requirements.txt deleted file mode 100644 index 84423deb971..00000000000 --- a/bigquery/dml/requirements.txt +++ /dev/null @@ -1,5 +0,0 @@ -flake8==3.3.0 -google-cloud-bigquery==0.25.0 -PyMySQL==0.7.11 -six==1.10.0 -SQLAlchemy==1.1.12 diff --git a/bigquery/dml/resources/insert_sql_test.sql b/bigquery/dml/resources/insert_sql_test.sql deleted file mode 100644 index 42f5dabdacb..00000000000 --- a/bigquery/dml/resources/insert_sql_test.sql +++ /dev/null @@ -1,6 +0,0 @@ --- This file is used to test ../insert_sql.py. --- These are comments. --- Each query to be executed should be on a single line. - -/* Another ignored line. */ -INSERT INTO `test_table` (`Name`) VALUES ('hello world')