Skip to content

Stream Django SQL queries and add flag to toggle their streaming #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 8, 2019
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
CHANGELOG
=========

unreleased
==========
* feature: Stream Django ORM SQL queries and add flag to toggle their streaming

2.2.0
=====
* feature: Added context managers on segment/subsegment capture. `PR97 <https://github.com/aws/aws-xray-sdk-python/pull/97>`_.
Expand Down Expand Up @@ -32,11 +36,11 @@ CHANGELOG
* **Breaking**: The original sampling modules for local defined rules are moved from `models.sampling` to `models.sampling.local`.
* **Breaking**: The default behavior of `patch_all` changed to selectively patches libraries to avoid double patching. You can use `patch_all(double_patch=True)` to force it to patch ALL supported libraries. See more details on `ISSUE63 <https://github.com/aws/aws-xray-sdk-python/issues/63>`_
* **Breaking**: The latest `botocore` that has new X-Ray service API `GetSamplingRules` and `GetSamplingTargets` are required.
* **Breaking**: Version 2.x doesn't support pynamodb and aiobotocore as it requires botocore >= 1.11.3 which isn’t currently supported by the pynamodb and aiobotocore libraries. Please continue to use version 1.x if you’re using pynamodb or aiobotocore until those haven been updated to use botocore > = 1.11.3.
* **Breaking**: Version 2.x doesn't support pynamodb and aiobotocore as it requires botocore >= 1.11.3 which isn’t currently supported by the pynamodb and aiobotocore libraries. Please continue to use version 1.x if you’re using pynamodb or aiobotocore until those haven been updated to use botocore > = 1.11.3.
* feature: Environment variable `AWS_XRAY_DAEMON_ADDRESS` now takes an additional notation in `tcp:127.0.0.1:2000 udp:127.0.0.2:2001` to set TCP and UDP destination separately. By default it assumes a X-Ray daemon listening to both UDP and TCP traffic on `127.0.0.1:2000`.
* feature: Added MongoDB python client support. `PR65 <https://github.com/aws/aws-xray-sdk-python/pull/65>`_.
* bugfix: Support binding connection in sqlalchemy as well as engine. `PR78 <https://github.com/aws/aws-xray-sdk-python/pull/78>`_.
* bugfix: Flask middleware safe request teardown. `ISSUE75 <https://github.com/aws/aws-xray-sdk-python/issues/75>`_.
* bugfix: Support binding connection in sqlalchemy as well as engine. `PR78 <https://github.com/aws/aws-xray-sdk-python/pull/78>`_.
* bugfix: Flask middleware safe request teardown. `ISSUE75 <https://github.com/aws/aws-xray-sdk-python/issues/75>`_.


1.1.2
Expand Down Expand Up @@ -68,7 +72,7 @@ CHANGELOG
* bugfix: Fixed an issue where arbitrary fields in trace header being dropped when calling downstream.
* bugfix: Fixed a compatibility issue between botocore and httplib patcher. `ISSUE48 <https://github.com/aws/aws-xray-sdk-python/issues/48>`_.
* bugfix: Fixed a typo in sqlalchemy decorators. `PR50 <https://github.com/aws/aws-xray-sdk-python/pull/50>`_.
* Updated `README` with more usage examples.
* Updated `README` with more usage examples.

0.97
====
Expand Down
20 changes: 19 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,19 @@ with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
pass
```

### Trace SQL queries
By default, if no other value is provided to `.configure()`, SQL trace streaming is enabled
for all the supported DB engines. Those currently are:
- Any engine attached to the Django ORM.
- Any engine attached to SQLAlchemy.

The behaviour can be toggled by sending the appropriate `stream_sql` value, for example:
```python
from aws_xray_sdk.core import xray_recorder

xray_recorder.configure(service='fallback_name', stream_sql=True)
```

### Patch third-party libraries

```python
Expand All @@ -260,7 +273,8 @@ libs_to_patch = ('boto3', 'mysql', 'requests')
patch(libs_to_patch)
```

### Add Django middleware
### Django
#### Add middleware

In django settings.py, use the following.

Expand All @@ -275,6 +289,10 @@ MIDDLEWARE = [
# ... other middlewares
]
```
#### SQL tracing
If Django's ORM is patched - either using the `AUTO_INSTRUMENT = True` in your settings file
or explicitly calling `patch_db()` - the SQL query trace streaming can then be enabled or
disabled updating the `STREAM_SQL` variable in your settings file. It is enabled by default.

### Add Flask middleware

Expand Down
15 changes: 14 additions & 1 deletion aws_xray_sdk/core/recorder.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ def __init__(self):
self._dynamic_naming = None
self._aws_metadata = copy.deepcopy(XRAY_META)
self._origin = None
self._stream_sql = True

if type(self.sampler).__name__ == 'DefaultSampler':
self.sampler.load_settings(DaemonConfig(), self.context)
Expand All @@ -81,7 +82,8 @@ def configure(self, sampling=None, plugins=None,
daemon_address=None, service=None,
context=None, emitter=None, streaming=None,
dynamic_naming=None, streaming_threshold=None,
max_trace_back=None, sampler=None):
max_trace_back=None, sampler=None,
stream_sql=True):
"""Configure global X-Ray recorder.

Configure needs to run before patching thrid party libraries
Expand Down Expand Up @@ -130,6 +132,7 @@ class to have your own implementation of the streaming process.
maximum number of subsegments within a segment.
:param int max_trace_back: The maxinum number of stack traces recorded
by auto-capture. Lower this if a single document becomes too large.
:param bool stream_sql: Whether SQL query texts should be streamed.

Environment variables AWS_XRAY_DAEMON_ADDRESS, AWS_XRAY_CONTEXT_MISSING
and AWS_XRAY_TRACING_NAME respectively overrides arguments
Expand Down Expand Up @@ -159,6 +162,8 @@ class to have your own implementation of the streaming process.
self.streaming_threshold = streaming_threshold
if max_trace_back:
self.max_trace_back = max_trace_back
if stream_sql is not None:
self.stream_sql = stream_sql

if plugins:
plugin_modules = get_plugin_modules(plugins)
Expand Down Expand Up @@ -548,3 +553,11 @@ def max_trace_back(self):
@max_trace_back.setter
def max_trace_back(self, value):
self._max_trace_back = value

@property
def stream_sql(self):
return self._stream_sql

@stream_sql.setter
def stream_sql(self, value):
self._stream_sql = value
1 change: 1 addition & 0 deletions aws_xray_sdk/ext/django/apps.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ def ready(self):
dynamic_naming=settings.DYNAMIC_NAMING,
streaming_threshold=settings.STREAMING_THRESHOLD,
max_trace_back=settings.MAX_TRACE_BACK,
stream_sql=settings.STREAM_SQL,
)

# if turned on subsegment will be generated on
Expand Down
1 change: 1 addition & 0 deletions aws_xray_sdk/ext/django/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
'DYNAMIC_NAMING': None,
'STREAMING_THRESHOLD': None,
'MAX_TRACE_BACK': None,
'STREAM_SQL': True,
}

XRAY_NAMESPACE = 'XRAY_RECORDER'
Expand Down
56 changes: 47 additions & 9 deletions aws_xray_sdk/ext/django/db.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,62 @@
import copy
import logging
import importlib

from django.db import connections

from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.ext.dbapi2 import XRayTracedCursor

log = logging.getLogger(__name__)


def patch_db():

for conn in connections.all():
module = importlib.import_module(conn.__module__)
_patch_conn(getattr(module, conn.__class__.__name__))


def _patch_conn(conn):

attr = '_xray_original_cursor'
class DjangoXRayTracedCursor(XRayTracedCursor):
def execute(self, query, *args, **kwargs):
if xray_recorder.stream_sql:
_previous_meta = copy.copy(self._xray_meta)
self._xray_meta['sanitized_query'] = query
result = super(DjangoXRayTracedCursor, self).execute(query, *args, **kwargs)
if xray_recorder.stream_sql:
self._xray_meta = _previous_meta
return result

def executemany(self, query, *args, **kwargs):
if xray_recorder.stream_sql:
_previous_meta = copy.copy(self._xray_meta)
self._xray_meta['sanitized_query'] = query
result = super(DjangoXRayTracedCursor, self).executemany(query, *args, **kwargs)
if xray_recorder.stream_sql:
self._xray_meta = _previous_meta
return result

def callproc(self, proc, args):
if xray_recorder.stream_sql:
_previous_meta = copy.copy(self._xray_meta)
self._xray_meta['sanitized_query'] = proc
result = super(DjangoXRayTracedCursor, self).callproc(proc, args)
if xray_recorder.stream_sql:
self._xray_meta = _previous_meta
return result


def _patch_cursor(cursor_name, conn):
attr = '_xray_original_{}'.format(cursor_name)

if hasattr(conn, attr):
log.debug('django built-in db already patched')
log.debug('django built-in db {} already patched'.format(cursor_name))
return

if not hasattr(conn, cursor_name):
log.debug('django built-in db does not have {}'.format(cursor_name))
return

setattr(conn, attr, conn.cursor)
setattr(conn, attr, getattr(conn, cursor_name))

meta = {}

Expand All @@ -45,7 +78,12 @@ def cursor(self, *args, **kwargs):
if user:
meta['user'] = user

return XRayTracedCursor(
self._xray_original_cursor(*args, **kwargs), meta)
original_cursor = getattr(self, attr)(*args, **kwargs)
return DjangoXRayTracedCursor(original_cursor, meta)

setattr(conn, cursor_name, cursor)

conn.cursor = cursor

def _patch_conn(conn):
_patch_cursor('cursor', conn)
_patch_cursor('chunked_cursor', conn)
3 changes: 2 additions & 1 deletion aws_xray_sdk/ext/sqlalchemy/util/decorators.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ def wrapper(*args, **kw):
if isinstance(arg, XRayQuery):
try:
sql = parse_bind(arg.session.bind)
sql['sanitized_query'] = str(arg)
if xray_recorder.stream_sql:
sql['sanitized_query'] = str(arg)
except Exception:
sql = None
if sql is not None:
Expand Down
87 changes: 87 additions & 0 deletions tests/ext/django/test_db.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
import django

import pytest

from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core.context import Context
from aws_xray_sdk.ext.django.db import patch_db


@pytest.fixture(scope='module', autouse=True)
def setup():
django.setup()
xray_recorder.configure(context=Context(),
context_missing='LOG_ERROR')
patch_db()


@pytest.fixture(scope='module')
def user_class(setup):
from django.db import models
from django_fake_model import models as f

class User(f.FakeModel):
name = models.CharField(max_length=255)
password = models.CharField(max_length=255)

return User


@pytest.fixture(
autouse=True,
params=[
False,
True,
]
)
@pytest.mark.django_db
def func_setup(request, user_class):
xray_recorder.stream_sql = request.param
xray_recorder.clear_trace_entities()
xray_recorder.begin_segment('name')
try:
user_class.create_table()
yield
finally:
xray_recorder.clear_trace_entities()
try:
user_class.delete_table()
finally:
xray_recorder.end_segment()


def _assert_query(sql_meta):
if xray_recorder.stream_sql:
assert 'sanitized_query' in sql_meta
assert sql_meta['sanitized_query']
assert sql_meta['sanitized_query'].startswith('SELECT')
else:
if 'sanitized_query' in sql_meta:
assert sql_meta['sanitized_query']
# Django internally executes queries for table checks, ignore those
assert not sql_meta['sanitized_query'].startswith('SELECT')


def test_all(user_class):
""" Test calling all() on get all records.
Verify we run the query and return the SQL as metadata"""
# Materialising the query executes the SQL
list(user_class.objects.all())
subsegment = xray_recorder.current_segment().subsegments[-1]
sql = subsegment.sql
assert sql['database_type'] == 'sqlite'
_assert_query(sql)


def test_filter(user_class):
""" Test calling filter() to get filtered records.
Verify we run the query and return the SQL as metadata"""
# Materialising the query executes the SQL
list(user_class.objects.filter(password='mypassword!').all())
subsegment = xray_recorder.current_segment().subsegments[-1]
sql = subsegment.sql
assert sql['database_type'] == 'sqlite'
_assert_query(sql)
if xray_recorder.stream_sql:
assert 'mypassword!' not in sql['sanitized_query']
assert '"password" = %s' in sql['sanitized_query']
13 changes: 9 additions & 4 deletions tests/ext/flask_sqlalchemy/test_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,15 @@ class User(db.Model):
password = db.Column(db.String(255), nullable=False)


@pytest.fixture()
def session():
@pytest.fixture(
params=[
False,
True,
],
)
def session(request):
"""Test Fixture to Create DataBase Tables and start a trace segment"""
xray_recorder.configure(service='test', sampling=False, context=Context())
xray_recorder.configure(service='test', sampling=False, context=Context(), stream_sql=request.param)
xray_recorder.clear_trace_entities()
xray_recorder.begin_segment('SQLAlchemyTest')
db.create_all()
Expand All @@ -41,8 +46,8 @@ def test_all(capsys, session):
User.query.all()
subsegment = find_subsegment_by_annotation(xray_recorder.current_segment(), 'sqlalchemy', 'sqlalchemy.orm.query.all')
assert subsegment['annotations']['sqlalchemy'] == 'sqlalchemy.orm.query.all'
assert subsegment['sql']['sanitized_query']
assert subsegment['sql']['url']
assert bool(subsegment['sql'].get('sanitized_query', None)) is xray_recorder.stream_sql


def test_add(capsys, session):
Expand Down
1 change: 1 addition & 0 deletions tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ deps =
future
# the sdk doesn't support earlier version of django
django >= 1.10, <2.0
django-fake-model
pynamodb >= 3.3.1
psycopg2
pg8000
Expand Down