Skip to content

patch_all should take a flag for an alternative patching strategy #63

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xplorld opened this issue May 30, 2018 · 6 comments
Closed

patch_all should take a flag for an alternative patching strategy #63

xplorld opened this issue May 30, 2018 · 6 comments

Comments

@xplorld
Copy link
Contributor

xplorld commented May 30, 2018

I created a lambda calling itself recursively for 3 times.
The lambda is set to“Enable active tracing”, and called patch_all()
It uses boto3 to send request.

Code:

import boto3
import json
import requests
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all

patch_all()

client = boto3.client('lambda')

def lambda_handler(event, context):
    action = event['action']
    return globals()[action](event, context)


# input event['i']
# output {'result': j}, where j = 1^2 + 2^2 + .. + i^2
def square_sum(event, context):
    i = event['i']
    if i <= 0:
        return {'result': 0}
    if i == 1:
        return {'result': 1}
    payload_str = json.dumps({'action': 'square_sum', 'i': i-1})
    payload_b = bytes(payload_str, 'utf-8')
    tailcall = client.invoke(FunctionName='Test-Xray-Caller',
                             InvocationType='RequestResponse',
                             Payload=payload_b)
    tailcall_result = json.loads(tailcall['Payload'].read())
    result = tailcall_result['result'] + i * i
    return {'result': result}

Expected graph

image

Real graph

image

Thinkings

In my understanding, a "remote" service means X-Ray does not recognize what endpoint it is, e.g. a non-AWS service. A call to AWS Lambda, shall have no "remote" services, with only one clean edge from the caller AWS service to the callee AWS service.

This redundant edge occurs more than here: When I call from lambda to Elastic Beanstalk endpoint, it still has "remote" services.

How do I get rid of them?

@haotianw465
Copy link
Contributor

Hi, I answered this question in a stack overflow post here.

But I copy/paste my answer here regardless:

The node you see comes from this PR #19 for adding support to httplib. If you use patch_all httplib operations will also be captured. In your case the dependency tree is boto3 -> botocore -> vendored requests -> httplib. So your AWS subsegment will have children subsegments that representing operations executed by httplib.

The service graph however will render a "remote" subsegment as a node which is what you see. The recommended way is to explicit patch the library you want to capture to avoid unexpected behavior.

@xplorld
Copy link
Contributor Author

xplorld commented May 31, 2018

Ok I understand what happens.

However, I do not think it is a good design to leave patch_all() as it is now, for a user may imagine patch_all() is a ready-to-use one-line patch that just traces every call exactly once.

Maybe we can do something to avoid this. For example, whenever a httplib call is traced, the tracing code examines whether the xray_recorder examines current tracing environment.

  • context is segment, trace.
  • context is subsegment which is local, trace.
  • context is subsegment which is aws or remote, DO NOT trace.

@xplorld
Copy link
Contributor Author

xplorld commented May 31, 2018

Actually IMHO the SDK should ignore any new subsegment in a remote subsegment context.

@haotianw465
Copy link
Contributor

Thank you for the suggestions. These are very good points. There are use cases where user want to trace both boto3 and httplib as boto3 tracing doesn't have as much granularity as httplib (retries, establishing connection etc). In that case although the service graph is not in an ideal shape due to lack of a mechanism to do custom aggregation, but trace timeline has much more information on the lowest python level information.

But yes to your point the redundant nodes are confusing. I propose to have patch_all takes a flag of supporting indirect dependency or not. By default if library A and B are both supported but A depends on B, patch_all() will not patch B. But with a flag set to True, if will patch all libraries regardless of their relationships.

Please let me know your thoughts on this.

@xplorld
Copy link
Contributor Author

xplorld commented Jun 1, 2018

Extra argument on patch_all is great. I'd love to see that.

@haotianw465 haotianw465 changed the title Redundant Edge in Service Map patch_all should takes a flag for an alternative patching strategy Jun 4, 2018
@haotianw465 haotianw465 changed the title patch_all should takes a flag for an alternative patching strategy patch_all should take a flag for an alternative patching strategy Jun 4, 2018
@haotianw465
Copy link
Contributor

This is added in release 2.0.0 with default to not do double patching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants