-
Notifications
You must be signed in to change notification settings - Fork 145
Support boto3 S3 transfer manager #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I just spent over a day digging deep into why the segment could not be found (suspecting other components at first). The current situation has a bad failure mode, the "additive" nature of "patch_all" actually breaks the basic AWS API component, boto3 with an error that points you into a completely wrong direction (namely, that no segment is defined, when clearly, there is). My minimal reproducer: from aws_xray_sdk.core import patch_all, xray_recorder
patch_all()
xray_recorder.begin_segment("pytest")
import boto3
bucket = boto3.resource('s3').create_bucket(Bucket="test23243534534634")
obj = bucket.Object('file.txt')
obj.put(Body="body")
assert b"body" == obj.get()['Body'].read()
import tempfile
with tempfile.NamedTemporaryFile(mode="r") as f:
obj.download_file(f.name)
assert "body" == open(f.name).read() raises:
|
Hey @haotianw465 thanks for the detail, been grappling with this issue for a long time without knowing. Any suggestion on what we should do to allow tracing with S3 TransferManager in the meantime? Or is it such a headache that you'd recommend we skip tracing when using S3 TransferManager for the time being? |
Is there any news on this, or an easy workaround? I'd like to keep tracing enabled if possible. |
I was recently burned by this. Any update on a fix? |
Unfortunately, xray sdk and other popular sdks such as OpenTelemetry don't support this case yet. Instead of supporting new feature in XRay SDKs, we want to support it in OpenTelemetry first. Will cut a request in OTel community. |
I've spent ~3 hours to troubleshoot this. I would vote for giving a meaningful error message at least, to not waste anyone's hours. |
Hi @eercanayar, did you find any workaround for this? I'd like to disable X-Ray tracing only for the S3 client but I'm not entirely sure that's currently possible. |
I have done something like this to temporarily disable X-Ray, and at least in my case it was enough to get around the issue. Maybe it helps?
And then call where needed.
It's a bit fiddly, but at the time I didn't find a better workaround. |
Hi @radimsuckr, I've migrated to put-object API since my goal was to upload a single file. I'll check this issue internally to see if our SDK team can resolve this, at least not triggering an Exception for non-supported APIs. |
@SpoonOfDoom thank you, I'm trying something similar but I'm afraid I won't get anywhere because of multiple threads overwriting each other's value of the env variable. 😞 @eercanayar sadly that's not possible for me, we need the multipart download/upload. Of course the best outcome would be a fix for this issue, but acceptable workaround for now is not triggering the error as you mentioned. Thank you for checking. We set |
Hi @eercanayar, did you have by any chance time for the check you mentioned? Thanks! |
Hi all, Sorry for the frustration with this issue. I've filed #334 to suggest the short term fix that was mentioned. We hope to get to it soon. In the meantime, even though as @wangzlei mentioned above this is a hard problem that OpenTelemetry (OTel) Python has not even solved with their AWS instrumentation, we still recommend onboarding with them because OTel Python is getting a lot of support and it's compatibility with AWS X-Ray has been GA since September of 2021. It should be easy to on-board, should give you more features and easier support 🙂 |
Hi @radimsuckr, I was away for several weeks but we were discussing the most optimal short term fix, internally. Now, we're discussing on #334 to have a third option as Thanks @NathanielRN for your bias for action! 🚀 |
the patch_all method wrongly patches some of the aws sdk. In order to use xray with threads you have to pass the thread context to the new threads, let's say I have a function: def example_function(test, trace_entity=None):
if trace_entity is not None:
xray_recorder.set_trace_entity(trace_entity)
return test
# When you invoke the function:
with concurrent.futures.ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
executor.submit(example_function, test, trace_entity=xray_recorder.get_trace_entity()) If you don't do this you end up with errors similar to this. This issue may not seem urgent but my challenge is this is causing a hard time finding where our code is breaking traces, since I can't troubleshoot OUR exceptions for missing context - I can't even get to them even with several code changes already trying to work around this. This is a pretty significant bug when basic AWS SDK functions don't work after calling patch_all. For now my work around will be to remove patch_all, but I also need AWS SDK timing to know if S3 one zone express solves some performance problems I'm working on :-( Hopefully someone familiar with patch_all can test my above ideas and maybe get a PR made. |
boto3 has a few high-level S3 API calls like
upload_file
,download_file
which depends on s3transfer to perform multi-threaded object puts/gets to increase through put. This will result in aSegmentNotFoundException
as the X-Ray recorder tries to capture the "real" S3 API call but it loses the context because the actual http call is in a worker thread from the thread pool.The S3 transfer manager under the hood uses
futures.ThreadPoolExecutor
per https://github.com/boto/s3transfer/blob/develop/s3transfer/futures.py#L370 but there is no proper API onboto3
client level to propagate context from user code. And requiring user code changes is not a good customer experience.The SDK should somehow monkey-patch S3 transfer manager so that it automatically propagate context to all worker threads so each http outbound call is captured and attached to its parent segment or subsegment properly.
Library
django-storages
or any storage library that supports S3 as back-end and usesboto3
certain APIs might face the same issue.More detailed technical deep dive could be found here: #4.
The text was updated successfully, but these errors were encountered: