Skip to content

Allow config for switching between thread locals and contextvars for execution_context #451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ramshaw888 opened this issue Apr 8, 2019 · 4 comments

Comments

@ramshaw888
Copy link
Contributor

ramshaw888 commented Apr 8, 2019

Is your feature request related to a problem? Please describe.
I'm trying to get some preliminary support working for apps that use gevent.

In Python<3.7 the agent uses thread locals to back the execution_context object, while in Python>=3.7 it uses context vars. The functionality to switch between the two was introduced in #291 which was 'a partial "import" of the asyncio work done in #252"'.

I've had some success in using the agent with Python<3.7 project because gevent monkey patches thread locals to be "greenlet-safe" (see here). In >=3.7 however the agent defaults to using contextvars which are not monkey patched by gevent, resulting in race conditions where multiple flask request transactions share the same contextvars.

Describe the solution you'd like
My proposed solution is to have an optional config value which determines whether to use thread locals, or contextvars as the "backing" for the execution_context object. If the config value is not present, then it should fall back to the existing logic.

Describe alternatives you've considered
Another option is to check whether or not thread locals has been monkey patched by gevent, and then assuming use of thread locals instead of contextvars. I understand that elastic has not fully committed to gevent support which is why I think an optional config might be a better solution to begin with.

Additional context
Additionally, as suggested by @bhodorog (#16 (comment)) this would pave the way for alternative "thread" storage options like werkzeug locals.

@beniwohli
Copy link
Contributor

@ramshaw888 thanks for the excellent write up! I think introducing a setting for this makes a lot of sense, with the dotted import path of the execution context class, similar to what we do with the transport class:

transport_class = _ConfigValue("TRANSPORT_CLASS", default="elasticapm.transport.http.AsyncTransport", required=True)

self._transport = import_string(self.config.transport_class)(self._api_endpoint_url, **transport_kwargs)

What complicates this a bit is that configuration is bound to the agent instance, while this is setting should apply process-wide. This isn't the first time something like process-wide settings come up (e.g. we'd would like to make it possible to configure auto-instrumentation via setting), so it's probably a good time to come up with a strategy for it.

@bhodorog
Copy link

bhodorog commented Apr 9, 2019

@beniwohli I wonder if the (emergency) solution proposed by @ramshaw888 should be configurable. In this case the reason for using elasticapm.context.threadlocal for elasticapm.traces.execution_context is to ensure using the agent within a gevent based application will correctly create a transaction for each greenlet. There is no point in configuring a gevent based application to use elasticapm.context.contextvars.
So, I'm thinking since using gevent.mokey.patch_* is process-wide and quite definitive (you cannot unpatch the current process) we should actually check if gevent.monkey has been used when the elasticapm agent is being initialized and enforce the use of elasticapm.context.threadlocal.
What do you think?

Also, just a complementary note to @ramshaw888's excellent summary:
The reason gevent is not monkey-patching contextvars is due to it being implemented in c, while gevent requires the code to be monkey-patched to be pure python.

@ramshaw888
Copy link
Contributor Author

On second thoughts I do agree with @bhodorog's suggestion to enforce (or at least default to) using elasticapm.context.threadlocal when we know it has been monkeypatched. Just had a play around and it seems pretty trivial to do this -- #453

I originally suggested the optional config approach because it's more generic and would be helpful in other use cases other this. The 2 possible ways I see this could be achieved are:

  • Having a separate config (maybe using envvars? config file?) which would be read at the time the client app imports elasticapm.
  • Attaching the execution_context (and other process-wide objects) to the agent instance.

Without knowing the codebase super well yet, I strongly lean towards the latter option. Splitting the config sounds dirty, and I don't really see a strong reason why the execution_context can't be tied to the agent.

@beniwohli
Copy link
Contributor

@ramshaw888 I don't think attaching execution context to the agent instance is an option with the current API. Thinks like elasticapm.capture_span() don't have a reference to an agent instance and need the execution context to get a reference to the current transaction.

At some point we might need to redesign those APIs that currently work without an agent instance to require one, but that is a larger piece of work.

I like your solution in #453 as a "quick fix" to get gevent back to working (more or less).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants