Skip to content

Allow to override global tracer_provider after tracers creation #1445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

anton-ryzhov
Copy link
Contributor

@anton-ryzhov anton-ryzhov commented Dec 4, 2020

Description

Allow to override global tracer_provider after tracers creation.

With this change any module may do

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

and wire that tracer to its code (e.g. by applying start_as_current_span decorator).

And such modules may be safely imported to a main app before initializing/configuring real tracer provider. All tracers created before (on import time) will use that provider.

Fixes # (issue)
#1159
#1276
https://gitter.im/open-telemetry/opentelemetry-python?at=5fc8be67f46e246609860465

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

submodule.py

from opentelemetry import trace

tracer = trace.get_tracer(__name__)


@tracer.start_as_current_span("submodule decorator created before configuration")
def echo(text):
    with tracer.start_as_current_span("submodule context manager"):
        print(text)

main.py

import submodule

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
    ConsoleSpanExporter,
    SimpleExportSpanProcessor,
)

tracer = trace.get_tracer("__main__")


@tracer.start_as_current_span("another decorator created before configuration")
def main():
    with tracer.start_as_current_span("submodule context manager"):
        submodule.echo("Hello world from OpenTelemetry Python!")

def init():
    trace.set_tracer_provider(TracerProvider())
    trace.get_tracer_provider().add_span_processor(
        SimpleExportSpanProcessor(ConsoleSpanExporter())
    )

init()
main()

Does This PR Require a Contrib Repo Change?

Answer the following question based on these examples of changes that would require a Contrib Repo Change:

  • The OTel specification has changed which prompted this PR to update the method interfaces of opentelemetry-api/ or opentelemetry-sdk/

  • The method interfaces of opentelemetry-instrumentation/ have changed

  • The method interfaces of test/util have changed

  • Scripts in scripts/ that were copied over to the Contrib repo have changed

  • Configuration files that were copied over to the Contrib repo have changed (when consistency between repositories is applicable) such as in

    • pyproject.toml
    • isort.cfg
    • .flake8
  • When a new .github/CODEOWNER is added

  • Major changes to project information, such as in:

    • README.md
    • CONTRIBUTING.md
  • Yes. - Link to PR:

  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@anton-ryzhov anton-ryzhov requested review from a team, owais and aabmass and removed request for a team December 4, 2020 11:54
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 4, 2020

CLA Signed

The committers are authorized under a signed CLA.

@anton-ryzhov anton-ryzhov marked this pull request as draft December 4, 2020 11:54
@anton-ryzhov anton-ryzhov force-pushed the override-global branch 3 times, most recently from 4c9d348 to 069b7e3 Compare December 4, 2020 12:17
@anton-ryzhov anton-ryzhov marked this pull request as ready for review December 4, 2020 12:29
@anton-ryzhov
Copy link
Contributor Author

Any comments?

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! We briefly discussed this approach in the last SIG meeting (cc @owais).

I am worried this is a little too much magic. I think the user should have to explicitly set the providers at initialization and we should encourage doing it that way.

Another thing is get_tracer() should just be a convenience alias for get_tracer_provider().get_tracer(). Here, they do different things. I guess you would have to move the proxying to be at the provider level to fix this.

_TRACER_PROVIDER = None


@functools.lru_cache(maxsize=128) # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can have a maxsize here. If you call provider.get_tracer() multiple times (even with the SDK one), it will not give you back the same tracer. Here's what the spec says:

It is unspecified whether or under which conditions the same or different Tracer instances are returned from this functions.

So if the cache was full, it would start giving new tracers even though the user is not asking for a new one themself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some linter was complaining about not defined maxsize. But it could be defined as None to make it unbounded — that's make sense here, I'll change.

tracer_provider = get_tracer_provider()
return tracer_provider.get_tracer(*args, **kwargs) # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the user would get a stale DefaultTracer if the cache_clear() below happened while this body was executing. There were no concurrency guarantees around the globals here (because it was intended to be called once in initialization), so might need some other additions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the user would get a stale DefaultTracer if the cache_clear() below happened while this body was executing.

And it will definitely will get old tracer if it is called before set_tracer_provider(). So yes, all traces that are produced before complete setup of tracing will be lost. Exactly the same way as now.

But the main idea of that PR — all traces after configuration will be handled. Currently there is no way to do that. So I think it's an improvement.

because it was intended to be called once in initialization

I agree. That should be configured early, during initialization, before spawning threads etc. All I want here is to do that initialization after most imports and maybe after few other calls.

Copy link
Member

@aabmass aabmass Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the ProxyTracer will be stuck proxying to a DefaultTracer because of the cache. I don't think cache_clear() is thread safe. In which case, all spans for that tracer will be lost even after configuration.

  1. thread 1 has a proxy tracer call _get_current_tracer() for the first time (or when the cache is full). L483 executes right before a context switch
  2. thread 2 calls set_tracer_provider() which calls cache_clear() and empties the caches so proxies will use the real provider
  3. _get_current_tracer() from step 1. above finishes and overwrites the cache with a DefaultTracer for that proxy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, valid point.

In python implementation it uses lock for bounded cache, but not for unbounded.

In real world python version overridden with C which does not Py_BEGIN_ALLOW_THREADS and uses GIL for synchronization.

So that may happen if there will be no C-implementation, set_tracer_provider will be used before spawning threads, if switch will happen exactly there… I think it's rather safe to do it this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find. We do support PyPy and I'd rather be safe than sorry regarding when threads are spawned. I believe wsgi/asgi runners could start multiple threads early on outside of the programmer's control?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use a cache for this? The fact that we need to look at implementation details of different run times makes not want to use it for this purpose. I feel like we are abusing this by using it in a place where we don't really need a cache.

What if the proxy called get_tracer_provider() on every call until it received a non-default one and then used it from there on?

Something like:

class ProxyTracer:

  def __init__(self, instrumentation_name, instrumentation_version):
    self._real_tracer = None
    self._noop_tracer = NoopTracer()
    self._inst_name = instrumentation_name
    self._inst_version = instrumentation_version

  @property
  def _tracer(self):
    if self._real_tracer:
      return self._real_tracer

    provider = get_tracer_provider()
    if not isinstance(provider, DefaultTracerProvider):
        self._real_tracer = provider.get_tracer(self._inst_name, self._inst_version)
        return self._real_tracer

    return self._noop_tracer

  def start_span(self):
    return self._tracer.start_span()

  # rest of tracer api methods that proxy calls to `self._tracer`

Assuming, default/noop tracer provider's get_tracer() method would always return a ProxyTracer instance. Unless I'm missing something obvious, these two changes should solve this problem no matter whether convenience methods are used or Otel API is directly used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the proxy called get_tracer_provider() on every call until it received a non-default one and then used it from there on?

And what if real won't be provided at all because tracing is disabled for this instance. Traceable instances may work faster than where tracing is disabled?

Something like:

Didn't you just rewritten cache mechanism by yourself? Then why not to use standard one?

My implementation reuses same tracer instance for all proxies, your creates new for each.

I don't think that is critical to have exactly one instance, but why not if it's so easy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't you just rewritten cache mechanism by yourself? Then why not to use standard one?

No. It's not a (LRU) cache, just a plain wrapper that is very simple and clear to read/understand. When I see something using an LRU cache, it usually means the code has been added as a performance optimization and we can safely assume that the cache can be invalidated at any time. Our problem is not that creating tracers is an expensive operation. Our problem is that we need to be able create a tracer implementation that can lazily upgrade itself from a noop to a real implementation. Just because a solution for another problem "works" doesn't mean we should use it.

What if the proxy called get_tracer_provider() on every call until it received a non-default one and then used it from there on?

And what if real won't be provided at all because tracing is disabled for this instance. Traceable instances may work faster than where tracing is disabled?

If tracing is disabled somehow then the tracer won't work as expected.

@anton-ryzhov
Copy link
Contributor Author

Thanks for your reply.

I think the user should have to explicitly set the providers at initialization and we should encourage doing it that way.

But how to practically implement that in real world?

That also depends on definition of "user". Are them only "end-users" — authors of an application (executable entry point)? Shouldn't the library also be used by authors of other libraries?

See my example in the first post — imagine I'm writing only submodule.py. How to use opentelemetry there?

Should I set some provider there before calling get_tracer? — It's impossible, library shouldn't know/care about it, only importer must define/configure that.

Should I only dynamically create tracer on call:

def echo(text):
    with trace.get_tracer(__name__).start_as_current_span("submodule context manager"):
        print(text)

?
That forbids using of rather handy decorator style usage. And that will also create one-shot tracers for every call — not very efficient.

And as author of an application — should I then configure and set TracerProvider before importing anything else, any 3rd-party and any own module?

And what if I need to read configuration to setup opentelemetry? What if that config should be read from an external source? And client to that source also should be instrumented? Catch 22.

Another thing is get_tracer() should just be a convenience alias for get_tracer_provider().get_tracer(). Here, they do different things. I guess you would have to move the proxying to be at the provider level to fix this.

I also tried that first, but it didn't work well. I made a ProxyTracerProvider which always returned ProxyTracer, and allowed to bound a real TracerProvider for it. But to make everything work as expected that ProxyTracerProvider must be a singleton and must wrap exactly one TracerProvider. And that provider also should proxy all provider's calls as well. In the end it becomes rather same code, but with extra complexity for no reason.

Provided implementation is more lightweight, straightforward and backward-compatible.

@aabmass
Copy link
Member

aabmass commented Dec 9, 2020

That also depends on definition of "user". Are them only "end-users" — authors of an application (executable entry point)?

And as author of an application — should I then configure and set TracerProvider before importing anything else, any 3rd-party and any own module?

Yes I think that is the intention of OTel. For libraries authors, they should instrument only with the opentelemetry-api and the "users" (i.e. person writing the executable entry point) should configure the providers before the import statements.

And what if I need to read configuration to setup opentelemetry? What if that config should be read from an external source? And client to that source also should be instrumented? Catch 22.

This is a good use case, but I don't think this PR would solve it either, right? The spans created by that client would be lost either way. Btw, you can already do something similar to this PR by setting the global provider and then adding additional config to it later if it needs to be fetched from other instrumented modules. That way you only need to set provider before other imports:

# before other imports
trace.set_tracer_provider(TracerProvider())

# do any imports in any order you like ...
from myconfig import fetch_otel_config
config = fetch_otel_config()

trace.get_tracer_provider().add_span_processor(
    BatchExportSpanProcessor(config.create_span_exporter())
)

What do you think? Same as this PR, all spans before adding the processor will be lost.

Provided implementation is more lightweight, straightforward and backward-compatible.

If get_tracer() and get_tracer_provider().get_tracer() do different things its not backward compatible tho? It is also confusing IMO.

This function is a convenience wrapper for
opentelemetry.trace.TracerProvider.get_tracer.

@anton-ryzhov
Copy link
Contributor Author

anton-ryzhov commented Dec 10, 2020

This is a good use case, but I don't think this PR would solve it either, right? The spans created by that client would be lost either way.

I'm not trying to catch traces happened before the configuration — it's impractical. With this PR I'm allowing to postpone configuration to a later step — after imports but before main business logic execution.

Btw, you can already do something similar to this PR by setting the global provider and then adding additional config to it later if it needs to be fetched from other instrumented modules.

Hm… okay, didn't know that hack. This will make late configuration possible.

But I still think it's too hacky and limited.

Is there only TracerProvider provider to set? If yes — then why it's not the default? Maybe then SDK should set it on import?

But if there will be an alternate SDK implementation, that won't allow to decide on it at later step.

Anyway, the rule that some custom code must be executed at the very beginning, even before imports, to allow configuration — looks unexpected, strange and unobvious requirement.

If get_tracer() and get_tracer_provider().get_tracer() do different things its not backward compatible tho?

I said "more compatible", not "completely". Add of ProxyTracerProvider will change APIs more. Now it's rather the same.

If anyone called get_tracer_provider().get_tracer() directly, then maybe he/she knows why to do so. For me it's rather expected to get a static tracer of exact provider with get_tracer_provider().get_tracer(), even if get_tracer() returns a proxy.

It is also confusing IMO

The comment should be changed, yes.

@aabmass
Copy link
Member

aabmass commented Dec 10, 2020

Anyway, the rule that some custom code must be executed at the very beginning, even before imports, to allow configuration — looks unexpected, strange and unobvious requirement.

Thanks for the feedback, I agree it's not obvious and should be better documented. I'm gonna bring this discussion into the issue, because I think we just need to make a decision on the approach.

@anton-ryzhov
Copy link
Contributor Author

Branch rebased onto current master

@lzchen
Copy link
Contributor

lzchen commented Jan 21, 2021

@aabmass @lonewolf3739
Have your comments been addressed?

@srikanthccv
Copy link
Member

@aabmass @lonewolf3739
Have your comments been addressed?

My comments have been addressed.

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow review everyone!

@anton-ryzhov I think both the comments I made here (about differing behavior) would be resolved if we proxy all the way to the tracer provider level. Is it very nasty to implement it this way?

"""
if tracer_provider is None:
tracer_provider = get_tracer_provider()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anton-ryzhov it looks like get_tracer() is no longer a simple shortcut for get_tracer_provider().get_tracer():

  • If user has already set the global provider, they behave the same
  • If user has set OTEL_PYTHON_TRACER_PROVIDER, they behave the same
  • If user has NOT set either of the above, then get_tracer() will give proxies and get_tracer_provider().get_tracer() will give no-ops.

Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_tracer always returns proxy. It may be no-op or not depending on whether tracer provider is set

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think trace.get_tracer() and get_tracer_provider().get_tracer() should behave the same. IMO this implementation should not touch trace.get_tracer() and instead update default tracer provider's get_tracer() method to make this happen.
  2. Ideally, it would be nice if a proxy is only returned when necessary but this is not a huge deal.

I added another comment above proposing a simpler/dumber solution but I'm not sure if I'm missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the comment. What exactly breaks here? Doesn't same thing break if we patch the convenience method instead of the real one?

One might argue this doesn't break anything if the object returned satisfies the tracer interface and in some case actually makes tracing viable. It would actually be a bug fix in that case. No one is relying on this edge-case to disable tracing for the lifetime of their process :)

I strongly think this should be solved at the inner most layer so it works in every possible scenario instead of fixing one wrapper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When get_tracer_provider should return a proxy, when _TRACER_PROVIDER and when a result of _load_trace_provider? How it should distinguish?

If you know how to redo it better — maybe you can replace this PR with your own? It's hard for me to guess and not reasonable to speculate and copypaste your short samples in comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK here is a working POC doing what I was suggesting. It solves the issues linked in this PR. The solution aligns with what was agreed upon in the SIG and implementation is clear enough to understand/maintain. This also does not fix a single convenience method but instead fixes the issue no matter how a user tries to fetch a tracer. Does not always return a proxy so no unnecessary attribute lookups for cases where this is not a problem. Returns the same tracer always but can be updated to return same tracer for unique combination of instrumentation name + library version using a simple dict or a cache like you did.

https://github.com/open-telemetry/opentelemetry-python/pull/1726/files

Comment on lines +502 to +503
instrumenting_module_name=instrumenting_module_name,
instrumenting_library_version=instrumenting_library_version,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current implementation does not use the same tracer if you do pass the same instrumenting_module_name and instrumenting_library_version. Idk if this a gap in our implementation of the spec or intended, but the proxy behavior is to return use same one underneath:

from opentelemetry.sdk.trace import TracerProvider

tracer_provider = TracerProvider()
args = ("foo", "version1")
tracer_1 = tracer_provider.get_tracer(*args)
tracer_2 = tracer_provider.get_tracer(*args)

# Current impl, False
tracer_1 is tracer_2

#  --------------
from opentelemetry import trace

tracer_3 = trace.get_tracer(*args)
tracer_4 = trace.get_tracer(*args)

trace.set_tracer_provider(TracerProvider())

# Proxy impl, True
tracer_3._get_current_tracer() is tracer_4._get_current_tracer()

@open-telemetry/python-approvers – thoughts? I don't think changes the behavior of exporter metrics but I'm not super familiar with tracing internals

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the spec I think returning the same or different tracers is fine here, I prefer the new behaviour of returning the same tracer

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#get-a-tracer

Base automatically changed from master to main January 29, 2021 16:49
anton-ryzhov and others added 2 commits February 2, 2021 17:58
Co-authored-by: Aaron Abbott <[email protected]>
Co-authored-by: Aaron Abbott <[email protected]>
@anton-ryzhov
Copy link
Contributor Author

@aabmass,

I think both the comments I made here (about differing behavior) would be resolved if we proxy all the way to the tracer provider level. Is it very nasty to implement it this way?

If user has NOT set either of the above, then get_tracer() will give proxies and get_tracer_provider().get_tracer() will give no-ops.

I don't think it's even possible without braking backward compatibility. get_tracer_provider() has a side effect of setting tracer provider, default or configured via OTEL_PYTHON_TRACER_PROVIDER.

There are 2 options then:

  • get_tracer_provider continues to call _load_trace_provider and never returns ProxyTracerProvider() — all this PR's code will never work.
  • get_tracer_provider returns ProxyTracerProvider() until real provider is set via set_tracer_provider — users must set provider explicitly (I'd prefer to have that from the beginning). Do you want to change that behavior?

@anton-ryzhov
Copy link
Contributor Author

anton-ryzhov commented Mar 5, 2021

Any updates on this issue?

@owais
Copy link
Contributor

owais commented Mar 5, 2021

I think we somewhat solved this with the env var solution but I wasn't deeply involved in the discussion so don't remember exactly.

get_tracer_provider returns ProxyTracerProvider() until real provider is set via set_tracer_provider — users must set provider explicitly (I'd prefer to have that from the beginning). Do you want to change that behavior?

I think providing a proxy that returns noop provider until a real provider is set would be a good solutions. There are a couple of issues that come with it.

First one is what to do with the spans that are generated before a real global provider is registered. Should the proxy tracer hold on to them and flush once real provider is available? Should they be dropped? I think dropping with a warning is a good enough tradeoff especially since this can be progressively improved later without breaking changes.

There were also other concerns related to consistency with how MeterProviders work that I don't remember exactly.

@anton-ryzhov it might be worth bringing this up in the next SIG call and discussing there.

@anton-ryzhov
Copy link
Contributor Author

I think we somewhat solved this with the env var solution but I wasn't deeply involved in the discussion so don't remember exactly.

I thought you've decided here.

@owais
Copy link
Contributor

owais commented Mar 5, 2021

Oh, I guess I forgot about that :)

Co-authored-by: Diego Hurtado <[email protected]>
@lzchen
Copy link
Contributor

lzchen commented Mar 10, 2021

@aabmass @lonewolf3739 @owais @ocelotl
Have y our comments been addressed?

@ocelotl
Copy link
Contributor

ocelotl commented Mar 10, 2021

@aabmass @lonewolf3739 @owais @ocelotl
Have y our comments been addressed?

Mine were very minor, approving this PR.

@owais
Copy link
Contributor

owais commented Mar 10, 2021

I've clearly forgotten about this PR so I'll need to review once more but if we do get 2 approvals, I don't want to block it. I'll still review and open issues if I find anything I think can be improved in that case.

CHANGELOG.md Outdated
@@ -46,6 +46,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Adding `opentelemetry-distro` package to add default configuration for
span exporter to OTLP
([#1482](https://github.com/open-telemetry/opentelemetry-python/pull/1482))
- Allow to override global `tracer_provider` after `tracer`s creation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still true? We don't allow this to happen but instead have a proxy provider until a real one is set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add link to PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


If tracer_provider is ommited the current configured one is used.
If tracer_provider is omitted it returns a _ProxyTracer
which redirects calls to a current instrumentation library.
"""
if tracer_provider is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this. Looks like we are assuming user will always pass a valid/configured tracer_provider. Why not keep existing behavior and check if the provider passed in to the function or fetched by get_tracer_provider() is an initialized one or a default/noop one, and then return a proxy or real tracer accordingly.

Also, doesn't this only solve the issue when users call trace.get_tracer()? What if someone calls get_tracer_provider().get_tracer() instead? Wouldn't we end up with the same problem? Unless I'm missing something, looks like we are only solving the problem partially.

Would it make more sense to patch API package's default tracer provider and have it always return a proxy tracer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this. Looks like we are assuming user will always pass a valid/configured tracer_provider. Why not keep existing behavior and check if the provider passed in to the function or fetched by get_tracer_provider() is an initialized one or a default/noop one, and then return a proxy or real tracer accordingly.

get_tracer_provider returns either default one, configured or set by set_tracer_provider. How can you distinguish them?

Also, doesn't this only solve the issue when users call trace.get_tracer()? What if someone calls get_tracer_provider().get_tracer() instead? Wouldn't we end up with the same problem? Unless I'm missing something, looks like we are only solving the problem partially.

There was the same question above: #1445 (comment)

Would it make more sense to patch API package's default tracer provider and have it always return a proxy tracer?

Then you'll need another "real default tracer" that "default proxy tracer" will use when no real provided. Which problem will it solve?

tracer_provider = get_tracer_provider()
return tracer_provider.get_tracer(*args, **kwargs) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use a cache for this? The fact that we need to look at implementation details of different run times makes not want to use it for this purpose. I feel like we are abusing this by using it in a place where we don't really need a cache.

What if the proxy called get_tracer_provider() on every call until it received a non-default one and then used it from there on?

Something like:

class ProxyTracer:

  def __init__(self, instrumentation_name, instrumentation_version):
    self._real_tracer = None
    self._noop_tracer = NoopTracer()
    self._inst_name = instrumentation_name
    self._inst_version = instrumentation_version

  @property
  def _tracer(self):
    if self._real_tracer:
      return self._real_tracer

    provider = get_tracer_provider()
    if not isinstance(provider, DefaultTracerProvider):
        self._real_tracer = provider.get_tracer(self._inst_name, self._inst_version)
        return self._real_tracer

    return self._noop_tracer

  def start_span(self):
    return self._tracer.start_span()

  # rest of tracer api methods that proxy calls to `self._tracer`

Assuming, default/noop tracer provider's get_tracer() method would always return a ProxyTracer instance. Unless I'm missing something obvious, these two changes should solve this problem no matter whether convenience methods are used or Otel API is directly used.

"""
if tracer_provider is None:
tracer_provider = get_tracer_provider()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think trace.get_tracer() and get_tracer_provider().get_tracer() should behave the same. IMO this implementation should not touch trace.get_tracer() and instead update default tracer provider's get_tracer() method to make this happen.
  2. Ideally, it would be nice if a proxy is only returned when necessary but this is not a huge deal.

I added another comment above proposing a simpler/dumber solution but I'm not sure if I'm missing something.

@anton-ryzhov
Copy link
Contributor Author

Replaced by #1726

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants