Skip to content

Tracing SDK - Processors get clone of SpanData which is incorrect #2726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cijothomas opened this issue Feb 27, 2025 · 2 comments · May be fixed by #2895
Open

Tracing SDK - Processors get clone of SpanData which is incorrect #2726

cijothomas opened this issue Feb 27, 2025 · 2 comments · May be fixed by #2895

Comments

@cijothomas
Copy link
Member

https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/src/trace/span.rs#L232-L233

Each processor gets a clone

  1. This violates the spec as changes done by one Processor is not reflected in the next.
  2. Performance issues due to clone.

Log SDK has changed the design to pass &mut instead which is performant and is spec-compliant. Cloning can be done for any processor (like BatchProcessor), if it want to store it for batching purposes.

@cijothomas cijothomas added this to the Tracing SDK Stable milestone Feb 27, 2025
@cijothomas
Copy link
Member Author

@TommyCpp Opened this as discussed in the community call.

@paullegranddc
Copy link
Contributor

Actually to me the current API fits the spec well. The spec says Even if the passed Span may be technically writable, since it's already ended at this point, modifying it is not allowed. link. So changes done by one Processor should probably not be reflected in the next
Owned data is technically "mutable", but the mutation only affects the copy you are holding which is in the spirit of the spec IMO.

Passing an &mut to on_end creates two new issues:

  • From just the API it's not obvious that mutable should not be called. A badly written SpanProcessor could modify the data and potentially cause non-deterministic behaviours depending on the order in which SpanProcessors are called
  • Passing a reference instead of owned data means that at least one extra copy is still needed to pass the data to the exporter. In particular, the BatchExporter needs to own the SpanData because of it's asynchronicity. This extra copy currently doesn't happen if you only have a single SpanProcessor https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/src/trace/span.rs#L224

I think we could design better api could be designed because there are two very different use cases (which are not really handled by the spec, since it mostly considers languages with a GC):

  • SpanProcessor that need to read the span data in on_end (or do nothing...)
  • SpanProcessor that need to take ownership of the span data in on_end

It doesn't make sense to give a cloned value to SpanProcessor that only need to just read, and a reference that SpanProcessor that need to take ownership if no other SpanProcessor accesses the data latter...

So I'm thinking of something like this:

trait SpanProcessor {
  /// FinishedSpan is passed mutably, but the span data itself cannot mutated because the field is private
  fn on_end(&self, span: &mut FinishedSpan);
}

pub struct FinishedSpan {
  data: Option<SpanData>,
  is_last_processor: bool,
  is_data_consumed: bool,
}

impl FinishedSpan {
  /// returns an read only reference to the span data
  fn read(&self) -> &SpanData {
    if self.is_data_consumed { panic!("") }
    return &self.data.unwrap();
  }

  /// returns an owned copy of the data
  /// This function will panic if it is called twice within the same SpanProcessor::on_end function
  /// If it is called by the last span processor in the list it will not allocate new data 
  fn consume(&mut self) -> SpanData {
    if self.is_data_consumed { panic!("") }
    self.data_consumed = true;
    if self.is_last_processor {
      self.data.take()
    } else {
      self.data.unwrap().clone()
    }
  }
}

// Then there are two cases:

/// Read only on_end span processor
impl SpanProcessor for SpanEndLogger {
  fn on_end(&self, span: &mut FinishedSpan) {
    // does not allocate if there are mutliple processors
    println!("{:?}", span.read().span_context.trace_id());
  } 
}

/// Read only on_end span processor
impl SpanProcessor for SimpleProcessor {
  fn on_end(&self, span: &mut FinishedSpan) {
    // does not allocate if this is the only processor left
    self.exporter.export(span.consume());
  } 
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants