Skip to content

Incorporating TagMap into the tracer #8589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 89 commits into
base: master
Choose a base branch
from
Open

Conversation

dougqh
Copy link
Contributor

@dougqh dougqh commented Mar 19, 2025

This change reduces the overhead from constructing spans both in terms of CPU and memory.

The biggest gains come when making use of SpanBuilders for both constructing the span and manipulating the tags on the Span. Span creation throughput with SpanBuilders improves by as much as 45%. startSpan methods commonly used in instrumentations improve by around 20%. In applications, real median response time gains are around to 5-10%.

More importantly, these changes reduce the amount of memory consumed by each Span reducing allocation / garbage collection pressure.

In a "real" application, the change is less noticeable when memory is plentiful; however, the difference becomes more pronounced when memory is limited. spring-petclinic shows a 17% throughput improvement relative to the current release when memory is constrained to 192M or 128M. At 96M, the difference is negligible 2-3% gain throughput. At 64M, this change becomes a detriment showing a -5% change in throughput.

What Does This Do

These gains are accomplished by changing how tags are stored to use a new Map (TagMap) that excels at Map-to-Map copies. To fully realize the gain, there's additional work to skip tag interceptors when possible. With these changes, the setting of the shared tags on a Span-s is nearly allocation free.

Motivation

The tracer does some Map operations regularly that regular HashMaps aren't good at.

The primary operation of concern being copying Entry-s from Map to Map where every copied Entry requires allocating a new Entry object in the destination Map.

And secondarily, Builder patterns which use defensive copying but also require in-order processing in the Tracer.

TagMap solves both those problems by using immutable Entry objects. By making the Entry objects immutable, the Entry objects can be freely shared between Map instances and between the Builder and a Map.

Additional Notes

To get the full benefit of this new TagMap, both the source Map and the destination Map need to be TagMap-s and the transfer needs to happen through putAll or the TagMap specific putEntry.

Meaning - that to get a significant gain quite a few files had to be modified

Contributor Checklist

dougqh added 5 commits March 18, 2025 08:10
The tracer does some Map operations regularly that regular HashMaps aren't good at.

The primary operation of concern being copying Entry-s from Map to Map where every copied Entry requires allocating a new Entry object.

And secondarily, Builder patterns which use defensive copying but also require in-order processing in the Tracer.

TagMap solves both those problems by using immutable Entry objects.  By making the Entry objects immutable, the Entry objects can be freely shared between Map instances and between the Builder and a Map.

By using these Maps in key places, this change significantly reduce the cost of span construction both in terms of CPU time and memory.  On an ARM 64 machine, span creation benchmarks improve by 15-45% while reducing memory consumption by 10-20%.

To get the benefit of this data structure, both the source Map and the destination Map need to be TagMaps and the transfer needs to happen through putAll or the TagMap specific putEntry.

Meaning - that to get a significant gain quite a few files had to be modified
This bug causes remove to incorrectly remove a whole BucketGroup when the BucketGroup still contains entries.  The intention was opposite only empty BucketGroups should be removed from the chain.

Incorporated tests from prototype that caught this issue.
@dougqh dougqh requested review from a team as code owners March 19, 2025 17:16
@dougqh dougqh requested review from smola, sezen-datadog, ojung and anmarchenko and removed request for a team March 19, 2025 17:16
@@ -72,7 +73,7 @@ public AbstractTestSession(
AgentSpanContext traceContext =
new TagContext(
CIConstants.CIAPP_TEST_ORIGIN,
Collections.emptyMap(),
Copy link
Contributor Author

@dougqh dougqh Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about this particular change. In effect, the constructor previously required the user to pass a mutable map. However if the provided Map was empty, the class would lazily construction a mutable Map to take place of the empty Map.

Because TagMap does not have an O(1) isEmpty, I didn't want to stick with this pattern.

What could be done instead is to pass TagMap.EMPTY and then check via a reference equality check. If others prefer that, I can adjust accordingly.

@@ -101,7 +102,7 @@ public TestImpl(
this.context = new TestContextImpl(coverageStore);

AgentSpanContext traceContext =
new TagContext(CIConstants.CIAPP_TEST_ORIGIN, Collections.emptyMap());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same mutable empty Map issue

@@ -57,7 +58,7 @@ public class DefaultExceptionDebuggerTest {
private ConfigurationUpdater configurationUpdater;
private DefaultExceptionDebugger exceptionDebugger;
private TestSnapshotListener listener;
private Map<String, Object> spanTags = new HashMap<>();
Copy link
Contributor Author

@dougqh dougqh Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the primary type of change that I've made throughout -- replacing HashMaps with TagMap-s.
To get the benefit of TagMap's quick Map-to-Map copying ability, both the source and destination Map need to be TagMap-s.

Currently, this checks fails.  I believe because of a problem with Mock based tests being overfit to the HashMap implementation.
@@ -185,11 +186,13 @@ public static CoreTracerBuilder builder() {
private final DynamicConfig<ConfigSnapshot> dynamicConfig;

/** A set of tags that are added only to the application's root span */
private final Map<String, ?> localRootSpanTags;
private final TagMap localRootSpanTags;
private final boolean localRootSpanTagsNeedIntercept;
Copy link
Contributor Author

@dougqh dougqh Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get the full benefit of TagMap's fast copy ability, I need to be to call TagMap.putAll.

However the TagInterceptor interferes with being able to do that, so I've added a method to TagInterceptor that can check the source Map in advance. If any tag in the source Map requires interception, then "needs intercept" is "true". If no tag needs interception, then "needs intercept" is false.

When "needs intercept" is false, setAllTags can then safely bypass the interceptor and use TagMap.putAll to do the fully optimized copy.

@@ -368,12 +371,22 @@ public CoreTracerBuilder extractor(HttpCodec.Extractor extractor) {
}

public CoreTracerBuilder localRootSpanTags(Map<String, ?> localRootSpanTags) {
this.localRootSpanTags = tryMakeImmutableMap(localRootSpanTags);
this.localRootSpanTags = TagMap.fromMapImmutable(localRootSpanTags);
Copy link
Contributor Author

@dougqh dougqh Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept some of the old methods that take Map instead of TagMap for ease of testing; however, the preferred way is to use TagMap now.

@@ -522,6 +535,63 @@ public CoreTracer build() {
flushOnClose);
}
}

@Deprecated
private CoreTracer(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a few places, I kept constructors that take Map-s instead of TagMap-s. This is mostly concession to the Groovy tests where using Map is far easier.

I may eliminate these over time, but I didn't want to create a giant diff where many test files had to be updated. This diff is big enough as it is.

@@ -588,7 +658,11 @@ private CoreTracer(
spanSamplingRules = SpanSamplingRules.deserializeFile(spanSamplingRulesFile);
}

this.tagInterceptor =
Copy link
Contributor Author

@dougqh dougqh Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to do a bit of reordering to make tagInterceptor available earlier in the constructor, so I can analyze the shared tag sets.

For the curious, I was not able to move the setting of defaultSpanTags down because it has some odd interactions with other parts of the constructor.

@@ -1280,7 +1353,7 @@ public class CoreSpanBuilder implements AgentTracer.SpanBuilder {
private final CoreTracer tracer;

// Builder attributes
private Map<String, Object> tags;
private TagMap.Builder tagBuilder;
Copy link
Contributor Author

@dougqh dougqh Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TagMap.Builder serves as a stand-in for using LinkedHashMap to preserve order. Although, the semantics aren't exactly the same.

LinkedHashMap traverses entries in the original insertion order. Meaning that if you insert: tag1, tag2, tag1. Then there will be a traversal of tag1, tag2.

TagMap.Builder simply records modifications in order, so the traversal order will be tag1 first value, tag2, tag1 second value. Arguably, this is closer to the intended semantics because it makes the behavior of the builder the same as calling setTag.

}
if (value == null) {
tagMap.remove(tag);
tagBuilder.remove(tag);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tagBuilder records removals as a special removal Entry object. Removal Entry objects are never stored in the TagMap.

This design allows tagBuilder to be an in-order ledger of modifications, but also allow Entry objects to be shared between the TagMap.Builder and the TagMap that is produced in the end.

@@ -1510,7 +1584,9 @@ private DDSpanContext buildSpanContext() {
samplingPriority = PrioritySampling.UNSET;
origin = null;
coreTags = null;
coreTagsNeedsIntercept = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could just as easily set needs intercept to true, since interception short-circuits on a null TagMap.

+ (null == coreTags ? 0 : coreTags.size())
+ (null == rootSpanTags ? 0 : rootSpanTags.size())
+ (null == contextualTags ? 0 : contextualTags.size());
final int tagsSize = 0;
Copy link
Contributor Author

@dougqh dougqh Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a further simplification, TagMaps are a fixed size (16) that's big enough to avoid a large number of collisions. But more importantly, TagMap doesn't currently implement an O(1) size method.

In short, the tagsSize calculation just isn't needed and its costly.

If no one objects to this, then I'll probably eliminate tagsSize altogether. Right now, this variable still exists because there's an argument still being passed to the DDSpanContext constructor.

@@ -1629,14 +1708,16 @@ private DDSpanContext buildSpanContext() {
final CharSequence operationName =
this.operationName != null ? this.operationName : resourceName;

final Map<String, ?> mergedTracerTags = traceConfig.mergedTracerTags;
final TagMap mergedTracerTags = traceConfig.mergedTracerTags;
boolean mergedTracerTagsNeedsIntercept = traceConfig.mergedTracerTagsNeedsIntercept;
Copy link
Contributor Author

@dougqh dougqh Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar pattern of checking if interception is needed in advance - in this case, traceConfig will be updated each time a new config is received by remote config.

if (contextualTags != null) {
context.setAllTags(contextualTags);
}
context.setAllTags(mergedTracerTags, mergedTracerTagsNeedsIntercept);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These calls have been updated to call new versions of setAllTags that work with TagMap or TagMap.Builder. For those that takes a TagMap, there's also the ability to pass along previously done calculation for "needs intercept".

Comment on lines 54 to 55
* Because the Entry objects can be shared between multiple TagMaps, the Entry objects cannot contain
* form a link list to handle collisions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to choose only one verb between contain/form

public static final TagMap EMPTY = createEmpty();

private static final TagMap createEmpty() {
return new TagMap().freeze();
Copy link
Contributor

@vandonr vandonr Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use the parameterized ctor to avoid allocating an array of 16 entries that'll never be used ?

Suggested change
return new TagMap().freeze();
return new TagMap(null);

(not sure if null is OK, I haven't read all the code yet :p)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, yeah, not ok, but we could create an empty array.

Comment on lines 226 to 249
public final boolean getBoolean(String tag) {
Entry entry = this.getEntry(tag);
return entry == null ? false : entry.booleanValue();
}

public final int getInt(String tag) {
Entry entry = this.getEntry(tag);
return entry == null ? 0 : entry.intValue();
}

public final long getLong(String tag) {
Entry entry = this.getEntry(tag);
return entry == null ? 0L : entry.longValue();
}

public final float getFloat(String tag) {
Entry entry = this.getEntry(tag);
return entry == null ? 0F : entry.floatValue();
}

public final double getDouble(String tag) {
Entry entry = this.getEntry(tag);
return entry == null ? 0D : entry.doubleValue();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks a bit dangerous to have default values for those, then we cannot really know when we get 0 if the value is really 0 or if the key is absent 🤔
Maybe it's no issue the way it's used... And I don't really have a better proposal other than throwing

Object[] thisBuckets = this.buckets;

int hash = _hash(tag);
int bucketIndex = hash & (thisBuckets.length - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the speed gain from doing a bitmask rather than a modulo worth the downside of discarding the upper bits from the hash ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty standard to use a bitmask to get the bucket index instead of modulo. div is not cheap on cpu.
I don't understand your comment about discarding the upper bits. modulo will give the same result.
the only constraint is the power of 2 for bucket array size.

Copy link
Contributor

@vandonr vandonr Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah if we keep the power of two constraint it doesn't change anything, but if you use modulo you can use prime numbers for the array size, which will be better at shuffling hash bits when modulo-ed.
If you do a bitmask with a size that is a power of two, you effectively only look at the lower bits of the hash
(the most significant bits are just masked away)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the hashes are already "folded in two" in the _hash() function, but that's still 16 bits, of which the upper bits will rarely see the light

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what the standard HashMap in the JDK is doing, nothing new here.

@vandonr
Copy link
Contributor

vandonr commented Apr 7, 2025

It seems it'd be relatively straightforward to keep track of the count in the map and avoid the O(n) price of computing it (or checking for emptiness), was it a deliberate choice to save to memory of the count and the operations of incrementing/decrementing it ?

@dougqh
Copy link
Contributor Author

dougqh commented Apr 9, 2025

I'll admit that the lack of count support partially is down to laziness so far. I'm happy to try implementing it and then we can measure the impact. I'm not really worried about the extra memory - mostly just the extra code to track it, but I imagine the cost is negligible.

dougqh added 13 commits April 22, 2025 13:08
Addressing review comments, using smaller array for EMPTY instance

Given that EMPTY now breaks the rules of fixed number of buckets, added an extra test to make sure putAll EMPTY works as expected
Parameterized TagMapEntryTest making sure to check negatives and positives, maxes and mins for all primitive related tests
Parameterized TagMapEntryTest making sure to check negatives and positives, maxes and mins for all primitive related tests
Adding size tracking to TagMap
- reduced visibility of checkIfEmpty and computeSize
- isEmpty and size are no longer deprecated
- updated callers of computeSize to use size instead

- adding more size checking into tests
- added additional test for putAll with non-fully overlapping maps for source and destination
- restored checkIntegrity method in TagMap
- checkIntegrity is now called by TagMapFuzzTest after each step to further verify correctness
Optimizing the common case of doing putAll into a new empty map / clone

Moved existing putAll code into putAllMerge
Added putAllIntoEmptyMap which is special case of putAll for the common empty destination case (used by clone)

putAll now picks between putAllIntoEmptyMap and putAllMerge based on the map size
Removed immutable which is a bit dangerous and not needed with both freeze and immutableCopy being available

Tweaked special constructor used by createEmpty (and previously by immutable).  Now takes a size to be a bit more explicitly about what is happening.
Introduced EntryChange part class of Entry and EntryRemoval
Renamed TagMap.Builder -> TagMap.Ledger to make the concept a bit more clear

Also renamed Builder.put methods to Ledger.set to be more consistent with TagMap

Update users of TagMap.Builder as needed
Changes TagMap to be an interface with two distinct implementations:
- One that extends HashMap
- One that implements the new map structure

Next change will make it possible to switch between the map implementations via configuration

To try and make both TagMap implementations reasonably efficient, added a more method variations:
- set methods now return void to avoid Entry creation for HashMap implementation
- getAndSet - replaces set for the add cases where the previous Entry is needed

- remove(String) - now returns a boolean to indicate that a change happened rather than returning the value or entry
- getAndRemove - returns an Entry

Many TagMapTest and TagMapLedgerTest have been parameterized to test both implementations of TagMaps

Updated calling code & other tests as needed
@dougqh dougqh requested a review from a team as a code owner May 8, 2025 20:08
dougqh added 2 commits May 8, 2025 16:09
Adding getAndSet calls to the various *Entry tests in TagMapTest
@dougqh
Copy link
Contributor Author

dougqh commented May 9, 2025

It seems it'd be relatively straightforward to keep track of the count in the map and avoid the O(n) price of computing it (or checking for emptiness), was it a deliberate choice to save to memory of the count and the operations of incrementing/decrementing it ?

I've gone ahead and added the count support.

In some places updating count is O(n) itself because of needing to traverse the collision chain. In practice, the chain shouldn't be deep and the constant factor is small, so I think it is worthwhile change.

I did rerun my benchmarks and adding count tracking had no real impact overall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: core Tracer core tag: performance Performance related changes type: enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants