Skip to content

Reduce Redundant Payload Size by Avoiding Per-Event Context #265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
apuig opened this issue May 29, 2025 · 0 comments
Open

Reduce Redundant Payload Size by Avoiding Per-Event Context #265

apuig opened this issue May 29, 2025 · 0 comments
Assignees

Comments

@apuig
Copy link

apuig commented May 29, 2025

Is your feature request related to a problem? Please describe.
When migrating from analytics-java to analytics-kotlin, I noticed a significant difference in how the context is handled in transmitted messages.

  • In analytics-kotlin, the ContextPlugin populates a context field in every BaseEvent.
  • In contrast, analytics-java uses the batch-level context for properties like library and instanceId, as described in the Segment HTTP API docs.

This leads to increased payload size, especially when custom application information is added to the context, which is problematic given the batch size limits and the fact that all events are persisted.

Describe the solution you'd like
I would like to see an option in analytics-kotlin to support batch-level context, similar to analytics-java, so that shared context properties do not need to be duplicated in every event within a batch.

Describe alternatives you've considered
As far as I recall, there is no explicit Batch entity in the analytics-kotlin library; the concept of a batch only materializes when the first message is appended to the temporary files for persistence.

Additional context
The default context size is already non-trivial:

"context":{"library":{"name":"analytics-kotlin","version":"1.19.2"},"instanceId":"c0e42b7b-88f9-432c-a04f-d15fefd3056a"}

This grows further with custom context. Since batch size is limited and all events are persisted, supporting batch-level context could help optimize payload size and storage.

Additional request
For the same reason, it would also be beneficial to remove unnecessary empty attributes from the payload. For example:

"integrations": {},
"_metadata": {
	"bundled": [],
	"unbundled": [],
	"bundledIds": []
}

Is this something that should be handled in a Plugin using JsonUtils.updateJsonObject with a remove operation, or is there a recommended approach for cleaning up these empty fields before sending events?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants