Skip to content

Reduce Overhead of RepositoryData Cache for Large Repositories #66587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

original-brownbear
Copy link
Member

This adds caching the fact that RepositoryData was too large to be
cached so we don't serialize it over and over just to find out we can't
cache it (which obviously is quite costly for larger RepositoryData instances).

relates #66042

This adds caching the fact that `RepositoryData` was too large to be
cached so we don't serialize it over and over just to find out we can't
cache it.
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Dec 18, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

/**
* Cached serialized repository data or placeholder to keep track of the fact that data for a generation was too large to be cached.
*/
private static final class CachedRepositoryData {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured I'd encapsulate this a little (even though it adds a bunch of lines) because the Tuple access is quite hard to read and it's used in a bunch of spots.

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

Thanks Tanguy!

@original-brownbear original-brownbear merged commit 502d10f into elastic:master Dec 18, 2020
@original-brownbear original-brownbear deleted the better-efficiency-repository-data-cache branch December 18, 2020 16:53
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Dec 18, 2020
…ic#66587)

This adds caching the fact that `RepositoryData` was too large to be
cached so we don't serialize it over and over just to find out we can't
cache it.
original-brownbear added a commit that referenced this pull request Dec 18, 2020
… (#66609)

This adds caching the fact that `RepositoryData` was too large to be
cached so we don't serialize it over and over just to find out we can't
cache it.
@original-brownbear original-brownbear restored the better-efficiency-repository-data-cache branch January 4, 2021 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.12.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants