Avoid capturing SnapshotsInProgress$Entry in queue #88707

DaveCTurner · 2022-07-22T07:14:33Z

Today each time there's shards to snapshot we enqueue a lambda which
captures the current SnapshotsInProgress$Entry. This is a pretty
heavyweight object, possibly several MB in size, most of which is not
necessary to capture, and with concurrent snapshots across thousands of
shards we may enqueue many hundreds of slightly different such objects.
With this commit we compute a more efficient representation of the work
to be done by each task in the queue instead.

Relates #77466

Today each time there's shards to snapshot we enqueue a lambda which captures the current `SnapshotsInProgress$Entry`. This is a pretty heavyweight object, possibly several MB in size, most of which is not necessary to capture, and with concurrent snapshots across thousands of shards we may enqueue many hundreds of slightly different such objects. With this commit we compute a more efficient representation of the work to be done by each task in the queue instead. Relates elastic#77466

elasticsearchmachine · 2022-07-22T07:14:56Z

Pinging @elastic/es-distributed (Team:Distributed)

elasticsearchmachine · 2022-07-22T07:14:57Z

Hi @DaveCTurner, I've created a changelog YAML for you.

pxsalehi · 2022-07-22T07:47:15Z

We've had some discussions kind of related to this for #88209.

I thought the lambda that captures the Entry, is only one per snapshot not per shard snapshot, and that snapshot is just one entry for all the indices that have shards for the index on the node, meaning the number of those lambdas grow with concurrent snapshots in the cluster, which I am guessing is much smaller than number of concurrent shard snapshots.

DaveCTurner · 2022-07-22T08:19:35Z

You're right that the lambda is not a per-shard thing but with concurrent snapshots each snapshot could result in potentially many tasks. Each queue entry will only snapshot the shards that are in state INIT, but that could potentially be a single shard that just moved from state QUEUED because it completed in an earlier snapshot. The heap dump in the attached case shows this effect.

But also we permit up to 1000 concurrent snapshots by default, so if each task takes a few MB of heap then that's a few GB of heap unnecessarily burned on this queue.

DaveCTurner · 2022-07-22T08:23:12Z

I wasn't aware of #88209 tho, that looks like a good move but not something I'd want to backport to 7.17. And it suffers the same issue about capturing the Entry rather than just the few bytes needed to describe the task. This'll conflict with that PR but hopefully it's not too hard to resolve those conflicts - we can perhaps rework this to make the conflicts less severe too.

pxsalehi

I understand. Thanks for the explanation.

No, the changes to SnapshotShardsService in #88209 are just for readability. Your PR also addresses that. I could just undo changes to SnapshotShardsService in my PR. There shouldn't be so much conflicts.

original-brownbear

LGTM, nice one! I guess I should analyse a data node heap dump for many-shards snapshotting benchmarks at some point :)

Also makes me wonder if we should make the snapshots-in-progress a proper diffable given how large these objects can get ...

Today each time there's shards to snapshot we enqueue a lambda which captures the current `SnapshotsInProgress$Entry`. This is a pretty heavyweight object, possibly several MB in size, most of which is not necessary to capture, and with concurrent snapshots across thousands of shards we may enqueue many hundreds of slightly different such objects. With this commit we compute a more efficient representation of the work to be done by each task in the queue instead. Relates elastic#77466

elasticsearchmachine · 2022-07-22T12:39:13Z

💔 Backport failed

Status	Branch	Result
❌	7.17	Commit could not be cherrypicked due to conflicts
✅	8.3

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 88707

DaveCTurner · 2022-07-22T12:40:40Z

Also makes me wonder if we should make the snapshots-in-progress a proper diffable given how large these objects can get

Yes some more sharing would have helped here too, although the ShardSnapshotStatus map is big and changes a lot so wouldn't be shared much.

Today each time there's shards to snapshot we enqueue a lambda which captures the current `SnapshotsInProgress$Entry`. This is a pretty heavyweight object, possibly several MB in size, most of which is not necessary to capture, and with concurrent snapshots across thousands of shards we may enqueue many hundreds of slightly different such objects. With this commit we compute a more efficient representation of the work to be done by each task in the queue instead. Relates elastic#77466 Backport of elastic#88707

Today each time there's shards to snapshot we enqueue a lambda which captures the current `SnapshotsInProgress$Entry`. This is a pretty heavyweight object, possibly several MB in size, most of which is not necessary to capture, and with concurrent snapshots across thousands of shards we may enqueue many hundreds of slightly different such objects. With this commit we compute a more efficient representation of the work to be done by each task in the queue instead. Relates #77466 Backport of #88707

Today each time there's shards to snapshot we enqueue a lambda which captures the current `SnapshotsInProgress$Entry`. This is a pretty heavyweight object, possibly several MB in size, most of which is not necessary to capture, and with concurrent snapshots across thousands of shards we may enqueue many hundreds of slightly different such objects. With this commit we compute a more efficient representation of the work to be done by each task in the queue instead. Relates #77466

DaveCTurner · 2022-07-22T15:41:11Z

I opened #88732 to capture #88707 (review)

In elastic#88707 we changed the behaviour here to run the shard-snapshot initialization tasks all in sequence. Yet these tasks do nontrivial work since they may flush to acquire the relevant index commit, so with this commit we go back to distributing them across the `SNAPSHOT` pool again.

In #88707 we changed the behaviour here to run the shard-snapshot initialization tasks all in sequence. Yet these tasks do nontrivial work since they may flush to acquire the relevant index commit, so with this commit we go back to distributing them across the `SNAPSHOT` pool again.

In elastic#88707 we changed the behaviour here to run the shard-snapshot initialization tasks all in sequence. Yet these tasks do nontrivial work since they may flush to acquire the relevant index commit, so with this commit we go back to distributing them across the `SNAPSHOT` pool again. Backport of elastic#126452 to `8.x`

In #88707 we changed the behaviour here to run the shard-snapshot initialization tasks all in sequence. Yet these tasks do nontrivial work since they may flush to acquire the relevant index commit, so with this commit we go back to distributing them across the `SNAPSHOT` pool again. Backport of #126452 to `8.x`

DaveCTurner added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.4.0 v7.17.6 v8.3.4 labels Jul 22, 2022

DaveCTurner requested a review from original-brownbear July 22, 2022 07:14

elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 22, 2022

Update docs/changelog/88707.yaml

bec7f1c

DaveCTurner mentioned this pull request Jul 22, 2022

Fix Large Shard Count Scalability Issues #77466

Open

97 tasks

DaveCTurner requested a review from pxsalehi July 22, 2022 08:19

pxsalehi approved these changes Jul 22, 2022

View reviewed changes

original-brownbear approved these changes Jul 22, 2022

View reviewed changes

pxsalehi mentioned this pull request Jul 22, 2022

Rework shard snapshot workers #88209

Merged

DaveCTurner added the auto-backport-and-merge label Jul 22, 2022

DaveCTurner merged commit 7891da8 into elastic:master Jul 22, 2022

DaveCTurner deleted the 2022-07-22-avoid-SnapshotsInProgress-Entry-capture branch July 22, 2022 12:37

DaveCTurner mentioned this pull request Jul 22, 2022

[8.3] Avoid capturing SnapshotsInProgress$Entry in queue (#88707) #88724

Merged

DaveCTurner mentioned this pull request Jul 22, 2022

Avoid capturing SnapshotsInProgress$Entry in queue #88726

Merged

mark-vieira added v8.3.3 and removed v8.3.4 labels Jul 29, 2022

DaveCTurner mentioned this pull request Apr 8, 2025

Run newShardSnapshotTask tasks concurrently #126452

Merged

DaveCTurner mentioned this pull request Apr 8, 2025

Run newShardSnapshotTask tasks concurrently #126478

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid capturing SnapshotsInProgress$Entry in queue #88707

Avoid capturing SnapshotsInProgress$Entry in queue #88707

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

elasticsearchmachine commented Jul 22, 2022

Uh oh!

elasticsearchmachine commented Jul 22, 2022

Uh oh!

pxsalehi commented Jul 22, 2022

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

pxsalehi left a comment

Uh oh!

original-brownbear left a comment

Uh oh!

elasticsearchmachine commented Jul 22, 2022

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

Uh oh!

Avoid capturing SnapshotsInProgress$Entry in queue #88707

Avoid capturing SnapshotsInProgress$Entry in queue #88707

Uh oh!

Conversation

DaveCTurner commented Jul 22, 2022

Uh oh!

elasticsearchmachine commented Jul 22, 2022

Uh oh!

elasticsearchmachine commented Jul 22, 2022

Uh oh!

pxsalehi commented Jul 22, 2022

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

pxsalehi left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear left a comment

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Jul 22, 2022

💔 Backport failed

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

DaveCTurner commented Jul 22, 2022

Uh oh!

Uh oh!