-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Increasing cluster.routing.allocation.cluster_concurrent_rebalance
causes redundant shard movements
#87279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-distributed (Team:Distributed) |
The precise sequence of shard movements isn't totally daft it seems. We start here:
We try and balance
This means the shards of
This move doesn't affect how balanced
That fixes the index balance but breaks the overall balance, and it's not possible to fix that by moving shards of
Tada 🎉 the cluster is now balanced. In this one case we'd make better decisions by moving on to |
@DaveCTurner since the desired balance allocator reconciles shard movement against a final goal, should we consider this issue resolved in 8.6+, won't that prevent the redundant movements of the same shard? |
No, in the situation described above the desired balance allocator would compute a goal which would require those extra shard movements to achieve. |
BalancedShardAllocator is prone to perform unnecessary moves when cluster_concurrent_rebalance is set to high values (>2). (See elastic#87279) Above allocator is used in DesiredBalanceComputer. Since we do not move actual shard data during calculation it is possible to artificially set above setting to 1 to avoid unnecessary moves in desired balance.
BalancedShardAllocator is prone to perform unnecessary moves when cluster_concurrent_rebalance is set to high values (>2). (See #87279) Above allocator is used in DesiredBalanceComputer. Since we do not move actual shard data during calculation it is possible to artificially set above setting to 2 to avoid unnecessary moves in desired balance.
) BalancedShardAllocator is prone to perform unnecessary moves when cluster_concurrent_rebalance is set to high values (>2). (See elastic#87279) Above allocator is used in DesiredBalanceComputer. Since we do not move actual shard data during calculation it is possible to artificially set above setting to 2 to avoid unnecessary moves in desired balance.
…) (#94082) * Simulate shard moves using cluster_concurrent_rebalance=2 (#93977) BalancedShardAllocator is prone to perform unnecessary moves when cluster_concurrent_rebalance is set to high values (>2). (See #87279) Above allocator is used in DesiredBalanceComputer. Since we do not move actual shard data during calculation it is possible to artificially set above setting to 2 to avoid unnecessary moves in desired balance. * fix merge --------- Co-authored-by: Elastic Machine <[email protected]>
Hello, Just upgraded my cluster from 8.5.1 to 8.8.1 and now all my nodes in the hot and warm tiers are unbalaced and the cluster is constantly moving shards around, could be related to this issue? Or there are any other issues related to that balance change made on 8.6? I had custom settings for |
@leandrojmp yes, the release notes for 8.6.0 introduced the desired balance allocator. This updated the default shard allocation calculations and is more likely the change in behavior you're seeing. |
If
cluster.routing.allocation.cluster_concurrent_rebalance
is increased above the default of 2 then the balancer will sometimes make unnecessary rebalancing movements which can drastically increase the time it takes for the cluster to reach a balanced state. It's likely that something similar happens if you increasecluster.routing.allocation.node_concurrent_recoveries
and friends too. For example, if you start with the following routing table ...... then a call to
reroute()
will relocate some shards ontonode-2
but will also relocate this node's sole shard onto a different node:In more detail, the balancer simulates throttled shard movements onto the emptier node and these movements improve both shard and index balance, but this process seems to overshoot a little which applies pressure to move shards off the empty node, and these movements are not throttled so they are triggered straight away.
It still achieves a balanced cluster in the end, it just takes more shard movements than it needs to get there:
Failing test case
Workaround
Remove the following settings from the configuration, so that their default values take effect:
cluster.routing.allocation.cluster_concurrent_rebalance
cluster.routing.allocation.node_concurrent_incoming_recoveries
cluster.routing.allocation.node_concurrent_outgoing_recoveries
cluster.routing.allocation.node_concurrent_recoveries
The text was updated successfully, but these errors were encountered: