set default value of cluster.routing.allocation.cluster_concurrent_rebalance to -1 #109210

madhava-sridhar · 2024-05-30T13:05:15Z

Default value for cluster.routing.allocation.cluster_concurrent_rebalance is set to -1, in effect disabling "cluster_concurrent_rebalance" bound . As a default behaviour throttling is controlled by
cluster.routing.allocation.node_concurrent_incoming_recoveries / cluster.routing.allocation.node_concurrent_outgoing_recoveries.

Have you signed the contributor license agreement?
✅
Have you followed the contributor guidelines?
✅
If submitting code, have you built your formula locally prior to submission with gradle check?
✅
If submitting code, is your pull request against main? Unless there is a good reason otherwise, we prefer pull requests against main and will backport as needed.
If submitting code, have you checked that your submission is for an OS and architecture that we support?
✅
If you are submitting this code for a class then read our policy for that.

Issue : #97750

…balance to -1

madhava-sridhar · 2024-05-30T13:10:30Z

...va/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceReconcilerTests.java

@@ -1238,7 +1238,7 @@ public void testRebalanceDoesNotCauseHotSpots() {
            assertThat(
                "Reconciling nodes should all have same amount (max 1 delta) of moves: " + totalOutgoingMoves,
                summary.getMax() - summary.getMin(),
-                lessThanOrEqualTo(1)
+                lessThanOrEqualTo(numberOfNodes)


number of concurrent rebalance now is controlled by cluster.routing.allocation.node_concurrent_incoming_recoveries / cluster.routing.allocation.node_concurrent_outgoing_recoveries

which has default value = number of nodes.

madhava-sridhar · 2024-05-30T13:11:27Z

server/src/test/java/org/elasticsearch/cluster/routing/allocation/AddIncrementallyTests.java

@@ -154,7 +154,8 @@ public void testMinimalRelocationsNoLimit() {
            ClusterRebalanceAllocationDecider.ClusterRebalanceType.ALWAYS.toString()
        )
            .put("cluster.routing.allocation.node_concurrent_recoveries", 100)
-            .put("cluster.routing.allocation.node_initial_primaries_recoveries", 100);
+            .put("cluster.routing.allocation.node_initial_primaries_recoveries", 100)


test case conditions assume previous default value of 2 for cluster.routing.allocation.cluster_concurrent_rebalance

planning to add new test for new behaviour as per new default .

DaveCTurner

I don't think it's so simple unfortunately - as mentioned in #97750 we want to limit rebalance activity at the node level (by default 1 per node) so that we don't use up all the concurrent-recovery slots (by default 2 per node) on rebalancing.

madhava-sridhar · 2024-05-30T14:54:15Z

default settings for cluster.routing.allocation.cluster_concurrent_rebalance before was "2" with "NodeScope". I'm confused about definition of NodeScope. Below is from Setting.java

" /**
         * Cluster-level or configuration file-level setting. Not an index setting.
         */
        NodeScope,"

When value is set to 2 with NodeScope , is it per node 2 rebalance allowed or per cluster ?

henningandersen

There is more work to do here.

henningandersen · 2024-05-31T13:16:30Z

...g/elasticsearch/cluster/routing/allocation/decider/ConcurrentRebalanceAllocationDecider.java

    public static final Setting<Integer> CLUSTER_ROUTING_ALLOCATION_CLUSTER_CONCURRENT_REBALANCE_SETTING = Setting.intSetting(
        "cluster.routing.allocation.cluster_concurrent_rebalance",
-        2,
+        DEFAULT_CLUSTER_ROUTING_ALLOCATION_CLUSTER_CONCURRENT_REBALANCE,


We should add two new separate settings controlling inbound and outbound rebalances - just like we have for recoveries.

We can assume that any existing relocation is a rebalance when figuring out whether to allow one more rebalance. However, that will only be able to determine based on source node, not target node AFAICS. I think we will need another method on AllocationDecider to help us decide whether a specific rebalance (to a specific node) is possible, otherwise you'll not be able to make the determination.

henningandersen · 2024-05-31T13:18:53Z

I'm confused about definition of NodeScope

This is only to do with the setting, whether it has "node scope" or "index scope". All node and cluster level settings are node scope. It has nothing to do with the allocation itself, only which scope the setting applies to (node/cluster or index).

set default value of cluster.routing.allocation.cluster_concurrent_re…

5abc482

…balance to -1

elasticsearchmachine added v8.15.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels May 30, 2024

madhava-sridhar requested a review from a team May 30, 2024 13:08

madhava-sridhar commented May 30, 2024

View reviewed changes

DaveCTurner requested changes May 30, 2024

View reviewed changes

henningandersen requested changes May 31, 2024

View reviewed changes

madhava-sridhar closed this by deleting the head repository Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set default value of cluster.routing.allocation.cluster_concurrent_rebalance to -1 #109210

set default value of cluster.routing.allocation.cluster_concurrent_rebalance to -1 #109210

madhava-sridhar commented May 30, 2024 •

edited

Loading

madhava-sridhar May 30, 2024

madhava-sridhar May 30, 2024

DaveCTurner left a comment

madhava-sridhar commented May 30, 2024 •

edited

Loading

henningandersen left a comment

henningandersen May 31, 2024

henningandersen commented May 31, 2024

set default value of cluster.routing.allocation.cluster_concurrent_rebalance to -1 #109210

set default value of cluster.routing.allocation.cluster_concurrent_rebalance to -1 #109210

Conversation

madhava-sridhar commented May 30, 2024 • edited Loading

madhava-sridhar May 30, 2024

Choose a reason for hiding this comment

madhava-sridhar May 30, 2024

Choose a reason for hiding this comment

DaveCTurner left a comment

Choose a reason for hiding this comment

madhava-sridhar commented May 30, 2024 • edited Loading

henningandersen left a comment

Choose a reason for hiding this comment

henningandersen May 31, 2024

Choose a reason for hiding this comment

henningandersen commented May 31, 2024

madhava-sridhar commented May 30, 2024 •

edited

Loading

madhava-sridhar commented May 30, 2024 •

edited

Loading