Skip to content

Short-circuit rebalancing when disabled #40966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

DaveCTurner
Copy link
Contributor

Today if cluster.routing.rebalance.enable: none then rebalancing is disabled,
but we still execute balanceByWeights() and perform some rather expensive
calculations before discovering that we cannot rebalance any shards. In a large
cluster this can make cluster state updates occur rather slowly. With this
change we check earlier whether rebalancing is globally disabled and, if so,
avoid the rebalancing process entirely.

Relates #40942 which was reverted because of egregiously faulty tests.

Today if `cluster.routing.rebalance.enable: none` then rebalancing is disabled,
but we still execute `balanceByWeights()` and perform some rather expensive
calculations before discovering that we cannot rebalance any shards. In a large
cluster this can make cluster state updates occur rather slowly. With this
change we check earlier whether rebalancing is globally disabled and, if so,
avoid the rebalancing process entirely.

Relates elastic#40942 which was reverted because of egregiously faulty tests.
@DaveCTurner DaveCTurner added >bug :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.0.0 v7.2.0 v6.7.2 v7.0.1 labels Apr 8, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

.put(CLUSTER_ROUTING_REBALANCE_ENABLE_SETTING.getKey(),
randomFrom(EnableAllocationDecider.Rebalance.ALL,
EnableAllocationDecider.Rebalance.PRIMARIES,
EnableAllocationDecider.Rebalance.REPLICAS).name()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid having to play spot-the-difference, the bug in #40942 was that I was picking values from EnableAllocationDecider.Allocation which are mostly, but not entirely, the same as these.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thanks @DaveCTurner , also for teaching me a new word!

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2 :)

@DaveCTurner DaveCTurner merged commit 2e21a11 into elastic:master Apr 9, 2019
@DaveCTurner DaveCTurner deleted the 2019-04-08-short-circuit-allocation-when-disabled-2 branch April 9, 2019 06:22
DaveCTurner added a commit that referenced this pull request Apr 9, 2019
Today if `cluster.routing.rebalance.enable: none` then rebalancing is disabled,
but we still execute `balanceByWeights()` and perform some rather expensive
calculations before discovering that we cannot rebalance any shards. In a large
cluster this can make cluster state updates occur rather slowly. With this
change we check earlier whether rebalancing is globally disabled and, if so,
avoid the rebalancing process entirely.

Relates #40942 which was reverted because of egregiously faulty tests.
DaveCTurner added a commit that referenced this pull request Apr 9, 2019
Today if `cluster.routing.rebalance.enable: none` then rebalancing is disabled,
but we still execute `balanceByWeights()` and perform some rather expensive
calculations before discovering that we cannot rebalance any shards. In a large
cluster this can make cluster state updates occur rather slowly. With this
change we check earlier whether rebalancing is globally disabled and, if so,
avoid the rebalancing process entirely.

Relates #40942 which was reverted because of egregiously faulty tests.
DaveCTurner added a commit that referenced this pull request Apr 9, 2019
Today if `cluster.routing.rebalance.enable: none` then rebalancing is disabled,
but we still execute `balanceByWeights()` and perform some rather expensive
calculations before discovering that we cannot rebalance any shards. In a large
cluster this can make cluster state updates occur rather slowly. With this
change we check earlier whether rebalancing is globally disabled and, if so,
avoid the rebalancing process entirely.

Relates #40942 which was reverted because of egregiously faulty tests.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Apr 9, 2019
…forced-unsafe-publication

* elastic/master:
  Improve Watcher test framework resiliency (elastic#40658)
  Fix order of request body search parameter names in documentation (elastic#40777)
  Node repurpose tool docs (elastic#40525)
  [Docs] Delete explanation for completion suggester default analyzer choice (elastic#36720)
  Revert "Revert "Change HLRC CCR response tests to use AbstractResponseTestCase base class. (elastic#40257)"" (elastic#40971)
  Short-circuit rebalancing when disabled (elastic#40966)
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Today if `cluster.routing.rebalance.enable: none` then rebalancing is disabled,
but we still execute `balanceByWeights()` and perform some rather expensive
calculations before discovering that we cannot rebalance any shards. In a large
cluster this can make cluster state updates occur rather slowly. With this
change we check earlier whether rebalancing is globally disabled and, if so,
avoid the rebalancing process entirely.

Relates elastic#40942 which was reverted because of egregiously faulty tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v6.7.2 v7.0.1 v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants