Skip to content

Add option to take currently relocating shards' sizes into account #7785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 19, 2014

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Sep 18, 2014

When using the DiskThresholdDecider, it's possible that shards could
already be marked as relocating to the node being evaluated. This commit
adds a new setting cluster.routing.allocation.disk.include_relocations
which adds the size of the shards currently being relocated to this node
to the node's used disk space.

This new option defaults to true, however, it's possible to
over-estimate the usage for a node if the relocation is already
partially complete, for instance:

A node with a 10gb shard that's 45% of the way through a relocation
would add 10gb + (.45 * 10) = 14.5gb to the node's disk usage before
examining the watermarks to see if a new shard can be allocated.

Fixes #7753
Relates to #6168

@gibrown
Copy link
Contributor

gibrown commented Sep 18, 2014

Because running out of disk space is such a hard failure condition I'd suggest that this should be the default behavior. Temporarily over-estimating seems safer to me.

Great improvement though. Thanks!

@grantr
Copy link

grantr commented Sep 18, 2014

I also favor safety over accuracy by default. @dakrone what would the example situation that you mention ultimately resolve to? Would the same shards still be allocated, but slower than usual?

@@ -649,6 +654,107 @@ public void freeDiskPercentageAfterShardAssignedUnitTest() {
assertThat(after, equalTo(19.0));
}

@Test
@TestLogging("cluster.routing.allocation.decider:TRACE")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is trace logging required here?

@s1monw
Copy link
Contributor

s1monw commented Sep 19, 2014

I left a minor comment. I think we should move to true as the default. I completely agree that the safer option is preferable. Other than that LGTM

@s1monw s1monw removed the review label Sep 19, 2014
@dakrone dakrone force-pushed the dtd-user-current-relocations branch 2 times, most recently from be53e08 to 4185566 Compare September 19, 2014 10:35
When using the DiskThresholdDecider, it's possible that shards could
already be marked as relocating to the node being evaluated. This commit
adds a new setting `cluster.routing.allocation.disk.include_relocations`
which adds the size of the shards currently being relocated to this node
to the node's used disk space.

This new option defaults to `true`, however it's possible to
over-estimate the usage for a node if the relocation is already
partially complete, for instance:

A node with a 10gb shard that's 45% of the way through a relocation
would add 10gb + (.45 * 10) = 14.5gb to the node's disk usage before
examining the watermarks to see if a new shard can be allocated.

Fixes elastic#7753
Relates to elastic#6168
@dakrone dakrone merged commit 4185566 into elastic:master Sep 19, 2014
@dakrone dakrone deleted the dtd-user-current-relocations branch September 19, 2014 13:35
@lcawl lcawl added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. and removed :Allocation labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. >enhancement v1.5.0 v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disk decider can allocate more data than the node can handle
6 participants