Skip to content

Introduce separate shard limit for frozen shards #71392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 31 additions & 7 deletions docs/reference/modules/cluster/misc.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,13 @@ either the limit is increased as described below, or some indices are
<<indices-open-close,closed>> or <<indices-delete-index,deleted>> to bring the
number of shards below the limit.

The cluster shard limit defaults to 1,000 shards per data node.
Both primary and replica shards of all open indices count toward the limit,
including unassigned shards.
For example, an open index with 5 primary shards and 2 replicas counts as 15 shards.
Closed indices do not contribute to the shard count.
The cluster shard limit defaults to 1,000 shards per non-frozen data node for
normal (non-frozen) indices and 3000 shards per frozen data node for frozen
indices.
Both primary and replica shards of all open indices count toward the limit,
including unassigned shards.
For example, an open index with 5 primary shards and 2 replicas counts as 15 shards.
Closed indices do not contribute to the shard count.

You can dynamically adjust the cluster shard limit with the following setting:

Expand All @@ -61,7 +63,7 @@ You can dynamically adjust the cluster shard limit with the following setting:
Limits the total number of primary and replica shards for the cluster. {es}
calculates the limit as follows:

`cluster.max_shards_per_node * number of data nodes`
`cluster.max_shards_per_node * number of non-frozen data nodes`

Shards for closed indices do not count toward this limit. Defaults to `1000`.
A cluster with no data nodes is unlimited.
Expand All @@ -71,7 +73,29 @@ example, a cluster with a `cluster.max_shards_per_node` setting of `100` and
three data nodes has a shard limit of 300. If the cluster already contains 296
shards, {es} rejects any request that adds five or more shards to the cluster.

NOTE: This setting does not limit shards for individual nodes. To limit the
Notice that frozen shards have their own independent limit.
--

[[cluster-max-shards-per-node-frozen]]
`cluster.max_shards_per_node.frozen`::
+
--
(<<dynamic-cluster-setting,Dynamic>>)
Limits the total number of primary and replica frozen shards for the cluster.
{es} calculates the limit as follows:

`cluster.max_shards_per_node * number of frozen data nodes`

Shards for closed indices do not count toward this limit. Defaults to `3000`.
A cluster with no frozen data nodes is unlimited.

{es} rejects any request that creates more frozen shards than this limit allows.
For example, a cluster with a `cluster.max_shards_per_node.frozen` setting of
`100` and three frozen data nodes has a frozen shard limit of 300. If the
cluster already contains 296 shards, {es} rejects any request that adds five or
more frozen shards to the cluster.

NOTE: These setting do not limit shards for individual nodes. To limit the
number of shards for each node, use the
<<cluster-total-shards-per-node,`cluster.routing.allocation.total_shards_per_node`>>
setting.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,8 @@ public void testIncreaseReplicasOverLimit() {
fail("shouldn't be able to increase the number of replicas");
} catch (IllegalArgumentException e) {
String expectedError = "Validation Failed: 1: this action would add [" + (dataNodes * firstShardCount)
+ "] total shards, but this cluster currently has [" + firstShardCount + "]/[" + dataNodes * shardsPerNode
+ "] maximum shards open;";
+ "] shards, but this cluster currently has [" + firstShardCount + "]/[" + dataNodes * shardsPerNode
+ "] maximum normal shards open;";
assertEquals(expectedError, e.getMessage());
}
Metadata clusterState = client().admin().cluster().prepareState().get().getState().metadata();
Expand Down Expand Up @@ -192,8 +192,8 @@ public void testChangingMultipleIndicesOverLimit() {
int difference = totalShardsAfter - totalShardsBefore;

String expectedError = "Validation Failed: 1: this action would add [" + difference
+ "] total shards, but this cluster currently has [" + totalShardsBefore + "]/[" + dataNodes * shardsPerNode
+ "] maximum shards open;";
+ "] shards, but this cluster currently has [" + totalShardsBefore + "]/[" + dataNodes * shardsPerNode
+ "] maximum normal shards open;";
assertEquals(expectedError, e.getMessage());
}
Metadata clusterState = client().admin().cluster().prepareState().get().getState().metadata();
Expand Down Expand Up @@ -352,7 +352,7 @@ private void verifyException(int dataNodes, ShardCounts counts, IllegalArgumentE
int currentShards = counts.getFirstIndexShards() * (1 + counts.getFirstIndexReplicas());
int maxShards = counts.getShardsPerNode() * dataNodes;
String expectedError = "Validation Failed: 1: this action would add [" + totalShards
+ "] total shards, but this cluster currently has [" + currentShards + "]/[" + maxShards + "] maximum shards open;";
+ "] shards, but this cluster currently has [" + currentShards + "]/[" + maxShards + "] maximum normal shards open;";
assertEquals(expectedError, e.getMessage());
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@
import org.elasticsearch.cluster.routing.allocation.AllocationService;
import org.elasticsearch.cluster.service.ClusterService;
import org.elasticsearch.common.Priority;
import org.elasticsearch.common.ValidationException;
import org.elasticsearch.common.inject.Inject;
import org.elasticsearch.common.regex.Regex;
import org.elasticsearch.common.settings.IndexScopedSettings;
Expand All @@ -38,7 +37,6 @@
import java.util.Arrays;
import java.util.HashSet;
import java.util.Locale;
import java.util.Optional;
import java.util.Set;

import static org.elasticsearch.action.support.ContextPreservingActionListener.wrapPreservingContext;
Expand Down Expand Up @@ -135,15 +133,7 @@ public ClusterState execute(ClusterState currentState) {
final int updatedNumberOfReplicas = IndexMetadata.INDEX_NUMBER_OF_REPLICAS_SETTING.get(openSettings);
if (preserveExisting == false) {
// Verify that this won't take us over the cluster shard limit.
int totalNewShards = Arrays.stream(request.indices())
.mapToInt(i -> getTotalNewShards(i, currentState, updatedNumberOfReplicas))
.sum();
Optional<String> error = shardLimitValidator.checkShardLimit(totalNewShards, currentState);
if (error.isPresent()) {
ValidationException ex = new ValidationException();
ex.addValidationError(error.get());
throw ex;
}
shardLimitValidator.validateShardLimitOnReplicaUpdate(currentState, request.indices(), updatedNumberOfReplicas);

/*
* We do not update the in-sync allocation IDs as they will be removed upon the first index operation which makes
Expand Down Expand Up @@ -269,14 +259,6 @@ public ClusterState execute(ClusterState currentState) {
});
}

private int getTotalNewShards(Index index, ClusterState currentState, int updatedNumberOfReplicas) {
IndexMetadata indexMetadata = currentState.metadata().index(index);
int shardsInIndex = indexMetadata.getNumberOfShards();
int oldNumberOfReplicas = indexMetadata.getNumberOfReplicas();
int replicaIncrease = updatedNumberOfReplicas - oldNumberOfReplicas;
return replicaIncrease * shardsInIndex;
}

/**
* Updates the cluster block only iff the setting exists in the given settings
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -476,7 +476,8 @@ public void apply(Settings value, Settings current, Settings previous) {
FsHealthService.ENABLED_SETTING,
FsHealthService.REFRESH_INTERVAL_SETTING,
FsHealthService.SLOW_PATH_LOGGING_THRESHOLD_SETTING,
IndexingPressure.MAX_INDEXING_BYTES);
IndexingPressure.MAX_INDEXING_BYTES,
ShardLimitValidator.SETTING_CLUSTER_MAX_SHARDS_PER_NODE_FROZEN);

static List<SettingUpgrader<?>> BUILT_IN_SETTING_UPGRADERS = Collections.emptyList();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
import org.elasticsearch.index.store.FsDirectoryFactory;
import org.elasticsearch.index.store.Store;
import org.elasticsearch.indices.IndicesRequestCache;
import org.elasticsearch.indices.ShardLimitValidator;

import java.util.Collections;
import java.util.Map;
Expand Down Expand Up @@ -157,6 +158,7 @@ public final class IndexScopedSettings extends AbstractScopedSettings {
MetadataIndexStateService.VERIFIED_BEFORE_CLOSE_SETTING,
ExistingShardsAllocator.EXISTING_SHARDS_ALLOCATOR_SETTING,
DiskThresholdDecider.SETTING_IGNORE_DISK_WATERMARKS,
ShardLimitValidator.INDEX_SETTING_SHARD_LIMIT_GROUP,

// validate that built-in similarities don't get redefined
Setting.groupSetting(
Expand Down
Loading