Skip to content

Commit 12314f6

Browse files
Introduce separate shard limit for frozen shards (#71392) (#71760)
Frozen indices (partial searchable snapshots) require less heap per shard and the limit can therefore be raised for those. We pick 3000 frozen shards per frozen data node, since we think 2000 is reasonable to use in production. Relates #71042 and #34021 Includes #71781 and #71777
1 parent 2fa0dc3 commit 12314f6

File tree

9 files changed

+384
-91
lines changed

9 files changed

+384
-91
lines changed

docs/reference/modules/cluster/misc.asciidoc

Lines changed: 31 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,13 @@ either the limit is increased as described below, or some indices are
4545
<<indices-open-close,closed>> or <<indices-delete-index,deleted>> to bring the
4646
number of shards below the limit.
4747

48-
The cluster shard limit defaults to 1,000 shards per data node.
49-
Both primary and replica shards of all open indices count toward the limit,
50-
including unassigned shards.
51-
For example, an open index with 5 primary shards and 2 replicas counts as 15 shards.
52-
Closed indices do not contribute to the shard count.
48+
The cluster shard limit defaults to 1,000 shards per non-frozen data node for
49+
normal (non-frozen) indices and 3000 shards per frozen data node for frozen
50+
indices.
51+
Both primary and replica shards of all open indices count toward the limit,
52+
including unassigned shards.
53+
For example, an open index with 5 primary shards and 2 replicas counts as 15 shards.
54+
Closed indices do not contribute to the shard count.
5355

5456
You can dynamically adjust the cluster shard limit with the following setting:
5557

@@ -61,7 +63,7 @@ You can dynamically adjust the cluster shard limit with the following setting:
6163
Limits the total number of primary and replica shards for the cluster. {es}
6264
calculates the limit as follows:
6365

64-
`cluster.max_shards_per_node * number of data nodes`
66+
`cluster.max_shards_per_node * number of non-frozen data nodes`
6567

6668
Shards for closed indices do not count toward this limit. Defaults to `1000`.
6769
A cluster with no data nodes is unlimited.
@@ -71,7 +73,29 @@ example, a cluster with a `cluster.max_shards_per_node` setting of `100` and
7173
three data nodes has a shard limit of 300. If the cluster already contains 296
7274
shards, {es} rejects any request that adds five or more shards to the cluster.
7375

74-
NOTE: This setting does not limit shards for individual nodes. To limit the
76+
Notice that frozen shards have their own independent limit.
77+
--
78+
79+
[[cluster-max-shards-per-node-frozen]]
80+
`cluster.max_shards_per_node.frozen`::
81+
+
82+
--
83+
(<<dynamic-cluster-setting,Dynamic>>)
84+
Limits the total number of primary and replica frozen shards for the cluster.
85+
{es} calculates the limit as follows:
86+
87+
`cluster.max_shards_per_node * number of frozen data nodes`
88+
89+
Shards for closed indices do not count toward this limit. Defaults to `3000`.
90+
A cluster with no frozen data nodes is unlimited.
91+
92+
{es} rejects any request that creates more frozen shards than this limit allows.
93+
For example, a cluster with a `cluster.max_shards_per_node.frozen` setting of
94+
`100` and three frozen data nodes has a frozen shard limit of 300. If the
95+
cluster already contains 296 shards, {es} rejects any request that adds five or
96+
more frozen shards to the cluster.
97+
98+
NOTE: These setting do not limit shards for individual nodes. To limit the
7599
number of shards for each node, use the
76100
<<cluster-total-shards-per-node,`cluster.routing.allocation.total_shards_per_node`>>
77101
setting.

server/src/internalClusterTest/java/org/elasticsearch/cluster/shards/ClusterShardLimitIT.java

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -145,8 +145,8 @@ public void testIncreaseReplicasOverLimit() {
145145
fail("shouldn't be able to increase the number of replicas");
146146
} catch (IllegalArgumentException e) {
147147
String expectedError = "Validation Failed: 1: this action would add [" + (dataNodes * firstShardCount)
148-
+ "] total shards, but this cluster currently has [" + firstShardCount + "]/[" + dataNodes * shardsPerNode
149-
+ "] maximum shards open;";
148+
+ "] shards, but this cluster currently has [" + firstShardCount + "]/[" + dataNodes * shardsPerNode
149+
+ "] maximum normal shards open;";
150150
assertEquals(expectedError, e.getMessage());
151151
}
152152
Metadata clusterState = client().admin().cluster().prepareState().get().getState().metadata();
@@ -192,8 +192,8 @@ public void testChangingMultipleIndicesOverLimit() {
192192
int difference = totalShardsAfter - totalShardsBefore;
193193

194194
String expectedError = "Validation Failed: 1: this action would add [" + difference
195-
+ "] total shards, but this cluster currently has [" + totalShardsBefore + "]/[" + dataNodes * shardsPerNode
196-
+ "] maximum shards open;";
195+
+ "] shards, but this cluster currently has [" + totalShardsBefore + "]/[" + dataNodes * shardsPerNode
196+
+ "] maximum normal shards open;";
197197
assertEquals(expectedError, e.getMessage());
198198
}
199199
Metadata clusterState = client().admin().cluster().prepareState().get().getState().metadata();
@@ -352,7 +352,7 @@ private void verifyException(int dataNodes, ShardCounts counts, IllegalArgumentE
352352
int currentShards = counts.getFirstIndexShards() * (1 + counts.getFirstIndexReplicas());
353353
int maxShards = counts.getShardsPerNode() * dataNodes;
354354
String expectedError = "Validation Failed: 1: this action would add [" + totalShards
355-
+ "] total shards, but this cluster currently has [" + currentShards + "]/[" + maxShards + "] maximum shards open;";
355+
+ "] shards, but this cluster currently has [" + currentShards + "]/[" + maxShards + "] maximum normal shards open;";
356356
assertEquals(expectedError, e.getMessage());
357357
}
358358

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataUpdateSettingsService.java

Lines changed: 1 addition & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@
2424
import org.elasticsearch.cluster.routing.allocation.AllocationService;
2525
import org.elasticsearch.cluster.service.ClusterService;
2626
import org.elasticsearch.common.Priority;
27-
import org.elasticsearch.common.ValidationException;
2827
import org.elasticsearch.common.collect.Tuple;
2928
import org.elasticsearch.common.inject.Inject;
3029
import org.elasticsearch.common.regex.Regex;
@@ -42,7 +41,6 @@
4241
import java.util.HashSet;
4342
import java.util.Locale;
4443
import java.util.Map;
45-
import java.util.Optional;
4644
import java.util.Set;
4745

4846
import static org.elasticsearch.action.support.ContextPreservingActionListener.wrapPreservingContext;
@@ -139,15 +137,7 @@ public ClusterState execute(ClusterState currentState) {
139137
final int updatedNumberOfReplicas = IndexMetadata.INDEX_NUMBER_OF_REPLICAS_SETTING.get(openSettings);
140138
if (preserveExisting == false) {
141139
// Verify that this won't take us over the cluster shard limit.
142-
int totalNewShards = Arrays.stream(request.indices())
143-
.mapToInt(i -> getTotalNewShards(i, currentState, updatedNumberOfReplicas))
144-
.sum();
145-
Optional<String> error = shardLimitValidator.checkShardLimit(totalNewShards, currentState);
146-
if (error.isPresent()) {
147-
ValidationException ex = new ValidationException();
148-
ex.addValidationError(error.get());
149-
throw ex;
150-
}
140+
shardLimitValidator.validateShardLimitOnReplicaUpdate(currentState, request.indices(), updatedNumberOfReplicas);
151141

152142
/*
153143
* We do not update the in-sync allocation IDs as they will be removed upon the first index operation which makes
@@ -273,14 +263,6 @@ public ClusterState execute(ClusterState currentState) {
273263
});
274264
}
275265

276-
private int getTotalNewShards(Index index, ClusterState currentState, int updatedNumberOfReplicas) {
277-
IndexMetadata indexMetadata = currentState.metadata().index(index);
278-
int shardsInIndex = indexMetadata.getNumberOfShards();
279-
int oldNumberOfReplicas = indexMetadata.getNumberOfReplicas();
280-
int replicaIncrease = updatedNumberOfReplicas - oldNumberOfReplicas;
281-
return replicaIncrease * shardsInIndex;
282-
}
283-
284266
/**
285267
* Updates the cluster block only iff the setting exists in the given settings
286268
*/

server/src/main/java/org/elasticsearch/common/settings/ClusterSettings.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -554,7 +554,8 @@ public void apply(Settings value, Settings current, Settings previous) {
554554
FsHealthService.ENABLED_SETTING,
555555
FsHealthService.REFRESH_INTERVAL_SETTING,
556556
FsHealthService.SLOW_PATH_LOGGING_THRESHOLD_SETTING,
557-
IndexingPressure.MAX_INDEXING_BYTES)));
557+
IndexingPressure.MAX_INDEXING_BYTES,
558+
ShardLimitValidator.SETTING_CLUSTER_MAX_SHARDS_PER_NODE_FROZEN)));
558559

559560
public static List<SettingUpgrader<?>> BUILT_IN_SETTING_UPGRADERS = Collections.unmodifiableList(Arrays.asList(
560561
SniffConnectionStrategy.SEARCH_REMOTE_CLUSTER_SEEDS_UPGRADER,

server/src/main/java/org/elasticsearch/common/settings/IndexScopedSettings.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
import org.elasticsearch.index.store.FsDirectoryFactory;
3434
import org.elasticsearch.index.store.Store;
3535
import org.elasticsearch.indices.IndicesRequestCache;
36+
import org.elasticsearch.indices.ShardLimitValidator;
3637

3738
import java.util.Arrays;
3839
import java.util.Collections;
@@ -165,6 +166,7 @@ public final class IndexScopedSettings extends AbstractScopedSettings {
165166
MetadataIndexStateService.VERIFIED_BEFORE_CLOSE_SETTING,
166167
ExistingShardsAllocator.EXISTING_SHARDS_ALLOCATOR_SETTING,
167168
DiskThresholdDecider.SETTING_IGNORE_DISK_WATERMARKS,
169+
ShardLimitValidator.INDEX_SETTING_SHARD_LIMIT_GROUP,
168170

169171
// validate that built-in similarities don't get redefined
170172
Setting.groupSetting("index.similarity.", (s) -> {

0 commit comments

Comments
 (0)