Skip to content

Commit 9831084

Browse files
authored
ILM: Make all the shrink action steps retryable (elastic#70107)
This aims at making the shrink action retryable. Every step is retryable, but in order to provide an experience where ILM tries to achieve a successful shrink even when the target node goes missing permanently or the shrunk index cannot recover, this also introduces a retryable shrink cycle within the shrink action. The shrink action will generate a unique index name that'll be the shrunk index name. The generated index name is stored in the lifecycle state. If the shrink action ends up waiting for the source shards to colocate or for the shrunk index to recover for more than the configured `LIFECYCLE_STEP_WAIT_TIME_THRESHOLD` setting, it will move back to clean up the attempted (and failed) shrunk index and will retry generating a new index name and attempting to shrink the source to the newly generated index name.
1 parent 259ca47 commit 9831084

File tree

46 files changed

+1901
-320
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1901
-320
lines changed

build.gradle

+2-2
Original file line numberDiff line numberDiff line change
@@ -189,9 +189,9 @@ tasks.register("verifyVersions") {
189189
* after the backport of the backcompat code is complete.
190190
*/
191191

192-
boolean bwc_tests_enabled = true
192+
boolean bwc_tests_enabled = false
193193
// place a PR link here when committing bwc changes:
194-
String bwc_tests_disabled_issue = ""
194+
String bwc_tests_disabled_issue = "https://github.com/elastic/elasticsearch/pull/70107"
195195
/*
196196
* FIPS 140-2 behavior was fixed in 7.11.0. Before that there is no way to run elasticsearch in a
197197
* JVM that is properly configured to be in fips mode with BCFIPS. For now we need to disable

docs/reference/ilm/actions/ilm-shrink.asciidoc

+40-28
Original file line numberDiff line numberDiff line change
@@ -4,41 +4,29 @@
44

55
Phases allowed: hot, warm.
66

7-
Sets an index to <<dynamic-index-settings, read-only>>
8-
and shrinks it into a new index with fewer primary shards.
9-
The name of the new index is of the form `shrink-<original-index-name>`.
10-
For example, if the name of the source index is _logs_,
11-
the name of the shrunken index is _shrink-logs_.
7+
Sets a source index to <<index-blocks-read-only,read-only>> and shrinks it into
8+
a new index with fewer primary shards. The name of the resulting index is
9+
`shrink-<random-uuid>-<original-index-name>`. This action corresponds to the
10+
<<indices-shrink-index,shrink API>>.
1211

13-
The shrink action allocates all primary shards of the index to one node so it
14-
can call the <<indices-shrink-index,Shrink API>> to shrink the index.
15-
After shrinking, it swaps aliases that point to the original index to the new shrunken index.
12+
After the `shrink` action, any aliases that pointed to the source index point to
13+
the new shrunken index. If {ilm-init} performs the `shrink` action on a backing
14+
index for a data stream, the shrunken index replaces the source index in the
15+
stream. You cannot perform the `shrink` action on a write index.
1616

17-
To use the `shrink` action in the `hot` phase, the `rollover` action *must* be present.
18-
If no rollover action is configured, {ilm-init} will reject the policy.
17+
To use the `shrink` action in the `hot` phase, the `rollover` action *must* be
18+
present. If no rollover action is configured, {ilm-init} will reject the policy.
1919

2020
[IMPORTANT]
21-
If the shrink action is used on a <<ccr-put-follow,follower index>>,
22-
policy execution waits until the leader index rolls over (or is
23-
<<skipping-rollover, otherwise marked complete>>),
24-
then converts the follower index into a regular index with the
25-
<<ilm-unfollow,unfollow>> action before performing the shrink operation.
26-
27-
If the managed index is part of a <<data-streams, data stream>>,
28-
the shrunken index replaces the original index in the data stream.
29-
30-
[NOTE]
31-
This action cannot be performed on a data stream's write index. Attempts to do
32-
so will fail. To shrink the index, first
33-
<<manually-roll-over-a-data-stream,manually roll over>> the data stream. This
34-
creates a new write index. Because the index is no longer the stream's write
35-
index, the action can resume shrinking it.
36-
Using a policy that makes use of the <<ilm-rollover, rollover>> action
37-
in the hot phase will avoid this situation and the need for a manual rollover for future
38-
managed indices.
21+
If the shrink action is used on a <<ccr-put-follow,follower index>>, policy
22+
execution waits until the leader index rolls over (or is <<skipping-rollover,
23+
otherwise marked complete>>), then converts the follower index into a regular
24+
index with the <<ilm-unfollow,unfollow>> action before performing the shrink
25+
operation.
3926

4027
[[ilm-shrink-options]]
4128
==== Shrink options
29+
4230
`number_of_shards`::
4331
(Optional, integer)
4432
Number of shards to shrink to.
@@ -103,3 +91,27 @@ PUT _ilm/policy/my_policy
10391
}
10492
}
10593
--------------------------------------------------
94+
95+
[[ilm-shrink-shard-allocation]]
96+
==== Shard allocation for shrink
97+
98+
During a `shrink` action, {ilm-init} allocates the source index's primary shards
99+
to one node. After shrinking the index, {ilm-init} reallocates the shrunken
100+
index's shards to the appropriate nodes based on your allocation rules.
101+
102+
These allocation steps can fail for several reasons, including:
103+
104+
* A node is removed during the `shrink` action.
105+
* No node has enough disk space to host the source index's shards.
106+
* {es} cannot reallocate the shrunken index due to conflicting allocation rules.
107+
108+
When one of the allocation steps fails, {ilm-init} waits for the period set in
109+
<<index-lifecycle-step-wait-time-threshold,`index.lifecycle.step.wait_time_threshold`>>,
110+
which defaults to 12 hours. This threshold period lets the cluster resolve any
111+
issues causing the allocation failure.
112+
113+
If the threshold period passes and {ilm-init} has not yet shrunk the index,
114+
{ilm-init} attempts to allocate the source index's primary shards to another
115+
node. If {ilm-init} shrunk the index but could not reallocate the shrunken
116+
index's shards during the threshold period, {ilm-init} deletes the shrunken
117+
index and re-attempts the entire `shrink` action.

docs/reference/settings/ilm-settings.asciidoc

+7
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,13 @@ An index that was rolled over would normally match the full format,
5959
for example `logs-2016.10.31-000002`).
6060
If the index name doesn't match the pattern, index creation fails.
6161

62+
[[index-lifecycle-step-wait-time-threshold]]
63+
`index.lifecycle.step.wait_time_threshold`::
64+
(<<indices-update-settings,Dynamic>>, <<time-units,time value>>)
65+
Time to wait for the cluster to resolve allocation issues during an {ilm-init}
66+
<<ilm-shrink,`shrink`>> action. Must be greater than `1h` (1 hour). Defaults to
67+
`12h` (12 hours). See <<ilm-shrink-shard-allocation>>.
68+
6269
`index.lifecycle.rollover_alias`::
6370
(<<indices-update-settings,Dynamic>>, string)
6471
The index alias to update when the index rolls over. Specify when using a

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CheckShrinkReadyStep.java

+5
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,11 @@ public class CheckShrinkReadyStep extends ClusterStateWaitStep {
3939
super(key, nextStepKey);
4040
}
4141

42+
@Override
43+
public boolean isRetryable() {
44+
return true;
45+
}
46+
4247
@Override
4348
public Result isConditionMet(Index index, ClusterState clusterState) {
4449
IndexMetadata idxMeta = clusterState.metadata().index(index);
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
/*
2+
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
3+
* or more contributor license agreements. Licensed under the Elastic License
4+
* 2.0; you may not use this file except in compliance with the Elastic License
5+
* 2.0.
6+
*/
7+
package org.elasticsearch.xpack.core.ilm;
8+
9+
import org.apache.logging.log4j.LogManager;
10+
import org.apache.logging.log4j.Logger;
11+
import org.elasticsearch.action.ActionListener;
12+
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
13+
import org.elasticsearch.action.support.master.AcknowledgedResponse;
14+
import org.elasticsearch.client.Client;
15+
import org.elasticsearch.cluster.ClusterState;
16+
import org.elasticsearch.cluster.metadata.IndexMetadata;
17+
import org.elasticsearch.common.Strings;
18+
import org.elasticsearch.index.IndexNotFoundException;
19+
20+
import static org.elasticsearch.xpack.core.ilm.LifecycleExecutionState.fromIndexMetadata;
21+
22+
/**
23+
* Deletes the index identified by the shrink index name stored in the lifecycle state of the managed index (if any was generated)
24+
*/
25+
public class CleanupShrinkIndexStep extends AsyncRetryDuringSnapshotActionStep {
26+
public static final String NAME = "cleanup-shrink-index";
27+
private static final Logger logger = LogManager.getLogger(CleanupShrinkIndexStep.class);
28+
29+
public CleanupShrinkIndexStep(StepKey key, StepKey nextStepKey, Client client) {
30+
super(key, nextStepKey, client);
31+
}
32+
33+
@Override
34+
public boolean isRetryable() {
35+
return true;
36+
}
37+
38+
@Override
39+
void performDuringNoSnapshot(IndexMetadata indexMetadata, ClusterState currentClusterState, Listener listener) {
40+
final String shrunkenIndexSource = IndexMetadata.INDEX_RESIZE_SOURCE_NAME.get(indexMetadata.getSettings());
41+
if (Strings.isNullOrEmpty(shrunkenIndexSource) == false) {
42+
// the current managed index is a shrunk index
43+
if (currentClusterState.metadata().index(shrunkenIndexSource) == null) {
44+
// if the source index does not exist, we'll skip deleting the
45+
// (managed) shrunk index as that will cause data loss
46+
String policyName = LifecycleSettings.LIFECYCLE_NAME_SETTING.get(indexMetadata.getSettings());
47+
logger.warn("managed index [{}] as part of policy [{}] is a shrunk index and the source index [{}] does not exist " +
48+
"anymore. will skip the [{}] step", indexMetadata.getIndex().getName(), policyName, shrunkenIndexSource, NAME);
49+
listener.onResponse(true);
50+
return;
51+
}
52+
}
53+
54+
LifecycleExecutionState lifecycleState = fromIndexMetadata(indexMetadata);
55+
final String shrinkIndexName = lifecycleState.getShrinkIndexName();
56+
// if the shrink index was not generated there is nothing to delete so we move on
57+
if (Strings.hasText(shrinkIndexName) == false) {
58+
listener.onResponse(true);
59+
return;
60+
}
61+
getClient().admin().indices()
62+
.delete(new DeleteIndexRequest(shrinkIndexName).masterNodeTimeout(getMasterTimeout(currentClusterState)),
63+
new ActionListener<AcknowledgedResponse>() {
64+
@Override
65+
public void onResponse(AcknowledgedResponse acknowledgedResponse) {
66+
// even if not all nodes acked the delete request yet we can consider this operation as successful as
67+
// we'll generate a new index name and attempt to shrink into the newly generated name
68+
listener.onResponse(true);
69+
}
70+
71+
@Override
72+
public void onFailure(Exception e) {
73+
if (e instanceof IndexNotFoundException) {
74+
// we can move on if the index was deleted in the meantime
75+
listener.onResponse(true);
76+
} else {
77+
listener.onFailure(e);
78+
}
79+
}
80+
});
81+
}
82+
83+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
/*
2+
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
3+
* or more contributor license agreements. Licensed under the Elastic License
4+
* 2.0; you may not use this file except in compliance with the Elastic License
5+
* 2.0.
6+
*/
7+
8+
package org.elasticsearch.xpack.core.ilm;
9+
10+
import org.apache.logging.log4j.LogManager;
11+
import org.apache.logging.log4j.Logger;
12+
import org.elasticsearch.cluster.ClusterState;
13+
import org.elasticsearch.cluster.metadata.IndexMetadata;
14+
import org.elasticsearch.common.Nullable;
15+
import org.elasticsearch.common.unit.TimeValue;
16+
import org.elasticsearch.index.Index;
17+
import org.elasticsearch.xpack.core.ilm.step.info.SingleMessageFieldInfo;
18+
19+
import java.time.Clock;
20+
import java.util.Locale;
21+
import java.util.Objects;
22+
import java.util.concurrent.atomic.AtomicBoolean;
23+
24+
import static org.elasticsearch.xpack.core.ilm.LifecycleExecutionState.fromIndexMetadata;
25+
26+
/**
27+
* This step wraps an {@link ClusterStateWaitStep} in order to be able to manipulate what the next step will be, depending on the result of
28+
* the wrapped {@link ClusterStateWaitStep}.
29+
* <p>
30+
* If the action response is complete, the {@link ClusterStateWaitUntilThresholdStep}'s nextStepKey will be the nextStepKey of the
31+
* wrapped action. When the threshold level is surpassed, if the underlying step's condition was not met, the nextStepKey will be changed to
32+
* the provided {@link #nextKeyOnThresholdBreach} and this step will stop waiting.
33+
*
34+
* Failures encountered whilst executing the wrapped action will be propagated directly.
35+
*/
36+
public class ClusterStateWaitUntilThresholdStep extends ClusterStateWaitStep {
37+
38+
private static final Logger logger = LogManager.getLogger(ClusterStateWaitUntilThresholdStep.class);
39+
40+
private final ClusterStateWaitStep stepToExecute;
41+
private final StepKey nextKeyOnThresholdBreach;
42+
private final AtomicBoolean thresholdPassed = new AtomicBoolean(false);
43+
44+
public ClusterStateWaitUntilThresholdStep(ClusterStateWaitStep stepToExecute, StepKey nextKeyOnThresholdBreach) {
45+
super(stepToExecute.getKey(), stepToExecute.getNextStepKey());
46+
this.stepToExecute = stepToExecute;
47+
this.nextKeyOnThresholdBreach = nextKeyOnThresholdBreach;
48+
}
49+
50+
@Override
51+
public boolean isRetryable() {
52+
return true;
53+
}
54+
55+
@Override
56+
public Result isConditionMet(Index index, ClusterState clusterState) {
57+
IndexMetadata idxMeta = clusterState.metadata().index(index);
58+
if (idxMeta == null) {
59+
// Index must have been since deleted, ignore it
60+
logger.debug("[{}] lifecycle action for index [{}] executed but index no longer exists",
61+
getKey().getAction(), index.getName());
62+
return new Result(false, null);
63+
}
64+
65+
Result stepResult = stepToExecute.isConditionMet(index, clusterState);
66+
67+
if (stepResult.isComplete() == false) {
68+
// checking the threshold after we execute the step to make sure we execute the wrapped step at least once (because time is a
69+
// wonderful thing)
70+
TimeValue retryThreshold = LifecycleSettings.LIFECYCLE_STEP_WAIT_TIME_THRESHOLD_SETTING.get(idxMeta.getSettings());
71+
LifecycleExecutionState lifecycleState = fromIndexMetadata(idxMeta);
72+
if (waitedMoreThanThresholdLevel(retryThreshold, lifecycleState, Clock.systemUTC())) {
73+
// we retried this step enough, next step will be the configured to {@code nextKeyOnThresholdBreach}
74+
thresholdPassed.set(true);
75+
76+
String message = String.format(Locale.ROOT, "[%s] lifecycle step, as part of [%s] action, for index [%s] executed for" +
77+
" more than [%s]. Abandoning execution and moving to the next fallback step [%s]",
78+
getKey().getName(), getKey().getAction(), idxMeta.getIndex().getName(), retryThreshold,
79+
nextKeyOnThresholdBreach);
80+
logger.debug(message);
81+
82+
return new Result(true, new SingleMessageFieldInfo(message));
83+
}
84+
}
85+
86+
return stepResult;
87+
}
88+
89+
static boolean waitedMoreThanThresholdLevel(@Nullable TimeValue retryThreshold, LifecycleExecutionState lifecycleState, Clock clock) {
90+
assert lifecycleState.getStepTime() != null : "lifecycle state [" + lifecycleState + "] does not have the step time set";
91+
if (retryThreshold != null) {
92+
// return true if the threshold was surpassed and false otherwise
93+
return (lifecycleState.getStepTime() + retryThreshold.millis()) < clock.millis();
94+
}
95+
return false;
96+
}
97+
98+
@Override
99+
public StepKey getNextStepKey() {
100+
if (thresholdPassed.get()) {
101+
return nextKeyOnThresholdBreach;
102+
} else {
103+
return super.getNextStepKey();
104+
}
105+
}
106+
107+
/**
108+
* Represents the {@link ClusterStateWaitStep} that's wrapped by this branching step.
109+
*/
110+
ClusterStateWaitStep getStepToExecute() {
111+
return stepToExecute;
112+
}
113+
114+
/**
115+
* The step key to be reported as the {@link ClusterStateWaitUntilThresholdStep#getNextStepKey()} if the index configured a max wait
116+
* time using {@link LifecycleSettings#LIFECYCLE_STEP_WAIT_TIME_THRESHOLD_SETTING} and the threshold was passed.
117+
*/
118+
StepKey getNextKeyOnThreshold() {
119+
return nextKeyOnThresholdBreach;
120+
}
121+
122+
@Override
123+
public boolean equals(Object o) {
124+
if (this == o) {
125+
return true;
126+
}
127+
if (o == null || getClass() != o.getClass()) {
128+
return false;
129+
}
130+
if (super.equals(o) == false) {
131+
return false;
132+
}
133+
ClusterStateWaitUntilThresholdStep that = (ClusterStateWaitUntilThresholdStep) o;
134+
return super.equals(o)
135+
&& Objects.equals(stepToExecute, that.stepToExecute)
136+
&& Objects.equals(nextKeyOnThresholdBreach, that.nextKeyOnThresholdBreach);
137+
}
138+
139+
@Override
140+
public int hashCode() {
141+
return Objects.hash(super.hashCode(), stepToExecute, nextKeyOnThresholdBreach);
142+
}
143+
}

0 commit comments

Comments
 (0)