Skip to content

[7.x] ILM: Make all the shrink action steps retryable (#70107) #70573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 18, 2021

Conversation

andreidan
Copy link
Contributor

This aims at making the shrink action retryable. Every step is
retryable, but in order to provide an experience where ILM tries
to achieve a successful shrink even when the target node goes
missing permanently or the shrunk index cannot recover, this also
introduces a retryable shrink cycle within the shrink action.

The shrink action will generate a unique index name that'll be the
shrunk index name. The generated index name is stored in the lifecycle
state.

If the shrink action ends up waiting for the source shards to
colocate or for the shrunk index to recover for more than the configured
LIFECYCLE_STEP_WAIT_TIME_THRESHOLD setting, it will move back
to clean up the attempted (and failed) shrunk index and will retry
generating a new index name and attempting to shrink the source
to the newly generated index name.

(cherry picked from commit 9831084)
Signed-off-by: Andrei Dan [email protected]

Backport of #70107

This aims at making the shrink action retryable. Every step is
retryable, but in order to provide an experience where ILM tries
to achieve a successful shrink even when the target node goes
missing permanently or the shrunk index cannot recover, this also
introduces a retryable shrink cycle within the shrink action.

The shrink action will generate a unique index name that'll be the
shrunk index name. The generated index name is stored in the lifecycle
state.

If the shrink action ends up waiting for the source shards to
colocate or for the shrunk index to recover for more than the configured
`LIFECYCLE_STEP_WAIT_TIME_THRESHOLD` setting, it will move back
to clean up the attempted (and failed) shrunk index and will retry
generating a new index name and attempting to shrink the source
to the newly generated index name.

(cherry picked from commit 9831084)
Signed-off-by: Andrei Dan <[email protected]>
@andreidan andreidan added :Data Management/ILM+SLM Index and Snapshot lifecycle management backport v7.13.0 labels Mar 18, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Mar 18, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@andreidan
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/2

@andreidan
Copy link
Contributor Author

@elasticmachine update branch

@andreidan andreidan merged commit 9b979fb into elastic:7.x Mar 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team v7.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants