Skip to content

Add support for retries in Reindex API #60362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DylanGriffith opened this issue Jul 29, 2020 · 2 comments
Closed

Add support for retries in Reindex API #60362

DylanGriffith opened this issue Jul 29, 2020 · 2 comments
Labels
:Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Comments

@DylanGriffith
Copy link

DylanGriffith commented Jul 29, 2020

When performing a Re-index I have seen several times where the overall reindex fails for transient causes (eg. Node not connected or No search context found for id). When this happens I have the option to try and track down the failed slices and retry those using Manual Slicing. But this is not ideal since it requires manual effort and is error prone. Alternatively if lots failed I may end up preferring to just retry the entire reindex.

It would be good if the Reindex API supported a few more options. For example retries to specify the number of times to retry a failed slice. If it's difficult to tell the difference between transient errors and other errors you could also make it required for the user to specify the retry_errors that should be allowed to retry and anything else should be considered a permanent failure.

@DylanGriffith DylanGriffith added >enhancement needs:triage Requires assignment of a team area label labels Jul 29, 2020
@gwbrown gwbrown added :Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down and removed needs:triage Requires assignment of a team area label labels Aug 5, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Reindex)

@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Aug 5, 2020
@henningandersen
Copy link
Contributor

@DylanGriffith thanks for your interest in Elasticsearch. We have an existing issue to add resiliency to reindex (#42612), which seems to be what you are looking for?

Seeing "Node not connected" would normally signal a network level problem and this may be worth looking into to improve stability until that effort lands in the future.

Also, "No search context found for id" could be an indication of instability (network/node) or that the default scroll timeout is too small for your use case. It might be worth trying with a larger scroll timeout to see if this helps.

Given that we already have this problem registered in #42612, I will go ahead and close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
Development

No branches or pull requests

4 participants