Skip to content

Reindex search resiliency #45497

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

henningandersen
Copy link
Contributor

Local reindex can now survive loosing data nodes that contain source
data. The original query will be restarted with a filter for
_seq_no >= last_seq_no when a failure is detected.

Part of #42612 and split out from #43187

A couple of follow-ups will be done (all indicated with todos) in addition to what is in the meta issue:

  1. Fix the problem with allowPartialSearchResult in master
  2. Build the ScrollableHitSource separately rather than in super constructor.
  3. SeqNo mapping creation or avoid the issue.

Local reindex can now survive loosing data nodes that contain source
data. The original query will be restarted with a filter for
`_seq_no >= last_seq_no` when a failure is detected.

Part of elastic#42612 and split out from elastic#43187
@henningandersen henningandersen added >enhancement v8.0.0 :Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down labels Aug 13, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@Tim-Brooks
Copy link
Contributor

I will take a look at this tomorrow.

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly looking good to me. I don't see any issues with the approach or your design. It seems likely that the test failures are related to this change. Do you want to let me know when I should look again to officially approve?

@@ -135,7 +136,9 @@
this.listener = listener;
BackoffPolicy backoffPolicy = buildBackoffPolicy();
bulkRetry = new Retry(BackoffPolicy.wrap(backoffPolicy, worker::countBulkRetry), threadPool);
scrollSource = buildScrollableResultSource(backoffPolicy);
// todo: this is trappy, since if a subclass override relies on subclass fields, they are not initialized. We should fix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This was causing me problems in a recent PR.

No longer retry/restart the original search request, since this is not
what we used to do and it leads to long wait time to get the info back
that a search request is bad.
@henningandersen
Copy link
Contributor Author

Thanks @tbrooks8 and sorry for not fixing the test failure before now. It should be ready for another round (provided tests succeed).

No longer fail on the empty index. So far we consider this a
workaround/hack more than the solution.
Use unmapped_type to circumvent problem with newly created indices
without a mapping, since this is local to reindex. Once type removal is
complete, ensuring that the metadata fields are always available likely
becomes easier, so we will defer a solution to the search/sort problem.
@henningandersen
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/1

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down >enhancement v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants