Store reindexing result in reindex index #45260

Tim-Brooks · 2019-08-06T20:38:50Z

Currently the result of a reindex persistent task is propogated and
stored in the cluster state. This commit changes this so that only the
ephemeral task-id, headers, and reindex state is store in the cluster
state. Any result (exception or response) is stored in the reindex
index.

Relates to #42612.

…e_in_index

elasticmachine · 2019-08-06T20:38:53Z

Pinging @elastic/es-distributed

Tim-Brooks · 2019-08-06T23:27:37Z

@ywelsch @henningandersen - This work has exposed a few issues. Currently BulkByScrollResponse serializes some failures/search failures using ElasticsearchException#generateThrowableXContent. When you deserialize the response, it does not preserve the exception types. The exception types are necessary for the rest layer to return an appropriate status code.

Any thought how I should go about fixing this?

Tim-Brooks · 2019-08-07T00:03:28Z

I could serialize the response as a raw byte field using the internal transport serialization? Or I could add the status codes as a field in the x content?

ywelsch · 2019-08-08T07:22:36Z

We discussed this yesterday and decided to go with adding the RestStatus to ScrollableHitSource.SearchFailure, similar as was done for BulkItemResponse.Failure

henningandersen

LGTM.

Thanks @tbrooks8. I added a few smaller comments and then a couple to handle in follow-ups.

henningandersen · 2019-08-13T11:08:02Z

modules/reindex/src/main/java/org/elasticsearch/index/reindex/ReindexJobState.java

-        assert (reindexResponse == null) || (jobException == null) : "Either response or exception must be null";
-        this.reindexResponse = reindexResponse;
-        this.jobException = jobException;
+        this.status = status;


Should we assert that status is != null? Mostly to ensure that the isDone() method cannot return true on null.

henningandersen · 2019-08-13T11:18:05Z

server/src/main/java/org/elasticsearch/index/reindex/ReindexIndexClient.java

+        });
+    }
+
+    public void createReindexTaskDoc(String taskId, ReindexTaskIndexState reindexState, boolean indexExists,


I think it would be nice to make this method private and instead add a method that does not have the indexExists flag. The ReindexClient should then also receive the ClusterService in its constructor, enabling it to make the check on whether the index exists itself.

henningandersen · 2019-08-13T11:28:43Z

modules/reindex/src/main/java/org/elasticsearch/index/reindex/ReindexTask.java

+                updateClusterStateToFailed(shouldStoreResult, ReindexJobState.Status.FAILED_TO_WRITE_TO_REINDEX_INDEX, ex);
+            }
+        });
+


nit: superfluous newline.

henningandersen · 2019-08-13T11:33:59Z

modules/reindex/src/main/java/org/elasticsearch/index/reindex/ReindexTask.java

        TaskManager taskManager = getTaskManager();
        assert taskManager != null : "TaskManager should have been set before reindex started";

-        updatePersistentTaskState(new ReindexJobState(taskId, null, wrapException(ex)), new ActionListener<>() {
+        ReindexTaskIndexState reindexState = new ReindexTaskIndexState(reindexRequest, response, null);
+        reindexIndexClient.updateReindexTaskDoc(getPersistentTaskId(), reindexState, new ActionListener<>() {


We now store the result in to two places (3 if we include the .tasks index), both the index and persistent cluster state. I think we risk storing the index state and then dying - and waking up to not be able to read the index. We would then have the index indicate success and the task indicate failure. Also, we risk someone seeing that it succeeded in the index and then afterwards it failed (this I think we cannot guarantee 100% but can likely handle better). I think resolving these things are definitely outside the scope of this PR, but wanted to mention it here for awareness.

henningandersen · 2019-08-13T11:38:56Z

modules/reindex/src/main/java/org/elasticsearch/index/reindex/ReindexTask.java

    }

-    private void sendStartedNotification(boolean shouldStoreResult, Runnable listener) {
-        updatePersistentTaskState(new ReindexJobState(taskId, null, null), new ActionListener<>() {
+    private void updateClusterStateToStarted(boolean shouldStoreResult, Runnable listener) {


I wonder if this name is the best name? I think the status would already be STARTED? Or is there a null status initially? I lean towards preferring the old sendStartedNotification name.

henningandersen · 2019-08-13T12:08:55Z

...es/reindex/src/main/java/org/elasticsearch/index/reindex/TransportStartReindexJobAction.java

+                                } else {
+                                    listener.onFailure(reindexState.getException());
+                                }
+                            }


Should we then delete the persistent task, given that we responded and everything is done? Probably to be done in a follow-up rather than in this PR.

henningandersen · 2019-08-13T12:11:36Z

...es/reindex/src/main/java/org/elasticsearch/index/reindex/TransportStartReindexJobAction.java

@@ -128,11 +115,30 @@ private void waitForReindexDone(String taskId, ActionListener<StartReindexJobAct
                @Override
                public void onResponse(PersistentTasksCustomMetaData.PersistentTask<ReindexJob> task) {
                    ReindexJobState state = (ReindexJobState) task.getState();
-                    if (state.getJobException() == null) {
-                        listener.onResponse(new StartReindexJobAction.Response(taskId, state.getReindexResponse()));
+                    if (state.getStatus() == ReindexJobState.Status.FAILED_TO_READ_FROM_REINDEX_INDEX) {


Also for a follow-up: it could be legitimate that we cannot read from the reindex index, for instance in a full system restart situation. I wonder if we should add some kind of retries with backoff to handle this?

Tim-Brooks added 5 commits August 5, 2019 13:22

WIP

1f3eb16

Merge remote-tracking branch 'upstream/reindex_v2' into store_respons…

a839779

…e_in_index

WIP

dc03c6d

Tests

d0501de

Changes

65f7965

Tim-Brooks added >non-issue v8.0.0 :Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down labels Aug 6, 2019

Tim-Brooks requested review from ywelsch and henningandersen August 6, 2019 20:38

Add license

7ff1847

Changes

6fac306

henningandersen approved these changes Aug 13, 2019

View reviewed changes

Tim-Brooks added 6 commits August 13, 2019 09:21

Merge branch 'reindex_v2' into store_response_in_index

d4c46d5

Merge branch 'reindex_v2' into store_response_in_index

730e205

Preserve status

e4b96de

Fix timeout

3315b93

Merge branch 'reindex_v2' into store_response_in_index

2f8d3f0

Changes

08cf6dc

Tim-Brooks merged commit 9c8143f into elastic:reindex_v2 Aug 14, 2019

Tim-Brooks deleted the store_response_in_index branch December 18, 2019 14:54

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store reindexing result in reindex index #45260

Store reindexing result in reindex index #45260

Tim-Brooks commented Aug 6, 2019

elasticmachine commented Aug 6, 2019

Tim-Brooks commented Aug 6, 2019

Tim-Brooks commented Aug 7, 2019

ywelsch commented Aug 8, 2019

henningandersen left a comment

henningandersen Aug 13, 2019

henningandersen Aug 13, 2019

henningandersen Aug 13, 2019

henningandersen Aug 13, 2019

henningandersen Aug 13, 2019

henningandersen Aug 13, 2019

henningandersen Aug 13, 2019

Store reindexing result in reindex index #45260

Store reindexing result in reindex index #45260

Conversation

Tim-Brooks commented Aug 6, 2019

elasticmachine commented Aug 6, 2019

Tim-Brooks commented Aug 6, 2019

Tim-Brooks commented Aug 7, 2019

ywelsch commented Aug 8, 2019

henningandersen left a comment

Choose a reason for hiding this comment

henningandersen Aug 13, 2019

Choose a reason for hiding this comment

henningandersen Aug 13, 2019

Choose a reason for hiding this comment

henningandersen Aug 13, 2019

Choose a reason for hiding this comment

henningandersen Aug 13, 2019

Choose a reason for hiding this comment

henningandersen Aug 13, 2019

Choose a reason for hiding this comment

henningandersen Aug 13, 2019

Choose a reason for hiding this comment

henningandersen Aug 13, 2019

Choose a reason for hiding this comment