Skip to content

Commit 3fc35aa

Browse files
jimczidnhatn
authored andcommitted
Shard Search Scroll failures consistency (#62061)
Today some uncaught shard failures such as RejectedExecutionException skips the release of shard context and let subsequent scroll requests access the same shard context again. Depending on how the other shards advanced, this behavior can lead to missing data since scrolls always move forward. In order to avoid hidden data loss, this commit ensures that we always release the context of shard search scroll requests whenever a failure occurs locally. The shard search context will no longer exist in subsequent scroll requests which will lead to consistent shard failures in the responses. This change also modifies the retry tests of the reindex feature. Reindex retries scroll search request that contains a shard failure and move on whenever the failure disappears. That is not compatible with how scrolls work and can lead to missing data as explained above. That means that reindex will now report scroll failures when search rejection happen during the operation instead of skipping document silently. Finally this change removes an old TODO that was fulfilled with #61062.
1 parent 4d528e9 commit 3fc35aa

File tree

6 files changed

+89
-152
lines changed

6 files changed

+89
-152
lines changed

modules/reindex/src/test/java/org/elasticsearch/index/reindex/RetryTests.java

+8-35
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@
5353
import static org.hamcrest.Matchers.hasSize;
5454

5555
/**
56-
* Integration test for retry behavior. Useful because retrying relies on the way that the
56+
* Integration test for bulk retry behavior. Useful because retrying relies on the way that the
5757
* rest of Elasticsearch throws exceptions and unit tests won't verify that.
5858
*/
5959
public class RetryTests extends ESIntegTestCase {
@@ -84,7 +84,7 @@ protected Collection<Class<? extends Plugin>> transportClientPlugins() {
8484
}
8585

8686
/**
87-
* Lower the queue sizes to be small enough that both bulk and searches will time out and have to be retried.
87+
* Lower the queue sizes to be small enough that bulk will time out and have to be retried.
8888
*/
8989
@Override
9090
protected Settings nodeSettings(int nodeOrdinal) {
@@ -152,22 +152,15 @@ private void testCase(
152152
BulkIndexByScrollResponseMatcher matcher)
153153
throws Exception {
154154
/*
155-
* These test cases work by stuffing the search and bulk queues of a single node and
156-
* making sure that we read and write from that node. Because of some "fun" with the
157-
* way that searches work, we need at least one more node to act as the coordinating
158-
* node for the search request. If we didn't do this then the searches would get stuck
159-
* in the queue anyway because we force queue portions of the coordinating node's
160-
* actions. This is not a big deal in normal operations but a real pain when you are
161-
* intentionally stuffing queues hoping for a failure.
155+
* These test cases work by stuffing the bulk queue of a single node and
156+
* making sure that we read and write from that node.
162157
*/
163158

164159
final Settings nodeSettings = Settings.builder()
165160
// use pools of size 1 so we can block them
166161
.put("thread_pool.write.size", 1)
167-
.put("thread_pool.search.size", 1)
168-
// use queues of size 1 because size 0 is broken and because search requests need the queue to function
162+
// use queues of size 1 because size 0 is broken and because bulk requests need the queue to function
169163
.put("thread_pool.write.queue_size", 1)
170-
.put("thread_pool.search.queue_size", 1)
171164
.put("node.attr.color", "blue")
172165
.build();
173166
final String node = internalCluster().startDataOnlyNode(nodeSettings);
@@ -193,45 +186,25 @@ private void testCase(
193186
assertFalse(initialBulkResponse.buildFailureMessage(), initialBulkResponse.hasFailures());
194187
client().admin().indices().prepareRefresh("source").get();
195188

196-
logger.info("Blocking search");
197-
CyclicBarrier initialSearchBlock = blockExecutor(ThreadPool.Names.SEARCH, node);
198-
199189
AbstractBulkByScrollRequestBuilder<?, ?> builder = request.apply(internalCluster().masterClient());
200190
// Make sure we use more than one batch so we have to scroll
201191
builder.source().setSize(DOC_COUNT / randomIntBetween(2, 10));
202192

193+
logger.info("Blocking bulk so we start to get bulk rejections");
194+
CyclicBarrier bulkBlock = blockExecutor(ThreadPool.Names.WRITE, node);
195+
203196
logger.info("Starting request");
204197
ActionFuture<BulkByScrollResponse> responseListener = builder.execute();
205198

206199
try {
207-
logger.info("Waiting for search rejections on the initial search");
208-
assertBusy(() -> assertThat(taskStatus(action).getSearchRetries(), greaterThan(0L)));
209-
210-
logger.info("Blocking bulk and unblocking search so we start to get bulk rejections");
211-
CyclicBarrier bulkBlock = blockExecutor(ThreadPool.Names.WRITE, node);
212-
initialSearchBlock.await();
213-
214200
logger.info("Waiting for bulk rejections");
215201
assertBusy(() -> assertThat(taskStatus(action).getBulkRetries(), greaterThan(0L)));
216-
217-
// Keep a copy of the current number of search rejections so we can assert that we get more when we block the scroll
218-
long initialSearchRejections = taskStatus(action).getSearchRetries();
219-
220-
logger.info("Blocking search and unblocking bulk so we should get search rejections for the scroll");
221-
CyclicBarrier scrollBlock = blockExecutor(ThreadPool.Names.SEARCH, node);
222202
bulkBlock.await();
223203

224-
logger.info("Waiting for search rejections for the scroll");
225-
assertBusy(() -> assertThat(taskStatus(action).getSearchRetries(), greaterThan(initialSearchRejections)));
226-
227-
logger.info("Unblocking the scroll");
228-
scrollBlock.await();
229-
230204
logger.info("Waiting for the request to finish");
231205
BulkByScrollResponse response = responseListener.get();
232206
assertThat(response, matcher);
233207
assertThat(response.getBulkRetries(), greaterThan(0L));
234-
assertThat(response.getSearchRetries(), greaterThan(initialSearchRejections));
235208
} finally {
236209
// Fetch the response just in case we blew up half way through. This will make sure the failure is thrown up to the top level.
237210
BulkByScrollResponse response = responseListener.get();

server/src/main/java/org/elasticsearch/action/search/SearchScrollAsyncAction.java

-15
Original file line numberDiff line numberDiff line change
@@ -53,21 +53,6 @@
5353
* run separate fetch phases etc.
5454
*/
5555
abstract class SearchScrollAsyncAction<T extends SearchPhaseResult> implements Runnable {
56-
/*
57-
* Some random TODO:
58-
* Today we still have a dedicated executing mode for scrolls while we could simplify this by implementing
59-
* scroll like functionality (mainly syntactic sugar) as an ordinary search with search_after. We could even go further and
60-
* make the scroll entirely stateless and encode the state per shard in the scroll ID.
61-
*
62-
* Today we also hold a context per shard but maybe
63-
* we want the context per coordinating node such that we route the scroll to the same coordinator all the time and hold the context
64-
* here? This would have the advantage that if we loose that node the entire scroll is deal not just one shard.
65-
*
66-
* Additionally there is the possibility to associate the scroll with a seq. id. such that we can talk to any replica as long as
67-
* the shards engine hasn't advanced that seq. id yet. Such a resume is possible and best effort, it could be even a safety net since
68-
* if you rely on indices being read-only things can change in-between without notification or it's hard to detect if there where any
69-
* changes while scrolling. These are all options to improve the current situation which we can look into down the road
70-
*/
7156
protected final Logger logger;
7257
protected final ActionListener<SearchResponse> listener;
7358
protected final ParsedScrollId scrollId;

server/src/main/java/org/elasticsearch/index/engine/Engine.java

+2-1
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,8 @@ public abstract class Engine implements Closeable {
113113
public static final String FORCE_MERGE_UUID_KEY = "force_merge_uuid";
114114
public static final String MIN_RETAINED_SEQNO = "min_retained_seq_no";
115115
public static final String MAX_UNSAFE_AUTO_ID_TIMESTAMP_COMMIT_ID = "max_unsafe_auto_id_timestamp";
116-
public static final String CAN_MATCH_SEARCH_SOURCE = "can_match"; // TODO: Make source of search enum?
116+
public static final String SEARCH_SOURCE = "search"; // TODO: Make source of search enum?
117+
public static final String CAN_MATCH_SEARCH_SOURCE = "can_match";
117118

118119
protected final ShardId shardId;
119120
protected final String allocationId;

0 commit comments

Comments
 (0)