Skip to content

UpdateSettingsIT#testEngineGCDeletesSetting failure #38874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
talevy opened this issue Feb 13, 2019 · 6 comments · Fixed by #38942
Closed

UpdateSettingsIT#testEngineGCDeletesSetting failure #38874

talevy opened this issue Feb 13, 2019 · 6 comments · Fixed by #38942
Assignees
Labels
:Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >test-failure Triaged test failures from CI v7.0.0-rc2

Comments

@talevy
Copy link
Contributor

talevy commented Feb 13, 2019

test failing:

org.elasticsearch.indices.settings.UpdateSettingsIT testEngineGCDeletesSetting

#38254 introduced a new VersionConflictEngineException assertion that seems
to be dependent on cache invalidation timed by Thread.sleep.

exception:

java.lang.AssertionError: expected a class org.elasticsearch.index.engine.VersionConflictEngineException exception to be thrown

	at __randomizedtesting.SeedInfo.seed([103812063D7BAB96:96D53E0C3D9BE721]:0)
	at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertThrows(ElasticsearchAssertions.java:596)
	at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertThrows(ElasticsearchAssertions.java:543)
	at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertThrows(ElasticsearchAssertions.java:520)
	at org.elasticsearch.indices.settings.UpdateSettingsIT.testEngineGCDeletesSetting(UpdateSettingsIT.java:452)

was tough to get a stacktrace, but looks to fail in InternalEngine#planIndexingAsPrimary

with the exception:

[type][1]: version conflict, required seqNo [3], primary term [1]. but no document was found

ci link: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.0+internalClusterTest/472/console

reproduce:

./gradlew :server:integTest \
  -Dtests.seed=EAF25E026AF61CCF \
  -Dtests.class=org.elasticsearch.indices.settings.UpdateSettingsIT \
  -Dtests.method="testEngineGCDeletesSetting" \
  -Dtests.security.manager=true \
  -Dtests.locale=is-IS \
  -Dtests.timezone=Australia/Canberra \
  -Dcompiler.java=11 \
  -Druntime.java=8

This is easy to reproduce if one removes the Thread.sleep(300). Which means that
cache hadn't changed in the CI run in time for the assertion.

@talevy talevy added :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >test-failure Triaged test failures from CI v7.0.0 labels Feb 13, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@talevy
Copy link
Contributor Author

talevy commented Feb 13, 2019

Thread.sleep(300); // wait for cache time to change TODO: this needs to be solved better. To be discussed.

The timing issue has a TODO, so it may be worth discussion instead of simply increasing the timeout?

@matriv matriv self-assigned this Feb 14, 2019
@talevy
Copy link
Contributor Author

talevy commented Feb 14, 2019

@matriv thought about this one, this diff should resolve the problem

--- a/server/src/test/java/org/elasticsearch/indices/settings/UpdateSettingsIT.java
+++ b/server/src/test/java/org/elasticsearch/indices/settings/UpdateSettingsIT.java
@@ -435,7 +435,7 @@ public class UpdateSettingsIT extends ESIntegTestCase {
         assertThat(getSettingsResponse.getSetting("test", "index.final"), nullValue());
     }
 
-    public void testEngineGCDeletesSetting() throws InterruptedException {
+    public void testEngineGCDeletesSetting() throws Exception {
         createIndex("test");
         client().prepareIndex("test", "type", "1").setSource("f", 1).get();
         DeleteResponse response = client().prepareDelete("test", "type", "1").get();
@@ -445,12 +445,10 @@ public class UpdateSettingsIT extends ESIntegTestCase {
         client().prepareIndex("test", "type", "1").setSource("f", 2).setIfSeqNo(seqNo).setIfPrimaryTerm(primaryTerm).get();
         client().admin().indices().prepareUpdateSettings("test").setSettings(Settings.builder().put("index.gc_deletes", 0)).get();
 
-        response = client().prepareDelete("test", "type", "1").get();
-        seqNo = response.getSeqNo();
-        Thread.sleep(300); // wait for cache time to change TODO: this needs to be solved better. To be discussed.
+        DeleteResponse secondDeleteResponse = client().prepareDelete("test", "type", "1").get();
         // delete is should not be in cache
-        assertThrows(client().prepareIndex("test", "type", "1").setSource("f", 3).setIfSeqNo(seqNo).setIfPrimaryTerm(primaryTerm),
-            VersionConflictEngineException.class);
+        assertBusy(() -> assertThrows(client().prepareIndex("test", "type", "1").setSource("f", 3)
+                .setIfSeqNo(secondDeleteResponse.getSeqNo()).setIfPrimaryTerm(primaryTerm), VersionConflictEngineException.class));
 
     }

matriv added a commit to matriv/elasticsearch that referenced this issue Feb 15, 2019
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: elastic#38874

Co-authored-by: Boaz Leskes <[email protected]>
@matriv
Copy link
Contributor

matriv commented Feb 15, 2019

@talevy Thanks for that.
We had a discussion with @bleskes who helped me here (thank you!) and we came up with: #38942 .

matriv added a commit that referenced this issue Feb 21, 2019
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: #38874

Co-authored-by: Boaz Leskes <[email protected]>
matriv added a commit to matriv/elasticsearch that referenced this issue Feb 21, 2019
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: elastic#38874

Co-authored-by: Boaz Leskes <[email protected]>
matriv added a commit to matriv/elasticsearch that referenced this issue Feb 21, 2019
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: elastic#38874

Co-authored-by: Boaz Leskes <[email protected]>
matriv added a commit that referenced this issue Feb 21, 2019
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: #38874

Co-authored-by: Boaz Leskes <[email protected]>
matriv added a commit that referenced this issue Feb 21, 2019
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: #38874

Co-authored-by: Boaz Leskes <[email protected]>
weizijun pushed a commit to weizijun/elasticsearch that referenced this issue Feb 22, 2019
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: elastic#38874

Co-authored-by: Boaz Leskes <[email protected]>
weizijun pushed a commit to weizijun/elasticsearch that referenced this issue Feb 22, 2019
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: elastic#38874

Co-authored-by: Boaz Leskes <[email protected]>
@dliappis dliappis reopened this Mar 17, 2020
@dnhatn dnhatn assigned dnhatn and unassigned matriv Mar 18, 2020
dnhatn pushed a commit that referenced this issue Mar 18, 2020
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: #38874

Co-authored-by: Boaz Leskes <[email protected]>
@dnhatn
Copy link
Member

dnhatn commented Mar 18, 2020

Backported to 6.8 in 3f3033b

@dnhatn dnhatn closed this as completed Mar 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >test-failure Triaged test failures from CI v7.0.0-rc2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants