Skip to content

[CI] Data Frame IT failures due to task not terminating #42344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
davidkyle opened this issue May 22, 2019 · 6 comments · Fixed by #42373 or #42644
Closed

[CI] Data Frame IT failures due to task not terminating #42344

davidkyle opened this issue May 22, 2019 · 6 comments · Fixed by #42373 or #42644
Labels
:ml/Transform Transform >test-failure Triaged test failures from CI

Comments

@davidkyle
Copy link
Member

davidkyle commented May 22, 2019

A number of tests have failed due to a failed wait for pending tasks test in the teardown. Affected tests are

  • DataFrameGetAndGetStatsIT
  • DataFramePivotRestIT
  • DataFrameMetaDataIT
  • DataFrameAuditorIT

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=sles-12/433/console

Many tests in the suite failed due to a wait for pending tasks failures

java.lang.AssertionError: 2 active tasks found:
data_frame/transforms[c]       d61iDVODQD-XrdRTsClKhw:1291 cluster:5                   persistent 1558492518692 02:35:18 49.8s       127.0.0.1 node-0 data_frame_pivot_stats_2
data_frame/transforms[c]       d61iDVODQD-XrdRTsClKhw:1474 cluster:6                   persistent 1558492519579 02:35:19 48.9s       127.0.0.1 node-0 data_frame_simpleStatsPivotWithQuery
 expected:<0> but was:<2>

For some reason pivot_stats_2 and simpleStatsPivotWithQuery did not stop

integTestCluster.log

One of the reproduce lines:

./gradlew :x-pack:plugin:data-frame:qa:single-node-tests:integTestRunner --tests "org.elasticsearch.xpack.dataframe.integration.DataFrameGetAndGetStatsIT.testGetPersistedStatsWithoutTask" \
  -Dtests.seed=A71AC2C8B8F32174 \
  -Dtests.security.manager=true \
  -Dtests.locale=en-GG \
  -Dtests.timezone=Europe/Podgorica \
  -Dcompiler.java=12 \
  -Druntime.java=11
@davidkyle davidkyle added >test-failure Triaged test failures from CI :ml/Transform Transform labels May 22, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@davidkyle
Copy link
Member Author

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/3636/console failed with the same wait for pending tasks assertions

data_frame/transforms[c]       U282Ia5PQ-expVFf07m4JA:977  cluster:4                   persistent 1558504830488 06:00:30 16.6s       127.0.0.1 node-0 data_frame_simplePivotWithQuery
data_frame/transforms[c]       U282Ia5PQ-expVFf07m4JA:1148 cluster:5                   persistent 1558504831354 06:00:31 15.8s       127.0.0.1 node-0 data_frame_scriptedMetricPivot
data_frame/transforms[c]       U282Ia5PQ-expVFf07m4JA:1615 cluster:7                   persistent 1558504833384 06:00:33 13.8s       127.0.0.1 node-0 data_frame_simpleDateHistogramPivotWithMaxTime
data_frame/transforms[c]       U282Ia5PQ-expVFf07m4JA:1808 cluster:8                   persistent 1558504834099 06:00:34 13s         127.0.0.1 node-0 data_frame_simpleHistogramPivot
data_frame/transforms[c]       U282Ia5PQ-expVFf07m4JA:2208 cluster:10                  persistent 1558504835555 06:00:35 11.6s       127.0.0.1 node-0 data_frame_simpleDateHistogramPivo

@alpar-t
Copy link
Contributor

alpar-t commented May 22, 2019

It seems that the failure so far were on sles and opensuse.
Might be a coincidence trough.

@davidkyle
Copy link
Member Author

Another failure where the tasks are not stopping properly

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/3646/console

data_frame/transforms[c]       zt-t-x_OTt-OOeyJKBJbMg:1238 cluster:5                   persistent 1558524685283 11:31:25 5.4s        127.0.0.1 node-0 data_frame_simplePivotWithQuery
data_frame/transforms[c]       zt-t-x_OTt-OOeyJKBJbMg:1399 cluster:6                   persistent 1558524685925 11:31:25 4.8s        127.0.0.1 node-0 data_frame_simplePivot
data_frame/transforms[c]       zt-t-x_OTt-OOeyJKBJbMg:1579 cluster:7                   persistent 1558524686727 11:31:26 4s          127.0.0.1 node-0 data_frame_scriptedMetricPivot
data_frame/transforms[c]       zt-t-x_OTt-OOeyJKBJbMg:1776 cluster:8                   persistent 1558524687421 11:31:27 3.3s        127.0.0.1 node-0 data_frame_simpleHistogramPivot
data_frame/transforms[c]       zt-t-x_OTt-OOeyJKBJbMg:1938 cluster:9                   persistent 1558524688025 11:31:28 2.7s        127.0.0.1 node-0 data_frame_geoCentroidPivot
data_frame/transforms[c]       zt-t-x_OTt-OOeyJKBJbMg:2122 cluster:10                  persistent 1558524688670 11:31:28 2s          127.0.0.1 node-0 data_frame_bucketScriptPivot

@davidkyle davidkyle changed the title [CI] DataFrameGetAndGetStatsIT Failure [CI] Data Frame IT failures due to task not terminating May 22, 2019
davidkyle added a commit that referenced this issue May 22, 2019
@davidkyle
Copy link
Member Author

davidkyle commented May 22, 2019

Muted

davidkyle added a commit that referenced this issue May 22, 2019
davidkyle added a commit that referenced this issue May 23, 2019
davidkyle added a commit that referenced this issue May 25, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this issue May 27, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this issue May 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml/Transform Transform >test-failure Triaged test failures from CI
Projects
None yet
3 participants