Skip to content

[CI] Failure in yaml=reference/ml/anomaly-detection/apis/close-job/line_102 #48941

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cbuescher opened this issue Nov 11, 2019 · 7 comments · Fixed by #63866
Closed

[CI] Failure in yaml=reference/ml/anomaly-detection/apis/close-job/line_102 #48941

cbuescher opened this issue Nov 11, 2019 · 7 comments · Fixed by #63866
Assignees
Labels
:ml Machine learning >test-failure Triaged test failures from CI

Comments

@cbuescher
Copy link
Member

Build scan: https://gradle-enterprise.elastic.co/s/stzxbe6to2qqa/console-log?task=:docs:integTestRunner

Didn't reproduce locally for me

./gradlew ':docs:integTestRunner' --tests "org.elasticsearch.smoketest.DocsClientYamlTestSuiteIT.test {yaml=reference/ml/anomaly-detection/apis/close-job/line_102}" -Dtests.seed=304F226F3FE7503C -Dtests.security.manager=true -Dtests.locale=es-CR -Dtests.timezone=America/Porto_Acre -Dcompiler.java=12 -Druntime.java=11

From the logs:


1> [2019-11-11T04:24:15,485][INFO ][o.e.s.DocsClientYamlTestSuiteIT] [test] Stash dump on test failure [{
--
1>   "stash" : {
1>     "body" : {
1>       "error" : {
1>         "root_cause" : [
1>           {
1>             "type" : "exception",
1>             "reason" : "Unexpected job state [failed] while waiting for job to be opened",
1>             "stack_trace" : "org.elasticsearch.ElasticsearchException: Unexpected job state [failed] while waiting for job to be opened
1> 	at org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:47)
1> 	at org.elasticsearch.xpack.ml.action.TransportOpenJobAction$JobPredicate.test(TransportOpenJobAction.java:578)
1> 	at org.elasticsearch.xpack.ml.action.TransportOpenJobAction$JobPredicate.test(TransportOpenJobAction.java:527)
1> 	at org.elasticsearch.persistent.PersistentTasksService.lambda$waitForPersistentTaskCondition$1(PersistentTasksService.java:153)
1> 	at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.clusterChanged(ClusterStateObserver.java:191)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateListeners$6(ClusterApplierService.java:521)
1> 	at java.base/java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
1> 	at java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:735)
1> 	at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListeners(ClusterApplierService.java:517)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:492)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176)
1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:699)
1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
1> 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
1> 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
1> 	at java.base/java.lang.Thread.run(Thread.java:834)
1> "
1>           }
1>         ],
1>         "type" : "exception",
1>         "reason" : "Unexpected job state [failed] while waiting for job to be opened",
1>         "stack_trace" : "org.elasticsearch.ElasticsearchException: Unexpected job state [failed] while waiting for job to be opened
1> 	at org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:47)
1> 	at org.elasticsearch.xpack.ml.action.TransportOpenJobAction$JobPredicate.test(TransportOpenJobAction.java:578)
1> 	at org.elasticsearch.xpack.ml.action.TransportOpenJobAction$JobPredicate.test(TransportOpenJobAction.java:527)
1> 	at org.elasticsearch.persistent.PersistentTasksService.lambda$waitForPersistentTaskCondition$1(PersistentTasksService.java:153)
1> 	at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.clusterChanged(ClusterStateObserver.java:191)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateListeners$6(ClusterApplierService.java:521)
1> 	at java.base/java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
1> 	at java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:735)
1> 	at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListeners(ClusterApplierService.java:517)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:492)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
1> 	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176)
1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:699)
1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
1> 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
1> 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
1> 	at java.base/java.lang.Thread.run(Thread.java:834)
1> "
1>       },
1>       "status" : 500
1>     }
1>   }
1> }]
1> [2019-11-11T04:24:15,633][INFO ][o.e.s.DocsClientYamlTestSuiteIT] [test] [yaml=reference/ml/anomaly-detection/apis/close-job/line_102] after test

And in the node logs:


»    ↓ errors and warnings from /dev/shm/elastic+elasticsearch+master+multijob+fast+part1/docs/build/testclusters/integTest-0/logs/es.stdout.log ↓
--
» WARN ][o.e.d.FileBasedSeedHostsProvider] [node-0] expected, but did not find, a dynamic hosts list at [/dev/shm/elastic+elasticsearch+master+multijob+fast+part1/docs/build/testclusters/integTest-0/config/unicast_hosts.txt]
» WARN ][o.e.c.m.TemplateUpgradeService] [node-0] Templates are still reported as out of date after the upgrade. The template upgrade will be retried.
»   ↑ repeated 104 times ↑
» ERROR][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] [node-0] Failed to launch autodetect for job total-requests
» WARN ][o.e.p.AllocatedPersistentTask] [node-0] task job-total-requests failed with an exception
»  org.elasticsearch.ElasticsearchException: Failed to launch autodetect for job total-requests
»  	at org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:51) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:126) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createAutodetectProcess(NativeAutodetectProcessFactory.java:75) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.create(AutodetectProcessManager.java:511) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.createProcessAndSetRunning(AutodetectProcessManager.java:462) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager$3.doRun(AutodetectProcessManager.java:407) ~[?:?]
»  	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:769) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
»  	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
»  	at java.lang.Thread.run(Thread.java:834) [?:?]
»  Caused by: java.io.FileNotFoundException: /dev/shm/elastic+elasticsearch+master+multijob+fast+part1/docs/build/testclusters/integTest-0/tmp/autodetect_total-requests_log_31951 (No such file or directory)
»  	at java.io.FileInputStream.open0(Native Method) ~[?:?]
»  	at java.io.FileInputStream.open(FileInputStream.java:219) ~[?:?]
»  	at java.io.FileInputStream.<init>(FileInputStream.java:157) ~[?:?]
»  	at java.io.FileInputStream.<init>(FileInputStream.java:112) ~[?:?]
»  	at org.elasticsearch.xpack.ml.utils.NamedPipeHelper$PrivilegedInputPipeOpener.run(NamedPipeHelper.java:288) ~[?:?]
»  	at org.elasticsearch.xpack.ml.utils.NamedPipeHelper$PrivilegedInputPipeOpener.run(NamedPipeHelper.java:277) ~[?:?]
»  	at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
»  	at org.elasticsearch.xpack.ml.utils.NamedPipeHelper.openNamedPipeInputStream(NamedPipeHelper.java:130) ~[?:?]
»  	at org.elasticsearch.xpack.ml.utils.NamedPipeHelper.openNamedPipeInputStream(NamedPipeHelper.java:97) ~[?:?]
»  	at org.elasticsearch.xpack.ml.process.ProcessPipes.connectStreams(ProcessPipes.java:132) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:122) ~[?:?]
»  	... 9 more
» WARN ][o.e.p.PersistentTasksClusterService] [node-0] persistent task job-total-requests failed
»  org.elasticsearch.ElasticsearchException: Failed to launch autodetect for job total-requests
»  	at org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:51) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:126) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createAutodetectProcess(NativeAutodetectProcessFactory.java:75) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.create(AutodetectProcessManager.java:511) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.createProcessAndSetRunning(AutodetectProcessManager.java:462) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager$3.doRun(AutodetectProcessManager.java:407) ~[?:?]
»  	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:769) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
»  	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
»  	at java.lang.Thread.run(Thread.java:834) [?:?]
»  Caused by: java.io.FileNotFoundException: /dev/shm/elastic+elasticsearch+master+multijob+fast+part1/docs/build/testclusters/integTest-0/tmp/autodetect_total-requests_log_31951 (No such file or directory)
»  	at java.io.FileInputStream.open0(Native Method) ~[?:?]
»  	at java.io.FileInputStream.open(FileInputStream.java:219) ~[?:?]
»  	at java.io.FileInputStream.<init>(FileInputStream.java:157) ~[?:?]
»  	at java.io.FileInputStream.<init>(FileInputStream.java:112) ~[?:?]
»  	at org.elasticsearch.xpack.ml.utils.NamedPipeHelper$PrivilegedInputPipeOpener.run(NamedPipeHelper.java:288) ~[?:?]
»  	at org.elasticsearch.xpack.ml.utils.NamedPipeHelper$PrivilegedInputPipeOpener.run(NamedPipeHelper.java:277) ~[?:?]
»  	at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
»  	at org.elasticsearch.xpack.ml.utils.NamedPipeHelper.openNamedPipeInputStream(NamedPipeHelper.java:130) ~[?:?]
»  	at org.elasticsearch.xpack.ml.utils.NamedPipeHelper.openNamedPipeInputStream(NamedPipeHelper.java:97) ~[?:?]
»  	at org.elasticsearch.xpack.ml.process.ProcessPipes.connectStreams(ProcessPipes.java:132) ~[?:?]
»  	at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:122) ~[?:?]
»  	... 9 more
» WARN ][o.e.t.ThreadPool         ] [node-0] failed to run scheduled task [org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor$1$$Lambda$4932/0x00000001010d1c40@12eb8b00] on thread pool [ccr]
»  org.elasticsearch.transport.NoSuchRemoteClusterException: no such remote cluster: [remote_cluster]
»  	at org.elasticsearch.transport.RemoteClusterService.getRemoteClusterClient(RemoteClusterService.java:395) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.client.node.NodeClient.getRemoteClusterClient(NodeClient.java:130) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.remoteClient(ShardFollowTasksExecutor.java:491) ~[?:?]
»  	at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor$1.lambda$scheduleBackgroundRetentionLeaseRenewal$16(ShardFollowTasksExecutor.java:461) ~[?:?]
»  	at org.elasticsearch.threadpool.Scheduler$ReschedulingRunnable.doRun(Scheduler.java:223) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:769) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
»  	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
»  	at java.lang.Thread.run(Thread.java:834) [?:?]
»   ↑ repeated 28 times ↑
» WARN ][o.e.t.RemoteClusterService] [node-0] failed to connect to new remote cluster remote_cluster within 10s
» WARN ][o.e.c.s.ClusterApplierService] [node-0] cluster state applier task [Publication{term=1, version=1957}] took [10s] which is above the warn threshold of [5s]: [running task [Publication{term=1, version=1957}]] took [0ms], [connecting to new nodes] took [0ms], [applying settings] took [10004ms], [running applier [org.elasticsearch.indices.cluster.IndicesClusterStateService@12558b90]] took [0ms], [running applier [org.elasticsearch.script.ScriptService@3c76d02e]] took [0ms], [running applier [org.elasticsearch.xpack.ilm.IndexLifecycleService@66cdc5f8]] took [0ms], [running applier [org.elasticsearch.repositories.RepositoriesService@32b156ab]] took [0ms], [running applier [org.elasticsearch.snapshots.RestoreService@c865266]] took [0ms], [running applier [org.elasticsearch.ingest.IngestService@6fb0478]] took [0ms], [running applier [org.elasticsearch.action.ingest.IngestActionForwarder@e11f35a]] took [0ms], [running applier [org.elasticsearch.action.admin.cluster.repositories.cleanup.TransportCleanupRepositoryAction$$Lambda$3257/0x0000000100c4b040@36ade83d]] took [0ms], [running applier [org.elasticsearch.tasks.TaskManager@88699ec]] took [0ms], [running applier [org.elasticsearch.snapshots.SnapshotsService@56008440]] took [0ms], [notifying listener [org.elasticsearch.cluster.InternalClusterInfoService@603ad16a]] took [0ms], [notifying listener [org.elasticsearch.xpack.security.support.SecurityIndexManager@66818e7a]] took [0ms], [notifying listener [org.elasticsearch.xpack.security.support.SecurityIndexManager@164fd4f7]] took [0ms], [notifying listener [org.elasticsearch.xpack.security.authc.TokenService$$Lambda$2020/0x0000000100902040@77eb4a90]] took [0ms], [notifying listener [org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$$Lambda$2093/0x0000000100923c40@41ec6486]] took [0ms], [notifying listener [org.elasticsearch.xpack.watcher.support.WatcherIndexTemplateRegistry@2b74c09f]] took [0ms], [notifying listener [org.elasticsearch.xpack.watcher.WatcherLifeCycleService@4374416a]] took [0ms], [notifying listener [org.elasticsearch.xpack.watcher.WatcherIndexingListener@6e662c5e]] took [0ms], [notifying listener [org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager@4c8cbd40]] took [0ms], [notifying listener [org.elasticsearch.xpack.ml.datafeed.DatafeedManager$TaskRunner@2979c43e]] took [0ms], [notifying listener [org.elasticsearch.xpack.ml.MlAssignmentNotifier@2b6e288f]] took [0ms], [notifying listener [org.elasticsearch.xpack.ml.MlInitializationService@48a0a709]] took [0ms], [notifying listener [org.elasticsearch.xpack.ilm.IndexLifecycleService@66cdc5f8]] took [0ms], [notifying listener [org.elasticsearch.xpack.core.slm.history.SnapshotLifecycleTemplateRegistry@155d38c5]] took [0ms], [notifying listener [org.elasticsearch.xpack.slm.SnapshotLifecycleService@58c16d2f]] took [0ms], [notifying listener [org.elasticsearch.xpack.ccr.action.ShardFollowTaskCleaner@7e35d919]] took [0ms], [notifying listener [org.elasticsearch.xpack.transform.TransformClusterStateListener@33e0b139]] took [0ms], [notifying listener [org.elasticsearch.cluster.metadata.TemplateUpgradeService@4f872c90]] took [1ms], [notifying listener [org.elasticsearch.node.ResponseCollectorService@1cfc985a]] took [0ms], [notifying listener [org.elasticsearch.snapshots.SnapshotShardsService@4c5167cd]] took [0ms], [notifying listener [org.elasticsearch.xpack.ml.action.TransportOpenJobAction$OpenJobPersistentTasksExecutor$$Lambda$2801/0x0000000100b43040@7a8e6e07]] took [0ms], [notifying listener [org.elasticsearch.xpack.ml.action.TransportStartDataFrameAnalyticsAction$TaskExecutor$$Lambda$2805/0x0000000100b44040@c1b2e6a]] took [0ms], [notifying listener [org.elasticsearch.persistent.PersistentTasksClusterService@745d59fb]] took [0ms], [notifying listener [org.elasticsearch.cluster.routing.DelayedAllocationService@337537b2]] took [0ms], [notifying listener [org.elasticsearch.indices.store.IndicesStore@6b61e767]] took [0ms], [notifying listener [org.elasticsearch.gateway.DanglingIndicesState@45014577]] took [0ms], [notifying listener [org.elasticsearch.persistent.PersistentTasksNodeService@566be4de]] took [0ms], [notifying listener [org.elasticsearch.license.LicenseService@3b87d961]] took [0ms], [notifying listener [org.elasticsearch.xpack.ccr.action.AutoFollowCoordinator@4c6fc748]] took [0ms], [notifying listener [org.elasticsearch.gateway.GatewayService@6ae28122]] took [0ms], [notifying listener [org.elasticsearch.cluster.service.ClusterApplierService$LocalNodeMasterListeners@61b5c29a]] took [0ms]
» WARN ][o.e.t.SniffConnectionStrategy] [node-0] fetching nodes from external cluster [cluster_three] failed
»  org.elasticsearch.transport.ConnectTransportException: [][127.0.0.1:9302] connect_exception
»  	at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:995) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$3(ActionListener.java:162) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[elasticsearch-core-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
»  	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
»  	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
»  	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
»  	at org.elasticsearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:57) ~[elasticsearch-core-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:68) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:688) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:600) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:554) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) ~[?:?]
»  	at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) ~[?:?]
»  	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
»  	at java.lang.Thread.run(Thread.java:834) [?:?]
»  Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:9302
»  Caused by: java.net.ConnectException: Connection refused
»  	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
»  	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
»  	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) ~[?:?]
»  	... 7 more
» WARN ][o.e.t.SniffConnectionStrategy] [node-0] fetching nodes from external cluster [cluster_two] failed
»  org.elasticsearch.transport.ConnectTransportException: [][127.0.0.1:9301] connect_exception
»  	at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:995) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$3(ActionListener.java:162) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[elasticsearch-core-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
»  	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
»  	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
»  	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
»  	at org.elasticsearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:57) ~[elasticsearch-core-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:68) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:688) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:600) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:554) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) ~[?:?]
»  	at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) ~[?:?]
»  	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
»  	at java.lang.Thread.run(Thread.java:834) [?:?]
»  Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:9301
»  Caused by: java.net.ConnectException: Connection refused
»  	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
»  	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
»  	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) ~[?:?]
»  	... 7 more
»   ↑ repeated 2 times ↑
» WARN ][o.e.t.SniffConnectionStrategy] [node-0] fetching nodes from external cluster [cluster_one] failed
»  org.elasticsearch.transport.ConnectTransportException: [][127.0.0.1:9300] connect_exception
»  	at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:995) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$3(ActionListener.java:162) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[elasticsearch-core-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
»  	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
»  	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
»  	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
»  	at org.elasticsearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:57) ~[elasticsearch-core-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
»  	at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:68) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608) ~[?:?]
»  	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:688) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:600) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:554) ~[?:?]
»  	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) ~[?:?]
»  	at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) ~[?:?]
»  	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
»  	at java.lang.Thread.run(Thread.java:834) [?:?]
»  Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:9300
»  Caused by: java.net.ConnectException: Connection refused
»  	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
»  	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
»  	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) ~[?:?]
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) ~[?:?]
»  	... 7 more
» ERROR][o.e.x.m.p.l.CppLogMessageHandler] [node-0] [controller/32836] [CDetachedProcessSpawner.cc@184] Child process with PID 40379 was terminated by signal 9
»   ↓ last 40 non error or warning messages from /dev/shm/elastic+elasticsearch+master+multijob+fast+part1/docs/build/testclusters/integTest-0/logs/es.stdout.log ↓
@cbuescher cbuescher added >test-failure Triaged test failures from CI :ml Machine learning labels Nov 11, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@droberts195
Copy link
Contributor

droberts195 commented Nov 11, 2019

Caused by: java.io.FileNotFoundException: /dev/shm/elastic+elasticsearch+master+multijob+fast+part1/docs/build/testclusters/integTest-0/tmp/autodetect_total-requests_log_31951 (No such file or directory)

This is the most important line of the error messages. There are basically 3 possibilities:

  1. It's some quirk of the precise way the Debian 9 worker has been set up (as the failing build ran on Debian 9)
  2. It's some quirk of creating named pipes on RAM disks (/dev/shm is a RAM disk)
  3. It's some sort of race condition between the Java and C++

@droberts195
Copy link
Contributor

#48933 is exactly the same underlying problem on Debian 9, but causing a different test to fail. (It could cause any ML test that opens a job to fail.)

@droberts195
Copy link
Contributor

droberts195 commented Nov 12, 2019

Soon somebody is going to notice that this test is failing in 50% of all builds. This is mainly not due to the rare problem with named pipes that originally caused this issue to be raised, but instead due to another part of the docs enabling ML upgrade mode - see #48583 (comment). I muted the test in #49000 so hopefully these failures won't cause too much noise.

@cbuescher
Copy link
Member Author

@droberts195 just going through old issues than maybe can be closed. I see the test for this one is still muted at least on master, but also #49023 seems to have changed something around docs tests? Do you think this is still an issue?

@droberts195 droberts195 self-assigned this Oct 6, 2020
@droberts195
Copy link
Contributor

Thanks for the ping @cbuescher. As luck would have it I think we have recently found the cause of:

the rare problem with named pipes that originally caused this issue to be raised

I believe it's #62823, which I am working on now. I aim to do the Java/C++ changes early in 7.11. I can unmute the test once they're merged.

@droberts195
Copy link
Contributor

Note: when unmuting the setup should now be // TEST[setup:Kibana sample data] - different to what it was when originally muted.

droberts195 added a commit to droberts195/elasticsearch that referenced this issue Oct 19, 2020
The original comment mentioned issue elastic#48583, but issue elastic#48941
is specifically open for this mute.  However, this is
inappropriate, as the underlying reason the test cannot be
unmuted is the same as for all the other tests skipped with the
comment "Kibana sample data": issues elastic#51572, elastic#51576 and elastic#51678.

Closes elastic#48941
droberts195 added a commit that referenced this issue Oct 19, 2020
The original comment mentioned issue #48583, but issue #48941
is specifically open for this mute.  However, this is
inappropriate, as the underlying reason the test cannot be
unmuted is the same as for all the other tests skipped with the
comment "Kibana sample data": issues #51572, #51576 and #51678.

Closes #48941
droberts195 added a commit that referenced this issue Oct 19, 2020
The original comment mentioned issue #48583, but issue #48941
is specifically open for this mute.  However, this is
inappropriate, as the underlying reason the test cannot be
unmuted is the same as for all the other tests skipped with the
comment "Kibana sample data": issues #51572, #51576 and #51678.

Closes #48941
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants