Skip to content

[CI] DeleteExpiredDataIT testDeleteExpiredDataWithStandardThrottle fails with "all shards failed" #62699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
droberts195 opened this issue Sep 21, 2020 · 9 comments
Labels
medium-risk An open issue or test failure that is a medium risk to future releases :ml Machine learning Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@droberts195
Copy link
Contributor

droberts195 commented Sep 21, 2020

Build scan:

https://gradle-enterprise.elastic.co/s/w4q6k52ivcyim

Repro line:

./gradlew ':x-pack:plugin:ml:qa:native-multi-node-tests:javaRestTest' --tests "org.elasticsearch.xpack.ml.integration.DeleteExpiredDataIT.testDeleteExpiredDataWithStandardThrottle" \
  -Dtests.seed=DB3A8BFFCD011A4A \
  -Dtests.security.manager=true \
  -Dtests.locale=es-CU \
  -Dtests.timezone=America/Fortaleza \
  -Druntime.java=8

Reproduces locally?:

No

Applicable branches:

7.x, 7.9

Failure history:

https://build-stats.elastic.co/app/kibana#/discover?_g=(refreshInterval:(pause:!t,value:0),time:(from:now-30d,mode:quick,to:now))&_a=(columns:!(_source),index:b646ed00-7efc-11e8-bf69-63c8ef516157,interval:auto,query:(language:lucene,query:testDeleteExpiredDataWithStandardThrottle),sort:!(process.time-start,desc))

Failure excerpt:

org.elasticsearch.xpack.ml.integration.DeleteExpiredDataIT > testDeleteExpiredDataWithStandardThrottle FAILED
    Failed to execute phase [query], all shards failed
        at __randomizedtesting.SeedInfo.seed([DB3A8BFFCD011A4A:D35ADFF9435CF5D6]:0)
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:545)
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:311)
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:579)
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:387)
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$100(AbstractSearchAsyncAction.java:68)
        at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:245)
        at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73)
        at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59)
        at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:408)
        at org.elasticsearch.transport.TransportService$6.handleException(TransportService.java:640)
        at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1181)
        at org.elasticsearch.transport.InboundHandler.lambda$handleException$3(InboundHandler.java:277)
        at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:224)
        at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:275)
        at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:267)
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:131)
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:89)
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:700)
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142)
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117)
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82)
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:76)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518)
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at java.lang.Thread.run(Thread.java:748)
@droberts195 droberts195 added >test-failure Triaged test failures from CI :ml Machine learning labels Sep 21, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@droberts195
Copy link
Contributor Author

Also seen in 7.9 in https://gradle-enterprise.elastic.co/s/4k3rednexsp46

@danielmitterdorfer
Copy link
Member

Another instance in https://gradle-enterprise.elastic.co/s/xlfgaxlyono5o.

@davidkyle
Copy link
Member

davidkyle commented Oct 26, 2020

davidkyle added a commit that referenced this issue Oct 26, 2020
davidkyle added a commit to davidkyle/elasticsearch that referenced this issue Oct 26, 2020
For elastic#62699
# Conflicts:
#	x-pack/plugin/ml/qa/native-multi-node-tests/src/javaRestTest/java/org/elasticsearch/xpack/ml/integration/DeleteExpiredDataIT.java
@droberts195
Copy link
Contributor Author

Assigning medium-risk due to loss of test coverage from muting.

@droberts195 droberts195 added the medium-risk An open issue or test failure that is a medium risk to future releases label Oct 10, 2023
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Oct 10, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine closed this as not planned Won't fix, can't repro, duplicate, stale Nov 5, 2024
@elasticsearchmachine
Copy link
Collaborator

This issue has been closed because it has been open for too long with no activity.

Any muted tests that were associated with this issue have been unmuted.

If the tests begin failing again, a new issue will be opened, and they may be muted again.

@elasticsearchmachine
Copy link
Collaborator

This issue is getting re-opened because there are still AwaitsFix mutes for the given test. It will likely be closed again in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium-risk An open issue or test failure that is a medium risk to future releases :ml Machine learning Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

5 participants