-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[CI] Test failed with AssertionError/Must be on selector thread #28729
Description
The test AzureMinimumMasterNodesTests.testSimpleOnlyMasterNodeElection failed today on CI:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=fedora/693
The test execution log is consoleText.txt. It shows that the test itself is executed correctly but an AssertionError was raised by the transport layer:
2> feb. 19, 2018 8:28:09 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
2> WARNING: Uncaught exception in thread: Thread[elasticsearch[node_s0][management][T#2],5,TGRP-AzureMinimumMasterNodesTests]
2> java.lang.AssertionError: Must be on selector thread
2> at __randomizedtesting.SeedInfo.seed([DF4EF41A200534E4]:0)
2> at org.elasticsearch.transport.nio.SocketSelector.executeFailedListener(SocketSelector.java:160)
2> at org.elasticsearch.transport.nio.SocketSelector.queueWrite(SocketSelector.java:111)
2> at org.elasticsearch.transport.nio.channel.TcpWriteContext.sendMessage(TcpWriteContext.java:50)
2> at org.elasticsearch.transport.nio.channel.TcpNioSocketChannel.sendMessage(TcpNioSocketChannel.java:38)
2> at org.elasticsearch.transport.TcpTransport.internalSendMessage(TcpTransport.java:1127)
2> at org.elasticsearch.transport.TcpTransport.sendRequestToChannel(TcpTransport.java:1113)
2> at org.elasticsearch.transport.TcpTransport.access$1700(TcpTransport.java:122)
2> at org.elasticsearch.transport.TcpTransport$NodeChannels.sendRequest(TcpTransport.java:482)
2> at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:598)
2> at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:518)
2> at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:506)
2> at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.start(TransportNodesAction.java:197)
2> at org.elasticsearch.action.support.nodes.TransportNodesAction.doExecute(TransportNodesAction.java:89)
2> at org.elasticsearch.action.support.nodes.TransportNodesAction.doExecute(TransportNodesAction.java:52)
2> at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167)
2> at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139)
2> at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81)
2> at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83)
2> at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72)
2> at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405)
2> at org.elasticsearch.client.support.AbstractClient$ClusterAdmin.execute(AbstractClient.java:712)
2> at org.elasticsearch.client.support.AbstractClient$ClusterAdmin.nodesStats(AbstractClient.java:808)
2> at org.elasticsearch.cluster.InternalClusterInfoService.updateNodeStats(InternalClusterInfoService.java:254)
2> at org.elasticsearch.cluster.InternalClusterInfoService.refresh(InternalClusterInfoService.java:290)
2> at org.elasticsearch.cluster.InternalClusterInfoService.maybeRefresh(InternalClusterInfoService.java:275)
2> at org.elasticsearch.cluster.InternalClusterInfoService.lambda$onMaster$0(InternalClusterInfoService.java:140)
2> at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573)
2> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
2> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
2> at java.lang.Thread.run(Thread.java:748)
It seems that the TcpWriteContext.sendMessage() decided to queue the write operation using SocketSelector.queueWrite(WriteOperation) because the write was not issued on a selector thread. At this time the channel is not closed yet and is still writeable.
Then the queueWrite() method checked if the selector was closed or not. Apparently it was closed and it removed the write operation from the queue and then executed executeFailedListener() which contains the failing assertion.
This assertion was specially added recently and I'm not sure if/how fix this, so I'm assigning this issue to you @tbrooks8 :)