-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[CI] NodeTests suite timeouts on new windows workers #44256
Copy link
Copy link
Closed
Labels
:Core/Infra/CoreCore issues without another labelCore issues without another label>test-failureTriaged test failures from CITriaged test failures from CI
Description
On Windows 2016 and windows 2012 we getting suite timeouts in what seem to be the very first suite run in that build.
It appears to that the test takes a long time between finishing the test method, and being marked as "done" - possibly due to clean up tasks in @After?
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.3+multijob-windows-compatibility/os=windows-2016/1/
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.3+multijob-windows-compatibility/os=windows-2012-r2/1/
Looking at the Windows-2016 failure:
- We start :server:test at 4:43 (
04:43:11 > Task :server:test) - We fail a little over 20 minutes (1,200,000 ms) later.
05:05:42 org.elasticsearch.node.NodeTests > testCloseOnLeakedStoreReference FAILED
05:05:42 java.lang.Exception: Test abandoned because suite timeout was reached
...
java.lang.Exception: Suite timeout exceeded (>= 1200000 msec).
- The node being tested was stopped early in the 20 mminutes (7:45 in the node's TZ)
1> [2019-07-12T07:45:46,034][INFO ][o.e.n.Node ] [testCloseOnLeakedStoreReference] closed
- but the "after test" log doesn't come until we're killing off the suite (8:05)
1> [2019-07-12T08:05:47,854][INFO ][o.e.n.NodeTests ] [testCloseOnLeakedStoreReference] after test
It seems that something on this new Ephemeral Windows CI worker is causing a problem moving this test method from "testing complete" to "test done"
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
:Core/Infra/CoreCore issues without another labelCore issues without another label>test-failureTriaged test failures from CITriaged test failures from CI
Type
Fields
Give feedbackNo fields configured for issues without a type.