tests/fix randomly failing testWatcherRestart#35243
Conversation
cc13469 to
bf08783
Compare
|
Pinging @elastic/es-core-infra |
|
@danielmitterdorfer would you be interested in reviewing this? |
|
@pgomulka sorry missed your review request earlier today. It seems the test has failed on the PR build in CI? |
fix typos and refactor to DRY up documentation for bulk, reindex and migration apis relates elastic#35345
4162c70 to
9f6ea40
Compare
|
The testWatcherRestart seems to be failing on 6.x branch because it is running on a cluster which is not in an expected state (all watchers started). This is most likely because previous tests did not clean up cluster after its run properly. It could be either Here we can see the received state of the cluster Additional assertion was added to wait until the cluster is in expected state Test failing as the watcher never recovers from failed state. However, this test is also on master as |
|
@pgomulka fyi I muted some of the tests in #33753 (comment) |
|
@cbuescher The fix from this PR won't be enough. I will look into this again as #35271 might help here |
only muting two failing test cases in a subclass where they fail. issue referencing the problem: elastic#36598
|
retest this please |
|
@cbuescher it was usually not easy to reproduce locally, but I run the failing jobs again on my machine and seems like the problem is at least not existing on these seeds. |
|
I have a test failure in a 6.x PR that contains the merged commit: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request-2/4142/console I'll quickly scan the last 6.x CI failures and will revert 7f20c0b if I find more WatchBackwardsCompatibilityIT failures. |
|
There seem to be several, I just add the last one here: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+intake/933/console Will revert 7f20c0b |
This reverts commit 7f20c0b.
As noted on an original ticket, the test fails when it starts to perform a stop while watchers were still in a starting phase. This is reproducible on 6.x (fails around 10% of a time)
When adding additional assertion waiting for watchers to be in a started phase, this seems to be fixed. No failures out of 1000times run.
closes #33753