systests: cp: add wait_for_ready#20912
systests: cp: add wait_for_ready#20912openshift-merge-bot[bot] merged 1 commit intocontainers:mainfrom
Conversation
Luap99
left a comment
There was a problem hiding this comment.
sounds like a logical explanation to me but I think you have overdone it a bit.
I half-agree. My first pass was addressing only the touch/mkdir containers. After some testing, and some thinking about it, I decided I never want to look at this flake again. I then applied |
Harmful no, but it makes the diff here bigger than it needs to be and makes the tests slower as they now always call podman logs even when it is not needed. |
|
OK. I'll repush once CI finishes. |
56bde48 to
18a268f
Compare
Some of the tests were doing "podman run -d" without wait_for_ready. This may be the cause of some of the CI flakes. Maybe even all? It's not clear why the tests have been working reliably for years under overlay, and only started failing under vfs, but shrug. Thanks to Chris for making that astute observation. Fixes: containers#20282 (I hope) Signed-off-by: Ed Santiago <santiago@redhat.com>
18a268f to
4d2125b
Compare
|
Done. Now |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: edsantiago, Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
|
Thanks for fixing this Ed, hopefully it was the cause.
If it helps, and this is a total guess. My feeling is the failure unpredictability is coming from the storage subsystem in the cloud context. All the CI VMs are running with (presumably multi-path) fiber-channel/network based storage. That in and of itself adds in a HUGE amount of complexity w/in the kernel and hardware-wise. Worse, both bandwidth and IOPS are "provisioned" (i.e. limited) based on what you pay for. Either/both of those aspects could easily result in randomly appearing "hiccups" in user-space. In other words, we should expect both the cloud "throttling" reads and/or writes, and occasional (transparent) hiccups w/in the hardware or network "fabric" itself. |
Some of the tests were doing "podman run -d" without wait_for_ready.
This may be the cause of some of the CI flakes. Maybe even all?
It's not clear why the tests have been working reliably for years
under overlay, and only started failing under vfs, but shrug.
Thanks to Chris for making that astute observation.
Fixes: #20282 (I hope)
Signed-off-by: Ed Santiago santiago@redhat.com