ignore ENOENT errors when parsing .d files#2696
Conversation
As always listing files in a dir to then read them is racy as the file might have been removed in the meantime. Thus we must ignore ENOENT errors when the file is opened. This is not just a theoretical problem, the reason I am here is because it caused a flake in the podman CI[1]: ... open /etc/containers/registries.d/podman-test-only-temporary-addition.yaml: no such file or directory [1] https://api.cirrus-ci.com/v1/artifact/task/6673405799301120/html/int-podman-fedora-40-root-host-boltdb.log.html#t--Podman-push-podman-push-to-local-registry-with-authorization--1 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
As always listing files in a dir to then read them is racy as the file might have been removed in the meantime. Thus we must ignore ENOENT errors when the file is opened. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
As always listing files in a dir to then read them is racy as the file might have been removed in the meantime. Thus we must ignore ENOENT errors when the file is opened. Now here the code already did not cause an hard error but it will cause a spurious warning in such case. There is really no need to log that as it can cause flakes for podman. Now there is the case here for .cert and .key files where both files must be present for a valid config. Ignoring ENOENT there seems wrong as it would hide a common misconfiguration where only one of the files exists. That mean the race can still cause a failure when these files are removed from the dir. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
LGTM |
mtrmac
left a comment
There was a problem hiding this comment.
Thanks!
(I’m not sure this will quite fix the Podman flake — there is still a race between creating the /etc/containers/registries.d/podman-test-only-temporary-addition.yaml file and actually writing data to it, IIRC cp is defined not to use the atomic Rename() approach — but in general, with Podman running potentially in system services, it would be hard for users to coordinate config file changes so that they reliably never interfere with a running c/image code; so I agree this is a good improvement in any case.)
Ah right, I didn't consider the possibility of the partial file there. That is certainly something that needs fixing in the podman suite long term. |
There is a expected race condition between listing files and actually opening them. Between that time the file might have been removed already. This is not really a real error so we should just ignore it as otherwise flakes will happen in the wild.
See the first commit for a real flake, the other two ones I have not observed but I thought I might as well fix it for all .d readers. Are there other files I missed?