ignition: Fix PXE by using mount namespace rather than remounting RO#103
ignition: Fix PXE by using mount namespace rather than remounting RO#103chewi merged 1 commit intoflatcar-masterfrom
Conversation
Remounting read-only was failing due to systemd v256 holding a read-write file descriptor on systemd-executor. We don't need to do this inside a namespace. We don't need to unmount the OEM partition either. Co-authored-by: Jeremi Piotrowski <jpiotrowski@microsoft.com> Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
|
Hello, can you share the bug link or description that this PR fixes? |
|
It was only discussed on Matrix. The above basically sums it up though. This went unnoticed initially because we don't run tests on EM by default, and it's known to be flaky sometimes. |
Can you please provide the context and the actual error logs / files on this matter here? A PR needs to have a link to an issue or a clear description of what the issue issue / error logs. |
|
Both amd64 and arm64 CI tests totally failed. The logs were full of errors like this: Logging in via the serial console showed that it was stuck in the emergency shell having failed to boot. This was due to ignition-setup.service failing, which showed this: Running lsof via the mounted /sysroot/usr showed a read-write file descriptor within /usr, indicated by the |
krnowak
left a comment
There was a problem hiding this comment.
That's a neat fix, especially that the unit does not do any mounting that should remain after the process is finished.
Aside that, I'm wondering how this thing worked exactly for the "normal" case, where the later trap 'retry-umount "${src}"' EXIT clobbers the earlier trap 'mount -o remount,ro /usr' EXIT. And also, without the clobbering, the same issue probably would arise on other platforms.
Also, it looks like that the /usr was remounted as read-only, but at a later stage? Because I don't remember seeing problems with writeable /usr.
|
Very good points! I had wondered why it only affected PXE, and that totally explains it. |
Remounting read-only was failing due to systemd v256 holding a read-write file descriptor on systemd-executor. We don't need to do this inside a namespace. We don't need to unmount the OEM partition either.
Jenkins is all green. I included Equinix Metal on amd64 and arm64.