bin/flatcar-install: run write_to_disk from subshell#97
bin/flatcar-install: run write_to_disk from subshell#97tormath1 merged 1 commit intoflatcar-masterfrom
Conversation
bin/flatcar-install
Outdated
| done | ||
| [ -z "$try" ] || exit 1 | ||
| udevadm settle | ||
| udevadm settle && return |
There was a problem hiding this comment.
Would this work, too, because when udevadm settle fails, the script exists due to set -e?
| udevadm settle && return | |
| udevadm settle | |
| return |
There was a problem hiding this comment.
A comment for this workaround would also be good
There was a problem hiding this comment.
Thanks for the suggestion - it works fine.
There was a problem hiding this comment.
Why do we need the trap? Can't the write the "done" in the last line instead of the return?
There was a problem hiding this comment.
Ok, I think the idea was to write the "done" as soon as the function did something, instead of the last line it should be done as first.
There was a problem hiding this comment.
The current approach with the explicit return doesn't cover the case where dd failed and thus the is_modified check wouldn't work anymore.
There was a problem hiding this comment.
I followed @krnowak approach and use a subshell + EXIT trap instead of explicit return - but I still can't explain why it works on Equinix Metal setup.
|
I remember replacing all RETURN traps to EXIT traps in subshells because of that. See it here: flatcar/scripts@eee6b50 |
Wow, that's interesting thanks for sharing. Interesting to to see that we did not get the issue with 5.1 but in 5.2... |
|
The install script is tested in the Equinix Metal installation path, strange that we didn't see the issue there? |
5074ccd to
f962af0
Compare
Thanks, I'm now even more confused. :D |
f962af0 to
a7c2128
Compare
bin/flatcar-install
Outdated
| ( | ||
| mkfifo -m 0600 "${WORKDIR}/disk_modified" | ||
| trap '(exec 2>/dev/null ; echo done > "${WORKDIR}/disk_modified") &' RETURN | ||
| trap '(exec 2>/dev/null ; echo done > "${WORKDIR}/disk_modified") &' EXIT |
There was a problem hiding this comment.
| trap '(exec 2>/dev/null ; echo done > "${WORKDIR}/disk_modified") &' EXIT | |
| (exec 2>/dev/null ; echo done > "${WORKDIR}/disk_modified") & |
I'm fine with the current approach, my idea was to simplify this and don't use traps.
This should work because we don't wait for the write somewhere in parallel but it's all sequential code and the marker can be written early.
There was a problem hiding this comment.
Removing the trap is creating a weird behavior:
++ write_to_disk
++ mkfifo -m 0600 /tmp/flatcar-install.r3mrgIyVoA/disk_modified
+ wget --tries 10 --timeout=20 --retry-connrefused --no-verbose -O - https://beta.release.flatcar-linux.net/amd64-usr/3602.1.0/flatcar_production_image.bin.bz2
+ gpg --batch --trusted-key E25D9AED0593B34A --verify /tmp/flatcar-install.r3mrgIyVoA/flatcar_production_image.bin.bz2.sig -
+ tee /dev/fd/62
++ lbzip2 -cd
+++ blockdev --getsz /dev/sda
++ exec
++ dd conv=nocreat count=1024 if=/dev/zero of=/dev/sda seek=41942016 status=none
++ dd bs=1M conv=nocreat of=/dev/sda status=none
2023-06-23 09:40:39 URL:https://beta.release.flatcar-linux.net/amd64-usr/3602.1.0/flatcar_production_image.bin.bz2 [379416813/379416813] -> "-" [1]
gpg: Signature made Mon May 29 09:40:52 2023 UTC
gpg: using RSA key 8D6DA7853CFE1B1ED346ED0DEDBDF411267EC954
gpg: issuer "buildbot@flatcar-linux.org"
gpg: key E25D9AED0593B34A marked as ultimately trusted
gpg: checking the trustdb
gpg: marginals needed: 3 completes needed: 1 trust model: pgp
gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: Good signature from "Flatcar Buildbot (Official Builds) <buildbot@flatcar-linux.org>" [ultimate]
+ wait_for_disk
+ '[' -n '' ']'
+ read -rt 7200 _disk_status
+ write_cloudinit
+ [[ -n '' ]]
+ write_ignition
+ [[ -n '' ]]
+ [[ -n '' ]]
+ echo 'Success! Flatcar Container Linux beta 3602.1.0 is installed on /dev/sda'
Success! Flatcar Container Linux beta 3602.1.0 is installed on /dev/sda
+ rm -rf /tmp/flatcar-install.r3mrgIyVoA
+ trap - EXIT
It ends right after the dd (it does not go through the udevadm settle and the for loop) so it takes a few second before the changes being visible
There was a problem hiding this comment.
Ah, install_from_url doesn't wait for write_to_disk to finish. It even doesn't capture the error code because > >(write_to_disk) redirects the output to the input FD but it's not a pipe where pipefail would work. This should be fixed to properly fail on error.
There was a problem hiding this comment.
Had to touch the waiting logic anyway now to signal the failure and removed the trap.
pothos
left a comment
There was a problem hiding this comment.
There are two other RETURN traps for umount, do you want to use the subshell trick there?
Somehow, with the bash upgrade the trap RETURN signal is not handled correctly (it's ignored). We run now everything from a subshell and turn RETURN trap to an EXIT trap. Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
a7c2128 to
206f49a
Compare
|
@pothos done and tested with a merge of your PR and mine. ✔️ |
This pulls in flatcar/init#97 and flatcar/init#99 to work around a bash regression and add handling for disk write errors.
This pulls in flatcar/init#97 and flatcar/init#99 to work around a bash regression and add handling for disk write errors.
This pulls in flatcar/init#97 and flatcar/init#99 to work around a bash regression and add handling for disk write errors.
Somehow, with the bash upgrade the trap RETURN signal is not handled correctly (it's ignored).
Adding an explicit return makes it work.
The bash upgrade and the way it's done (a single huge commit) it's too complicated to make a proper bisect but I assume the issue is regarding the changes on how trap signals are handled.
Testing done