build_library: Fix depmod issues with sysext kmods#2976
Conversation
chewi
left a comment
There was a problem hiding this comment.
Rather than have a lot of duplicate scripts, they could just be symlinks and would work just the same.
I have a suggestion for how the hook could be much cleaner and simpler. I'd normally use bubblewrap for this. We don't have it in Flatcar, but this is basically what it would do:
KMOD_PATH=/usr/lib/modules/$(uname -r)
TMP_DIR=$(mktemp -d)
trap "rm -rv -- '${TMP_DIR}'" EXIT
mkdir "${TMP_DIR}"/{upper,work}
unshare -m bash -s -- "${@}" <<EOF
set -euo pipefail
mount -t overlay overlay -o lowerdir="${KMOD_PATH}",upperdir="${TMP_DIR}"/upper,workdir="${TMP_DIR}"/work "${KMOD_PATH}"
depmod
modprobe --ignore-install "\${@}"
EOFb74467f to
86d4ebc
Compare
chewi
left a comment
There was a problem hiding this comment.
Much better, thank you. We're still missing the symlinks, but other than that, it's looking good. Please check whether the remove hook is actually needed.
86d4ebc to
e3a18be
Compare
e3a18be to
348c1fc
Compare
|
Forgot the symlinks, thanks for checking. For the remove hook, I've just checked without it, and it works (removing ZFS module removes the SPL module even without the hook). The dependencies of an already loaded kmods seems to be tracked in kernel, so I've removed the remove hook. |
348c1fc to
97c9e72
Compare
|
@danzatt Why do we do this at runtime, instead of generating the modules.dep file for all available kmod sysexts during image build? |
I guess that would only work if we didn't have multiple different versions of the same modules. Maybe they produce the same depmod results, but we cannot always assume that. |
|
@jepio Yes, Chewi is right. That was my reasoning. We currently ship sysexts with different versions (and different useflags) of nvidia kmods. I have checked all nvidia sysexts from the latest alpha (4344) and the |
|
alright - can you test this with the GPU operator? It runs modprobe from the host, i want to make sure that still works. |
97c9e72 to
18d5de0
Compare
|
@jepio Just run the operator tests on Azure NC instance and it is passing @chewi Turns out my testing was wrong. The remove hook is needed, I tested with the latest build without the remove hook and unloading ZFS didn't unload SPL. After adding the remove hook it works as expected, so I added the hook back. |
build_library/sysext_mangle_kmod
Outdated
| EOF | ||
| done | ||
|
|
||
| mkdir -p ./usr/local/bin/ |
There was a problem hiding this comment.
I just realized /use/libexec would be a better location for these internal helpers.
There are scenarios where /usr/local needs to be writeable and that is accomplished by using a sysext that redirects /usr/local to a different path using a symlink. Using the same path would conflict with those scenarios.
|
@danzatt I left a comment here a while back, would you take a look? |
OS-dependent sysexts that ship kernel modules, usually also ship the files in /usr/lib/modules/*-flatcar/modules.XXX When multiple such sysexts get activated, depmod files from just one sysext win and other kernel modules cannot be loaded using modprobe. We get around this by removing the depmod files from every sysext with kernel modules. Instead, we set up modprobe hook, which dynamically runs depmod in a temporary directory on every sysext kernel module activation. Signed-off-by: Daniel Zatovic <daniel.zatovic@gmail.com>
18d5de0 to
9c51990
Compare
|
@jepio sorry for missing that one. should be fixed now |
build_library: Fix depmod issues with sysext kmods
OS-dependent sysexts that ship kernel modules, usually also ship the files in
/usr/lib/modules/*-flatcar/modules.XXXWhen multiple such sysexts get activated, depmod files from just one sysext win and other kernel modules cannot be loaded using modprobe. We get around this by removing the depmod files from every sysext with kernel modules. Instead, we set up modprobe hook, which dynamically runs depmod in a temporary directory on every sysext kernel module activation.Fixes: Flatcar #1576
Testing done
[Describe the testing you have done before submitting this PR. Please include both the commands you issued as well as the output you got.]
changelog/directory (user-facing change, bug fix, security fix, update)/bootand/usrsize, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.