In some scenarios users would like to use specific kernels (with different versions, configs, modules, parameters, etc.) to run their containers in k8s. Designing a mechanism to achieve this goal:
1, Imagine a special kind of docker image - Kernel Docker Image. Just as its name implies, the image contains a kernel (/boot/vmlinuz) and the corresponding modules (/lib/modules/$(uname -r)/). Besides, a 'pause' binary is added into it. So it's also a pause image.
2, Designate this special image as the pod_sandbox container image. Currently the sandbox image is a node-level setting in k8s/containerd. Patches are needed to make it a per-pod configuration.
3, Traditionally, when Kata create pod_sandbox container, a hypervisor process will be spawned with a preset kernel on host (/usr/share/kata-containers/vmlinuz.container). But here, instead, Kata use the kernel inside Kernel Docker Image to launch the VM.
4, Basically, that's all. There are still some details need to be handled, eg. mod probing. But overall they're not critical.
In general, by doing this, we bring pluggable kernel to k8s pod/containers.
What do you guys think? Thanks :-)