nixos/test-driver: add support for nspawn containers#478109
Conversation
1a58317 to
81245e7
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces support for systemd-nspawn containers as a lightweight alternative to QEMU VMs in the NixOS test framework. The implementation refactors the Python test driver to use an abstract BaseMachine class with separate QemuMachine and NspawnMachine implementations, enabling tests to run both VMs and containers in parallel with shared networking infrastructure.
Key changes:
- Refactored test driver from monolithic
Machineclass to abstractBaseMachinewith specializedQemuMachineandNspawnMachinesubclasses - Created new
guest-networking-options.nixmodule to share VLAN configuration between QEMU VMs and nspawn containers - Added
run-nspawnPython package for managing container lifecycle, networking, and process execution viansenter
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| nixos/tests/test-containers.nix | New test demonstrating basic container startup and VLAN isolation |
| nixos/tests/test-containers-bittorrent.nix | Complex networking test with NAT/UPnP using multiple containers |
| nixos/tests/all-tests.nix | Registers new container tests |
| nixos/modules/virtualisation/qemu-vm.nix | Extracts networking options to shared module |
| nixos/modules/virtualisation/guest-networking-options.nix | New shared networking configuration for VMs and containers |
| nixos/modules/virtualisation/nspawn-container/default.nix | Container profile module with systemd-nspawn configuration |
| nixos/modules/virtualisation/nspawn-container/run-nspawn/ | Python package for container lifecycle management |
| nixos/modules/testing/test-instrumentation.nix | Disables backdoor for containers (not compatible) |
| nixos/lib/testing/nodes.nix | Adds container support alongside nodes with separate defaults |
| nixos/lib/testing/network.nix | Refactors networking to support both VMs and containers |
| nixos/lib/testing/driver.nix | Updates driver build to pass container scripts separately |
| nixos/lib/testing/run.nix | Adds uid-range requirement for container tests |
| nixos/lib/testing/testScript.nix | Minor variable rename for clarity |
| nixos/lib/testing/nixos-test-base.nix | Removes qemu-vm import (now conditional) |
| nixos/lib/test-driver/src/test_driver/machine/init.py | Major refactoring: BaseMachine, QemuMachine, NspawnMachine classes |
| nixos/lib/test-driver/src/test_driver/driver.py | Updates driver to handle both VM and container machines |
| nixos/lib/test-driver/src/test_driver/init.py | Adds CLI arguments for container support |
| nixos/lib/test-driver/default.nix | Adds systemd and util-linux dependencies |
| nixos/lib/test-script-prepend.py | Updates type hints for new machine classes |
| nixos/lib/testing-python.nix | Adds containers parameter |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@ofborg test nat.firewall networking.scripted.link installer.simpleProvided installer.separateBootFat networking.scripted.virtual printing keymap.azerty keymap.dvorak-programmer boot-stage1 installer.swraid keymap.neo nfs4.simple i3wm udisks2 networking.networkd.loopback containers-ip ecryptfs login installer.simpleLabels php.httpd zfs.installer predictable-interface-names.unpredictable predictable-interface-names.unpredictableNetworkd mutableUsers |
227fa66 to
ccf61f9
Compare
ccf61f9 to
0722a1f
Compare
Ma27
left a comment
There was a problem hiding this comment.
Mostly low-hanging fruits. Hopefully I'll have spoons to do the rest tomorrow or next week :)
| # (n-daemon)[417]: transmission.service: Failed to create destination mount point node '/run/transmission/run/host/.os-release-stage/', ignoring: Read-only file system | ||
| # (n-daemon)[417]: transmission.service: Failed to mount /run/systemd/propagate/.os-release-stage to /run/transmission/run/host/.os-release-stage/: No such file or directory | ||
| # (n-daemon)[417]: transmission.service: Failed to set up mount namespacing: /run/host/.os-release-stage/: No such file or directory | ||
| # (n-daemon)[417]: transmission.service: Failed at step NAMESPACE spawning /nix/store/zfksw9bllp95pl45d1nxmpd2lks42bkj-transmission-4.0.6/bin/transmission-daemon: No such file or directory |
There was a problem hiding this comment.
Have you traced down what settings lead to that problem?
At this point I'm unsure if I consider this a potential bug in the driver, something that needs to be fixed or something that we can just accept (if so, I think it's worth leaving a more detailed rationale here).
There was a problem hiding this comment.
This setting specifically:
Disabling it manually lets transmission start successfully.
There was a problem hiding this comment.
Maybe one of our systemd folks has an opinion on that? I'm not sure if we're missing something here or if just turning the option for that test-case off is OK cc @ElvishJerricco @nikstur
There was a problem hiding this comment.
Do I understand you right in that you would prefer a test that uses the upstream services.transmission module (with the option turned off if necessary) to a test that uses aria2?
There was a problem hiding this comment.
It's not about the service itself, I'm just wondering if we're holding something wrong here. And finally, I wouldn't want to lift arbitrary hardening once we get to the point that this feature of the test framework is being used inside nixpkgs.
maybe cc @NixOS/systemd reaches more people who could weigh in.
9f082bb to
abd52f4
Compare
80b4cb2 to
e02998f
Compare
Eveeifyeve
left a comment
There was a problem hiding this comment.
I approve this pr getting merged, with the follow-up pr with the stuff mentioned from @Mic92 on the horizon.
Co-authored-by: cinereal <cinereal@riseup.net>
|
CI reports job canceled? |
|
Bisection is telling me that 23f1e63 broke manual build, e.g. |
|
|
|
Ah, it is not a channel blocker. I forgot that we don't block on Firefox tests (anymore) but only on the builds. |
|
Bisect says 23f1e63 broke eval of at least |
|
Probably more test regressions can be found inside |
|
@vcunat I will take care of these regressions tomorrow. |
|
Another unresolved thing is that hydra.nixos.org currently doesn't have any machines with |
|
@vcunat, I've sent in NixOS/infra#986 to update our builders with the various knobs required to run these tests. |
|
Already testing on elated-minsky right now. |
|
Can this be backported to 25.11 or is it too complicated for that? |
|
Considering that this is a major change, I'd rather not. I'd argue that the point of release branches is to not have large changes with several iterations backported, but stabilized on unstable for the next release. |
Motivation
Current NixOS integration tests rely heavily on QEMU, which can be slow and resource-intensive. This PR introduces
systemd-nspawnas a lightweight container backend, significantly reducing test latency and enabling hardware passthrough scenarios that are difficult to achieve in VMs.Key advantages of container tests
~25% improvement in test execution speed on Intel(R) Core(TM) Ultra 9 285HX
benchmark of 24 machines running GNU helloTry it out (2 simple steps!)
systemd-nspawn:nix-build -A nixosTests.nixos-test-driver.containers: Basic startup and inter-VLAN isolation (VMs and containers in parallel).nix-build -A nixosTests.test-containers-bittorrent: Complex networking (NAT/UPnP) involving multiple containers.Implementation details
QemuMachineclass (inheriting from an abstractBaseMachineclass).NspawnMachineclass replicates as much functionality ofQemuMachineas possible usingsystemd-nspawncontainers.virtualisation.vlans) for containers.Debugging
To debug a failing container test, introduce
enableDebugHookandsshBackdoorlike so:The test will then print an ssh command on startup:
Upon failure, the test will print a command to attach to it, e.g.
Run this to get a shell inside the sandbox, where you can run the
sshcommand above to enter the container.Note: Due to the nature of
systemd-nspawn, interactive execution of the tests requires root privileges:sudo $(nix -L build --print-out-paths .#nixosTests.test-containers.driverInteractive)/bin/nixos-test-driver --interactiveLimitations of containers
setuidwrappers (likesudo—though you can userunuserinstead).systemdhardening options (ProtectSystem=etc.) used by NixOS modules such asservices.transmissionamong many others./dev, making it necessary to pass in needed paths, e. g.--option sandbox-paths /dev/netfor VPN tests that create/dev/net/tun.Credits and history
Based on the testing infrastructure from @clan-lol.
The heavy lifting for integrating this into nixpkgs was done by @jfly.
Things done
passthru.tests.nixpkgs-reviewon this PR. See nixpkgs-review usage../result/bin/.Add a 👍 reaction to pull requests you find important.