Skip to content

nixos/test-driver: add support for nspawn containers#478109

Merged
Ma27 merged 37 commits into
NixOS:staging-nixosfrom
applicative-systems:nixos-test-containers
Mar 18, 2026
Merged

nixos/test-driver: add support for nspawn containers#478109
Ma27 merged 37 commits into
NixOS:staging-nixosfrom
applicative-systems:nixos-test-containers

Conversation

@kmein

@kmein kmein commented Jan 8, 2026

Copy link
Copy Markdown
Member

Motivation

Current NixOS integration tests rely heavily on QEMU, which can be slow and resource-intensive. This PR introduces systemd-nspawn as a lightweight container backend, significantly reducing test latency and enabling hardware passthrough scenarios that are difficult to achieve in VMs.

Key advantages of container tests

  • Faster boot times and lower overhead
    ~25% improvement in test execution speed on Intel(R) Core(TM) Ultra 9 285HXbenchmark of 24 machines running GNU hello

    nix-build test-backends-benchmark.nix -A hello-nspawn 27.76s user 1.50s system 155% cpu 18.803 total
    nix-build test-backends-benchmark.nix -A hello-qemu 36.80s user 1.92s system 140% cpu 27.548 total
  • Container tests can be run in cheap VMs instead of bare-metal machines.
  • Containers allow direct bind-mounting of host device nodes. This enables integration testing for CUDA code within the NixOS test framework.
  • The implementation (see below) lays the groundwork for other machine backends (other container infrastructures, bare-metal etc.)

Try it out (2 simple steps!)

  1. Configure the nix daemon to allow running systemd-nspawn:
 nix.settings.auto-allocate-uids = true;
 nix.settings.experimental-features = ["auto-allocate-uids" "cgroups"];
 nix.settings.extra-system-features = ["uid-range"];
 nix.settings.sandbox-paths = [ "/dev/net" ]; # to make nspawn↔qemu networking work
  1. Run a container test, either
  • nix-build -A nixosTests.nixos-test-driver.containers: Basic startup and inter-VLAN isolation (VMs and containers in parallel).
  • nix-build -A nixosTests.test-containers-bittorrent: Complex networking (NAT/UPnP) involving multiple containers.

Implementation details

  • Refactor the Python test-driver to move QEMU-specific logic into a QemuMachine class (inheriting from an abstract BaseMachine class).
  • Introduce an NspawnMachine class replicates as much functionality of QemuMachine as possible using systemd-nspawn containers.
  • Use the existing QEMU networking options (virtualisation.vlans) for containers.
  • Bridge QEMU's VLANs to the containers, enabling VM↔container networking.

Debugging

To debug a failing container test, introduce enableDebugHook and sshBackdoor like so:

# nixos/tests/test-containers-backdoor.nix
{
  name = "containers-backdoor";

  containers.machine = { };

  sshBackdoor.enable = true;
  enableDebugHook = true;

  testScript = ''
    start_all()
    machine.succeed("false") # this will fail
  '';
}

The test will then print an ssh command on startup:

machine:  ssh -o User=root -o ProxyCommand="socat - UNIX-CLIENT:/run/systemd/nspawn/unix-export/machine/ssh" bash

Upon failure, the test will print a command to attach to it, e.g.

!!! Breakpoint reached, run 'sudo /nix/store/hb1v3cz5bd6qk8arhxy6bii0wcilg8wh-attach/bin/attach 6793584

Run this to get a shell inside the sandbox, where you can run the ssh command above to enter the container.

Note: Due to the nature of systemd-nspawn, interactive execution of the tests requires root privileges: sudo $(nix -L build --print-out-paths .#nixosTests.test-containers.driverInteractive)/bin/nixos-test-driver --interactive

Limitations of containers

  • You cannot test kernel-specific changes (e. g. kernel modules) without also having them active on the host.
  • Containers running in the Nix sandbox cannot run setuid wrappers (like sudo—though you can use runuser instead).
  • Container tests do not support graphical applications (and taking screenshots of them).
  • Containers running in the Nix sandbox don't allow many of the systemd hardening options (ProtectSystem= etc.) used by NixOS modules such as services.transmission among many others.
  • Containers running in the sandbox have limited access to /dev, making it necessary to pass in needed paths, e. g. --option sandbox-paths /dev/net for VPN tests that create /dev/net/tun.

Credits and history

Based on the testing infrastructure from @clan-lol.
The heavy lifting for integrating this into nixpkgs was done by @jfly.

Things done

  • Built on platform:
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • Tested, as applicable:
  • Ran nixpkgs-review on this PR. See nixpkgs-review usage.
  • Tested basic functionality of all binary files, usually in ./result/bin/.
  • Nixpkgs Release Notes
    • Package update: when the change is major or breaking.
  • NixOS Release Notes
    • Module addition: when adding a new NixOS module.
    • Module update: when the change is significant.
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other READMEs.

Add a 👍 reaction to pull requests you find important.

Copilot AI review requested due to automatic review settings January 8, 2026 15:41
@kmein kmein force-pushed the nixos-test-containers branch from 1a58317 to 81245e7 Compare January 8, 2026 15:45
@kmein kmein changed the base branch from master to staging January 8, 2026 15:49
@nixpkgs-ci nixpkgs-ci Bot closed this Jan 8, 2026
@nixpkgs-ci nixpkgs-ci Bot reopened this Jan 8, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces support for systemd-nspawn containers as a lightweight alternative to QEMU VMs in the NixOS test framework. The implementation refactors the Python test driver to use an abstract BaseMachine class with separate QemuMachine and NspawnMachine implementations, enabling tests to run both VMs and containers in parallel with shared networking infrastructure.

Key changes:

  • Refactored test driver from monolithic Machine class to abstract BaseMachine with specialized QemuMachine and NspawnMachine subclasses
  • Created new guest-networking-options.nix module to share VLAN configuration between QEMU VMs and nspawn containers
  • Added run-nspawn Python package for managing container lifecycle, networking, and process execution via nsenter

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
nixos/tests/test-containers.nix New test demonstrating basic container startup and VLAN isolation
nixos/tests/test-containers-bittorrent.nix Complex networking test with NAT/UPnP using multiple containers
nixos/tests/all-tests.nix Registers new container tests
nixos/modules/virtualisation/qemu-vm.nix Extracts networking options to shared module
nixos/modules/virtualisation/guest-networking-options.nix New shared networking configuration for VMs and containers
nixos/modules/virtualisation/nspawn-container/default.nix Container profile module with systemd-nspawn configuration
nixos/modules/virtualisation/nspawn-container/run-nspawn/ Python package for container lifecycle management
nixos/modules/testing/test-instrumentation.nix Disables backdoor for containers (not compatible)
nixos/lib/testing/nodes.nix Adds container support alongside nodes with separate defaults
nixos/lib/testing/network.nix Refactors networking to support both VMs and containers
nixos/lib/testing/driver.nix Updates driver build to pass container scripts separately
nixos/lib/testing/run.nix Adds uid-range requirement for container tests
nixos/lib/testing/testScript.nix Minor variable rename for clarity
nixos/lib/testing/nixos-test-base.nix Removes qemu-vm import (now conditional)
nixos/lib/test-driver/src/test_driver/machine/init.py Major refactoring: BaseMachine, QemuMachine, NspawnMachine classes
nixos/lib/test-driver/src/test_driver/driver.py Updates driver to handle both VM and container machines
nixos/lib/test-driver/src/test_driver/init.py Adds CLI arguments for container support
nixos/lib/test-driver/default.nix Adds systemd and util-linux dependencies
nixos/lib/test-script-prepend.py Updates type hints for new machine classes
nixos/lib/testing-python.nix Adds containers parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread nixos/lib/test-driver/src/test_driver/driver.py
Comment thread nixos/lib/test-driver/src/test_driver/machine/__init__.py
Comment thread nixos/lib/test-driver/src/test_driver/machine/__init__.py Outdated
Comment thread nixos/lib/test-driver/src/test_driver/machine/__init__.py Outdated
Comment thread nixos/lib/test-driver/src/test_driver/machine/__init__.py Outdated
Comment thread nixos/tests/test-containers-bittorrent.nix
Comment thread nixos/lib/test-driver/src/test_driver/machine/__init__.py
Comment thread nixos/lib/test-script-prepend.py
Comment thread nixos/lib/test-driver/src/test_driver/machine/__init__.py
@nixpkgs-ci nixpkgs-ci Bot requested review from RaitoBezarius and tfc January 8, 2026 15:54
@nixpkgs-ci nixpkgs-ci Bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-nixos-tests This PR causes rebuilds for all NixOS tests and should normally target the staging branches. 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 6.topic: testing Tooling for automated testing of packages and modules labels Jan 8, 2026
@tfc tfc requested a review from Ma27 January 8, 2026 16:53
@tfc

tfc commented Jan 12, 2026

Copy link
Copy Markdown
Contributor

@ofborg test nat.firewall networking.scripted.link installer.simpleProvided installer.separateBootFat networking.scripted.virtual printing keymap.azerty keymap.dvorak-programmer boot-stage1 installer.swraid keymap.neo nfs4.simple i3wm udisks2 networking.networkd.loopback containers-ip ecryptfs login installer.simpleLabels php.httpd zfs.installer predictable-interface-names.unpredictable predictable-interface-names.unpredictableNetworkd mutableUsers

@kmein kmein force-pushed the nixos-test-containers branch 4 times, most recently from 227fa66 to ccf61f9 Compare January 13, 2026 15:17
@kmein kmein force-pushed the nixos-test-containers branch from ccf61f9 to 0722a1f Compare January 14, 2026 08:35
@nixpkgs-ci nixpkgs-ci Bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jan 17, 2026

@Ma27 Ma27 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly low-hanging fruits. Hopefully I'll have spoons to do the rest tomorrow or next week :)

Comment thread nixos/tests/all-tests.nix Outdated
Comment thread nixos/lib/test-script-prepend.py Outdated
Comment thread nixos/tests/test-containers-bittorrent.nix
# (n-daemon)[417]: transmission.service: Failed to create destination mount point node '/run/transmission/run/host/.os-release-stage/', ignoring: Read-only file system
# (n-daemon)[417]: transmission.service: Failed to mount /run/systemd/propagate/.os-release-stage to /run/transmission/run/host/.os-release-stage/: No such file or directory
# (n-daemon)[417]: transmission.service: Failed to set up mount namespacing: /run/host/.os-release-stage/: No such file or directory
# (n-daemon)[417]: transmission.service: Failed at step NAMESPACE spawning /nix/store/zfksw9bllp95pl45d1nxmpd2lks42bkj-transmission-4.0.6/bin/transmission-daemon: No such file or directory

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you traced down what settings lead to that problem?
At this point I'm unsure if I consider this a potential bug in the driver, something that needs to be fixed or something that we can just accept (if so, I think it's worth leaving a more detailed rationale here).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting specifically:

Disabling it manually lets transmission start successfully.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one of our systemd folks has an opinion on that? I'm not sure if we're missing something here or if just turning the option for that test-case off is OK cc @ElvishJerricco @nikstur

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand you right in that you would prefer a test that uses the upstream services.transmission module (with the option turned off if necessary) to a test that uses aria2?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not about the service itself, I'm just wondering if we're holding something wrong here. And finally, I wouldn't want to lift arbitrary hardening once we get to the point that this feature of the test framework is being used inside nixpkgs.

maybe cc @NixOS/systemd reaches more people who could weigh in.

Comment thread nixos/tests/test-containers-bittorrent.nix
Comment thread nixos/lib/test-driver/src/test_driver/vlan.py Outdated
@kmein kmein force-pushed the nixos-test-containers branch 2 times, most recently from 9f082bb to abd52f4 Compare January 19, 2026 08:45
github-actions[bot]

This comment was marked as outdated.

@kmein kmein force-pushed the nixos-test-containers branch 2 times, most recently from 80b4cb2 to e02998f Compare January 19, 2026 08:52

@Eveeifyeve Eveeifyeve left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this pr getting merged, with the follow-up pr with the stuff mentioned from @Mic92 on the horizon.

@nixpkgs-ci nixpkgs-ci Bot added 12.approvals: 2 This PR was reviewed and approved by two persons. and removed 12.approvals: 1 This PR was reviewed and approved by one person. labels Mar 16, 2026
@nixpkgs-ci nixpkgs-ci Bot removed the 12.approvals: 2 This PR was reviewed and approved by two persons. label Mar 17, 2026
Co-authored-by: cinereal <cinereal@riseup.net>
@KiaraGrouwstra

Copy link
Copy Markdown
Contributor

CI reports job canceled?

@vcunat

vcunat commented Mar 19, 2026

Copy link
Copy Markdown
Member

Bisection is telling me that 23f1e63 broke manual build, e.g. nix build -f nixos/release.nix manual.x86_64-linux
https://github.com/NixOS/nixpkgs/actions/runs/23282868559/job/67699910130?pr=497493

@kmein

kmein commented Mar 19, 2026

Copy link
Copy Markdown
Member Author

@vcunat Terribly sorry for that! Neither me nor the CI caught it. Didn't build the manuals because my commits didn't touch the manuals. There is a corresponding docs PR in the works at #479968 which should fix the failures. I'll rebase it ASAP.

@vcunat

vcunat commented Mar 19, 2026

Copy link
Copy Markdown
Member

Another channel blocker. 799cafc broke nix build -f. nixosTests.allDrivers.firefox Hydra log: https://hydra.nixos.org/build/324392409/nixlog/1

@vcunat

vcunat commented Mar 20, 2026

Copy link
Copy Markdown
Member

Ah, it is not a channel blocker. I forgot that we don't block on Firefox tests (anymore) but only on the builds.

@trofi

trofi commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

Bisect says 23f1e63 broke eval of at least snipe-it.tests as:

$ nix-instantiate -A snipe-it.tests
error:
       … while evaluating the attribute 'drvPath'
         at lib/customisation.nix:445:7:
          444|     // {
          445|       drvPath =
             |       ^
          446|         assert condition;

       … while calling the 'derivationStrict' builtin
         at «nix-internal»/derivation-internal.nix:37:12:
           36|
           37|   strict = derivationStrict drvAttrs;
             |            ^
           38|

       … while evaluating the option `testScriptString':

       … while evaluating definitions from `nixos/lib/testing/testScript.nix':

       … while evaluating definitions from `nixos/tests/web-apps/snipe-it.nix':

       (stack trace truncated; use '--show-trace' to show the full, detailed trace)

       error: function 'testScript' called with unexpected argument 'containers'
       at nixos/tests/web-apps/snipe-it.nix:47:5:
           46|   testScript =
           47|     { nodes }:
             |     ^
           48|     let

@Mic92

Mic92 commented Mar 20, 2026

Copy link
Copy Markdown
Member

@trofi @vcunat #501599

@vcunat

vcunat commented Mar 23, 2026

Copy link
Copy Markdown
Member

Probably more test regressions can be found inside
https://hydra.nixos.org/eval/1823783?filter=nixos.tests.allDrivers&compare=1823690#tabs-now-fail
e.g. https://hydra.nixos.org/build/324720946/nixlog/14/tail
but it's always a question who is supposed to be taking care of this kind of regressions.

@kmein

kmein commented Mar 23, 2026

Copy link
Copy Markdown
Member Author

@vcunat I will take care of these regressions tomorrow.

@vcunat

vcunat commented Mar 24, 2026

Copy link
Copy Markdown
Member

Another unresolved thing is that hydra.nixos.org currently doesn't have any machines with uid-range, so this generates jobs that get stuck in the queue forever. (nixosTests.nixos-test-driver.containers and nixosTests.test-containers-bittorrent, currently)

@jfly

jfly commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

@vcunat, I've sent in NixOS/infra#986 to update our builders with the various knobs required to run these tests.

@mweinelt

Copy link
Copy Markdown
Member

Already testing on elated-minsky right now.

@Ericson2314

Copy link
Copy Markdown
Member

Can this be backported to 25.11 or is it too complicated for that?

@Ma27

Ma27 commented Mar 30, 2026

Copy link
Copy Markdown
Member

Considering that this is a major change, I'd rather not. I'd argue that the point of release branches is to not have large changes with several iterations backported, but stabilized on unstable for the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: testing Tooling for automated testing of packages and modules 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-nixos-tests This PR causes rebuilds for all NixOS tests and should normally target the staging branches. 12.approvals: 3+ This PR was reviewed and approved by three or more persons.

Projects

None yet

Development

Successfully merging this pull request may close these issues.