-
Notifications
You must be signed in to change notification settings - Fork 65
Description
This was discovered when writing a NixOS system test for the bees module.
The testing logic in https://github.com/NixOS/nixpkgs/blob/923a3e4970226293e4698e44e3e5d5ccf7487603/nixos/tests/bees.nix consistently succeeds every time: This code first creates files on a new filesystem, then starts the bees service. That (passing) test is roughly equivalent to the following shell script:
any_shared_space() {
[[ $(btrfs fi du -s --raw "$@" | awk 'NR>1 { print $3 }' | grep -E '^0$' | wc -l) -eq 0 ]]
}
die() { echo "$*" >&2; exit 1; }
mkfs.btrfs -f -L aux /dev/vdb || die
mount /dev/vdb /home || die
dd if=/dev/urandom of=/home/dedup-me-1 bs=1M count=8 || die
cp --reflink=never /home/dedup-me-1 /home/dedup-me-2 || die
any_shared_space /home/dedup-me-* && die "ERROR: Detecting shared space before any deduplication has been done"
sync
systemctl start beesd@aux.service
sleep 10
any_shared_space /home/dedup-me-* || die "ERROR: No shared space detected even after bees is running"
By contrast, a test akin to the following -- which starts the service after the filesystem is created and initially mounted, but before any content has been created -- consistently fails, with bees running in a loop which is trying to poll the status of a file descriptor referring to a file that doesn't exist:
any_shared_space() {
[[ $(btrfs fi du -s --raw "$@" | awk 'NR>1 { print $3 }' | grep -E '^0$' | wc -l) -eq 0 ]]
}
die() { echo "$*" >&2; exit 1; }
mkfs.btrfs -f -L aux /dev/vdb || die
mount /dev/vdb /home || die
systemctl start beesd@aux.service
sleep 1
dd if=/dev/urandom of=/home/dedup-me-1 bs=1M count=8 || die
cp --reflink=never /home/dedup-me-1 /home/dedup-me-2 || die
sync
sleep 10
any_shared_space /home/dedup-me-* || die "ERROR: No shared space detected even after bees is running"
The actual failing test can be found at https://github.com/charles-dyfis-net/nixpkgs/blob/bees-test-failing/nixos/tests/bees.nix; if checking out the relevant branch of nixpkgs, it can be run (from the root of that working tree) with nix-build -I nixpkgs="$PWD" ./nixos/tests/bees.nix.
strace of the loop taking place when in that failed state can be seen at https://gist.github.com/charles-dyfis-net/34ac2e4d2bada0c3a3c8632cab98c8d9