Skip to content

Parallel eval: data race on lookupPathResolved crashes nix eval #412

@JonathanTroyer

Description

@JonathanTroyer

Describe the bug

EvalState::lookupPathResolved is a non-thread-safe boost::unordered_flat_map (eval.hh:503-504) accessed without synchronization from resolveLookupPathPath (eval.cc:3187-3196). With parallel eval (eval-cores > 1), concurrent threads race on find/emplace, corrupting the map and crashing during rehash.

Adjacent caches (srcToStore, fileEvalCache, importResolutionCache) were converted to boost::concurrent_flat_map but lookupPathResolved was missed.

Steps To Reproduce

Save as repro.sh and run with bash repro.sh:

#!/usr/bin/env bash
set -euo pipefail
EXPR=$(mktemp --suffix=.nix)
cat > "$EXPR" <<'NIX'
let
  nixpkgs = builtins.getFlake "nixpkgs";
  mkSystem = name: nixpkgs.lib.nixosSystem {
    system = "x86_64-linux";
    modules = [{
      networking.hostName = name;
      fileSystems."/" = { device = "/dev/sda1"; fsType = "ext4"; };
      boot.loader.grub.device = "/dev/sda";
      system.stateVersion = "25.11";
    }];
  };
in {
  a = (mkSystem "a").config.system.build.toplevel.outPath;
  b = (mkSystem "b").config.system.build.toplevel.outPath;
  c = (mkSystem "c").config.system.build.toplevel.outPath;
}
NIX
fails=0
for i in $(seq 1 "${1:-20}"); do
  if ! nix eval --json --impure --file "$EXPR" > /dev/null 2>/dev/null; then
    fails=$((fails + 1))
  fi
done
rm -f "$EXPR"
echo "$fails/${1:-20} crashed"

On 8-core x86_64-linux: ~10-30% crash rate. Adding --option eval-cores 1: 0% crash rate.

Crashes manifest as either:

  • Assertion 'num_destroyed==size()||num_destroyed==0' failed in ... unchecked_rehash (boost detects inconsistent state during map growth)
  • corrupted size vs. prev_size (glibc detects heap metadata corruption)

Expected behavior

nix eval should not crash.

Metadata

Determinate Nixd daemon version: 3.17.2
Determinate Nixd client version: 3.17.2
nix (Determinate Nix 3.17.2) 2.33.3

Additional context

Fix: convert lookupPathResolved to boost::concurrent_flat_map (same as the other caches) or wrap accesses with a mutex.

Workaround: eval-cores = 1 in nix.conf.

Checklist

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions