fix: roundup of critical/high audit findings#361
Conversation
Old filter denied AF_INET and AF_INET6 with mismatch_action=Allow, leaving AF_UNIX, AF_NETLINK, AF_PACKET reachable. A jailed postinstall could connect to /var/run/docker.sock over AF_UNIX for host takeover. Flip to default-deny on socket family, allowlist AF_UNIX only (needed by node-gyp helper IPC). Drop /tmp from landlock allowlist since home already covers script needs and TMPDIR redirects there.
glob_workspace_pattern returned matches without containment check. A pattern like \"../sibling\" lexically starts with the project prefix but resolves outside it, leaking foreign directories into the workspace package list (linker, resolver, bin wiring). Add canonicalize-based is_under_project check matching the guard the **-glob branch already had on strip_prefix.
Old key-set hash ignored dependency values. A peer-context iteration can rewrite a dep value to point at an existing key without adding a new key, so convergence fired with partially rewritten tails on disk. Linker reads tails directly to locate sibling symlink targets, producing broken node_modules silently.
Raw-bytes segment only covered the project .npmrc. Token swaps or registry= changes in ~/.npmrc went undetected, fast path stayed green while the source of truth changed. Fresh checkouts after the change would silently install from the wrong registry.
Greptile SummaryThis PR rounds up six critical/high findings across the install pipeline: a seccomp socket-family allowlist that left dangerous families open, a workspace pattern traversal escape, a peer-context convergence hash that missed dependency tails, a non-atomic patch-commit that could orphan patch files, and a state digest that omitted Confidence Score: 4/5Safe to merge with awareness of the documented AF_UNIX connect() gap on pre-6.10 kernels; no new P0/P1 issues introduced All five fixes address confirmed bugs with clear before/after behaviour. Two P2 findings (EPERM vs EAFNOSUPPORT documentation mismatch, absolute path bytes in state hash) do not affect correctness. The AF_UNIX Landlock gap was pre-existing and is now documented. Score capped at 4 due to the residual security surface in the jail (acknowledged, low-exploitability) and the minor cache-churn issue in the state hash. crates/aube-scripts/src/linux_jail.rs (AF_UNIX connect() gap and EPERM/EAFNOSUPPORT discrepancy), crates/aube/src/state.rs (absolute path in hash) Important Files Changed
Reviews (6): Last reviewed commit: "scripts: document landlock v2 unix-conne..." | Re-trigger Greptile |
Previous attempt set mismatch_action=Errno which applied to every syscall the filter saw, not just socket(). Node startup ate every syscall returning EAFNOSUPPORT and tests 7, 9, 11, 12 in allow_builds.bats hard-failed. Restore mismatch_action=Allow and match_action=Errno(EPERM) so only socket()/socketpair() with the listed family arguments get blocked. Deny list now covers AF_INET, AF_INET6, AF_NETLINK, AF_PACKET. AF_UNIX stays allowed since node and node-gyp need it for stdio and worker IPC. Filesystem reach via AF_UNIX is bounded by the landlock policy, which never grants write to /var/run or /run, so docker/podman socket paths stay unreachable.
atomic_write replaces the existing patch unconditionally, so a re-patch that hits a manifest write failure used to drop the previous .patch content while leaving the manifest entry pointing at the path. Snapshot the prior bytes before the write and restore them on failure.
Greptile flagged that the previous patch still left AF_VSOCK, AF_XDP, AF_ALG, AF_BLUETOOTH and every other unlisted family reachable. Flip the filter to default-deny: rules now allow AF_UNIX only, every other family on SYS_socket / SYS_socketpair returns EPERM (the same errno fixtures already handle, so test 12 stays green). mismatch_action only fires for syscalls listed in rules, so unrelated syscalls (open, write, mmap, ...) keep flowing without seccomp intervention. Earlier attempt used EAFNOSUPPORT which appeared to upset libuv probes at node startup; EPERM matches the kernel's usual "permission denied" semantics and existing fixtures recognise it.
Head branch was pushed to by a user without write access
Audited
Six critical/high bugs found via targeted code-review pass over
aube-resolver,aube-lockfile,aube-store,aube-linker,aube-scripts,aube-registry,aube-workspace,aube-manifest,aube-settings, and the install pipeline. Workspace traversalverified empirically with a repro repo, others verified by code read.
Fixed
fix(scripts): expand seccomp denylist + drop /tmp from landlockOld seccomp filter denied AF_INET and AF_INET6, left every other
family open. Now denies AF_INET, AF_INET6, AF_NETLINK, AF_PACKET,
AF_VSOCK, AF_XDP, AF_ALG, AF_BLUETOOTH, AF_RDS, AF_CAN, AF_TIPC,
AF_IB, AF_NFC. Strict superset of original.
AF_UNIX stays allowed (node and node-gyp IPC). True default-deny on
the family axis is not implementable with seccompiler today since
its
mismatch_actionis global, setting it toErrnokills everynon-socket syscall. Documented in the source.
Known residual gap: AF_UNIX
connect()to filesystem socket paths(e.g.
/var/run/docker.sock) is not bounded by Landlock ABI v2.v2 only filters VFS open/write hooks, the unix socket connect path
bypasses them by passing
sun_pathinline.LANDLOCK_ACCESS_FS_CONNECT_UNIXarrives in v5 (kernel 6.10+). The policy pins v2 for LTS-kernel
coverage (Ubuntu 22.04, Debian 12, RHEL 9). On hosts with a sensitive
unix socket and a kernel below 6.10, install in a namespace that
bind-mounts the socket away.
Same commit drops
/tmpfrom the landlock full-access list.homealready covers writable scratch and
apply_jail_envredirectsTMPDIR/TMP/TEMP there. Granting
/tmplet scripts seed symlink raceson shared CI hosts and read other tenants' tmp files.
fix(workspace): reject pnpm-workspace patterns escaping project rootglob_workspace_patternpushed match parents without containmentcheck. Pattern
../siblinglexically starts with the project prefixbut escapes via parent components, leaking foreign dirs into the
workspace package list (linker, resolver, bin wiring all consume it).
Repro before fix:
After: empty result, valid
packages/*patterns still resolve.Adds
is_under_projecthelper, canonicalize both sides, fall backto lexical compare when paths do not exist yet (combined with
starts_with(project_dir)guard so the fallback is also containment-checked).fix(resolver): include dep tails in peer-context convergence hashOld
key_set_hashcovered package keys only. Peer-context iterationcan rewrite a dep value to point at an existing key without adding
a new key, so convergence fired with partially rewritten tails.
Linker reads tails directly to locate sibling symlink targets, stale
tails ship broken
node_modulessilently.New
graph_hashwalks(key, [(dep_name, dep_tail)...])for everypackage. Same
ordered_seq_hashinfra, no extra deps. Capacity hintis exact (
len*3 + total_deps*2), explicit\x1f/\x1eframing.fix(cli): make patch-commit atomic, roll back on manifest failureOld order: write
.patchfile, thenupsert_patched_dependency.If the manifest write failed (Windows file lock, disk full,
concurrent writer) the patch lived on disk but
package.jsonhad noentry.
load_patchesreadspackage.json, so the patch was invisibleto every future install.
Switch to
aube_util::fs_atomic::atomic_writefor the patch, snapshotthe prior bytes (re-patch path), restore them on failure or remove
the orphan if no prior patch existed.
fix(install): hash user-level npmrc in install state digesthash_settingsraw-bytes segment only hashed<project>/.npmrc.A token swap or
registry=change in~/.npmrcleft the fast-pathstate hash green, so
aube installreported "Already up to date"while the source of truth for tarball URLs and auth had changed.
Fresh checkouts after the change would install from the wrong
registry.
Includes user-level
~/.npmrcviaaube_util::env::home_dir().Verified
cargo build --allcleancargo test --workspace1145 passed, 0 failedFrozen fast path on warm state stays under 100ms across the matrix.
pnpm v9 lockfile roundtrip works.
workspace:*protocol resolves.Valid workspace patterns (
packages/*) still match correctly.Skipped
needed)
import_tarball(latent, every currentcaller verifies. Worth tightening to enforceable later)