fix(sandbox): ensure procps survives hardening and stale base images (#2343)#2408
Conversation
…2343) Follow-up to #2356. The base image addition alone does not guarantee procps is present in the production image — the GHCR base may not have been rebuilt yet, and apt-get autoremove in the hardening step could strip it. Add apt-mark manual + conditional install fallback in the production Dockerfile with a build-time smoke test, plus static and runtime regression tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds procps provisioning to images: Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/sandbox-provisioning.test.ts (1)
63-72: Tighten regex guards to reduce false positives in static checks.Line 64 and Line 71 patterns are broad and can still pass after unintended Dockerfile edits. Consider matching the exact hardening invariants (
apt-mark manual procps, fallback install, andps --versionsmoke check).Suggested test tightening
it("Dockerfile.base installs procps in the apt-get layer", () => { - expect(baseSrc).toMatch(/apt-get.*install.*procps/s); + expect(baseSrc).toMatch(/apt-get\s+update\s*&&\s*apt-get\s+install[\s\S]*\bprocps=2:4\.0\.2-3\b/); }); it("Dockerfile has a procps fallback for stale GHCR base images", () => { // The hardening step must protect procps from autoremove and install it // if the base image predates the procps addition. + expect(mainSrc).toMatch(/apt-mark\s+manual\s+procps/); expect(mainSrc).toMatch(/command -v ps/); - expect(mainSrc).toMatch(/install.*procps/); + expect(mainSrc).toMatch(/apt-get\s+install\s+-y\s+--no-install-recommends\s+procps=2:4\.0\.2-3/); + expect(mainSrc).toMatch(/ps --version/); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sandbox-provisioning.test.ts` around lines 63 - 72, The tests use broad regexes that yield false positives; tighten the assertions in the "Dockerfile.base installs procps in the apt-get layer" and "Dockerfile has a procps fallback for stale GHCR base images" tests to match the exact hardening invariants: assert baseSrc contains the apt-mark invocation (e.g., match /apt-mark\s+manual\s+procps/), assert mainSrc contains the explicit ps smoke check (e.g., /ps\s+--version/ or /command\s+-v\s+ps/ depending on the script) and the fallback install of procps (e.g., /\bapt-get\b.*\binstall\b.*\bprocps\b/ or the exact flags you use like /apt-get\s+install.*--no-install-recommends.*procps/); update the expect calls that reference baseSrc and mainSrc accordingly so they verify these precise invariants.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e-test.sh`:
- Around line 419-431: The script currently only smoke-tests execution for ps;
update the existence loop (the for loop iterating over cmd and using pass/fail)
to also attempt a safe, non-interactive execution of each found tool (using a
tried sequence such as --version, -V, or --help alternatives until one succeeds)
and call pass "<cmd> executes successfully" or fail "<cmd> found but failed to
execute" via the same pass/fail helpers; locate the loop that declares cmd and
the subsequent ps-only check and replace the separate ps execution check with
per-cmd execution attempts that use the cmd variable and reuse the pass/fail
functions (ensuring no interactive flags are used).
---
Nitpick comments:
In `@test/sandbox-provisioning.test.ts`:
- Around line 63-72: The tests use broad regexes that yield false positives;
tighten the assertions in the "Dockerfile.base installs procps in the apt-get
layer" and "Dockerfile has a procps fallback for stale GHCR base images" tests
to match the exact hardening invariants: assert baseSrc contains the apt-mark
invocation (e.g., match /apt-mark\s+manual\s+procps/), assert mainSrc contains
the explicit ps smoke check (e.g., /ps\s+--version/ or /command\s+-v\s+ps/
depending on the script) and the fallback install of procps (e.g.,
/\bapt-get\b.*\binstall\b.*\bprocps\b/ or the exact flags you use like
/apt-get\s+install.*--no-install-recommends.*procps/); update the expect calls
that reference baseSrc and mainSrc accordingly so they verify these precise
invariants.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8d37a51f-caad-4522-b257-83988f21c48f
📒 Files selected for processing (4)
DockerfileDockerfile.basetest/e2e-test.shtest/sandbox-provisioning.test.ts
| for cmd in ps top free uptime vmstat; do | ||
| if command -v "$cmd" >/dev/null 2>&1; then | ||
| pass "$cmd is available at $(command -v "$cmd")" | ||
| else | ||
| fail "$cmd not found — procps package missing from sandbox image" | ||
| fi | ||
| done | ||
| # Smoke-test: ps must actually execute, not just resolve | ||
| if ps --version >/dev/null 2>&1; then | ||
| pass "ps executes successfully" | ||
| else | ||
| fail "ps found but failed to execute" | ||
| fi |
There was a problem hiding this comment.
Validate execution for all five tools, not only ps.
Line 419 currently proves PATH presence, but Lines 427-431 only smoke-test ps. To match the stated runtime guarantee, execute each tool with a safe non-interactive invocation.
Proposed hardening diff
-for cmd in ps top free uptime vmstat; do
- if command -v "$cmd" >/dev/null 2>&1; then
- pass "$cmd is available at $(command -v "$cmd")"
- else
- fail "$cmd not found — procps package missing from sandbox image"
- fi
-done
-# Smoke-test: ps must actually execute, not just resolve
-if ps --version >/dev/null 2>&1; then
- pass "ps executes successfully"
-else
- fail "ps found but failed to execute"
-fi
+for cmd in ps top free uptime vmstat; do
+ if ! command -v "$cmd" >/dev/null 2>&1; then
+ fail "$cmd not found — procps package missing from sandbox image"
+ fi
+ case "$cmd" in
+ ps) "$cmd" --version >/dev/null 2>&1 ;;
+ top) "$cmd" -b -n 1 >/dev/null 2>&1 ;;
+ free) "$cmd" -b >/dev/null 2>&1 ;;
+ uptime) "$cmd" >/dev/null 2>&1 ;;
+ vmstat) "$cmd" 1 1 >/dev/null 2>&1 ;;
+ esac || fail "$cmd found but failed to execute"
+ pass "$cmd is available and executable"
+done📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for cmd in ps top free uptime vmstat; do | |
| if command -v "$cmd" >/dev/null 2>&1; then | |
| pass "$cmd is available at $(command -v "$cmd")" | |
| else | |
| fail "$cmd not found — procps package missing from sandbox image" | |
| fi | |
| done | |
| # Smoke-test: ps must actually execute, not just resolve | |
| if ps --version >/dev/null 2>&1; then | |
| pass "ps executes successfully" | |
| else | |
| fail "ps found but failed to execute" | |
| fi | |
| for cmd in ps top free uptime vmstat; do | |
| if ! command -v "$cmd" >/dev/null 2>&1; then | |
| fail "$cmd not found — procps package missing from sandbox image" | |
| fi | |
| case "$cmd" in | |
| ps) "$cmd" --version >/dev/null 2>&1 ;; | |
| top) "$cmd" -b -n 1 >/dev/null 2>&1 ;; | |
| free) "$cmd" -b >/dev/null 2>&1 ;; | |
| uptime) "$cmd" >/dev/null 2>&1 ;; | |
| vmstat) "$cmd" 1 1 >/dev/null 2>&1 ;; | |
| esac || fail "$cmd found but failed to execute" | |
| pass "$cmd is available and executable" | |
| done |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e-test.sh` around lines 419 - 431, The script currently only
smoke-tests execution for ps; update the existence loop (the for loop iterating
over cmd and using pass/fail) to also attempt a safe, non-interactive execution
of each found tool (using a tried sequence such as --version, -V, or --help
alternatives until one succeeds) and call pass "<cmd> executes successfully" or
fail "<cmd> found but failed to execute" via the same pass/fail helpers; locate
the loop that declares cmd and the subsequent ps-only check and replace the
separate ps execution check with per-cmd execution attempts that use the cmd
variable and reuse the pass/fail functions (ensuring no interactive flags are
used).
|
✨ Thanks for submitting this pull request that proposes a way to improve the compatibility of NemoClaw by ensuring procps survives hardening and stale base images. This identifies a bug that causes procps tools to be missing inside the sandbox runtime and proposes a three-layer defense to prevent it. The addition of regression tests and a build-time smoke test will help ensure the issue does not recur. Related open PRs: PR #2356. Related open issues: Issue #2343. Related open PRs: Related open issues: |
jyaunches
left a comment
There was a problem hiding this comment.
LGTM — clean procps hardening fix with three-layer defense (apt-mark manual, conditional install fallback, build-time smoke test). Tests are well-layered: static Dockerfile guards + E2E runtime verification. All CI green except wsl-e2e still running. Added hadolint DL3001 ignore for the intentional ps --version assertion.
…VIDIA#2343) (NVIDIA#2408) ## Summary Follow-up to NVIDIA#2356. The adversarial PR review found that `procps` tools (`ps`, `top`, `free`, `uptime`, `vmstat`) were still missing inside a real sandbox runtime even after NVIDIA#2356 added `procps` to `Dockerfile.base`. Root cause: the GHCR base image may not have been rebuilt yet, and the `Dockerfile` hardening step (`apt-get autoremove --purge`) could strip it. This PR adds a three-layer defense in the production `Dockerfile`: - **`apt-mark manual procps`** before the autoremove step, so apt never considers it auto-removable - **Conditional `apt-get install procps`** if `ps` is still missing after hardening (handles stale GHCR base images that predate NVIDIA#2356) - **Build-time `ps --version`** smoke test that fails the Docker build if procps didn't make it through Plus regression tests: - Static guards in `sandbox-provisioning.test.ts` verifying both Dockerfiles contain procps provisions - Runtime E2E test in `e2e-test.sh` verifying all five procps tools are executable inside the sandbox container ## Test plan - [ ] `hadolint Dockerfile` passes - [ ] `hadolint Dockerfile.base` passes - [ ] `vitest run test/sandbox-provisioning.test.ts` passes - [ ] CI `test-e2e-sandbox` job passes (runs `e2e-test.sh` inside container, including new test 11) - [ ] Docker build succeeds with both fresh base image and stale GHCR base 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Added end-to-end checks that verify common system debug tools (ps, top, free, uptime, vmstat) run in the sandbox. * Added regression tests ensuring Docker provisioning includes the debug tooling. * **Chores** * Improved Docker provisioning to ensure procps/debug tools are present at runtime, with a fallback install path for older base images. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
…VIDIA#2343) (NVIDIA#2408) ## Summary Follow-up to NVIDIA#2356. The adversarial PR review found that `procps` tools (`ps`, `top`, `free`, `uptime`, `vmstat`) were still missing inside a real sandbox runtime even after NVIDIA#2356 added `procps` to `Dockerfile.base`. Root cause: the GHCR base image may not have been rebuilt yet, and the `Dockerfile` hardening step (`apt-get autoremove --purge`) could strip it. This PR adds a three-layer defense in the production `Dockerfile`: - **`apt-mark manual procps`** before the autoremove step, so apt never considers it auto-removable - **Conditional `apt-get install procps`** if `ps` is still missing after hardening (handles stale GHCR base images that predate NVIDIA#2356) - **Build-time `ps --version`** smoke test that fails the Docker build if procps didn't make it through Plus regression tests: - Static guards in `sandbox-provisioning.test.ts` verifying both Dockerfiles contain procps provisions - Runtime E2E test in `e2e-test.sh` verifying all five procps tools are executable inside the sandbox container ## Test plan - [ ] `hadolint Dockerfile` passes - [ ] `hadolint Dockerfile.base` passes - [ ] `vitest run test/sandbox-provisioning.test.ts` passes - [ ] CI `test-e2e-sandbox` job passes (runs `e2e-test.sh` inside container, including new test 11) - [ ] Docker build succeeds with both fresh base image and stale GHCR base 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Added end-to-end checks that verify common system debug tools (ps, top, free, uptime, vmstat) run in the sandbox. * Added regression tests ensuring Docker provisioning includes the debug tooling. * **Chores** * Improved Docker provisioning to ensure procps/debug tools are present at runtime, with a fallback install path for older base images. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
Summary
Follow-up to #2356. The adversarial PR review found that
procpstools (ps,top,free,uptime,vmstat) were still missing inside a real sandbox runtime even after #2356 addedprocpstoDockerfile.base. Root cause: the GHCR base image may not have been rebuilt yet, and theDockerfilehardening step (apt-get autoremove --purge) could strip it.This PR adds a three-layer defense in the production
Dockerfile:apt-mark manual procpsbefore the autoremove step, so apt never considers it auto-removableapt-get install procpsifpsis still missing after hardening (handles stale GHCR base images that predate feat(sandbox): add procps for ps/top/kill debug tools (#2343) #2356)ps --versionsmoke test that fails the Docker build if procps didn't make it throughPlus regression tests:
sandbox-provisioning.test.tsverifying both Dockerfiles contain procps provisionse2e-test.shverifying all five procps tools are executable inside the sandbox containerTest plan
hadolint Dockerfilepasseshadolint Dockerfile.basepassesvitest run test/sandbox-provisioning.test.tspassestest-e2e-sandboxjob passes (runse2e-test.shinside container, including new test 11)🤖 Generated with Claude Code
Summary by CodeRabbit
Tests
Chores