feat(sandbox): add procps for ps/top/kill debug tools (#2343)#2356
Conversation
Sandbox users reported that basic process-inspection commands were
missing ("ps: command not found"), making it hard to observe what's
running inside the container. procps also provides top, kill, free,
uptime, and vmstat, which cover the common debug needs without
pulling in network probes or build tools that the final image
explicitly strips.
Pinned to the Debian bookworm version (2:4.0.2-3) to match the rest
of the base-image apt list.
Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe Dockerfile.base has been modified to include the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
✨ Thanks for submitting this pull request that proposes a way to improve the testing infrastructure of NemoClaw by adding procps for ps/top/kill debug tools. Related open issues: |
ericksoa
left a comment
There was a problem hiding this comment.
Single-line addition, correct placement in Dockerfile.base, version pinned to bookworm. Verified survives the final-image hardening step per the test plan. LGTM.
One note: several merge commits are authored by Test User <test@example.com> — make sure your local git config is set correctly for future PRs. Not blocking since squash-merge will use the correct identity from the feature commit.
…2343) (#2408) ## Summary Follow-up to #2356. The adversarial PR review found that `procps` tools (`ps`, `top`, `free`, `uptime`, `vmstat`) were still missing inside a real sandbox runtime even after #2356 added `procps` to `Dockerfile.base`. Root cause: the GHCR base image may not have been rebuilt yet, and the `Dockerfile` hardening step (`apt-get autoremove --purge`) could strip it. This PR adds a three-layer defense in the production `Dockerfile`: - **`apt-mark manual procps`** before the autoremove step, so apt never considers it auto-removable - **Conditional `apt-get install procps`** if `ps` is still missing after hardening (handles stale GHCR base images that predate #2356) - **Build-time `ps --version`** smoke test that fails the Docker build if procps didn't make it through Plus regression tests: - Static guards in `sandbox-provisioning.test.ts` verifying both Dockerfiles contain procps provisions - Runtime E2E test in `e2e-test.sh` verifying all five procps tools are executable inside the sandbox container ## Test plan - [ ] `hadolint Dockerfile` passes - [ ] `hadolint Dockerfile.base` passes - [ ] `vitest run test/sandbox-provisioning.test.ts` passes - [ ] CI `test-e2e-sandbox` job passes (runs `e2e-test.sh` inside container, including new test 11) - [ ] Docker build succeeds with both fresh base image and stale GHCR base 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Added end-to-end checks that verify common system debug tools (ps, top, free, uptime, vmstat) run in the sandbox. * Added regression tests ensuring Docker provisioning includes the debug tooling. * **Chores** * Improved Docker provisioning to ensure procps/debug tools are present at runtime, with a fallback install path for older base images. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
…VIDIA#2356) ## Summary Add `procps` to the sandbox base image so users can run `ps`, `top`, `kill`, `free`, `uptime`, and `vmstat` for debugging. Fixes NVIDIA#2343. ## Why Sandbox users reported that basic process-inspection commands were missing: ``` sandbox@xxx-nemoclaw-assistant:~$ ps bash: ps: command not found ``` Without `ps` it is very hard to observe what is running inside the container. `procps` is the standard Debian package that delivers the full set of essential process/resource commands (`ps`, `top`, `kill`, `free`, `uptime`, `vmstat`) in a single ~1 MB package. ## Scope & Choices - **Only one package added.** Kept the change minimal — the reporter asked for `ps`, and `procps` covers the common debug needs without dragging in additional surfaces (e.g. `psmisc`, `net-tools`). Follow-up packages can be requested in a separate issue. - **Added to `Dockerfile.base`, not `Dockerfile`.** This is a cacheable, rarely-changing package, matching the policy documented at the top of `Dockerfile.base`. - **Version pinned to `2:4.0.2-3`** (Debian bookworm candidate, confirmed via `apt-cache policy procps` on `node:22-slim`), consistent with every other apt package in the list. - **Survives the final-image hardening step.** `Dockerfile` runs `apt-get remove --purge ... && apt-get autoremove --purge` to strip build tools and network probes. I simulated this end-to-end and `procps` is preserved, because it is explicitly installed and not a transitive dep of anything removed. ## Test plan - [x] `hadolint Dockerfile.base` passes - [x] `hadolint Dockerfile` passes - [x] `docker build -f Dockerfile.base -t nemoclaw-base-test:2343 .` builds successfully - [x] `docker run --rm nemoclaw-base-test:2343 bash -c 'ps aux'` lists processes - [x] `which ps top kill free uptime vmstat` all resolve to `/usr/bin/` - [x] Simulated the `Dockerfile` hardening step (`apt-get remove ... && autoremove --purge`) against the built base — `ps` still works afterwards - [x] `prek` / commit-lint / DCO / gitleaks pre-commit hooks all pass - [ ] CI `base-image.yaml` rebuild on merge Signed-off-by: Chengjie Wang <chengjiew@nvidia.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Updated Docker base image with an additional system utility package to support improved system monitoring and process management capabilities. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Chengjie Wang <chengjiew@nvidia.com> Co-authored-by: Test User <test@example.com> Co-authored-by: Aaron Erickson 🦞 <aerickson@nvidia.com>
…VIDIA#2343) (NVIDIA#2408) ## Summary Follow-up to NVIDIA#2356. The adversarial PR review found that `procps` tools (`ps`, `top`, `free`, `uptime`, `vmstat`) were still missing inside a real sandbox runtime even after NVIDIA#2356 added `procps` to `Dockerfile.base`. Root cause: the GHCR base image may not have been rebuilt yet, and the `Dockerfile` hardening step (`apt-get autoremove --purge`) could strip it. This PR adds a three-layer defense in the production `Dockerfile`: - **`apt-mark manual procps`** before the autoremove step, so apt never considers it auto-removable - **Conditional `apt-get install procps`** if `ps` is still missing after hardening (handles stale GHCR base images that predate NVIDIA#2356) - **Build-time `ps --version`** smoke test that fails the Docker build if procps didn't make it through Plus regression tests: - Static guards in `sandbox-provisioning.test.ts` verifying both Dockerfiles contain procps provisions - Runtime E2E test in `e2e-test.sh` verifying all five procps tools are executable inside the sandbox container ## Test plan - [ ] `hadolint Dockerfile` passes - [ ] `hadolint Dockerfile.base` passes - [ ] `vitest run test/sandbox-provisioning.test.ts` passes - [ ] CI `test-e2e-sandbox` job passes (runs `e2e-test.sh` inside container, including new test 11) - [ ] Docker build succeeds with both fresh base image and stale GHCR base 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Added end-to-end checks that verify common system debug tools (ps, top, free, uptime, vmstat) run in the sandbox. * Added regression tests ensuring Docker provisioning includes the debug tooling. * **Chores** * Improved Docker provisioning to ensure procps/debug tools are present at runtime, with a fallback install path for older base images. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
…VIDIA#2343) (NVIDIA#2408) ## Summary Follow-up to NVIDIA#2356. The adversarial PR review found that `procps` tools (`ps`, `top`, `free`, `uptime`, `vmstat`) were still missing inside a real sandbox runtime even after NVIDIA#2356 added `procps` to `Dockerfile.base`. Root cause: the GHCR base image may not have been rebuilt yet, and the `Dockerfile` hardening step (`apt-get autoremove --purge`) could strip it. This PR adds a three-layer defense in the production `Dockerfile`: - **`apt-mark manual procps`** before the autoremove step, so apt never considers it auto-removable - **Conditional `apt-get install procps`** if `ps` is still missing after hardening (handles stale GHCR base images that predate NVIDIA#2356) - **Build-time `ps --version`** smoke test that fails the Docker build if procps didn't make it through Plus regression tests: - Static guards in `sandbox-provisioning.test.ts` verifying both Dockerfiles contain procps provisions - Runtime E2E test in `e2e-test.sh` verifying all five procps tools are executable inside the sandbox container ## Test plan - [ ] `hadolint Dockerfile` passes - [ ] `hadolint Dockerfile.base` passes - [ ] `vitest run test/sandbox-provisioning.test.ts` passes - [ ] CI `test-e2e-sandbox` job passes (runs `e2e-test.sh` inside container, including new test 11) - [ ] Docker build succeeds with both fresh base image and stale GHCR base 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Added end-to-end checks that verify common system debug tools (ps, top, free, uptime, vmstat) run in the sandbox. * Added regression tests ensuring Docker provisioning includes the debug tooling. * **Chores** * Improved Docker provisioning to ensure procps/debug tools are present at runtime, with a fallback install path for older base images. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
Summary
Add
procpsto the sandbox base image so users can runps,top,kill,free,uptime, andvmstatfor debugging.Fixes #2343.
Why
Sandbox users reported that basic process-inspection commands were missing:
Without
psit is very hard to observe what is running inside the container.procpsis the standard Debian package that delivers the full set of essential process/resource commands (ps,top,kill,free,uptime,vmstat) in a single ~1 MB package.Scope & Choices
ps, andprocpscovers the common debug needs without dragging in additional surfaces (e.g.psmisc,net-tools). Follow-up packages can be requested in a separate issue.Dockerfile.base, notDockerfile. This is a cacheable, rarely-changing package, matching the policy documented at the top ofDockerfile.base.2:4.0.2-3(Debian bookworm candidate, confirmed viaapt-cache policy procpsonnode:22-slim), consistent with every other apt package in the list.Dockerfilerunsapt-get remove --purge ... && apt-get autoremove --purgeto strip build tools and network probes. I simulated this end-to-end andprocpsis preserved, because it is explicitly installed and not a transitive dep of anything removed.Test plan
hadolint Dockerfile.basepasseshadolint Dockerfilepassesdocker build -f Dockerfile.base -t nemoclaw-base-test:2343 .builds successfullydocker run --rm nemoclaw-base-test:2343 bash -c 'ps aux'lists processeswhich ps top kill free uptime vmstatall resolve to/usr/bin/Dockerfilehardening step (apt-get remove ... && autoremove --purge) against the built base —psstill works afterwardsprek/ commit-lint / DCO / gitleaks pre-commit hooks all passbase-image.yamlrebuild on mergeSigned-off-by: Chengjie Wang chengjiew@nvidia.com
🤖 Generated with Claude Code
Summary by CodeRabbit