Skip to content

feat(sandbox): add procps for ps/top/kill debug tools (#2343)#2356

Merged
ericksoa merged 7 commits into
mainfrom
fix/2343-sandbox-linux-tools
Apr 24, 2026
Merged

feat(sandbox): add procps for ps/top/kill debug tools (#2343)#2356
ericksoa merged 7 commits into
mainfrom
fix/2343-sandbox-linux-tools

Conversation

@chengjiew

@chengjiew chengjiew commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Add procps to the sandbox base image so users can run ps, top, kill, free, uptime, and vmstat for debugging.

Fixes #2343.

Why

Sandbox users reported that basic process-inspection commands were missing:

sandbox@xxx-nemoclaw-assistant:~$ ps
bash: ps: command not found

Without ps it is very hard to observe what is running inside the container. procps is the standard Debian package that delivers the full set of essential process/resource commands (ps, top, kill, free, uptime, vmstat) in a single ~1 MB package.

Scope & Choices

  • Only one package added. Kept the change minimal — the reporter asked for ps, and procps covers the common debug needs without dragging in additional surfaces (e.g. psmisc, net-tools). Follow-up packages can be requested in a separate issue.
  • Added to Dockerfile.base, not Dockerfile. This is a cacheable, rarely-changing package, matching the policy documented at the top of Dockerfile.base.
  • Version pinned to 2:4.0.2-3 (Debian bookworm candidate, confirmed via apt-cache policy procps on node:22-slim), consistent with every other apt package in the list.
  • Survives the final-image hardening step. Dockerfile runs apt-get remove --purge ... && apt-get autoremove --purge to strip build tools and network probes. I simulated this end-to-end and procps is preserved, because it is explicitly installed and not a transitive dep of anything removed.

Test plan

  • hadolint Dockerfile.base passes
  • hadolint Dockerfile passes
  • docker build -f Dockerfile.base -t nemoclaw-base-test:2343 . builds successfully
  • docker run --rm nemoclaw-base-test:2343 bash -c 'ps aux' lists processes
  • which ps top kill free uptime vmstat all resolve to /usr/bin/
  • Simulated the Dockerfile hardening step (apt-get remove ... && autoremove --purge) against the built base — ps still works afterwards
  • prek / commit-lint / DCO / gitleaks pre-commit hooks all pass
  • CI base-image.yaml rebuild on merge

Signed-off-by: Chengjie Wang chengjiew@nvidia.com

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores
    • Updated Docker base image with an additional system utility package to support improved system monitoring and process management capabilities.

Sandbox users reported that basic process-inspection commands were
missing ("ps: command not found"), making it hard to observe what's
running inside the container. procps also provides top, kill, free,
uptime, and vmstat, which cover the common debug needs without
pulling in network probes or build tools that the final image
explicitly strips.

Pinned to the Debian bookworm version (2:4.0.2-3) to match the rest
of the base-image apt list.

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
@coderabbitai

coderabbitai Bot commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 047f4632-eb89-47fe-b70c-182f7fa76384

📥 Commits

Reviewing files that changed from the base of the PR and between 2f436db and a779557.

📒 Files selected for processing (1)
  • Dockerfile.base

📝 Walkthrough

Walkthrough

The Dockerfile.base has been modified to include the procps system package (version 2:4.0.2-3) in the apt-get install command. This enables process management utilities, including the ps command, within the sandbox environment.

Changes

Cohort / File(s) Summary
System Package Addition
Dockerfile.base
Added procps package to the Docker base image's package dependencies to enable process inspection tools like ps command in the sandbox.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐰 A tiny change, yet oh so grand,
The ps command now at hand!
With procps installed in the sand,
Debugging tools across the land! 🔧✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding procps package to provide debug tools (ps, top, kill, etc.) to the sandbox environment.
Linked Issues check ✅ Passed The PR directly addresses issue #2343 by adding the procps package to provide essential Linux debugging tools (ps, top, kill, free, uptime, vmstat) in the sandbox image as requested.
Out of Scope Changes check ✅ Passed All changes are scoped to adding a single package (procps) to Dockerfile.base to address the linked issue; no unrelated modifications are present.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/2343-sandbox-linux-tools

Comment @coderabbitai help to get the list of available commands and usage tips.

@copy-pr-bot

copy-pr-bot Bot commented Apr 23, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@chengjiew chengjiew requested a review from ericksoa April 23, 2026 16:35
@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this pull request that proposes a way to improve the testing infrastructure of NemoClaw by adding procps for ps/top/kill debug tools.


Related open issues:

@wscurran wscurran removed the feat label Apr 23, 2026

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single-line addition, correct placement in Dockerfile.base, version pinned to bookworm. Verified survives the final-image hardening step per the test plan. LGTM.

One note: several merge commits are authored by Test User <test@example.com> — make sure your local git config is set correctly for future PRs. Not blocking since squash-merge will use the correct identity from the feature commit.

@ericksoa ericksoa merged commit b026ab6 into main Apr 24, 2026
8 checks passed
jyaunches added a commit that referenced this pull request Apr 24, 2026
…2343) (#2408)

## Summary

Follow-up to #2356. The adversarial PR review found that `procps` tools
(`ps`, `top`, `free`, `uptime`, `vmstat`) were still missing inside a
real sandbox runtime even after #2356 added `procps` to
`Dockerfile.base`. Root cause: the GHCR base image may not have been
rebuilt yet, and the `Dockerfile` hardening step (`apt-get autoremove
--purge`) could strip it.

This PR adds a three-layer defense in the production `Dockerfile`:
- **`apt-mark manual procps`** before the autoremove step, so apt never
considers it auto-removable
- **Conditional `apt-get install procps`** if `ps` is still missing
after hardening (handles stale GHCR base images that predate #2356)
- **Build-time `ps --version`** smoke test that fails the Docker build
if procps didn't make it through

Plus regression tests:
- Static guards in `sandbox-provisioning.test.ts` verifying both
Dockerfiles contain procps provisions
- Runtime E2E test in `e2e-test.sh` verifying all five procps tools are
executable inside the sandbox container

## Test plan

- [ ] `hadolint Dockerfile` passes
- [ ] `hadolint Dockerfile.base` passes
- [ ] `vitest run test/sandbox-provisioning.test.ts` passes
- [ ] CI `test-e2e-sandbox` job passes (runs `e2e-test.sh` inside
container, including new test 11)
- [ ] Docker build succeeds with both fresh base image and stale GHCR
base

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added end-to-end checks that verify common system debug tools (ps,
top, free, uptime, vmstat) run in the sandbox.
* Added regression tests ensuring Docker provisioning includes the debug
tooling.

* **Chores**
* Improved Docker provisioning to ensure procps/debug tools are present
at runtime, with a fallback install path for older base images.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
@cv cv added the v0.0.25 label Apr 24, 2026
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
…VIDIA#2356)

## Summary

Add `procps` to the sandbox base image so users can run `ps`, `top`,
`kill`, `free`, `uptime`, and `vmstat` for debugging.

Fixes NVIDIA#2343.

## Why

Sandbox users reported that basic process-inspection commands were
missing:

```
sandbox@xxx-nemoclaw-assistant:~$ ps
bash: ps: command not found
```

Without `ps` it is very hard to observe what is running inside the
container. `procps` is the standard Debian package that delivers the
full set of essential process/resource commands (`ps`, `top`, `kill`,
`free`, `uptime`, `vmstat`) in a single ~1 MB package.

## Scope & Choices

- **Only one package added.** Kept the change minimal — the reporter
asked for `ps`, and `procps` covers the common debug needs without
dragging in additional surfaces (e.g. `psmisc`, `net-tools`). Follow-up
packages can be requested in a separate issue.
- **Added to `Dockerfile.base`, not `Dockerfile`.** This is a cacheable,
rarely-changing package, matching the policy documented at the top of
`Dockerfile.base`.
- **Version pinned to `2:4.0.2-3`** (Debian bookworm candidate,
confirmed via `apt-cache policy procps` on `node:22-slim`), consistent
with every other apt package in the list.
- **Survives the final-image hardening step.** `Dockerfile` runs
`apt-get remove --purge ... && apt-get autoremove --purge` to strip
build tools and network probes. I simulated this end-to-end and `procps`
is preserved, because it is explicitly installed and not a transitive
dep of anything removed.

## Test plan

- [x] `hadolint Dockerfile.base` passes
- [x] `hadolint Dockerfile` passes
- [x] `docker build -f Dockerfile.base -t nemoclaw-base-test:2343 .`
builds successfully
- [x] `docker run --rm nemoclaw-base-test:2343 bash -c 'ps aux'` lists
processes
- [x] `which ps top kill free uptime vmstat` all resolve to `/usr/bin/`
- [x] Simulated the `Dockerfile` hardening step (`apt-get remove ... &&
autoremove --purge`) against the built base — `ps` still works
afterwards
- [x] `prek` / commit-lint / DCO / gitleaks pre-commit hooks all pass
- [ ] CI `base-image.yaml` rebuild on merge

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
* Updated Docker base image with an additional system utility package to
support improved system monitoring and process management capabilities.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
Co-authored-by: Test User <test@example.com>
Co-authored-by: Aaron Erickson 🦞 <aerickson@nvidia.com>
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
…VIDIA#2343) (NVIDIA#2408)

## Summary

Follow-up to NVIDIA#2356. The adversarial PR review found that `procps` tools
(`ps`, `top`, `free`, `uptime`, `vmstat`) were still missing inside a
real sandbox runtime even after NVIDIA#2356 added `procps` to
`Dockerfile.base`. Root cause: the GHCR base image may not have been
rebuilt yet, and the `Dockerfile` hardening step (`apt-get autoremove
--purge`) could strip it.

This PR adds a three-layer defense in the production `Dockerfile`:
- **`apt-mark manual procps`** before the autoremove step, so apt never
considers it auto-removable
- **Conditional `apt-get install procps`** if `ps` is still missing
after hardening (handles stale GHCR base images that predate NVIDIA#2356)
- **Build-time `ps --version`** smoke test that fails the Docker build
if procps didn't make it through

Plus regression tests:
- Static guards in `sandbox-provisioning.test.ts` verifying both
Dockerfiles contain procps provisions
- Runtime E2E test in `e2e-test.sh` verifying all five procps tools are
executable inside the sandbox container

## Test plan

- [ ] `hadolint Dockerfile` passes
- [ ] `hadolint Dockerfile.base` passes
- [ ] `vitest run test/sandbox-provisioning.test.ts` passes
- [ ] CI `test-e2e-sandbox` job passes (runs `e2e-test.sh` inside
container, including new test 11)
- [ ] Docker build succeeds with both fresh base image and stale GHCR
base

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added end-to-end checks that verify common system debug tools (ps,
top, free, uptime, vmstat) run in the sandbox.
* Added regression tests ensuring Docker provisioning includes the debug
tooling.

* **Chores**
* Improved Docker provisioning to ensure procps/debug tools are present
at runtime, with a fallback install path for older base images.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
ksapru pushed a commit to ksapru/NemoClaw that referenced this pull request May 12, 2026
…VIDIA#2343) (NVIDIA#2408)

## Summary

Follow-up to NVIDIA#2356. The adversarial PR review found that `procps` tools
(`ps`, `top`, `free`, `uptime`, `vmstat`) were still missing inside a
real sandbox runtime even after NVIDIA#2356 added `procps` to
`Dockerfile.base`. Root cause: the GHCR base image may not have been
rebuilt yet, and the `Dockerfile` hardening step (`apt-get autoremove
--purge`) could strip it.

This PR adds a three-layer defense in the production `Dockerfile`:
- **`apt-mark manual procps`** before the autoremove step, so apt never
considers it auto-removable
- **Conditional `apt-get install procps`** if `ps` is still missing
after hardening (handles stale GHCR base images that predate NVIDIA#2356)
- **Build-time `ps --version`** smoke test that fails the Docker build
if procps didn't make it through

Plus regression tests:
- Static guards in `sandbox-provisioning.test.ts` verifying both
Dockerfiles contain procps provisions
- Runtime E2E test in `e2e-test.sh` verifying all five procps tools are
executable inside the sandbox container

## Test plan

- [ ] `hadolint Dockerfile` passes
- [ ] `hadolint Dockerfile.base` passes
- [ ] `vitest run test/sandbox-provisioning.test.ts` passes
- [ ] CI `test-e2e-sandbox` job passes (runs `e2e-test.sh` inside
container, including new test 11)
- [ ] Docker build succeeds with both fresh base image and stale GHCR
base

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added end-to-end checks that verify common system debug tools (ps,
top, free, uptime, vmstat) run in the sandbox.
* Added regression tests ensuring Docker provisioning includes the debug
tooling.

* **Chores**
* Improved Docker provisioning to ensure procps/debug tools are present
at runtime, with a fallback install path for older base images.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
@wscurran wscurran added area: e2e End-to-end tests, nightly failures, or validation infrastructure feature PR adds or expands user-visible functionality and removed enhancement: testing labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure feature PR adds or expands user-visible functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for necessary linux tools inside the sandbox

4 participants