Skip to content

fix(sandbox): install tmux for OpenClaw bundled tmux-session flow (#4513)#4606

Merged
cv merged 1 commit into
NVIDIA:mainfrom
yimoj:fix/4513-install-tmux-in-sandbox
Jun 1, 2026
Merged

fix(sandbox): install tmux for OpenClaw bundled tmux-session flow (#4513)#4606
cv merged 1 commit into
NVIDIA:mainfrom
yimoj:fix/4513-install-tmux-in-sandbox

Conversation

@yimoj

@yimoj yimoj commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

OpenClaw's bundled tmux-session flow shells out to tmux inside the sandbox, but the sandbox image never installed it — so the flow failed with bash: line 1: tmux: command not found and the agent path timed out. This installs a pinned tmux in the sandbox image and proves it at build time, fixing the supported flow.

Related Issue

Fixes #4513

Changes

  • Dockerfile.base: add pinned tmux=3.5a-3 to the single base-image apt-get install layer so fresh builds ship tmux with no runtime apt round-trip.
  • Dockerfile: mirror the existing procps/e2fsprogs stale-base handling in the hardening layer — protect tmux via apt-mark manual, detect a missing tmux on stale GHCR bases, install the same pinned version, and assert command -v tmux at build time so the image fails the build rather than regressing silently.
  • test/e2e/test-sandbox-operations.sh: add TC-SBX-09, an e2e smoke that asserts tmux is present and drives the full detached session lifecycle (new-sessionlist-sessionskill-session) inside the sandbox — the exact shape the bundled flow exercises.
  • test/runner.test.ts: add a #4513 regression guard covering the base apt pin, the runtime stale-base repair plus build-time assertion, the cross-file pin match, and the e2e wiring.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Verification notes

  • npx vitest run test/runner.test.ts — 52/52 pass (includes the 4 new #4513 guards).
  • hadolint Dockerfile Dockerfile.base — clean.
  • bash -n test/e2e/test-sandbox-operations.sh — clean.
  • Full npm test: 5952 pass. The only failures (41) are pre-existing Test timed out in 5000ms flakes in test/e2e-scenario/framework-tests/* that spawn real subprocesses on a slow/loaded host; they pass under --testTimeout=60000 and do not touch any file changed here.

Residual image-build/runtime gap

A full image-build + live-sandbox E2E was not run in this environment (no GPU/sandbox host). The build-time command -v tmux assertion in Dockerfile and the TC-SBX-09 runtime smoke close the gap on CI/Brev: the image build now fails if tmux is absent, and the e2e suite fails if the bundled tmux-session lifecycle does not work inside a real sandbox.


Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

  • New Features

    • tmux is now guaranteed to be available in sandbox environments with pinned versioning for consistency.
  • Tests

    • Added end-to-end tests to verify tmux session lifecycle operations in sandboxes.
    • Added test suite to ensure tmux is consistently shipped and properly configured across all runtime configurations.

…IDIA#4513)

OpenClaw's bundled tmux-session flow shells out to `tmux` inside the
sandbox, but the sandbox image never installed it, so the flow failed
with `bash: line 1: tmux: command not found` and the agent path timed
out.

Add a pinned `tmux=3.5a-3` to the base-image apt layer so fresh builds
ship it. Mirror procps/e2fsprogs handling in the runtime Dockerfile's
hardening layer: detect a missing tmux on stale GHCR bases, install the
same pinned version, and assert `command -v tmux` at build time so the
image cannot regress silently. Base and runtime pins are kept in sync.

Add a TC-SBX-09 e2e smoke that drives the full detached tmux session
lifecycle (new-session -> list -> kill) inside the sandbox, and a
runner.test.ts regression guard covering the base apt pin, the
runtime stale-base repair + build-time assertion, the cross-file pin
match, and the e2e wiring.

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 12155c86-6db9-41bb-b861-36b45eabde59

📥 Commits

Reviewing files that changed from the base of the PR and between df7d054 and ccdb3b0.

📒 Files selected for processing (4)
  • Dockerfile
  • Dockerfile.base
  • test/e2e/test-sandbox-operations.sh
  • test/runner.test.ts

📝 Walkthrough

Walkthrough

The PR adds tmux (version 3.5a-3) to NemoClaw sandbox images through coordinated changes: base image installs tmux at build time, runtime image repairs missing tmux on stale cached bases, a new E2E smoke test validates tmux session lifecycle inside sandboxes, and an integration test suite verifies pinned versions and wiring across all artifacts.

Changes

Tmux Bundling and Validation

Layer / File(s) Summary
Base and Runtime Image Hardening
Dockerfile.base, Dockerfile
Dockerfile.base pins tmux=3.5a-3 in the base apt-get install layer. Dockerfile extends the runtime apt hardening step with a needs_tmux check that installs tmux=3.5a-3 and adds it to the manual apt-mark set if missing, failing the build if the final verification fails.
E2E Tmux Session Lifecycle Test
test/e2e/test-sandbox-operations.sh
New test function test_sbx_09_tmux_session_flow() verifies tmux exists in sandbox A and executes a detached session workflow: new-sessionlist-sessionskill-session. Test is integrated into Phase 1 of the main test runner after log streaming tests.
Integration Test Suite
test/runner.test.ts
New Vitest describe block validates that Dockerfile.base includes pinned tmux, Dockerfile conditionally repairs/installs pinned tmux on stale bases and fails if tmux remains missing, both Dockerfiles pin the same tmux version, and the E2E script contains and wires the tmux-session smoke test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A session in a sandbox, tmux opens wide,
No more "command not found"—the tool's inside!
Base and runtime, tested, all aligned,
Three point five-alpha, perfectly pinned.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(sandbox): install tmux for OpenClaw bundled tmux-session flow' clearly and specifically describes the main change—ensuring tmux is installed in the sandbox for OpenClaw's tmux-session flow to work.
Linked Issues check ✅ Passed All coding requirements from issue #4513 are met: tmux is installed in Dockerfile.base, stale-base handling is mirrored in Dockerfile with explicit version pinning (3.5a-3), build-time assertion is added, and E2E test TC-SBX-09 verifies the tmux-session lifecycle.
Out of Scope Changes check ✅ Passed All changes are directly scoped to installing and verifying tmux availability in the sandbox image; no unrelated modifications to other systems or functionality are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@yimoj yimoj added the v0.0.56 Release target label Jun 1, 2026
@wscurran wscurran added Docker integration: openclaw OpenClaw integration behavior labels Jun 1, 2026
@wscurran

wscurran commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

@cjagwani cjagwani self-requested a review June 1, 2026 16:38
@cv cv added v0.0.57 Release target and removed v0.0.56 Release target labels Jun 1, 2026

@cjagwani cjagwani left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving. Conservative addition of pinned tmux=3.5a-3 to the sandbox image, matching the existing procps/e2fsprogs stale-base-repair pattern.

The cross-file pin-match regression in runner.test.ts is a nice touch — prevents future drift between Dockerfile and Dockerfile.base. The new TC-SBX-09 e2e drives the full new-session → list → kill lifecycle inside the sandbox, and it ran (and passed) under test-e2e-sandbox in this PR's CI run.

@cv cv merged commit ecf8078 into NVIDIA:main Jun 1, 2026
31 checks passed
cv pushed a commit that referenced this pull request Jun 2, 2026
…rk (#4640)

## Summary

`TC-SBX-09: Tmux Session Flow` has been failing on every scheduled
nightly E2E run since #4606 merged. The first assertion (tmux binary
present) still passes; the second assertion — drive a full detached
`new-session` → `list-sessions` → `kill-session` cycle inside the
sandbox — consistently fails with `create window failed: fork failed:
Permission denied`.

Root cause: OpenShell sandbox hardening (seccomp, `no-new-privileges`,
`nproc=512` ulimit) blocks tmux's fork-to-spawn child window when
invoked under the e2e SSH session account. The binary-presence assertion
already covers the surface of issue #4513; the lifecycle drive depends
on sandbox runtime capabilities that are environment-dependent and out
of scope for this case. Degrade that branch to `skip` with the observed
`fork failed` output so the suite reports the limitation without failing
the nightly.

Latest failing scheduled nightly: [run
26790528855](https://github.com/NVIDIA/NemoClaw/actions/runs/26790528855).
Same failure also blocks PR review on
[#4388](#4388) via inherited
advisor reruns [run
26790735708](https://github.com/NVIDIA/NemoClaw/actions/runs/26790735708)
and [run
26791599457](https://github.com/NVIDIA/NemoClaw/actions/runs/26791599457).

## Related Issue

Follow-up to #4606 (which added TC-SBX-09 alongside the sandbox-image
tmux pin). The PR body of #4606 noted *"A full image-build +
live-sandbox E2E was not run in this environment"* — the lifecycle drive
added by that PR turned out to be incompatible with the live OpenShell
sandbox seccomp + capability profile, so every scheduled `E2E / Nightly`
run since the merge has reported `sandbox-operations-e2e` as failing on
this single assertion. This PR keeps the binary-presence guard from
#4606 intact (the actual surface of #4513) while making the lifecycle
drive a soft skip when the sandbox refuses to fork, so the nightly
pipeline can go green again without masking real regressions (a
non-`fork failed` error still hits the `fail` branch).

## Changes

- `test/e2e/test-sandbox-operations.sh`: in
`test_sbx_09_tmux_session_flow`, add a `skip` branch matching `fork
failed: (Permission denied|Resource temporarily unavailable)` between
the existing `pass`/`fail` branches; keeps best-effort `kill-session`
cleanup before recording the skip.

## Type of Change

- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification

- [x] `npx prek run --all-files` passes
- [x] `npm test` passes
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes
- [ ] `npm run docs` builds without warnings (doc changes only)
- [ ] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

---
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Improved tmux sandbox operations test to better detect and handle fork
failures with enhanced error recovery and clearer skip messages.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@wscurran wscurran added area: packaging Packages, images, registries, installers, or distribution area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression platform: container Affects Docker, containerd, Podman, or images and removed area: packaging Packages, images, registries, installers, or distribution Docker labels Jun 3, 2026
yimoj added a commit to yimoj/NemoClaw that referenced this pull request Jun 9, 2026
…te a PTY (NVIDIA#4513)

PR NVIDIA#4606 installed tmux in the sandbox image, but the bundled tmux-session
flow still failed with `create window failed: fork failed: Permission denied`
(QA reopened NVIDIA#4513; NVIDIA#4640 then degraded the e2e drive to a soft skip).

Root cause is not fork/seccomp/nproc — plain fork() works and nproc is nearly
empty. It is PTY allocation: the OpenShell sandbox landlock filesystem policy
grants /dev/null and /dev/urandom but never /dev/pts, so forkpty() opening
/dev/ptmx (-> /dev/pts/ptmx) and a /dev/pts/<n> slave is denied with EACCES.
tmux surfaces that EACCES as "fork failed: Permission denied".

Grant /dev/pts (read_write) in the OpenClaw base + permissive policies. Grant
the directory, not /dev/ptmx — the supervisor refuses to chown the symlink, and
the devpts directory grant covers ptmx + slaves via the landlock path hierarchy.
This follows the same base+permissive pattern as the prior /home/linuxbrew
(NVIDIA#3913) and GPU /proc additions, which permissive-runtime already unions on
shields-down.

Verified hermetically with the real openshell-sandbox supervisor + the actual
edited policy: before, openpty()=EACCES and tmux new/list/kill fails; after,
openpty() returns /dev/pts/0 and the full tmux lifecycle succeeds.

Changes:
- nemoclaw-blueprint/policies/openclaw-sandbox.yaml: add /dev/pts to
  filesystem_policy.read_write with the PTY rationale.
- nemoclaw-blueprint/policies/openclaw-sandbox-permissive.yaml and
  agents/openclaw/policy-permissive.yaml: mirror /dev/pts so `shields down`
  never removes a live filesystem path (OpenShell rejects removals).
- test/e2e/test-sandbox-operations.sh: restore TC-SBX-09 to a hard assertion
  (drop the NVIDIA#4640 fork-failure soft skip), add an explicit openpty() probe that
  pins the root cause, and fail loudly if devpts ever regresses.
- test/runner.test.ts: assert both policies grant /dev/pts (and not /dev/ptmx)
  and that TC-SBX-09 no longer skips on fork failure.
- docs/reference/network-policies.mdx, docs/security/best-practices.mdx: list
  /dev/pts as a default writable path with the tmux rationale.

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
cv pushed a commit that referenced this pull request Jun 9, 2026
…te a PTY (#4513) (#5019)

## Summary

The reopened #4513: PR #4606 installed `tmux` in the sandbox image, but
OpenClaw's bundled tmux-session flow still failed with `create window
failed: fork failed: Permission denied`, and #4640 then degraded the e2e
drive to a soft skip. This grants `/dev/pts` in the OpenClaw sandbox
landlock filesystem policy so the flow can actually allocate a PTY, and
restores the e2e to a hard assertion.

## Related Issue

Fixes #4513

## Changes

**Root cause (diagnosed, not assumed):** it is not fork/seccomp/nproc —
plain `fork()` works and `nproc` (512) is nearly empty. It is **PTY
allocation**. The OpenShell sandbox landlock `filesystem_policy` grants
`/dev/null` and `/dev/urandom` but never `/dev/pts`, so `forkpty()`
opening `/dev/ptmx` (→ `/dev/pts/ptmx`) and a `/dev/pts/<n>` slave is
denied with `EACCES`. tmux surfaces that `EACCES` as `create window
failed: fork failed: Permission denied`.

- `nemoclaw-blueprint/policies/openclaw-sandbox.yaml`: add `/dev/pts` to
`filesystem_policy.read_write` with the PTY rationale. Grant the
**directory**, not `/dev/ptmx` — the supervisor refuses to `chown` the
`/dev/ptmx` symlink, and the devpts directory grant covers `ptmx` +
slaves via the landlock path hierarchy.
- `nemoclaw-blueprint/policies/openclaw-sandbox-permissive.yaml` and
`agents/openclaw/policy-permissive.yaml`: mirror `/dev/pts` so `shields
down` never removes a live filesystem path (OpenShell rejects live
filesystem removals). Same base+permissive pattern as the prior
`/home/linuxbrew` (#3913) and GPU `/proc` additions, which
`permissive-runtime` already unions on shields-down.
- `test/e2e/test-sandbox-operations.sh` (`TC-SBX-09`): restore the
lifecycle drive to a **hard assertion** (drop the #4640 fork-failure
soft skip), add an explicit `openpty()` probe that pins the root cause,
and fail loudly if devpts ever regresses. The `PTY_OK` sentinel is
emitted by a shell `&& echo` (not inside the python source) so a Python
traceback echoing the source line cannot make the probe falsely pass.
- `test/runner.test.ts`: assert both OpenClaw policies grant `/dev/pts`
(and not `/dev/ptmx`), and that `TC-SBX-09` no longer skips on fork
failure.
- `docs/reference/network-policies.mdx`,
`docs/security/best-practices.mdx`: list `/dev/pts` as a default
writable path with the tmux rationale.

## Verification

**Reporter-workflow E2E — exact tmux-session/forkpty flow inside the
hardened NemoClaw sandbox.**

Reproduced and verified hermetically with the **real `openshell-sandbox`
supervisor** (which applies the live seccomp + landlock + `RLIMIT_NPROC`
hardening; no gateway/GPU needed), running inside
`ghcr.io/nvidia/nemoclaw/sandbox-base` + `tmux`, driven by the **actual
edited `openclaw-sandbox.yaml`** as the policy data.

Command (same shape as `TC-SBX-09`):

```
openshell-sandbox --policy-rules sandbox-policy.rego --policy-data <openclaw-sandbox.yaml> -- \
  bash -lc 'export TMUX_TMPDIR=/tmp; python3 -c "import os; _,s=os.openpty(); print(os.ttyname(s))";
            tmux new-session -d -s smoke "sleep 30"; tmux list-sessions; tmux kill-session -t smoke; echo TMUX_FLOW_OK'
```

Logs:

```
# BEFORE (committed policy, no /dev/pts):
-- openpty --  PermissionError: [Errno 13] Permission denied
-- tmux --     create window failed: fork failed: Permission denied      (new/list/kill RC=1/1/1)

# AFTER (this PR's policy, /dev/pts granted):
-- openpty --  /dev/pts/0
-- tmux --     smoke: 1 windows (created ...)                            (new/list/kill RC=0/0/0)  TMUX_FLOW_OK
```

This is the exact bundled tmux-session lifecycle (`new-session` →
`list-sessions` → `kill-session`) the reporter hit. **Pipeline
coverage:** the restored `TC-SBX-09`
(`test/e2e/test-sandbox-operations.sh`) drives this same lifecycle plus
the `openpty()` root-cause probe against a live OpenShell sandbox on the
scheduled E2E job, now as a hard assertion (a `fork failed` is a
failure, not a skip).

PR CI on head `b75befddb`: all required checks green — `static-checks`,
`build-typecheck`, `cli-test-shards (1–5)`, `plugin-tests`,
`unit-vitest-linux`, `ShellCheck`, `CodeQL`, `wsl-e2e`, `macos-e2e`,
`dco-check`, `commit-lint`, `codebase-growth-guardrails` (run
`27184680225`).

- [x] `npx prek run --all-files` passes (all
formatter/lint/security/docs hooks green; the heavy `Test (CLI)` hook
shows only load-induced timeout flakes that pass in isolation — this
change touches no `src/` code)
- [x] `npm test` — targeted suites pass (`runner`, `policies`,
`initial-policy`, `permissive-runtime`, `policy-tiers`, `shields`); CI
`cli-test-shards (1–5)` + `plugin-tests` + `unit-vitest-linux` all green
on PR head
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes

## Type of Change

- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates

## Notes — known, pre-existing limitation (out of scope)

On an OpenClaw sandbox **created before this change and upgraded without
recreate**, a `shields down` then `shields up` cycle can leave the
sandbox in shields-down state: `shields down` adds `/dev/pts` via the
permissive policy, and `shields up` restores the pre-upgrade live
snapshot, which omits `/dev/pts` (OpenShell rejects the resulting live
filesystem removal). This is identical to the existing `/home/linuxbrew`
(#3913), GPU `/proc`, and Hermes `/opt/hermes` base-policy additions —
shields-up restore has never normalized the snapshot for any path. Such
a sandbox has not received the creation-locked filesystem fix anyway;
the remediation (recreate / re-onboard) resolves both the tmux fix and
this edge case. A symmetric snapshot-normalization in shields-restore
would address the whole class and belongs in its own change.

---
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* PTY allocation now works inside sandboxes, allowing terminal tools
(tmux, script, interactive shells) and preventing "fork failed:
Permission denied" failures.

* **Documentation**
* Sandbox filesystem docs updated to include PTY device access and
clarify configuration for terminal allocation tooling.

* **Tests**
* End-to-end and unit tests tightened to assert PTY support and to fail
when PTY allocation regresses.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression integration: openclaw OpenClaw integration behavior platform: container Affects Docker, containerd, Podman, or images v0.0.57 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ubuntu 24.04][Agent] OpenClaw bundled tmux-session flow fails with tmux command not found

4 participants