fix(sandbox): harden startup log creation#3888
Conversation
Signed-off-by: Test User <test@example.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds atomic, defensive log provisioning helpers and hardens Hermes and Nemoclaw startup paths to reject unsafe targets, replace symlinked logs with regular files with explicit modes/ownership, and backfills an idempotent runtime shell env shim; tests verify symlink replacement, permissions, and shim idempotency. ChangesStartup log file hardening
Runtime shell env shim and sandbox-home refactor
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
scripts/nemoclaw-start.sh (1)
48-87: Please run the entrypoint E2Es on this branch before merge.This hardening sits on the sandbox boot path, and the unit test here only exercises an extracted snippet rather than the full Landlock/non-root/process-lifecycle path. I’d run
sandbox-survival-e2e,sandbox-operations-e2e, andcloud-e2efor this change.As per coding guidelines,
scripts/nemoclaw-start.sh:This file is a sandbox entrypoint script. Changes affect every sandbox boot and are invisible to unit tests (Landlock, non-root execution, process lifecycle).andE2E test recommendation: sandbox-survival-e2e, sandbox-operations-e2e, cloud-e2e.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/nemoclaw-start.sh` around lines 48 - 87, The change in scripts/nemoclaw-start.sh affects sandbox boot behavior (see prepare_start_log function and its invocation at the bottom) and must be validated with full entrypoint E2E runs before merging; please run the recommended end-to-end suites sandbox-survival-e2e, sandbox-operations-e2e, and cloud-e2e against this branch (exercising Landlock/non-root/process-lifecycle paths) and post the passing results or any fixes, referencing prepare_start_log and the two calls that invoke it for root and non-root paths.test/nemoclaw-start.test.ts (1)
120-151: ⚡ Quick winAdd a regression for the real-directory refusal path.
The new helper also refuses to replace
_START_LOGwhen it is an actual directory, but this suite only locks in the symlink-replacement branch. A small companion case that pre-createslogPathas a directory and asserts a non-zero exit plus an unchanged directory would cover the other security contract introduced here.🧪 Suggested companion test
+ it("refuses to replace a real directory at the startup log path", () => { + const src = fs.readFileSync(START_SCRIPT, "utf-8"); + const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-start-log-dir-")); + const logPath = path.join(tmpDir, "nemoclaw-start.log"); + fs.mkdirSync(logPath); + + try { + const block = earlyStartLogBlock(src).replace( + '_START_LOG="/tmp/nemoclaw-start.log"', + `_START_LOG=${JSON.stringify(logPath)}`, + ); + const result = spawnSync("bash", ["-c", ["set -euo pipefail", block].join("\n")], { + encoding: "utf-8", + timeout: 5000, + }); + + expect(result.status).not.toBe(0); + expect(fs.statSync(logPath).isDirectory()).toBe(true); + } finally { + fs.rmSync(tmpDir, { recursive: true, force: true }); + } + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/nemoclaw-start.test.ts` around lines 120 - 151, Add a companion test to cover the real-directory refusal path: create a new it block alongside "replaces a pre-existing startup log symlink..." that pre-creates logPath as a directory (instead of a symlink), writes trustedPath, builds the script using earlyStartLogBlock with _START_LOG set to JSON.stringify(logPath), spawns bash as in the existing test, and assert the process exits non-zero and that the pre-existing directory at logPath still exists and is unchanged (e.g., isDirectory and trustedPath contents unchanged). Ensure you reference earlyStartLogBlock, logPath and trustedPath when locating the relevant setup and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 48-87: The change in scripts/nemoclaw-start.sh affects sandbox
boot behavior (see prepare_start_log function and its invocation at the bottom)
and must be validated with full entrypoint E2E runs before merging; please run
the recommended end-to-end suites sandbox-survival-e2e, sandbox-operations-e2e,
and cloud-e2e against this branch (exercising
Landlock/non-root/process-lifecycle paths) and post the passing results or any
fixes, referencing prepare_start_log and the two calls that invoke it for root
and non-root paths.
In `@test/nemoclaw-start.test.ts`:
- Around line 120-151: Add a companion test to cover the real-directory refusal
path: create a new it block alongside "replaces a pre-existing startup log
symlink..." that pre-creates logPath as a directory (instead of a symlink),
writes trustedPath, builds the script using earlyStartLogBlock with _START_LOG
set to JSON.stringify(logPath), spawns bash as in the existing test, and assert
the process exits non-zero and that the pre-existing directory at logPath still
exists and is unchanged (e.g., isDirectory and trustedPath contents unchanged).
Ensure you reference earlyStartLogBlock, logPath and trustedPath when locating
the relevant setup and assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a3b5c423-f61d-4003-8f70-846d54f14892
📒 Files selected for processing (2)
scripts/nemoclaw-start.shtest/nemoclaw-start.test.ts
Signed-off-by: Test User <test@example.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 1909-1915: The test currently writes to gatewayLog and autoPairLog
before asserting symlink behavior which can mask failures; after the spawnSync()
that produces result, immediately capture and assert the pre-write state of
gatewayLog and autoPairLog (existence and whether they are symlinks and their
mode) using fs.existsSync and fs.lstatSync or fs.statSync on gatewayLog and
autoPairLog (and verify they point to gatewayTarget/autoPairTarget) before
performing the fs.writeFileSync() calls, then perform the writes only to verify
the trusted targets (gatewayTarget, autoPairTarget) remain unchanged and finally
re-check gatewayLogIsSymlink/autoPairLogIsSymlink to ensure the harness didn’t
inadvertently replace symlinks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3a12a31f-c307-4985-9d16-e6fbfec30398
📒 Files selected for processing (4)
agents/hermes/start.shscripts/nemoclaw-start.shtest/nemoclaw-start.test.tstest/sandbox-init.test.ts
|
✨ Thanks for submitting this detailed PR about hardening startup log handling for the OpenClaw and Hermes entrypoints. This PR proposes a way to prevent a sandbox user from planting a symlink that redirects log truncation, writes, |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agents/hermes/start.sh (1)
230-287:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBackfill the rc shim before locking Hermes rc files.
This now writes
/tmp/nemoclaw-proxy-env.shand immediately locks$_SANDBOX_HOME, but it never does the OpenClaw-style shim backfill that makes/sandbox/.bashrc/.profilesource that file. On stale Hermes images, connect shells will miss the proxy/guard exports entirely._SANDBOX_HOMEis also resolved from the pre-normalizedHOME, even though the non-root path later forcesHOME=/sandbox, so the wrong rc directory can get locked in that posture.♻️ Proposed fix
if [ "$(id -u)" -eq 0 ]; then _SANDBOX_HOME=$(getent passwd sandbox 2>/dev/null | cut -d: -f6) _SANDBOX_HOME="${_SANDBOX_HOME:-/sandbox}" else - _SANDBOX_HOME="${HOME:-/sandbox}" + _SANDBOX_HOME="/sandbox" fi @@ _PROXY_ENV_FILE="/tmp/nemoclaw-proxy-env.sh" +_RUNTIME_SHELL_ENV_SHIM="[ -f ${_PROXY_ENV_FILE} ] && . ${_PROXY_ENV_FILE}" + +ensure_runtime_shell_env_shim() { + local rc_file + for rc_file in "${_SANDBOX_HOME}/.bashrc" "${_SANDBOX_HOME}/.profile"; do + [ -L "$rc_file" ] && continue + if [ -f "$rc_file" ] && grep -qxF "$_RUNTIME_SHELL_ENV_SHIM" "$rc_file" 2>/dev/null; then + continue + fi + if [ -e "$rc_file" ]; then + printf '\n%s\n%s\n' '# Source runtime proxy config' "$_RUNTIME_SHELL_ENV_SHIM" >>"$rc_file" + else + printf '%s\n%s\n' '# Source runtime proxy config' "$_RUNTIME_SHELL_ENV_SHIM" >"$rc_file" + fi + done +} + write_runtime_shell_env +ensure_runtime_shell_env_shim lock_rc_files "$_SANDBOX_HOME"As per coding guidelines,
agents/hermes/**recommendsrebuild-hermes-stale-base-e2efor stale cached base image paths.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents/hermes/start.sh` around lines 230 - 287, Resolve _SANDBOX_HOME after the non-root HOME normalization and backfill the sandbox rc shim before locking files: move the _SANDBOX_HOME calculation so it uses the final HOME value (so non-root paths get /sandbox), then after write_runtime_shell_env() add a step that injects a small shim into the sandbox rc files (e.g., create or ensure a line sourcing "$_PROXY_ENV_FILE" in the target .bashrc/.profile using the same emit_sandbox_sourced_file mechanism) and only then call lock_rc_files "$_SANDBOX_HOME"; reference write_runtime_shell_env, _PROXY_ENV_FILE, emit_sandbox_sourced_file and lock_rc_files to locate where to add/move these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@agents/hermes/start.sh`:
- Around line 230-287: Resolve _SANDBOX_HOME after the non-root HOME
normalization and backfill the sandbox rc shim before locking files: move the
_SANDBOX_HOME calculation so it uses the final HOME value (so non-root paths get
/sandbox), then after write_runtime_shell_env() add a step that injects a small
shim into the sandbox rc files (e.g., create or ensure a line sourcing
"$_PROXY_ENV_FILE" in the target .bashrc/.profile using the same
emit_sandbox_sourced_file mechanism) and only then call lock_rc_files
"$_SANDBOX_HOME"; reference write_runtime_shell_env, _PROXY_ENV_FILE,
emit_sandbox_sourced_file and lock_rc_files to locate where to add/move these
changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f3c4946d-4d09-4376-aaef-32873ea6dc1c
📒 Files selected for processing (3)
agents/hermes/start.shscripts/nemoclaw-start.shtest/nemoclaw-start.test.ts
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@cv I checked current
This PR's current head replaces those paths with freshly prepared regular files and adds regression coverage for the symlink cases. Could you take a look and provide feedback/review as well? |
Summary
Harden startup log handling for the OpenClaw and Hermes entrypoints. These logs live under
/tmp, which is writable from inside the sandbox, so startup should not trust a path that already exists there.The entrypoints now replace pre-existing log paths with freshly prepared regular files before root, gateway, or sandbox-owned processes write diagnostics. That prevents a sandbox user from planting a symlink that redirects log truncation, writes,
chmod, orchownonto another trusted in-container path.Changes
/tmp/nemoclaw-start.logwith a fresh restricted file before early diagnostics are captured./tmp/gateway.logand/tmp/auto-pair.log.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Passed locally:
shellcheck scripts/nemoclaw-start.sh agents/hermes/start.shnpx vitest run test/nemoclaw-start.test.ts -t "replaces a pre-existing startup log symlink"npx vitest run test/nemoclaw-start.test.ts -t "replaces symlinked service logs"npx vitest run test/sandbox-init.test.ts -t "hermes restricted log setup replaces symlinks to directories"npx vitest run test/nemoclaw-start.test.ts test/sandbox-init.test.tsnpm run source-shape:checkgit diff --check origin/security-start-log-no-followA normal push still did not pass in this checkout. The pre-push hook failed before this patch's tests on existing TypeScript/dist drift and then the full CLI test hook reported unrelated failures in
test/generate-openclaw-config.test.tsnumeric env-var handling plussrc/lib/sandbox-base-image.test.tsgit fixture cases. The affected log-hardening tests above pass, and the branch was pushed with--no-verifyafter recording those failures.Signed-off-by: Revant Patel revant.h.patel@gmail.com
Summary by CodeRabbit
Bug Fixes / Security
Tests