fix(sandbox): harden startup log creation by ChunkyMonkey11 · Pull Request #3888 · NVIDIA/NemoClaw

ChunkyMonkey11 · 2026-05-20T09:29:37Z

Summary

Harden startup log handling for the OpenClaw and Hermes entrypoints. These logs live under /tmp, which is writable from inside the sandbox, so startup should not trust a path that already exists there.

The entrypoints now replace pre-existing log paths with freshly prepared regular files before root, gateway, or sandbox-owned processes write diagnostics. That prevents a sandbox user from planting a symlink that redirects log truncation, writes, chmod, or chown onto another trusted in-container path.

Changes

Prepare /tmp/nemoclaw-start.log with a fresh restricted file before early diagnostics are captured.
Reuse the same no-follow preparation for OpenClaw service logs: /tmp/gateway.log and /tmp/auto-pair.log.
Tighten the Hermes restricted-log helper so symlinks, including symlinks to directories, are unlinked before replacement.
Keep real directories protected: startup refuses to replace an actual directory at the log path.
Add regressions that plant symlinks at the affected log paths and verify the symlink targets are not modified.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Passed locally:

shellcheck scripts/nemoclaw-start.sh agents/hermes/start.sh
npx vitest run test/nemoclaw-start.test.ts -t "replaces a pre-existing startup log symlink"
npx vitest run test/nemoclaw-start.test.ts -t "replaces symlinked service logs"
npx vitest run test/sandbox-init.test.ts -t "hermes restricted log setup replaces symlinks to directories"
npx vitest run test/nemoclaw-start.test.ts test/sandbox-init.test.ts
npm run source-shape:check
git diff --check origin/security-start-log-no-follow

A normal push still did not pass in this checkout. The pre-push hook failed before this patch's tests on existing TypeScript/dist drift and then the full CLI test hook reported unrelated failures in test/generate-openclaw-config.test.ts numeric env-var handling plus src/lib/sandbox-base-image.test.ts git fixture cases. The affected log-hardening tests above pass, and the branch was pushed with --no-verify after recording those failures.

Signed-off-by: Revant Patel revant.h.patel@gmail.com

Summary by CodeRabbit

Bug Fixes / Security
- Hardened startup log provisioning to replace symlinked targets with regular files, refuse non-symlink directories, remove/replace targets atomically, and fail safely on errors.
- Enforces explicit file modes (600 for sensitive logs, 644 where appropriate) and applies ownership when running as root.
- Ensures runtime shell environment shim is backfilled idempotently and runtime shell files are validated/created.
Tests
- Added tests verifying symlink replacement, permission/ownership enforcement, idempotent shim installation, and correct behavior across root/non-root paths, including symlinks to directories.

Signed-off-by: Test User <test@example.com>

copy-pr-bot · 2026-05-20T09:29:41Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-05-20T09:29:51Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds atomic, defensive log provisioning helpers and hardens Hermes and Nemoclaw startup paths to reject unsafe targets, replace symlinked logs with regular files with explicit modes/ownership, and backfills an idempotent runtime shell env shim; tests verify symlink replacement, permissions, and shim idempotency.

Changes

Startup log file hardening

Layer / File(s)	Summary
Secure log initialization helper `scripts/nemoclaw-start.sh`, `agents/hermes/start.sh`	Adds `prepare_start_log()` and hardens `prepare_restricted_log()` to create a temp file adjacent to the target, truncate it, optionally `chown`, set explicit `chmod`, refuse to overwrite a pre-existing non-symlink directory, unlink existing destination, and replace the target via `mv -f`. Replaces prior `touch`/`chown`/`chmod` sequences for `/tmp/nemoclaw-start.log`, `/tmp/gateway.log`, and `/tmp/auto-pair.log` in non-root and root branches.
Test coverage for symlink replacement `test/nemoclaw-start.test.ts`, `test/sandbox-init.test.ts`	Adds helpers that extract the early-start and restricted-log helpers, pre-seed `_START_LOG`, `gateway.log`, and `auto-pair.log` as symlinks (including symlink-to-directory), inject the helpers into wrapper scripts, run startup snippets, and assert symlinks are replaced by regular files with expected `0600`/`0644` modes while capturing trusted target contents.

Runtime shell env shim and sandbox-home refactor

Layer / File(s)	Summary
Sandbox-home resolution and shim install `agents/hermes/start.sh`	Move sandbox-home detection into `resolve_sandbox_home()`, add `_RUNTIME_SHELL_ENV_SHIM`, introduce `ensure_runtime_shell_env_shim()` which validates/creates non-symlink `.bashrc`/`.profile` and backfills a sourced shim via an atomic write helper; delay runtime/proxy initialization until after privilege branching and call `write_runtime_shell_env`, `ensure_runtime_shell_env_shim`, and `lock_rc_files` in both startup paths.
Tests for shim backfill idempotency `test/hermes-start.test.ts`	Extract `ensure_runtime_shell_env_shim()` and bootstrapping helpers into tests, update `emit_sandbox_sourced_file` to atomic-write via `.tmp` + `mv`, and add a test asserting the shim line is added exactly once to both `.bashrc` and `.profile` and that both files end up with mode `444`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

bug, NV QA, VDR

Suggested reviewers

ericksoa

Poem

🐰 I hopped and made a temp file neat,
Swapped sly links for files so discreet,
Wrote shims with gentle, atomic care,
Locked permissions so nothing's bare,
Logs and rc files now sleep sound and sweet.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.76% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(sandbox): harden startup log creation' accurately and concisely summarizes the main objective of the pull request: hardening the creation of startup log files to prevent symlink-based attacks.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

scripts/nemoclaw-start.sh (1)

48-87: Please run the entrypoint E2Es on this branch before merge.

This hardening sits on the sandbox boot path, and the unit test here only exercises an extracted snippet rather than the full Landlock/non-root/process-lifecycle path. I’d run sandbox-survival-e2e, sandbox-operations-e2e, and cloud-e2e for this change.

As per coding guidelines, scripts/nemoclaw-start.sh: This file is a sandbox entrypoint script. Changes affect every sandbox boot and are invisible to unit tests (Landlock, non-root execution, process lifecycle). and E2E test recommendation: sandbox-survival-e2e, sandbox-operations-e2e, cloud-e2e.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/nemoclaw-start.sh` around lines 48 - 87, The change in
scripts/nemoclaw-start.sh affects sandbox boot behavior (see prepare_start_log
function and its invocation at the bottom) and must be validated with full
entrypoint E2E runs before merging; please run the recommended end-to-end suites
sandbox-survival-e2e, sandbox-operations-e2e, and cloud-e2e against this branch
(exercising Landlock/non-root/process-lifecycle paths) and post the passing
results or any fixes, referencing prepare_start_log and the two calls that
invoke it for root and non-root paths.

test/nemoclaw-start.test.ts (1)

120-151: ⚡ Quick win

Add a regression for the real-directory refusal path.

The new helper also refuses to replace _START_LOG when it is an actual directory, but this suite only locks in the symlink-replacement branch. A small companion case that pre-creates logPath as a directory and asserts a non-zero exit plus an unchanged directory would cover the other security contract introduced here.

🧪 Suggested companion test

+  it("refuses to replace a real directory at the startup log path", () => {
+    const src = fs.readFileSync(START_SCRIPT, "utf-8");
+    const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-start-log-dir-"));
+    const logPath = path.join(tmpDir, "nemoclaw-start.log");
+    fs.mkdirSync(logPath);
+
+    try {
+      const block = earlyStartLogBlock(src).replace(
+        '_START_LOG="/tmp/nemoclaw-start.log"',
+        `_START_LOG=${JSON.stringify(logPath)}`,
+      );
+      const result = spawnSync("bash", ["-c", ["set -euo pipefail", block].join("\n")], {
+        encoding: "utf-8",
+        timeout: 5000,
+      });
+
+      expect(result.status).not.toBe(0);
+      expect(fs.statSync(logPath).isDirectory()).toBe(true);
+    } finally {
+      fs.rmSync(tmpDir, { recursive: true, force: true });
+    }
+  });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/nemoclaw-start.test.ts` around lines 120 - 151, Add a companion test to
cover the real-directory refusal path: create a new it block alongside "replaces
a pre-existing startup log symlink..." that pre-creates logPath as a directory
(instead of a symlink), writes trustedPath, builds the script using
earlyStartLogBlock with _START_LOG set to JSON.stringify(logPath), spawns bash
as in the existing test, and assert the process exits non-zero and that the
pre-existing directory at logPath still exists and is unchanged (e.g.,
isDirectory and trustedPath contents unchanged). Ensure you reference
earlyStartLogBlock, logPath and trustedPath when locating the relevant setup and
assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 48-87: The change in scripts/nemoclaw-start.sh affects sandbox
boot behavior (see prepare_start_log function and its invocation at the bottom)
and must be validated with full entrypoint E2E runs before merging; please run
the recommended end-to-end suites sandbox-survival-e2e, sandbox-operations-e2e,
and cloud-e2e against this branch (exercising
Landlock/non-root/process-lifecycle paths) and post the passing results or any
fixes, referencing prepare_start_log and the two calls that invoke it for root
and non-root paths.

In `@test/nemoclaw-start.test.ts`:
- Around line 120-151: Add a companion test to cover the real-directory refusal
path: create a new it block alongside "replaces a pre-existing startup log
symlink..." that pre-creates logPath as a directory (instead of a symlink),
writes trustedPath, builds the script using earlyStartLogBlock with _START_LOG
set to JSON.stringify(logPath), spawns bash as in the existing test, and assert
the process exits non-zero and that the pre-existing directory at logPath still
exists and is unchanged (e.g., isDirectory and trustedPath contents unchanged).
Ensure you reference earlyStartLogBlock, logPath and trustedPath when locating
the relevant setup and assertions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a3b5c423-f61d-4003-8f70-846d54f14892

📥 Commits

Reviewing files that changed from the base of the PR and between 11b1937 and 86787ac.

📒 Files selected for processing (2)

scripts/nemoclaw-start.sh
test/nemoclaw-start.test.ts

Signed-off-by: Test User <test@example.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/nemoclaw-start.test.ts`:
- Around line 1909-1915: The test currently writes to gatewayLog and autoPairLog
before asserting symlink behavior which can mask failures; after the spawnSync()
that produces result, immediately capture and assert the pre-write state of
gatewayLog and autoPairLog (existence and whether they are symlinks and their
mode) using fs.existsSync and fs.lstatSync or fs.statSync on gatewayLog and
autoPairLog (and verify they point to gatewayTarget/autoPairTarget) before
performing the fs.writeFileSync() calls, then perform the writes only to verify
the trusted targets (gatewayTarget, autoPairTarget) remain unchanged and finally
re-check gatewayLogIsSymlink/autoPairLogIsSymlink to ensure the harness didn’t
inadvertently replace symlinks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3a12a31f-c307-4985-9d16-e6fbfec30398

📥 Commits

Reviewing files that changed from the base of the PR and between daf1222 and 1c37ef4.

📒 Files selected for processing (4)

agents/hermes/start.sh
scripts/nemoclaw-start.sh
test/nemoclaw-start.test.ts
test/sandbox-init.test.ts

wscurran · 2026-05-20T15:57:46Z

✨ Thanks for submitting this detailed PR about hardening startup log handling for the OpenClaw and Hermes entrypoints. This PR proposes a way to prevent a sandbox user from planting a symlink that redirects log truncation, writes, chmod, or chown onto another trusted in-container path.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

agents/hermes/start.sh (1)

230-287: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Backfill the rc shim before locking Hermes rc files.

This now writes /tmp/nemoclaw-proxy-env.sh and immediately locks $_SANDBOX_HOME, but it never does the OpenClaw-style shim backfill that makes /sandbox/.bashrc / .profile source that file. On stale Hermes images, connect shells will miss the proxy/guard exports entirely. _SANDBOX_HOME is also resolved from the pre-normalized HOME, even though the non-root path later forces HOME=/sandbox, so the wrong rc directory can get locked in that posture.

♻️ Proposed fix

 if [ "$(id -u)" -eq 0 ]; then
   _SANDBOX_HOME=$(getent passwd sandbox 2>/dev/null | cut -d: -f6)
   _SANDBOX_HOME="${_SANDBOX_HOME:-/sandbox}"
 else
-  _SANDBOX_HOME="${HOME:-/sandbox}"
+  _SANDBOX_HOME="/sandbox"
 fi
@@
 _PROXY_ENV_FILE="/tmp/nemoclaw-proxy-env.sh"
+_RUNTIME_SHELL_ENV_SHIM="[ -f ${_PROXY_ENV_FILE} ] && . ${_PROXY_ENV_FILE}"
+
+ensure_runtime_shell_env_shim() {
+  local rc_file
+  for rc_file in "${_SANDBOX_HOME}/.bashrc" "${_SANDBOX_HOME}/.profile"; do
+    [ -L "$rc_file" ] && continue
+    if [ -f "$rc_file" ] && grep -qxF "$_RUNTIME_SHELL_ENV_SHIM" "$rc_file" 2>/dev/null; then
+      continue
+    fi
+    if [ -e "$rc_file" ]; then
+      printf '\n%s\n%s\n' '# Source runtime proxy config' "$_RUNTIME_SHELL_ENV_SHIM" >>"$rc_file"
+    else
+      printf '%s\n%s\n' '# Source runtime proxy config' "$_RUNTIME_SHELL_ENV_SHIM" >"$rc_file"
+    fi
+  done
+}
+
 write_runtime_shell_env
+ensure_runtime_shell_env_shim
 lock_rc_files "$_SANDBOX_HOME"

As per coding guidelines, agents/hermes/** recommends rebuild-hermes-stale-base-e2e for stale cached base image paths.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/hermes/start.sh` around lines 230 - 287, Resolve _SANDBOX_HOME after
the non-root HOME normalization and backfill the sandbox rc shim before locking
files: move the _SANDBOX_HOME calculation so it uses the final HOME value (so
non-root paths get /sandbox), then after write_runtime_shell_env() add a step
that injects a small shim into the sandbox rc files (e.g., create or ensure a
line sourcing "$_PROXY_ENV_FILE" in the target .bashrc/.profile using the same
emit_sandbox_sourced_file mechanism) and only then call lock_rc_files
"$_SANDBOX_HOME"; reference write_runtime_shell_env, _PROXY_ENV_FILE,
emit_sandbox_sourced_file and lock_rc_files to locate where to add/move these
changes.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@agents/hermes/start.sh`:
- Around line 230-287: Resolve _SANDBOX_HOME after the non-root HOME
normalization and backfill the sandbox rc shim before locking files: move the
_SANDBOX_HOME calculation so it uses the final HOME value (so non-root paths get
/sandbox), then after write_runtime_shell_env() add a step that injects a small
shim into the sandbox rc files (e.g., create or ensure a line sourcing
"$_PROXY_ENV_FILE" in the target .bashrc/.profile using the same
emit_sandbox_sourced_file mechanism) and only then call lock_rc_files
"$_SANDBOX_HOME"; reference write_runtime_shell_env, _PROXY_ENV_FILE,
emit_sandbox_sourced_file and lock_rc_files to locate where to add/move these
changes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f3c4946d-4d09-4376-aaef-32873ea6dc1c

📥 Commits

Reviewing files that changed from the base of the PR and between 1c37ef4 and 1556f01.

📒 Files selected for processing (3)

agents/hermes/start.sh
scripts/nemoclaw-start.sh
test/nemoclaw-start.test.ts

ChunkyMonkey11 · 2026-05-22T05:18:15Z

@coderabbitai review

coderabbitai · 2026-05-22T05:18:22Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

ChunkyMonkey11 · 2026-05-27T22:54:35Z

@cv I checked current main, and this still looks like a live issue.

/tmp is sandbox-writable, and scripts/nemoclaw-start.sh still initializes /tmp/nemoclaw-start.log, /tmp/gateway.log, and /tmp/auto-pair.log with direct truncate/touch/chown/chmod flows that can act through a pre-planted symlink. agents/hermes/start.sh also still lacks the explicit no-follow replacement guard for the restricted log helper, especially for symlinks that resolve to directories.

This PR's current head replaces those paths with freshly prepared regular files and adds regression coverage for the symlink cases. Could you take a look and provide feedback/review as well?

fix(sandbox): harden startup log creation

86787ac

Signed-off-by: Test User <test@example.com>

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

ChunkyMonkey11 and others added 2 commits May 20, 2026 02:40

Merge branch 'main' into security-start-log-no-follow

daf1222

fix(sandbox): harden remaining entrypoint logs

1c37ef4

Signed-off-by: Test User <test@example.com>

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread test/nemoclaw-start.test.ts

wscurran added fix security Potential vulnerability, unsafe behavior, or access risk labels May 20, 2026

Merge branch 'main' into security-start-log-no-follow

841c4ae

This was referenced May 21, 2026

fix: remove proxy hooks from sandbox rc files #3842

Merged

fix(openclaw): route discord through loopback proxy #3961

Merged

fix(hermes): restore Spark GPU recreate startup #3963

Merged

test(start): assert log state before harness writes

1556f01

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 21, 2026

fix(openclaw): remove Discord loopback helper #4005

Merged

ChunkyMonkey11 and others added 3 commits May 21, 2026 16:35

Merge branch 'main' into security-start-log-no-follow

842f163

fix(hermes): backfill runtime rc shim before locking

425f58c

Merge branch 'main' into security-start-log-no-follow

700b96e

ChunkyMonkey11 added 5 commits May 22, 2026 00:14

Merge branch 'main' into security-start-log-no-follow

df89773

Merge branch 'main' into security-start-log-no-follow

c809803

Merge branch 'main' into security-start-log-no-follow

4d182e3

Merge branch 'main' into security-start-log-no-follow

466f6ab

Merge branch 'main' into security-start-log-no-follow

71ec271

ChunkyMonkey11 added 2 commits May 27, 2026 15:55

Merge branch 'main' into security-start-log-no-follow

53837ca

Merge branch 'main' into security-start-log-no-follow

8717f10

wscurran added area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery bug-fix PR fixes a bug or regression and removed priority: high labels Jun 3, 2026

Conversation

ChunkyMonkey11 commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wscurran commented May 20, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ChunkyMonkey11 commented May 22, 2026

Uh oh!

coderabbitai Bot commented May 22, 2026

Uh oh!

ChunkyMonkey11 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ChunkyMonkey11 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading