Skip to content

fix: fall back to non-root mode when gosu is blocked by no-new-privileges#846

Merged
ericksoa merged 5 commits into
mainfrom
fix/gosu-no-new-privileges
Mar 25, 2026
Merged

fix: fall back to non-root mode when gosu is blocked by no-new-privileges#846
ericksoa merged 5 commits into
mainfrom
fix/gosu-no-new-privileges

Conversation

@ericksoa

@ericksoa ericksoa commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Detect non-root at entrypoint startup and skip gosu privilege separation
  • Fixes "failed switching to 'sandbox': operation not permitted" when OpenShell runs containers with --security-opt=no-new-privileges
  • Gateway isolation (separate gateway user) still works when running as root (standalone Docker, Brev direct)

Regression from #721.

Test plan

  • nemoclaw onboard succeeds on Brev one-click deployment
  • nemoclaw onboard succeeds on local Docker (root entrypoint path)
  • Gateway isolation e2e tests pass (root path)

Summary by CodeRabbit

  • Bug Fixes
    • Startup now reliably handles both root and non-root environments to avoid permission issues.
    • Gateway can start and run in the background without elevated privileges; startup waits for the gateway and exits with its status.
    • Configuration integrity checks are tolerant in non-root mode, and dashboard URLs are consistently printed on startup.

…eges

OpenShell runs sandbox containers with --security-opt=no-new-privileges,
which blocks gosu's setuid syscall. Detect non-root at startup and skip
privilege separation, running gateway and agent as the current user.

Full gateway isolation (separate gateway user) still works when the
container runs as root (standalone Docker, Brev direct).
@coderabbitai

coderabbitai Bot commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

Caution

Review failed

Pull request was closed or merged during review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)
📝 Walkthrough

Walkthrough

Adds an early non-root branch to scripts/nemoclaw-start.sh: when UID != 0 it does best-effort verify_config_integrity || true, runs write_auth_profile, then either execs the provided command or starts "$OPENCLAW" gateway run in background, invokes start_auto_pair without gosu, prints URLs, waits, and exits. Root path unchanged.

Changes

Cohort / File(s) Summary
Startup script (root vs non-root)
scripts/nemoclaw-start.sh
Add UID-based early branching: non-root path logs mode, runs `verify_config_integrity

Sequence Diagram(s)

sequenceDiagram
    participant Entrypoint as Entrypoint Script
    participant Verifier as verify_config_integrity
    participant Auth as write_auth_profile
    participant Gateway as "$OPENCLAW" gateway
    participant Watcher as start_auto_pair
    participant Gosu as gosu sandbox

    Entrypoint->>Entrypoint: check UID
    alt UID != 0 (non-root)
        Entrypoint->>Verifier: verify_config_integrity || true
        Entrypoint->>Auth: write_auth_profile
        Entrypoint->>Gateway: start gateway (background)
        Entrypoint->>Watcher: run python3 watcher (direct)
        Entrypoint->>Entrypoint: print URLs, wait for gateway PID, exit
    else UID == 0 (root)
        Entrypoint->>Verifier: verify_config_integrity (full)
        Entrypoint->>Auth: write_auth_profile
        Entrypoint->>Gosu: prefix watcher with gosu sandbox
        Gosu->>Watcher: run python3 watcher (via gosu)
        Entrypoint->>Gateway: start gateway (with gosu as needed)
        Entrypoint->>Entrypoint: print URLs, wait for gateway PID, exit
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped into the start with a curious grin,
Split the path for root and user spin,
I check gently when I cannot reign,
Launch gateway, run watcher — no gosu chain,
A nimble hop, the sandbox hums again.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: detecting non-root execution and falling back to skip gosu privilege separation when blocked by no-new-privileges, which directly addresses the core issue in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/gosu-no-new-privileges

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
scripts/nemoclaw-start.sh (1)

187-190: Gateway output not redirected to log file unlike root path.

In the root path (line 237), gateway output is redirected to /tmp/gateway.log. Here it goes to stdout, which may clutter the terminal or be lost. Consider redirecting for consistency and debuggability:

♻️ Proposed consistency fix
   # Start gateway in background, auto-pair, then wait
-  "$OPENCLAW" gateway run &
+  "$OPENCLAW" gateway run >>/tmp/gateway.log 2>&1 &

Note: You may need to ensure /tmp/gateway.log is writable by the current user (likely fine since /tmp is world-writable).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/nemoclaw-start.sh` around lines 187 - 190, The gateway process is
launched with "$OPENCLAW" gateway run & and its PID stored in GATEWAY_PID but
its stdout/stderr are not redirected (unlike the later invocation that writes to
/tmp/gateway.log); update the launch of the gateway in the background to
redirect stdout and stderr to /tmp/gateway.log (or the same log path used
elsewhere) so logs are captured consistently and then set GATEWAY_PID=$! and
echo as before; ensure the chosen log file is writable by the current user.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Line 180: The line currently silences verify_config_integrity failures with
"|| true", which hides tampered configs; change it to check the command's exit
status and emit a clear, visible warning when it fails (for example, use
verify_config_integrity and if it returns non-zero, print a loudly formatted
warning to stderr or syslog) while preserving the strict root-mode behavior
elsewhere; reference the verify_config_integrity call and mirror the strict
root-path enforcement branch by logging a prominent warning in non-root mode
instead of ignoring the failure.
- Line 191: The start_auto_pair function invokes "gosu sandbox python3 -" which
fails under no-new-privileges/non-root execution; modify start_auto_pair to
detect whether gosu is usable (e.g., test for root or check if /sbin/gosu exists
and can drop to sandbox) and when it isn’t, run the auto-pair subprocess without
gosu (invoke "python3 -" directly or use a non-root variant) so pairing still
works; update the function that currently calls gosu sandbox python3 - to
conditionally choose the gosu path or the direct python3 path and ensure any
environment/user setup done by gosu is replicated for the non-root fallback.
- Line 181: In the non-root fallback branch where write_auth_profile is invoked
(check the id -u != 0 path), explicitly set HOME before calling
write_auth_profile so os.path.expanduser resolves to /sandbox; add export
HOME=/sandbox immediately before the write_auth_profile call (and ensure any
subsequent commands in that branch run with that HOME set). This mirrors the
root-mode behavior (gosu sandbox) and prevents credentials from being written to
an unexpected user home.

---

Nitpick comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 187-190: The gateway process is launched with "$OPENCLAW" gateway
run & and its PID stored in GATEWAY_PID but its stdout/stderr are not redirected
(unlike the later invocation that writes to /tmp/gateway.log); update the launch
of the gateway in the background to redirect stdout and stderr to
/tmp/gateway.log (or the same log path used elsewhere) so logs are captured
consistently and then set GATEWAY_PID=$! and echo as before; ensure the chosen
log file is writable by the current user.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9ae433bd-4d30-45b3-8629-4eb8c2e9f8c5

📥 Commits

Reviewing files that changed from the base of the PR and between 93a475a and c44d79d.

📒 Files selected for processing (1)
  • scripts/nemoclaw-start.sh

Comment thread scripts/nemoclaw-start.sh Outdated
Comment thread scripts/nemoclaw-start.sh
Comment thread scripts/nemoclaw-start.sh

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
scripts/nemoclaw-start.sh (1)

183-197: ⚠️ Potential issue | 🟠 Major

Pin HOME before starting the non-root flow.

Line 60 writes the auth profile under ~/.openclaw/..., while Line 81 and Lines 39-47 still read /sandbox/.openclaw explicitly. In this branch, gateway run and start_auto_pair inherit the same current HOME, so a non-root launch with HOME unset or pointing elsewhere can read and write a different state tree than the one you verified and printed URLs from. The root path stays aligned because every sandbox-side subprocess runs under gosu sandbox.

🛠️ Proposed fix
 if [ "$(id -u)" -ne 0 ]; then
   echo "[gateway] Running as non-root (uid=$(id -u)) — privilege separation disabled"
+  export HOME=/sandbox
   verify_config_integrity || true
   write_auth_profile

Please verify this path with a non-root launch where HOME is unset or not /sandbox.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/nemoclaw-start.sh` around lines 183 - 197, The non-root branch can
read/write a different $HOME than the sandbox state you validated; pin HOME to
the sandbox before any actions in that branch so write_auth_profile, gateway
run, start_auto_pair and print_dashboard_urls all operate on the same directory.
Update the non-root block (the branch that checks id -u and uses NEMOCLAW_CMD,
OPENCLAW, write_auth_profile, verify_config_integrity, gateway run,
start_auto_pair, print_dashboard_urls) to export HOME=/sandbox (or the canonical
sandbox path used elsewhere) immediately after entering the branch and before
calling verify_config_integrity/write_auth_profile or launching processes so
subsequent commands inherit the correct HOME. Ensure this change is only applied
in the non-root branch and does not alter the root/gosu sandbox behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 183-197: The non-root branch can read/write a different $HOME than
the sandbox state you validated; pin HOME to the sandbox before any actions in
that branch so write_auth_profile, gateway run, start_auto_pair and
print_dashboard_urls all operate on the same directory. Update the non-root
block (the branch that checks id -u and uses NEMOCLAW_CMD, OPENCLAW,
write_auth_profile, verify_config_integrity, gateway run, start_auto_pair,
print_dashboard_urls) to export HOME=/sandbox (or the canonical sandbox path
used elsewhere) immediately after entering the branch and before calling
verify_config_integrity/write_auth_profile or launching processes so subsequent
commands inherit the correct HOME. Ensure this change is only applied in the
non-root branch and does not alter the root/gosu sandbox behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bcf1d40b-bddd-489f-a869-b67b8f6c04d8

📥 Commits

Reviewing files that changed from the base of the PR and between 2b20205 and fc7f154.

📒 Files selected for processing (1)
  • scripts/nemoclaw-start.sh

Convert run_prefix from string to bash array so it expands correctly
when empty (no-op) and when set (gosu sandbox), without triggering
SC2086 word-splitting warning.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
scripts/nemoclaw-start.sh (2)

186-186: ⚠️ Potential issue | 🟠 Major

Missing HOME export before write_auth_profile.

The write_auth_profile function uses os.path.expanduser('~/.openclaw/...'), which depends on the HOME environment variable. In non-root mode with arbitrary UIDs (e.g., Kubernetes mapped users), HOME may be unset or incorrect, causing credentials to be written to the wrong location and authentication to fail.

The root path handles this via gosu sandbox which inherits HOME from /etc/passwd. The non-root path should explicitly set it:

🔧 Proposed fix
   echo "[gateway] Running as non-root (uid=$(id -u)) — privilege separation disabled"
-  verify_config_integrity || true
+  if ! verify_config_integrity; then
+    echo "[SECURITY WARNING] Config integrity check failed — proceeding anyway (non-root mode)"
+  fi
+  export HOME=/sandbox
   write_auth_profile
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/nemoclaw-start.sh` at line 186, The non-root startup path calls
write_auth_profile but may run with HOME unset or incorrect (so
os.path.expanduser('~/.openclaw/...') writes to the wrong place); before
invoking write_auth_profile ensure HOME is explicitly exported to the target
user's home directory (same approach as the root path uses via gosu sandbox) by
determining the intended home (from /etc/passwd or an env value) and exporting
HOME in the non-root branch prior to calling write_auth_profile so credentials
are written to the correct location.

185-185: ⚠️ Potential issue | 🟠 Major

Config integrity failures are silently ignored.

The || true suppresses verification failures without any warning. While operational flexibility may be intentional, this weakens security guarantees compared to the root path (line 205). At minimum, emit a visible warning when verification fails.

🛡️ Proposed fix: warn on failure
-  verify_config_integrity || true
+  if ! verify_config_integrity; then
+    echo "[SECURITY WARNING] Config integrity check failed — proceeding anyway (non-root mode)"
+  fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/nemoclaw-start.sh` at line 185, The script currently silences
failures from verify_config_integrity by appending "|| true", which hides
verification errors; change this to surface a warning instead of swallowing the
failure: replace the "verify_config_integrity || true" usage so that when
verify_config_integrity exits non‑zero you emit a visible warning to stderr
(e.g., via echo >&2 or logger) but continue execution, or alternately remove the
"|| true" if you want to fail fast; ensure the change targets the call to
verify_config_integrity so verification failures are at least logged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scripts/nemoclaw-start.sh`:
- Line 186: The non-root startup path calls write_auth_profile but may run with
HOME unset or incorrect (so os.path.expanduser('~/.openclaw/...') writes to the
wrong place); before invoking write_auth_profile ensure HOME is explicitly
exported to the target user's home directory (same approach as the root path
uses via gosu sandbox) by determining the intended home (from /etc/passwd or an
env value) and exporting HOME in the non-root branch prior to calling
write_auth_profile so credentials are written to the correct location.
- Line 185: The script currently silences failures from verify_config_integrity
by appending "|| true", which hides verification errors; change this to surface
a warning instead of swallowing the failure: replace the
"verify_config_integrity || true" usage so that when verify_config_integrity
exits non‑zero you emit a visible warning to stderr (e.g., via echo >&2 or
logger) but continue execution, or alternately remove the "|| true" if you want
to fail fast; ensure the change targets the call to verify_config_integrity so
verification failures are at least logged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8fa83b25-9aff-4295-b955-8f55a5a45e46

📥 Commits

Reviewing files that changed from the base of the PR and between fc7f154 and 2317f9d.

📒 Files selected for processing (1)
  • scripts/nemoclaw-start.sh

@ericksoa ericksoa self-assigned this Mar 25, 2026
…oot mode

- Export HOME=/sandbox before write_auth_profile so credentials land in
  the correct directory when running with arbitrary UIDs (e.g. Kubernetes)
- Replace silent `|| true` with visible security warning when config
  integrity check fails in non-root mode

@liveaverage liveaverage left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ericksoa ericksoa merged commit 4cfc3b3 into main Mar 25, 2026
7 of 8 checks passed
temrjan pushed a commit to temrjan/NemoClaw that referenced this pull request Mar 25, 2026
…eges (NVIDIA#846)

* fix: fall back to non-root mode when gosu is blocked by no-new-privileges

OpenShell runs sandbox containers with --security-opt=no-new-privileges,
which blocks gosu's setuid syscall. Detect non-root at startup and skip
privilege separation, running gateway and agent as the current user.

Full gateway isolation (separate gateway user) still works when the
container runs as root (standalone Docker, Brev direct).

* fix: guard all gosu calls behind root check

* chore: apply shfmt with -ci flag

* fix: use array for run_prefix to satisfy shellcheck SC2086

Convert run_prefix from string to bash array so it expands correctly
when empty (no-op) and when set (gosu sandbox), without triggering
SC2086 word-splitting warning.

* fix(security): set HOME and warn on config integrity failure in non-root mode

- Export HOME=/sandbox before write_auth_profile so credentials land in
  the correct directory when running with arbitrary UIDs (e.g. Kubernetes)
- Replace silent `|| true` with visible security warning when config
  integrity check fails in non-root mode
lakamsani pushed a commit to lakamsani/NemoClaw that referenced this pull request Apr 4, 2026
…eges (NVIDIA#846)

* fix: fall back to non-root mode when gosu is blocked by no-new-privileges

OpenShell runs sandbox containers with --security-opt=no-new-privileges,
which blocks gosu's setuid syscall. Detect non-root at startup and skip
privilege separation, running gateway and agent as the current user.

Full gateway isolation (separate gateway user) still works when the
container runs as root (standalone Docker, Brev direct).

* fix: guard all gosu calls behind root check

* chore: apply shfmt with -ci flag

* fix: use array for run_prefix to satisfy shellcheck SC2086

Convert run_prefix from string to bash array so it expands correctly
when empty (no-op) and when set (gosu sandbox), without triggering
SC2086 word-splitting warning.

* fix(security): set HOME and warn on config integrity failure in non-root mode

- Export HOME=/sandbox before write_auth_profile so credentials land in
  the correct directory when running with arbitrary UIDs (e.g. Kubernetes)
- Replace silent `|| true` with visible security warning when config
  integrity check fails in non-root mode
jacobtomlinson pushed a commit to jacobtomlinson/NemoClaw that referenced this pull request Apr 30, 2026
…eges (NVIDIA#846)

* fix: fall back to non-root mode when gosu is blocked by no-new-privileges

OpenShell runs sandbox containers with --security-opt=no-new-privileges,
which blocks gosu's setuid syscall. Detect non-root at startup and skip
privilege separation, running gateway and agent as the current user.

Full gateway isolation (separate gateway user) still works when the
container runs as root (standalone Docker, Brev direct).

* fix: guard all gosu calls behind root check

* chore: apply shfmt with -ci flag

* fix: use array for run_prefix to satisfy shellcheck SC2086

Convert run_prefix from string to bash array so it expands correctly
when empty (no-op) and when set (gosu sandbox), without triggering
SC2086 word-splitting warning.

* fix(security): set HOME and warn on config integrity failure in non-root mode

- Export HOME=/sandbox before write_auth_profile so credentials land in
  the correct directory when running with arbitrary UIDs (e.g. Kubernetes)
- Replace silent `|| true` with visible security warning when config
  integrity check fails in non-root mode
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants