Skip to content

fix(security): enforce owner-only permissions on ~/.nemoclaw directory and config files#4054

Merged
cv merged 2 commits into
NVIDIA:mainfrom
kagura-agent:fix/nemoclaw-dir-permissions
May 25, 2026
Merged

fix(security): enforce owner-only permissions on ~/.nemoclaw directory and config files#4054
cv merged 2 commits into
NVIDIA:mainfrom
kagura-agent:fix/nemoclaw-dir-permissions

Conversation

@kagura-agent

@kagura-agent kagura-agent commented May 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Three code paths created ~/.nemoclaw/ or files within it without specifying restrictive permissions, falling back to umask-derived defaults (typically 1755 for directories, 644 for files). This leaves credentials and config world-readable.

Closes #4009

Changes

File Fix
config-sync.ts Sandbox sync script: mkdir -p -m 700, chmod 700 dir, chmod 600 config.json
local-adapter-lifecycle.ts ensureLocalAdapterStateDir(): add mode: 0o700 + retroactive chmod for existing lax directories
onboard.ts startModelRouter(): add mode: 0o700 to state dir creation

The retroactive chmod in ensureLocalAdapterStateDir() follows the same pattern already used by config-io.ts ensureConfigDir() (line 153).

Testing

  • Added 2 new vitest tests in local-adapter-lifecycle.test.ts: verifies new directory gets 0o700 and existing 0o755 directory gets tightened to 0o700
  • Added 3 new assertions in config-sync.test.ts: verifies shell script contains mkdir -p -m 700, chmod 700, and chmod 600
  • All 8 tests pass locally (npx vitest run)

Signed-off-by: kagura-agent kagura-agent@users.noreply.github.com

Summary by CodeRabbit

  • Bug Fixes

    • Improved security posture by enforcing owner-only file permissions on local adapter state directories and sandbox configuration files. Automatic verification ensures restrictive permissions persist across operations, preventing unauthorized access to sensitive configuration data.
  • Tests

    • Added comprehensive tests verifying that configuration directories and files are properly created with and maintain restrictive owner-only file permissions throughout their lifecycle.

Review Change Stack

@copy-pr-bot

copy-pr-bot Bot commented May 22, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR hardens the security of NemoClaw configuration and adapter state directories by enforcing owner-only permissions (0o700 for directories, 0o600 for config files) across two separate initialization paths: adapter lifecycle and sandbox config sync. Implementations use best-effort tightening for existing directories and explicit permission modes for new creations, with comprehensive test coverage for both paths.

Changes

Permission Security Hardening

Layer / File(s) Summary
Local adapter state directory with permission enforcement
src/lib/inference/local-adapter-lifecycle.ts, src/lib/inference/local-adapter-lifecycle.test.ts
ensureLocalAdapterStateDir creates adapter state directories with explicit mode 0o700, then performs a best-effort post-check by stat'ing and calling chmodSync(..., 0o700) if group/other write bits are detected; errors are caught and ignored. Tests verify directory creation and tightening from 0o755 to 0o700 on non-Windows platforms.
Config sync shell script with restrictive permissions
src/lib/onboard/config-sync.ts, src/lib/onboard/config-sync.test.ts
Generated sandbox config sync script now creates ~/.nemoclaw with mkdir -p -m 700, reapplies chmod 700 ~/.nemoclaw, writes the config file, and enforces chmod 600 ~/.nemoclaw/config.json. Tests assert these four shell operations appear in the generated script output.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

Sandbox

Suggested reviewers

  • ericksoa
  • jyaunches

Poem

🐰 A rabbit's cheer for security tight,
Dir perms now 700, config 600 right,
No world reads our secrets with care,
Owner-only access, safe and fair!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: enforcing owner-only permissions on ~/.nemoclaw directory and config files, which is the core objective.
Linked Issues check ✅ Passed The PR fully addresses all coding objectives from #4009: enforces 700 permissions on ~/.nemoclaw directory, 600 on config files, applies retroactive chmod to tighten lax directories, and includes comprehensive tests.
Out of Scope Changes check ✅ Passed All changes are directly scoped to addressing the security issue: modifying local-adapter-lifecycle.ts, config-sync.ts, onboard.ts, and adding corresponding test assertions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Line 997: The current fs.mkdirSync(stateDir, { recursive: true, mode: 0o700 })
call only sets permissions for newly created directories; if stateDir already
exists its mode won't be tightened. After the mkdirSync call, check the existing
directory (stateDir) and explicitly set its mode to 0o700 (e.g., via
fs.chmodSync or fs.promises.chmod) so permissions are enforced for both
newly-created and pre-existing ~/.nemoclaw/state; ensure errors are handled or
propagated consistent with the surrounding function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 74808ab8-feb9-461a-9231-edf673131fff

📥 Commits

Reviewing files that changed from the base of the PR and between c84b6f1 and b76a1ef.

📒 Files selected for processing (5)
  • src/lib/inference/local-adapter-lifecycle.test.ts
  • src/lib/inference/local-adapter-lifecycle.ts
  • src/lib/onboard.ts
  • src/lib/onboard/config-sync.test.ts
  • src/lib/onboard/config-sync.ts

Comment thread src/lib/onboard.ts Outdated
@kagura-agent kagura-agent force-pushed the fix/nemoclaw-dir-permissions branch from b76a1ef to cfb5302 Compare May 22, 2026 05:28

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/lib/inference/local-adapter-lifecycle.test.ts (1)

134-153: Consider running the recommended E2E tests for onboard.ts changes.

Since this PR modifies onboard.ts (which contains core onboarding logic), the coding guidelines recommend running a suite of E2E tests to verify the full sandbox creation and configuration flow with the permission changes. As per coding guidelines, the recommended tests include: cloud-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e, and others listed in the guidelines.

To run selectively:

gh workflow run nightly-e2e.yaml --ref fix/nemoclaw-dir-permissions \
  -f jobs=cloud-e2e,sandbox-operations-e2e,rebuild-openclaw-e2e,channels-stop-start-e2e,messaging-compatible-endpoint-e2e,bedrock-runtime-compatible-anthropic-e2e,hermes-discord-e2e,hermes-slack-e2e,openshell-gateway-upgrade-e2e

This would provide additional confidence that the permission enforcement doesn't break existing onboarding flows or sandbox operations.

As per coding guidelines, changes to src/lib/onboard.ts should be validated with E2E tests including cloud-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e, channels-stop-start-e2e, messaging-compatible-endpoint-e2e, bedrock-runtime-compatible-anthropic-e2e, hermes-discord-e2e, hermes-slack-e2e, and openshell-gateway-upgrade-e2e.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/inference/local-adapter-lifecycle.test.ts` around lines 134 - 153,
Run the recommended E2E suites to validate the onboarding changes in
src/lib/onboard.ts: trigger the nightly-e2e.yaml workflow against your branch
(e.g., ref fix/nemoclaw-dir-permissions) with the jobs cloud-e2e,
sandbox-operations-e2e, rebuild-openclaw-e2e, channels-stop-start-e2e,
messaging-compatible-endpoint-e2e, bedrock-runtime-compatible-anthropic-e2e,
hermes-discord-e2e, hermes-slack-e2e, and openshell-gateway-upgrade-e2e (for
example via gh workflow run nightly-e2e.yaml -f jobs=... --ref <branch>),
inspect failures related to directory permission enforcement introduced around
onboard.ts, and fix any failing flows (sandbox creation, configuration, or
permission tightening) until all listed E2E jobs pass.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 997-1006: The added permission-tightening block increases file
size; extract it into a small helper like tightenDirPermissions(stateDir:
string) (or ensureSecureDir) in a new module under src/lib/onboard/ and replace
the inline block in the current file with a single call to that helper; the
helper should perform the mkdirSync(stateDir, { recursive: true, mode: 0o700 })
and the try { const stat = fs.statSync(stateDir); if ((stat.mode & 0o077) !== 0)
fs.chmodSync(stateDir, 0o700); } catch {} logic so the original onboard
entrypoint no longer grows in lines while preserving behavior (call sites: where
mkdirSync/statSync/chmodSync were used around stateDir).

---

Nitpick comments:
In `@src/lib/inference/local-adapter-lifecycle.test.ts`:
- Around line 134-153: Run the recommended E2E suites to validate the onboarding
changes in src/lib/onboard.ts: trigger the nightly-e2e.yaml workflow against
your branch (e.g., ref fix/nemoclaw-dir-permissions) with the jobs cloud-e2e,
sandbox-operations-e2e, rebuild-openclaw-e2e, channels-stop-start-e2e,
messaging-compatible-endpoint-e2e, bedrock-runtime-compatible-anthropic-e2e,
hermes-discord-e2e, hermes-slack-e2e, and openshell-gateway-upgrade-e2e (for
example via gh workflow run nightly-e2e.yaml -f jobs=... --ref <branch>),
inspect failures related to directory permission enforcement introduced around
onboard.ts, and fix any failing flows (sandbox creation, configuration, or
permission tightening) until all listed E2E jobs pass.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6eb90d3e-2d65-4ebc-95ce-1299881c21ea

📥 Commits

Reviewing files that changed from the base of the PR and between b76a1ef and cfb5302.

📒 Files selected for processing (5)
  • src/lib/inference/local-adapter-lifecycle.test.ts
  • src/lib/inference/local-adapter-lifecycle.ts
  • src/lib/onboard.ts
  • src/lib/onboard/config-sync.test.ts
  • src/lib/onboard/config-sync.ts
✅ Files skipped from review due to trivial changes (1)
  • src/lib/onboard/config-sync.test.ts

Comment thread src/lib/onboard.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/lib/onboard.ts (1)

36-36: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Onboard entrypoint budget is still blocking merge.

CI is failing on onboard-entrypoint-budget due to net +1 line in src/lib/onboard.ts. Please make this change budget-neutral (same behavior, -1 net line) so the PR can merge.

✂️ Minimal budget-neutral alternative
-const { ensureLocalAdapterStateDir }: typeof import("./inference/local-adapter-lifecycle") = require("./inference/local-adapter-lifecycle");
...
-  ensureLocalAdapterStateDir(stateDir);
+  require("./inference/local-adapter-lifecycle").ensureLocalAdapterStateDir(stateDir);

Also applies to: 998-998

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` at line 36, The extra type-annotation form of the require
added a line; replace the typed require with a plain destructuring require to
keep identical runtime behavior but remove one line: change the current line
that declares ensureLocalAdapterStateDir (the const { ensureLocalAdapterStateDir
}: typeof import("./inference/local-adapter-lifecycle") = require(...)) to a
single-line plain require (const { ensureLocalAdapterStateDir } =
require("./inference/local-adapter-lifecycle");) so behavior is unchanged and
the file is one line shorter.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/lib/onboard.ts`:
- Line 36: The extra type-annotation form of the require added a line; replace
the typed require with a plain destructuring require to keep identical runtime
behavior but remove one line: change the current line that declares
ensureLocalAdapterStateDir (the const { ensureLocalAdapterStateDir }: typeof
import("./inference/local-adapter-lifecycle") = require(...)) to a single-line
plain require (const { ensureLocalAdapterStateDir } =
require("./inference/local-adapter-lifecycle");) so behavior is unchanged and
the file is one line shorter.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 02d07edc-4523-46c5-9963-4cb13f1b710f

📥 Commits

Reviewing files that changed from the base of the PR and between cfb5302 and 1a42d73.

📒 Files selected for processing (1)
  • src/lib/onboard.ts

@wscurran wscurran added fix security Potential vulnerability, unsafe behavior, or access risk labels May 22, 2026
@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this detailed PR about enforcing owner-only permissions on ~/.nemoclaw directory and config files. This proposes a way to improve security by avoiding world-readable credentials and config, and adds test cases to verify the new permissions.


Related open issues:

…y and config files

Three code paths created ~/.nemoclaw/ or files within it without
specifying restrictive permissions, falling back to umask-derived
defaults (typically 755 for directories, 644 for files):

1. config-sync.ts — sandbox sync script used bare mkdir -p and cat
   without chmod
2. local-adapter-lifecycle.ts — ensureLocalAdapterStateDir() called
   mkdirSync without mode
3. onboard.ts — startModelRouter() created state dir without mode

Fix: add mode 0o700 to all directory-creation calls and chmod 600 to
config.json in the sandbox sync script.  Also tighten existing
directories retroactively in ensureLocalAdapterStateDir() (matching
the pattern already used by config-io.ts ensureConfigDir()).

Closes NVIDIA#4009

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>
@kagura-agent kagura-agent force-pushed the fix/nemoclaw-dir-permissions branch from df6c395 to 6719542 Compare May 23, 2026 12:14
@cv cv added the v0.0.51 Release target label May 25, 2026
@cv cv merged commit f1044ad into NVIDIA:main May 25, 2026
17 checks passed
cv pushed a commit that referenced this pull request May 26, 2026
## Summary

- keep `~/.nemoclaw` directory hardening for user-owned config dirs
- skip directory `chmod` when the config dir is owned by another UID,
which preserves OpenClaw's root-owned `/sandbox/.nemoclaw` layout
- keep `config.json` restricted to `0600`
- add shell-level regression coverage for owned and non-owned config
directories

## Root Cause

PR #4054 changed the step-7 sandbox config sync script from writing
`~/.nemoclaw/config.json` to unconditionally running `chmod 700
~/.nemoclaw` first. In the OpenClaw sandbox image, `/sandbox/.nemoclaw`
is intentionally root-owned with mode `1755`, while
`/sandbox/.nemoclaw/config.json` is sandbox-owned and writable. The
directory chmod fails under `set -euo pipefail`, so `openshell sandbox
connect <sandbox>` exits 1 during:

`[7/8] Setting up OpenClaw inside sandbox`

This restores compatibility with that image contract without reverting
the file-permission hardening from #4054.

## Evidence

- Last known good public install: run `26413471038`, installed ref
`8ed9a2c04c80627deed519a7c5ffe8f95ebb5ddd`, `cloud-onboard-e2e` reached
`✓ OpenClaw gateway launched inside sandbox`.
- First observed failing public install: run `26414889786`, installed
ref `94fefa6fd08e33d48eede651e33770a0712da854`, `cloud-onboard-e2e`
failed at `openshell sandbox connect e2e-cloud-onboard`.
- The commit window contains #4054, #3980, and #3675; #4054 is the only
PR that changed the config-sync script executed at the failing boundary.

## Validation

- `npm ci --include=dev --ignore-scripts`
- `npx vitest run src/lib/onboard/config-sync.test.ts
--reporter=verbose`
- `npx @biomejs/biome check src/lib/onboard/config-sync.ts
src/lib/onboard/config-sync.test.ts`
- `git diff --check`
- `npm run build:cli`
- `npm run typecheck:cli`


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Refactor**
* Improved NemoClaw configuration directory permission handling to be
more robust in different user ownership scenarios.
* Enhanced sandbox configuration script to conditionally apply
permissions based on directory ownership.

* **Tests**
* Added comprehensive integration tests for configuration directory
permission management and ownership verification.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4199?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@wscurran wscurran added bug-fix PR fixes a bug or regression and removed fix labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression security Potential vulnerability, unsafe behavior, or access risk v0.0.51 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[NemoClaw][All Platforms][Security] NemoClaw creates ~/.nemoclaw and config.json with world-readable permissions after onboard (should be 700/600)

3 participants