fix(security): surface immutable symlink hardening status#1
Closed
13ernkastel wants to merge 3 commits into
Conversation
The symlink validation loop in nemoclaw-start.sh verifies that all symlinks in /sandbox/.openclaw/ point to their expected /sandbox/.openclaw-data/ targets, but this check runs only once at boot. After validation, the symlinks could theoretically be swapped before the gateway starts on the next line (TOCTOU). While DAC already prevents the sandbox user from modifying the root-owned /sandbox/.openclaw directory, this adds defense-in-depth by setting the immutable flag (chattr +i) on both the directory and its symlinks after validation passes. The immutable flag cannot be removed by the sandbox user, closing the TOCTOU window even if DAC or Landlock are bypassed. The fix degrades gracefully: if chattr is not available or the filesystem does not support immutable flags, the existing DAC protections remain in effect. Closes NVIDIA#1019 Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
Refactor symlink validation into reusable helpers, log when immutable hardening is unavailable or partial, and fail the gateway-isolation E2E when chattr is missing from the image.
3 tasks
99ef53b to
5ab4386
Compare
|
@13ernkastel nice follow-up! Mind posting it to NVIDIA/NemoClaw? Happy to merge it there. |
Author
|
Reposted upstream as NVIDIA#1467: That PR carries the same follow-up hardening-observability changes against current |
|
Looks like NVIDIA#1467 was closed shortly after, though? |
Author
|
@cv upstream is live now at NVIDIA#1499:
|
cv
added a commit
to NVIDIA/NemoClaw
that referenced
this pull request
Apr 6, 2026
## Summary This follow-up builds on #1137 and improves the observability around immutable symlink hardening without changing the underlying defense-in-depth approach. ## What Changed - factors `.openclaw` symlink validation into a reusable helper so both startup paths use the same validation logic - adds explicit security logging when immutable hardening succeeds, is partial, or is skipped because `chattr` is unavailable - extends the gateway-isolation E2E to fail if `chattr` is missing from the image, so the mitigation cannot silently disappear ## Why The original immutable-hardening fix is directionally strong, but the `chattr` path is intentionally best-effort and currently silent. That makes the mitigation harder to trust and harder to debug because: - a missing `chattr` binary looks the same as successful hardening - partial `chattr +i` failures are suppressed with no visibility - the image can regress and stop shipping `chattr` without CI catching it These changes make the mitigation easier to audit while staying compatible with the current layered hardening model. ## Validation - `bash -n scripts/nemoclaw-start.sh` - `bash -n test/e2e-gateway-isolation.sh` - `git diff --check` - not run: `test/e2e-gateway-isolation.sh` (`docker` is not installed in this environment) ## Relationship To #1137 This is a repost of the follow-up originally opened as `latenighthackathon#1`, now targeted at `NVIDIA/NemoClaw` as requested. ## Note This replaces `#1467`, which GitHub auto-closed because the repository's contributor open-PR limit was hit at the time. Signed-off-by: 13ernkastel <LennonCMJ@live.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Enhanced startup process validation to ensure system integrity and correct configuration * Improved security hardening mechanisms with comprehensive logging and graceful fallback handling when system features are unavailable * **Tests** * Updated end-to-end integration tests to verify system hardening capabilities and feature availability <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
tranzmatt
pushed a commit
to tranzmatt/NemoClaw
that referenced
this pull request
Apr 6, 2026
## Summary This follow-up builds on NVIDIA#1137 and improves the observability around immutable symlink hardening without changing the underlying defense-in-depth approach. ## What Changed - factors `.openclaw` symlink validation into a reusable helper so both startup paths use the same validation logic - adds explicit security logging when immutable hardening succeeds, is partial, or is skipped because `chattr` is unavailable - extends the gateway-isolation E2E to fail if `chattr` is missing from the image, so the mitigation cannot silently disappear ## Why The original immutable-hardening fix is directionally strong, but the `chattr` path is intentionally best-effort and currently silent. That makes the mitigation harder to trust and harder to debug because: - a missing `chattr` binary looks the same as successful hardening - partial `chattr +i` failures are suppressed with no visibility - the image can regress and stop shipping `chattr` without CI catching it These changes make the mitigation easier to audit while staying compatible with the current layered hardening model. ## Validation - `bash -n scripts/nemoclaw-start.sh` - `bash -n test/e2e-gateway-isolation.sh` - `git diff --check` - not run: `test/e2e-gateway-isolation.sh` (`docker` is not installed in this environment) ## Relationship To NVIDIA#1137 This is a repost of the follow-up originally opened as `latenighthackathon#1`, now targeted at `NVIDIA/NemoClaw` as requested. ## Note This replaces `NVIDIA#1467`, which GitHub auto-closed because the repository's contributor open-PR limit was hit at the time. Signed-off-by: 13ernkastel <LennonCMJ@live.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Enhanced startup process validation to ensure system integrity and correct configuration * Improved security hardening mechanisms with comprehensive logging and graceful fallback handling when system features are unavailable * **Tests** * Updated end-to-end integration tests to verify system hardening capabilities and feature availability <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
latenighthackathon
pushed a commit
that referenced
this pull request
Apr 8, 2026
…IA#1114) (NVIDIA#1305) ## Summary Fixes the four issues reported in NVIDIA#1114 — EACCES permission errors and missing gateway token when running inside the NemoClaw sandbox. ### Issue mapping | # | Reported error | Fix | |---|----------------|-----| | 1 | `EACCES: open '/sandbox/.openclaw/openclaw.json.*.tmp'` | `install_configure_guard` — intercepts `openclaw configure` with a clear error and directs users to `nemoclaw onboard --resume` on the host | | 2 | Same as #1 (different PID) | Same fix | | 3 | `EACCES: mkdir '/sandbox/.openclaw/credentials'` | Already resolved on main via NVIDIA#1519 (credentials symlink to `.openclaw-data/`) | | 4 | No WhatsApp QR code | Consequence of #3, also resolved by NVIDIA#1519 | ### Root cause (issues 1 & 2) OpenClaw's `configure` command performs atomic writes — it creates a temp file (`openclaw.json.PID.UUID.tmp`) in the same directory as the config. Since `/sandbox/.openclaw/` is Landlock read-only at the kernel level, file creation is rejected with EACCES. This is by design: the sandbox config is intentionally immutable at runtime. Rather than weakening Landlock (security regression), we intercept the command in the sandbox shell and guide users to the correct host-side workflow. ### Changes **1. `install_configure_guard()`** — Writes a shell function wrapper to `.bashrc`/`.profile` that intercepts `openclaw configure` and prints: ``` Error: 'openclaw configure' cannot modify config inside the sandbox. The sandbox config is read-only (Landlock enforced) for security. To change your configuration, exit the sandbox and run: nemoclaw onboard --resume This rebuilds the sandbox with your updated settings. ``` All other `openclaw` subcommands pass through to the real binary. **2. `export_gateway_token()`** — Reads `gateway.auth.token` from `openclaw.json` and exports it as `OPENCLAW_GATEWAY_TOKEN`, so interactive sessions (`openshell sandbox connect`) can authenticate with the gateway. Persists to `.bashrc`/`.profile` using idempotent marker blocks and cleans stale tokens on revocation. **3. `_read_gateway_token()` helper** — Shared Python snippet used by both `export_gateway_token` and `print_dashboard_urls` (deduplication, uses `with open()` context manager). All three are called in both root and non-root startup paths. ## Security properties preserved - `/sandbox/.openclaw` remains root-owned, Landlock read-only - `openclaw.json` remains chmod 444 (immutable) - No new attack surface — token is read-only from existing config - `command openclaw` bypass preserves all non-configure functionality Fixes NVIDIA#1114 Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
gemini2026
pushed a commit
to gemini2026/NemoClaw
that referenced
this pull request
Apr 14, 2026
## Summary This follow-up builds on NVIDIA#1137 and improves the observability around immutable symlink hardening without changing the underlying defense-in-depth approach. ## What Changed - factors `.openclaw` symlink validation into a reusable helper so both startup paths use the same validation logic - adds explicit security logging when immutable hardening succeeds, is partial, or is skipped because `chattr` is unavailable - extends the gateway-isolation E2E to fail if `chattr` is missing from the image, so the mitigation cannot silently disappear ## Why The original immutable-hardening fix is directionally strong, but the `chattr` path is intentionally best-effort and currently silent. That makes the mitigation harder to trust and harder to debug because: - a missing `chattr` binary looks the same as successful hardening - partial `chattr +i` failures are suppressed with no visibility - the image can regress and stop shipping `chattr` without CI catching it These changes make the mitigation easier to audit while staying compatible with the current layered hardening model. ## Validation - `bash -n scripts/nemoclaw-start.sh` - `bash -n test/e2e-gateway-isolation.sh` - `git diff --check` - not run: `test/e2e-gateway-isolation.sh` (`docker` is not installed in this environment) ## Relationship To NVIDIA#1137 This is a repost of the follow-up originally opened as `latenighthackathon#1`, now targeted at `NVIDIA/NemoClaw` as requested. ## Note This replaces `NVIDIA#1467`, which GitHub auto-closed because the repository's contributor open-PR limit was hit at the time. Signed-off-by: 13ernkastel <LennonCMJ@live.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Enhanced startup process validation to ensure system integrity and correct configuration * Improved security hardening mechanisms with comprehensive logging and graceful fallback handling when system features are unavailable * **Tests** * Updated end-to-end integration tests to verify system hardening capabilities and feature availability <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
latenighthackathon
pushed a commit
that referenced
this pull request
Apr 29, 2026
… (NVIDIA#2404) ## Summary NemoClaw's sandbox create stream only recognized the legacy Docker builder format, so BuildKit output would not be treated as active build progress once OpenShell emits it. This adds BuildKit progress markers to the same parser path as the existing legacy builder output. It keeps the current legacy behavior and makes `#1 [internal] ...`, `#2 CACHED`, and `#3 DONE ...` visible as build progress. ## Changes - `src/lib/sandbox-create-stream.ts`: recognize BuildKit step and completion lines while tracking the build phase. - `src/lib/sandbox-create-stream.test.ts`: cover BuildKit progress output and verify it is streamed to the user. ## Testing - `npm run build:cli` passed - `npm run typecheck:cli` passed - `npm test -- src/lib/sandbox-create-stream.test.ts` passed - `npm test` was also attempted. The full suite is not green on current main in this environment; failures are in existing installer/onboard/legacy-guard tests outside this change. ## Evidence it works The new focused test feeds BuildKit-style output into `streamSandboxCreate` and verifies that the lines are logged, collected in output, and mark sandbox creation as having seen progress. Fixes NVIDIA#2311 Signed-off-by: Deepak Jain <deepujain@gmail.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Improved detection and display of BuildKit and upload progress so progress markers and completion states are recognized reliably. * **Refactor** * Centralized progress-detection logic for more consistent handling of build and upload output. * **Tests** * Added a test ensuring BuildKit-formatted progress lines are captured, included in output, and reported to the log callback. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Deepak Jain <deepujain@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
latenighthackathon
pushed a commit
that referenced
this pull request
May 15, 2026
…3456) (NVIDIA#3520) > **Draft for visibility.** Issue-autopilot Stages 4-5 of NVIDIA#3456. Will mark ready once batch self-review + CI complete. ## Summary Closes the two remaining output threads in NVIDIA#3456 after the core dead-loop fix already landed on `main` (via NVIDIA#3459, NVIDIA#3434, NVIDIA#3483). Full sub-bug mapping in the [NVIDIA#3456 status comment](NVIDIA#3456 (comment)). - **Sub-bug #3** — `nemoclaw <name> destroy --yes` recovery hint replaced with a registry-aware helper. - **Sub-bug NVIDIA#4** — `Destroyed gateway 'nemoclaw' skipped` self-contradictory wording replaced with `Gateway 'nemoclaw' already removed or unreachable`. ## Acceptance criteria mapping | Sub-bug | Resolution | Evidence | |---|---|---| | #1 dead loop | Already fixed on main (NVIDIA#3459) | out of scope | | #2 firewall diagnostic | Already fixed on main (NVIDIA#3459) | out of scope | | **#3** literal `<name>` placeholder | **This PR** | `src/lib/onboard/gpu-recovery.ts` + `onboard.ts:10387-10405` | | **NVIDIA#4** misleading "skipped" wording | **This PR** | `src/lib/actions/uninstall/run-plan.ts:210-228, 407-414` | | NVIDIA#5 uninstall residuals | Already fixed on main (NVIDIA#3483) | out of scope | ## Behavior matrix `gpuPassthroughRecoveryLines(names)`: | Input | Suggestion | |---|---| | `null` / `[]` | `nemoclaw uninstall && nemoclaw onboard --gpu` | | one sandbox | `nemoclaw <name> destroy --yes --cleanup-gateway && nemoclaw onboard --gpu` | | many sandboxes | each `destroy --yes`, only the last gets `--cleanup-gateway` | ## Test plan ``` npm run typecheck:cli npx vitest run src/lib/onboard/gpu-recovery.test.ts src/lib/actions/uninstall/run-plan.test.ts ``` 22 tests pass (6 new + 16 existing). ## Notes for reviewers - This is the work [NVIDIA#3464 attempted](NVIDIA#3464); that PR was closed without merging after CodeRabbit asked for the `<name>` placeholder to be forbidden in tests via negative assertion. This PR adopts that refinement. - `runOptional` extension is backwards-compatible — existing callers without `onSkip` get the original wording. Closes NVIDIA#3456 once merged. --------- Signed-off-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR builds on NVIDIA#1137 and tightens the immutable symlink hardening path without changing its core approach.
What Changed
chattris missing from the image, so the mitigation cannot silently disappearWhy
The original fix is directionally strong, but today the
chattrpath is intentionally best-effort and silent. That makes review and operations harder because:chattrbinary looks the same as successful hardeningchattr +ifailures are suppressed with no visibilitychattrwithout CI catching itThese changes make the mitigation easier to trust and easier to debug while staying compatible with the current defense-in-depth model.
Validation
bash -n scripts/nemoclaw-start.shbash -n test/e2e-gateway-isolation.shRelationship To NVIDIA#1137
This is a follow-up hardening PR intended to sit on top of NVIDIA#1137 rather than replace it.