fix(onboard): skip OpenClaw base resolution for agents by yimoj · Pull Request #3858 · NVIDIA/NemoClaw

yimoj · 2026-05-20T04:52:09Z

Summary

Hermes onboarding already stages and resolves its own agent sandbox base image, so the shared sandbox creation path should not also resolve the OpenClaw sandbox base. This PR skips OpenClaw base-image resolution for agent-staged Dockerfiles while preserving default OpenClaw pinning behavior.

Related Issue

Fixes #3645

Changes

Tracks when createSandbox is using an agent-staged Dockerfile and skips the OpenClaw pullAndResolveBaseImageDigest() path for that case.
Adds a Hermes regression proving agent onboarding does not invoke the OpenClaw sandbox-base resolver or emit OpenClaw base warnings.
Adds a paired default-path regression proving OpenClaw sandbox-base resolution and pinning logs still run for the standard Dockerfile path.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Focused checks run:

npm run build:cli passes
npm run typecheck:cli passes
npx vitest run test/onboard.test.ts -t "skips OpenClaw sandbox-base resolution|keeps resolving the OpenClaw sandbox base image" passes
npx vitest run test/onboard.test.ts passes
cd nemoclaw && npm install && npm run build passes
npx vitest run --project plugin nemoclaw/src/commands/migration-state.test.ts passes
npx vitest run --project cli test/ssrf-parity.test.ts passes

Broad checks attempted:

npm test failed on unrelated setup/broad-suite issues, including missing plugin install/build artifacts before cd nemoclaw && npm install && npm run build, test/fetch-guard-patch-regression.test.ts expecting status 0 but receiving 1, and several long-running CLI/e2e test timeouts.
npx prek run --all-files failed only in test-cli; earlier hook stages, plugin tests, source-shape budget, and skills YAML passed. The failing test-cli cases were unrelated to this change: test/cli.test.ts timeout cases, test/fetch-guard-patch-regression.test.ts, and test/sandbox-connect-inference.test.ts status mismatch.

Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

Bug Fixes
- Agent-generated sandbox builds now skip base-image digest pinning and the associated local-registry fallback warnings, avoiding unnecessary resolution and warnings while preserving prior behavior for non-agent builds.
Tests
- Added integration tests verifying that agent-staged builds bypass base-image resolution and that default (non-agent) builds continue to resolve and pin base images with corresponding log output.

coderabbitai · 2026-05-20T04:52:21Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 95f77279-a61b-4423-9186-d1fe147df8c6

📥 Commits

Reviewing files that changed from the base of the PR and between 6422e21 and 1a84ce0.

📒 Files selected for processing (2)

src/lib/onboard.ts
test/onboard.test.ts

📝 Walkthrough

Walkthrough

createSandbox now skips base-image digest resolution and local-registry availability warnings when the Dockerfile was generated by an agent; the default build-context path retains resolution/pinning logic. Two tests verify both agent and non-agent behaviors.

Changes

Agent Dockerfile Base Image Handling

Layer / File(s)	Summary
Conditional base-image resolution `src/lib/onboard.ts`	When building from an agent-generated Dockerfile, `createSandbox` sets `resolved = null` to skip digest pinning and bypasses the local-registry availability/warning checks; non-agent (default) builds keep the previous resolve-and-pin behavior.
Tests for agent and default paths `test/onboard.test.ts`	Adds two tests: one asserting the agent-staged Dockerfile path does not call `resolveSandboxBaseImage` and emits no base-image logs/warnings; another asserting the default path calls the resolver once and logs base-image pinning.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

fix, Integration: Hermes, NemoClaw CLI, Sandbox, v0.0.46

Suggested reviewers

ericksoa
cv

Poem

🐰 I nibble code where Dockerfiles dwell,

Agent lanes now skip the digest bell,
Default roads still pin with care,
Two tests hop in to make it fair,
Hooray — the builds behave as they were meant to tell!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(onboard): skip OpenClaw base resolution for agents' directly and clearly summarizes the main change - skipping OpenClaw base resolution when using agent-staged Dockerfiles.
Linked Issues check	✅ Passed	The PR successfully addresses issue `#3645` by detecting agent-staged Dockerfile usage and skipping OpenClaw base-image resolution, preventing spurious warnings and failed builds during Hermes agent onboarding.
Out of Scope Changes check	✅ Passed	All changes are directly related to the core objective: modifications to base-image resolution logic in createSandbox and corresponding regression tests validating agent vs. standard paths.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/lib/onboard.ts (1)

5171-5451: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reduce this hunk to satisfy the onboard entrypoint growth gate.

CI is currently failing because src/lib/onboard.ts grew by net +6 lines. Please trim this change (behavior can stay the same) so the budget check passes.

Possible line-neutral rewrite (same behavior)

-  let usingAgentDockerfile = false;
   if (fromDockerfile) {
@@
   } else if (agent) {
     const agentBuild = agentOnboard.createAgentSandbox(agent);
     buildCtx = agentBuild.buildCtx;
     stagedDockerfile = agentBuild.stagedDockerfile;
-    usingAgentDockerfile = true;
   } else {
     ({ buildCtx, stagedDockerfile } = stageOptimizedSandboxBuildContext(ROOT));
   }
@@
-  const resolved = usingAgentDockerfile
+  const resolved = agent && !fromDockerfile
     ? null
     : pullAndResolveBaseImageDigest({
         requireOpenshellSandboxAbi: isLinuxDockerDriverGatewayEnabled(),
       });
-  if (!usingAgentDockerfile) {
-    if (resolved?.digest) {
-      console.log(`  Pinning base image to ${resolved.digest.slice(0, 19)}...`);
-    } else if (resolved) {
-      console.log(`  Using sandbox base image ${resolved.ref}`);
-    } else {
+  if (resolved?.digest) {
+    console.log(`  Pinning base image to ${resolved.digest.slice(0, 19)}...`);
+  } else if (resolved) {
+    console.log(`  Using sandbox base image ${resolved.ref}`);
+  } else if (!(agent && !fromDockerfile)) {
       // Check if the image exists locally before falling back to unpinned :latest.
       // On a first-time install behind a firewall with no cached image, warn early
       // so the user knows the build will likely fail.
       const localCheck = dockerImageInspect(`${SANDBOX_BASE_IMAGE}:${SANDBOX_BASE_TAG}`, {
         ignoreError: true,
         suppressOutput: true,
       });
       if (localCheck.status === 0) {
         console.warn("  Warning: could not pull base image from registry; using cached :latest.");
       } else {
         console.warn(
           `  Warning: base image ${SANDBOX_BASE_IMAGE}:${SANDBOX_BASE_TAG} is not available locally.`,
         );
         console.warn("  The build will fail unless Docker can pull the image during build.");
         console.warn("  If offline, pull the image manually first:");
         console.warn(`    docker pull ${SANDBOX_BASE_IMAGE}:${SANDBOX_BASE_TAG}`);
       }
-    }
   }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 5171 - 5451, The file grew by a few lines in
the custom-Dockerfile and cleanup logic — shrink the hunk without changing
behavior by collapsing multi-line console logs and small helpers and removing
redundant whitespace/comments: inside the fromDockerfile branch compress the
three separate console.error/console.warn calls into single concise messages
where safe (e.g. the permission/blocking messages and the "move Dockerfile"
suggestions), inline the small cleanupCustomBuildCtx arrow into the catch
handler (making it a one-liner try/finally usage), and remove extra blank lines
around the usingAgentDockerfile assignment and the resolved/pull fallback block;
target symbols to change: the fromDockerfile branch
(fromResolved/buildContextDir/stagedDockerfile), cleanupCustomBuildCtx,
cleanupBuildCtx, usingAgentDockerfile, and the resolved = usingAgentDockerfile ?
null : pullAndResolveBaseImageDigest(...) section. Ensure behavior and exit
codes remain identical.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/onboard.ts`:
- Around line 5171-5451: The file grew by a few lines in the custom-Dockerfile
and cleanup logic — shrink the hunk without changing behavior by collapsing
multi-line console logs and small helpers and removing redundant
whitespace/comments: inside the fromDockerfile branch compress the three
separate console.error/console.warn calls into single concise messages where
safe (e.g. the permission/blocking messages and the "move Dockerfile"
suggestions), inline the small cleanupCustomBuildCtx arrow into the catch
handler (making it a one-liner try/finally usage), and remove extra blank lines
around the usingAgentDockerfile assignment and the resolved/pull fallback block;
target symbols to change: the fromDockerfile branch
(fromResolved/buildContextDir/stagedDockerfile), cleanupCustomBuildCtx,
cleanupBuildCtx, usingAgentDockerfile, and the resolved = usingAgentDockerfile ?
null : pullAndResolveBaseImageDigest(...) section. Ensure behavior and exit
codes remain identical.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9c2d5632-f728-4d0f-9a94-c27597141cdc

📥 Commits

Reviewing files that changed from the base of the PR and between 11b1937 and db19946.

📒 Files selected for processing (2)

src/lib/onboard.ts
test/onboard.test.ts

coderabbitai

🧹 Nitpick comments (1)

test/onboard.test.ts (1)

2249-2255: ⚡ Quick win

Make these spawned tests hermetic by stripping messaging env vars.

Both new tests pass ...process.env into the child process, which can leak DISCORD_*/TELEGRAM_* and activate unrelated channels during createSandbox(). That can make these assertions flaky.

Proposed patch

-      env: {
-        ...process.env,
+      env: {
+        ...stripMessagingEnv(process.env),
         HOME: tmpDir,
         NEMOCLAW_HOME: path.join(tmpDir, ".nemoclaw"),
         PATH: `${fakeBin}:${process.env.PATH || ""}`,
         NEMOCLAW_NON_INTERACTIVE: "1",
       },
@@
-      env: {
-        ...process.env,
+      env: {
+        ...stripMessagingEnv(process.env),
         HOME: tmpDir,
         NEMOCLAW_HOME: path.join(tmpDir, ".nemoclaw"),
         PATH: `${fakeBin}:${process.env.PATH || ""}`,
         NEMOCLAW_NON_INTERACTIVE: "1",
       },

Based on learnings: “ensure spawned helper script does not inherit CI/local messaging credentials/config… delete/remove DISCORD_* and TELEGRAM_* before calling createSandbox().”

Also applies to: 2411-2417

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/onboard.test.ts` around lines 2249 - 2255, The spawned test environments
leak messaging credentials because the tests spread ...process.env into the
child env; before constructing the env object passed to the child (the env
literal that includes HOME, NEMOCLAW_HOME, PATH, NEMOCLAW_NON_INTERACTIVE),
remove any DISCORD_* and TELEGRAM_* variables so the helper script and
createSandbox() cannot pick up CI/local messaging creds; update both the block
around the shown env and the similar block used at the other test location (the
other spawn/env at the nearby tests) to filter out keys starting with "DISCORD_"
and "TELEGRAM_" from process.env when building the child env.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/onboard.test.ts`:
- Around line 2249-2255: The spawned test environments leak messaging
credentials because the tests spread ...process.env into the child env; before
constructing the env object passed to the child (the env literal that includes
HOME, NEMOCLAW_HOME, PATH, NEMOCLAW_NON_INTERACTIVE), remove any DISCORD_* and
TELEGRAM_* variables so the helper script and createSandbox() cannot pick up
CI/local messaging creds; update both the block around the shown env and the
similar block used at the other test location (the other spawn/env at the nearby
tests) to filter out keys starting with "DISCORD_" and "TELEGRAM_" from
process.env when building the child env.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 233028e2-0d7d-4ffc-b163-7cd97339bbef

📥 Commits

Reviewing files that changed from the base of the PR and between db19946 and 6422e21.

📒 Files selected for processing (2)

src/lib/onboard.ts
test/onboard.test.ts

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

wscurran · 2026-05-20T16:08:11Z

✨ Related open issues:

#3645 [hermes][onboard] NemoClaw loses memory mid-way and tries to create OpenClaw sandbox

ericksoa

Approved. Independent adversarial review found no blockers; regression risk is low and the change is scoped to skipping OpenClaw base resolution for agent-staged Dockerfiles while preserving default/custom paths. Validation passed in the review worktree: npm run build:cli, npm run typecheck:cli, focused onboard/base-image Vitest, and git diff --check. Live checks are green on head 1a84ce0.

yimoj self-assigned this May 20, 2026

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

yimoj force-pushed the fix/3645-hermes-openclaw-base-resolution branch from db19946 to 6422e21 Compare May 20, 2026 04:57

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

fix(onboard): skip OpenClaw base resolution for agents

1a84ce0

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

yimoj force-pushed the fix/3645-hermes-openclaw-base-resolution branch from 6422e21 to 1a84ce0 Compare May 20, 2026 05:28

yimoj added the v0.0.48 Release target label May 20, 2026

wscurran added fix integration: hermes Hermes integration behavior labels May 20, 2026

cv added v0.0.49 Release target v0.0.50 Release target and removed v0.0.48 Release target v0.0.49 Release target labels May 21, 2026

ericksoa approved these changes May 22, 2026

View reviewed changes

ericksoa merged commit 712ced7 into NVIDIA:main May 22, 2026
20 checks passed

coderabbitai Bot mentioned this pull request May 22, 2026

refactor(cli): extract onboard sandbox handler #3872

Merged

12 tasks

wscurran added bug-fix PR fixes a bug or regression and removed fix labels Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(onboard): skip OpenClaw base resolution for agents#3858

fix(onboard): skip OpenClaw base resolution for agents#3858
ericksoa merged 1 commit into
NVIDIA:mainfrom
yimoj:fix/3645-hermes-openclaw-base-resolution

yimoj commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

wscurran commented May 20, 2026

Uh oh!

ericksoa left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yimoj commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

wscurran commented May 20, 2026

Uh oh!

ericksoa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yimoj commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading