chore: upgrade agent runtime dependencies by ericksoa · Pull Request #3832 · NVIDIA/NemoClaw

ericksoa · 2026-05-19T21:40:53Z

Summary

upgrade OpenClaw pins and metadata to stable 2026.5.18
upgrade OpenShell stable install/blueprint pins to v0.0.44
upgrade Hermes Agent to v2026.5.16 / Hermes Agent v0.14.0 with the verified tarball SHA-256
include the repo-obvious fourth runtime pin, @tencent-weixin/openclaw-weixin, from 2.4.2 to 2.4.3; the host iLink client version is documented as lockstep with that sandbox plugin

Validation

npm run build:cli
npx vitest run test/validate-blueprint.test.ts test/generate-openclaw-config.test.ts test/install-openshell-version-check.test.ts test/onboard-gateway-runtime.test.ts test/onboard-openshell-version.test.ts src/lib/onboard/docker-driver-gateway-runtime-marker.test.ts src/lib/sandbox/version.test.ts src/lib/verify-deployment.test.ts src/ext/wechat/qr.test.ts src/ext/wechat/login.test.ts test/seed-wechat-accounts.test.ts test/generate-hermes-config.test.ts test/hermes-plugin-handlers.test.ts test/hermes-provider-foundation.test.ts test/hermes-sandbox-workflow.test.ts test/hermes-share-mount-deps.test.ts test/hermes-start.test.ts test/hermes-tool-gateway-broker.test.ts test/nemohermes-alias.test.ts src/lib/hermes-provider-auth.test.ts --testTimeout 60000
cd nemoclaw && npm test -- src/package-metadata.test.ts
npm run validate:configs
npm run source-shape:check
npm run checks
npx tsx scripts/e2e/check-parity-map.ts --strict
npx tsx scripts/e2e/lint-conventions.ts
git diff --check
bash -n scripts/install-openshell.sh scripts/brev-launchable-ci-cpu.sh test/e2e/test-openshell-version-pin.sh test/e2e/test-openshell-gateway-upgrade.sh
shellcheck scripts/install-openshell.sh scripts/brev-launchable-ci-cpu.sh test/e2e/test-openshell-version-pin.sh test/e2e/test-openshell-gateway-upgrade.sh

Summary by CodeRabbit

New Features
- Detect native tool-search catalogs; improved WeChat account seeding using local plugin metadata; smarter Kimi inference command splitting; unified OpenClaw JSON output parsing helper.
Bug Fixes
- Hardened WebSocket pre-auth handshake timeout handling; more precise policy host extraction; stronger gateway readiness and crash-loop recovery checks.
Chores
- Bumped OpenClaw, Hermes, OpenShell and WeChat plugin versions; added runtime flag for preinstalled WeChat plugin; broad test and CI updates.

coderabbitai · 2026-05-19T21:41:06Z

📝 Walkthrough

Walkthrough

Bumps OpenClaw/Hermes/OpenShell/WeChat plugin pins; adds WeChat plugin metadata discovery and multi-channel seeding; centralizes OpenClaw JSON payload extraction; refactors Kimi exec splitting and tool-catalog handling; updates Dockerfile patching, Slack proof, tests, and docs.

Changes

Upstream Version Coordination and Plugin Overhaul

Layer / File(s)	Summary
Build args, manifests, and package pins `Dockerfile.base`, `agents/hermes/Dockerfile.base`, `agents/hermes/manifest.yaml`, `agents/openclaw/manifest.yaml`, `nemoclaw-blueprint/blueprint.yaml`, `nemoclaw/package.json`, `src/ext/wechat/qr.ts`, `docs/reference/commands.mdx`, `.agents/skills/nemoclaw-user-reference/references/commands.md`	Bump OpenClaw to `2026.5.18`, Hermes to `v2026.5.16`, and the `@tencent-weixin/openclaw-weixin` plugin to `2.4.3`; update manifests, package.json compat/build values, and documentation examples.
OpenShell pinning and installation defaults `scripts/brev-launchable-ci-cpu.sh`, `scripts/install-openshell.sh`, `src/lib/onboard/...`, `test/e2e/*`, `test/install-openshell-version-check.test.ts`	Raise OpenShell min/max/dev fallback pins from `0.0.39` → `0.0.44`; update CI comments, installer scripts, fallback supervisor/gateway refs, and E2E/test fixtures to use `0.0.44`/`0.0.45` scenarios.
WeChat plugin discovery and multi-channel seeding `scripts/generate-openclaw-config.py`, `scripts/seed-wechat-accounts.py`, `Dockerfile`, `test/generate-openclaw-config.test.ts`, `test/seed-wechat-accounts.test.ts`	Detect preinstalled plugin metadata or env signal before seeding; derive channel ids from installed plugin metadata; patch `openclaw.json` for all derived/legacy channel ids; bump plugin spec to `@tencent-weixin/openclaw-weixin@2.4.3`.
E2E JSON payload helper & migration `test/e2e/lib/openclaw-agent-json.py`, multiple `test/e2e/*.sh`, `test/openclaw-agent-json.test.ts`	Add helper to normalize/extract assistant `payload.text` from OpenClaw JSON envelopes; migrate many E2E scripts from inline Python parsers to this shared helper and add unit tests for it.
Kimi exec-splitting and tool-catalog detection `nemoclaw-blueprint/openclaw-plugins/kimi-inference-compat/index.js`, `scripts/patch-openclaw-tool-catalog.js`, `test/openclaw-tool-catalog-patch.test.ts`	Refactor safe combined-`exec` splitting for messages and deltas with encoding helpers; detect native tool-search selection files and treat them as unpatchable.
Slack proof, Dockerfile patches, gateway readiness, and regression tests `test/e2e/lib/slack-api-proof.sh`, `Dockerfile`, `test/fetch-guard-patch-regression.test.ts`, `test/e2e/test-issue-2478-crash-loop-recovery.sh`, `src/lib/policy/index.ts`	Add hermetic fallback for Slack proof when OpenClaw helpers are unavailable; generalize Dockerfile OpenClaw patching and add regression test; enhance gateway readiness checks and tighten host parsing in policy extraction.
Unit tests, parity docs, and misc updates `src/lib/sandbox/version.test.ts`, `src/lib/verify-deployment.test.ts`, numerous test fixtures and parity docs	Update tests/fixtures/parity inventories to expect the new pinned versions, tweak test utilities, and refine comments/diagnostics across test scripts.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

NVIDIA/NemoClaw#3808: Related tool-catalog patcher changes and tests.
NVIDIA/NemoClaw#3478: Related OpenShell version-pin E2E guard updates.
NVIDIA/NemoClaw#3839: Overlaps in WeChat seeding and seed-wechat-accounts.py logic.

Suggested labels

NemoClaw CLI, Integration: WeChat, v0.0.44

Suggested reviewers

cv
jyaunches

"I hopped through versions, pins in tow,
Found WeChat channels where local data grow,
Tests now share one little Python light,
Kimi splits commands tidy and right,
A rabbit cheers the infra’s new glow." 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 22.39% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'chore: upgrade agent runtime dependencies' directly summarizes the main change in the changeset—upgrading OpenClaw, OpenShell, Hermes Agent, and WeChat plugin versions.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch upgrade/all-deps-2026-05-19

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-19T21:41:22Z

🌿 Preview your docs: https://nvidia-preview-pr-3832.docs.buildwithfern.com/nemoclaw

github-actions · 2026-05-19T21:43:08Z

E2E Advisor Recommendation

Required E2E: cloud-e2e, cloud-onboard-e2e, test-e2e-sandbox, test-e2e-gateway-isolation, test-non-root-sandbox-smoke, openshell-version-pin-e2e, openshell-gateway-upgrade-e2e, gateway-health-honest-e2e, gateway-drift-preflight-e2e, onboard-inference-smoke-e2e, openclaw-plugin-runtime-exdev-e2e, kimi-inference-compat-e2e, messaging-providers-e2e, messaging-compatible-endpoint-e2e, channels-stop-start-e2e, openclaw-inference-switch-e2e, model-router-provider-routed-inference-e2e, bedrock-runtime-compatible-anthropic-e2e, network-policy-e2e, sandbox-operations-e2e, hermes-e2e, launchable-smoke-e2e
Optional E2E: cloud-inference-e2e, brave-search-e2e, hermes-inference-switch-e2e, rebuild-openclaw-e2e, upgrade-stale-sandbox-e2e, issue-2478-crash-loop-recovery-e2e

Dispatch hint: cloud-e2e,cloud-onboard-e2e,kimi-inference-compat-e2e,messaging-providers-e2e,messaging-compatible-endpoint-e2e,channels-stop-start-e2e,openclaw-inference-switch-e2e,bedrock-runtime-compatible-anthropic-e2e,network-policy-e2e,sandbox-operations-e2e,hermes-e2e,openshell-gateway-upgrade-e2e,launchable-smoke-e2e

Auto-dispatched E2E: cloud-e2e, cloud-onboard-e2e, openshell-gateway-upgrade-e2e, kimi-inference-compat-e2e, messaging-providers-e2e, messaging-compatible-endpoint-e2e, channels-stop-start-e2e, openclaw-inference-switch-e2e, bedrock-runtime-compatible-anthropic-e2e, network-policy-e2e, sandbox-operations-e2e, hermes-e2e, launchable-smoke-e2e via nightly-e2e.yaml at cd41b2b4f5ebf6758a62f4c3036c6817f244db11 — nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

cloud-e2e (high): Validates the complete source install → onboard → sandbox creation → policy setup → live NVIDIA inference → OpenClaw agent user flow after the OpenShell/OpenClaw version bumps and Docker/runtime changes.
cloud-onboard-e2e (high): Focused onboarding validation for installer, Landlock/read-only behavior, credential leak checks, and inference.local after changes to install-openshell, onboard, Dockerfile, blueprint pins, and config generation.
test-e2e-sandbox (medium): Builds the modified sandbox image and runs in-container blueprint/OpenClaw/plugin smoke assertions, directly covering Dockerfile, Dockerfile.base, blueprint, package metadata, and OpenClaw patch changes.
test-e2e-gateway-isolation (medium): Required because Dockerfile runtime changes and gateway token/config handling can affect isolation guarantees and gateway process boundaries.
test-non-root-sandbox-smoke (low): Covers non-root sandbox startup after image/base image and OpenClaw plugin/runtime-dependency changes, including permissions-sensitive WeChat/plugin state paths.
openshell-version-pin-e2e (low): Directly covers the OpenShell max/min pin bump and scripts/install-openshell.sh behavior when a too-new sticky OpenShell is already on PATH.
openshell-gateway-upgrade-e2e (high): OpenShell pin and installer behavior changed; this validates upgrading an older working install to the current supported OpenShell while preserving gateway/sandbox state.
gateway-health-honest-e2e (low): Onboard/gateway runtime marker code is in the change set; this regression ensures onboarding does not report a dead Docker-driver gateway as healthy.
gateway-drift-preflight-e2e (low): The OpenShell version bump can change gateway RPC/state behavior; this verifies NemoClaw fails closed on stale or incompatible gateway state before trusting sandbox status.
onboard-inference-smoke-e2e (low): Onboard/config-generation changes can falsely report success before the route works; this regression verifies onboard requires a real configured provider/model completion.
openclaw-plugin-runtime-exdev-e2e (medium): OpenClaw version and plugin-runtime behavior changed, including plugin preinstall/load paths; this catches runtime dependency staging failures such as EXDEV cross-device renames.
kimi-inference-compat-e2e (medium): The Kimi compatibility plugin changed tool-call splitting and streaming delta rewriting; this E2E is the direct coverage for safe exec splitting through OpenClaw trajectories.
messaging-providers-e2e (high): WeChat plugin preinstall/seed changes, Slack proof helper changes, and messaging credential-provider paths require validation of token isolation, placeholder rewriting, and channel config inside a real sandbox.
messaging-compatible-endpoint-e2e (medium): Validates the Telegram plus OpenAI-compatible endpoint route through inference.local after config-generation and OpenClaw version changes.
channels-stop-start-e2e (high): WeChat/Slack/messaging config and credential handling changed; this validates stop/start/remove channel lifecycle across rebuilds for OpenClaw and Hermes messaging channels.
openclaw-inference-switch-e2e (high): Covers NemoClaw inference set/switch behavior, OpenShell route state, openclaw.json patching, hashes, and live requests after changes to provider config generation and OpenClaw runtime version.
model-router-provider-routed-inference-e2e (medium): Blueprint and config-generation changes can affect routed provider setup; this ensures model-router onboard produces a route that serves through inference.local.
bedrock-runtime-compatible-anthropic-e2e (medium): The provider/plugin disablement and config-generation changes can affect compatible Anthropic/Bedrock routing and hidden credential boundaries; the touched E2E validates that path hermetically.
network-policy-e2e (high): src/lib/policy and blueprint changes can affect policy application and network boundaries; this validates deny-by-default, presets, live policy changes, inference exemption, and SSRF validation.
sandbox-operations-e2e (high): Sandbox lifecycle and version/status behavior are in scope, and test-sandbox-operations.sh was touched; this validates real sandbox operations under the new OpenShell/OpenClaw pins.
hermes-e2e (high): Hermes base image, expected version, and config changed; this validates Hermes onboarding, health, gateway, and live inference for the multi-agent runtime.
launchable-smoke-e2e (medium): scripts/brev-launchable-ci-cpu.sh and launchable smoke test changed, including the OpenShell default version. This validates the community launchable/bootstrap path.

Optional E2E

cloud-inference-e2e (medium): Useful focused coverage for live inference.local and skill filesystem behavior, but cloud-e2e already exercises the primary live OpenClaw inference path.
brave-search-e2e (medium): Policy/config changes touch web-search-related code paths; this is useful confidence for Brave credential rewrite and policy preset behavior, but requires a real Brave key and is adjacent rather than central to the version bump.
hermes-inference-switch-e2e (high): Good additional coverage for Hermes route/config updates under the new Hermes version, but hermes-e2e is the merge-blocking smoke for the Hermes bump.
rebuild-openclaw-e2e (high): Useful for validating stale OpenClaw rebuild/upgrade behavior after the OpenClaw pin bump, but the core install/onboard and version-pin risks are already covered by required full, upgrade, and sandbox E2Es.
upgrade-stale-sandbox-e2e (high): Additional confidence that an existing stale sandbox upgrades from an older OpenClaw to the new pin, but not required if rebuild-openclaw or the main upgrade path is deferred for cost.
issue-2478-crash-loop-recovery-e2e (medium): Useful adjacent coverage because OpenClaw/gateway startup behavior changed, but it targets a specific crash-loop recovery regression rather than the primary changed paths.

New E2E recommendations

wechat-plugin-preinstall (high): Existing messaging-provider checks appear to allow WeChat seeded account assertions to skip when plugin/account state is missing. This PR makes WeChat plugin preinstall and seeding a first-class runtime contract, so CI should hard-fail when openclaw-weixin metadata, channel registration, and placeholder account files are absent after onboard.
- Suggested test: Add a dedicated WeChat preinstalled-plugin E2E that onboards with fake WECHAT_* env, asserts openclaw-weixin is installed/enabled, channels.openclaw-weixin has the configured account enabled, and the per-account file contains only openshell:resolve:env placeholders.
openclaw-version-bump-patch-contract (medium): The Dockerfile patch logic depends on compiled OpenClaw dist symbol names and constants. The sandbox build catches missing grep patterns, but there is no focused E2E artifact that records which patches applied against the new OpenClaw version and proves the patched runtime still serves a minimal agent turn before plugin install.
- Suggested test: Add an OpenClaw patch-contract E2E that builds the sandbox image, extracts patch markers for fetch-guard, symlink install paths, handshake timeout, and tool catalog wrapper, then runs a minimal in-sandbox OpenClaw doctor/agent smoke.
hermes-version-upgrade-contract (medium): Hermes expected_version and tarball pin changed, but existing Hermes E2E validates fresh onboard rather than upgrade/rebuild preservation across a Hermes version bump.
- Suggested test: Add a Hermes stale-version rebuild E2E that creates or simulates a sandbox registered with an older Hermes version, runs rebuild, and asserts expected_version, persisted state files, and API health after upgrade.

Dispatch hint

Workflow: .github/workflows/nightly-e2e.yaml
jobs input: cloud-e2e,cloud-onboard-e2e,kimi-inference-compat-e2e,messaging-providers-e2e,messaging-compatible-endpoint-e2e,channels-stop-start-e2e,openclaw-inference-switch-e2e,bedrock-runtime-compatible-anthropic-e2e,network-policy-e2e,sandbox-operations-e2e,hermes-e2e,openshell-gateway-upgrade-e2e,launchable-smoke-e2e

github-actions · 2026-05-19T21:53:54Z

Selective E2E Results — ❌ Some jobs failed

Run: 26127019955
Target ref: upgrade/all-deps-2026-05-19
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: all (no filter)
Summary: 8 passed, 35 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	❌ failure
brave-search-e2e	✅ success
channels-stop-start-e2e	❌ failure
cloud-e2e	❌ failure
cloud-inference-e2e	❌ failure
cloud-onboard-e2e	❌ failure
credential-migration-e2e	❌ failure
credential-sanitization-e2e	❌ failure
device-auth-health-e2e	❌ failure
diagnostics-e2e	❌ failure
docs-validation-e2e	❌ failure
double-onboard-e2e	❌ failure
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	❌ failure
issue-2478-crash-loop-recovery-e2e	❌ failure
kimi-inference-compat-e2e	❌ failure
launchable-smoke-e2e	❌ failure
messaging-compatible-endpoint-e2e	❌ failure
messaging-providers-e2e	❌ failure
network-policy-e2e	❌ failure
onboard-repair-e2e	❌ failure
onboard-resume-e2e	❌ failure
openclaw-inference-switch-e2e	❌ failure
openclaw-slack-pairing-e2e	❌ failure
openshell-gateway-upgrade-e2e	❌ failure
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	❌ failure
runtime-overrides-e2e	❌ failure
sandbox-operations-e2e	❌ failure
sandbox-survival-e2e	❌ failure
shields-config-e2e	❌ failure
skill-agent-e2e	❌ failure
snapshot-commands-e2e	❌ failure
state-backup-restore-e2e	❌ failure
telegram-injection-e2e	❌ failure
token-rotation-e2e	❌ failure
tunnel-lifecycle-e2e	❌ failure
upgrade-stale-sandbox-e2e	❌ failure

Failed jobs: bedrock-runtime-compatible-anthropic-e2e, channels-stop-start-e2e, cloud-e2e, cloud-inference-e2e, cloud-onboard-e2e, credential-migration-e2e, credential-sanitization-e2e, device-auth-health-e2e, diagnostics-e2e, docs-validation-e2e, double-onboard-e2e, inference-routing-e2e, issue-2478-crash-loop-recovery-e2e, kimi-inference-compat-e2e, launchable-smoke-e2e, messaging-compatible-endpoint-e2e, messaging-providers-e2e, network-policy-e2e, onboard-repair-e2e, onboard-resume-e2e, openclaw-inference-switch-e2e, openclaw-slack-pairing-e2e, openshell-gateway-upgrade-e2e, rebuild-openclaw-e2e, runtime-overrides-e2e, sandbox-operations-e2e, sandbox-survival-e2e, shields-config-e2e, skill-agent-e2e, snapshot-commands-e2e, state-backup-restore-e2e, telegram-injection-e2e, token-rotation-e2e, tunnel-lifecycle-e2e, upgrade-stale-sandbox-e2e. Check run artifacts for logs.

github-actions · 2026-05-19T21:58:38Z

Selective E2E Results — ❌ Some jobs failed

Run: 26127114600
Target ref: 68b19c87d2f1b90414b3ff731759f3a3a4659126
Workflow ref: main
Requested jobs: openshell-gateway-upgrade-e2e,cloud-onboard-e2e,rebuild-openclaw-e2e,hermes-e2e,launchable-smoke-e2e
Summary: 2 passed, 3 failed, 0 skipped

Job	Result
cloud-onboard-e2e	✅ success
hermes-e2e	✅ success
launchable-smoke-e2e	❌ failure
openshell-gateway-upgrade-e2e	❌ failure
rebuild-openclaw-e2e	❌ failure

Failed jobs: launchable-smoke-e2e, openshell-gateway-upgrade-e2e, rebuild-openclaw-e2e. Check run artifacts for logs.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

Dockerfile (1)
205-223: ⚠️ Potential issue | 🔴 Critical

E2E test failures must be resolved before merge.

The recommended E2E workflows have been triggered on upgrade/all-deps-2026-05-19. However, the previous run (started 2026-05-19T21:41:03Z) completed with critical failures in the exact tests specified for Dockerfile validation:

sandbox-survival-e2e: FAILED (validates gateway restart recovery with persistent connections)

rebuild-openclaw-e2e: FAILED (validates workspace state survives rebuild)

cloud-e2e: FAILED (validates full onboard + cloud inference)

hermes-e2e: Passed

A newer run started at 2026-05-19T22:01:45Z is currently in progress with all recommended jobs queued. Given that the WebSocket timeout patch directly affects OpenClaw's handshake behavior under load, and sandbox-survival-e2e tests gateway restart recovery (triggering reconnections), these failures are critical to resolve. Wait for the in-progress run to complete and confirm all recommended tests pass before merging.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 205 - 223, The Dockerfile patch changes
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS (and thus affects
OPENCLAW_HANDSHAKE_TIMEOUT_MS / OPENCLAW_CONNECT_CHALLENGE_TIMEOUT_MS) and has
triggered critical E2E failures; do not merge until the in-progress CI run for
upgrade/all-deps-2026-05-19 completes and all recommended workflows (especially
sandbox-survival-e2e, rebuild-openclaw-e2e, cloud-e2e) pass; if they fail again,
revert or adjust the change around DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS (the
hto_files sed/grep patch in the Patch 5 block) and iterate until the full E2E
matrix is green.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@Dockerfile`:
- Around line 205-223: The Dockerfile patch changes
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS (and thus affects
OPENCLAW_HANDSHAKE_TIMEOUT_MS / OPENCLAW_CONNECT_CHALLENGE_TIMEOUT_MS) and has
triggered critical E2E failures; do not merge until the in-progress CI run for
upgrade/all-deps-2026-05-19 completes and all recommended workflows (especially
sandbox-survival-e2e, rebuild-openclaw-e2e, cloud-e2e) pass; if they fail again,
revert or adjust the change around DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS (the
hto_files sed/grep patch in the Patch 5 block) and iterate until the full E2E
matrix is green.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0fd110ed-60e5-46b8-a551-f60c4b2dfefb

📥 Commits

Reviewing files that changed from the base of the PR and between 68b19c8 and 68b852f.

📒 Files selected for processing (2)

Dockerfile
test/fetch-guard-patch-regression.test.ts

github-actions · 2026-05-19T22:19:45Z

Selective E2E Results — ❌ Some jobs failed

Run: 26128083363
Target ref: 68b852f98210251945355b46d0ecda8a353030c8
Workflow ref: main
Requested jobs: openshell-gateway-upgrade-e2e,cloud-onboard-e2e,cloud-inference-e2e,messaging-providers-e2e,network-policy-e2e,hermes-e2e,rebuild-openclaw-e2e,upgrade-stale-sandbox-e2e,rebuild-hermes-e2e,launchable-smoke-e2e
Summary: 7 passed, 3 failed, 0 skipped

Job	Result
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
hermes-e2e	✅ success
launchable-smoke-e2e	❌ failure
messaging-providers-e2e	❌ failure
network-policy-e2e	✅ success
openshell-gateway-upgrade-e2e	❌ failure
rebuild-hermes-e2e	✅ success
rebuild-openclaw-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

Failed jobs: launchable-smoke-e2e, messaging-providers-e2e, openshell-gateway-upgrade-e2e. Check run artifacts for logs.

github-actions · 2026-05-19T22:35:44Z

Selective E2E Results — ❌ Some jobs failed

Run: 26127986697
Target ref: upgrade/all-deps-2026-05-19
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: all (no filter)
Summary: 33 passed, 10 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	❌ failure
brave-search-e2e	✅ success
channels-stop-start-e2e	❌ failure
cloud-e2e	❌ failure
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	✅ success
docs-validation-e2e	✅ success
double-onboard-e2e	✅ success
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	❌ failure
kimi-inference-compat-e2e	❌ failure
launchable-smoke-e2e	❌ failure
messaging-compatible-endpoint-e2e	❌ failure
messaging-providers-e2e	❌ failure
network-policy-e2e	✅ success
onboard-repair-e2e	✅ success
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	❌ failure
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	✅ success
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	✅ success
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	❌ failure
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	✅ success
telegram-injection-e2e	✅ success
token-rotation-e2e	✅ success
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

Failed jobs: bedrock-runtime-compatible-anthropic-e2e, channels-stop-start-e2e, cloud-e2e, issue-2478-crash-loop-recovery-e2e, kimi-inference-compat-e2e, launchable-smoke-e2e, messaging-compatible-endpoint-e2e, messaging-providers-e2e, openclaw-inference-switch-e2e, sandbox-operations-e2e. Check run artifacts for logs.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

scripts/seed-wechat-accounts.py (1)

145-155: ⚡ Quick win

Make metadata discovery order deterministic.

os.walk() does not guarantee directory or file order, and _dedupe() preserves first-seen order. That means the patched channel-id order in openclaw.json and the "registered ..." log can vary across environments once multiple metadata files are present.

Suggested fix

     matches: list[pathlib.Path] = []
     for root, dirs, files in os.walk(extensions_dir):
-        dirs[:] = [
+        dirs[:] = sorted(
+            [
             item
             for item in dirs
             if item not in {"node_modules", "plugin-runtime-deps", ".git"}
-        ]
+            ]
+        )
         root_path = pathlib.Path(root)
-        for filename in files:
+        for filename in sorted(files):
             if filename in {"openclaw.plugin.json", "package.json"}:
                 matches.append(root_path / filename)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/seed-wechat-accounts.py` around lines 145 - 155, os.walk traversal is
non-deterministic so the discovered metadata order (matches list) can vary; make
discovery deterministic by sorting directory names and file names during the
os.walk loop and then sort (or otherwise order) the resulting matches before
returning. Specifically, inside the loop that iterates over
os.walk(extensions_dir) where variables extensions_dir, dirs, files, root_path,
filename, and matches are used, replace the implicit iteration with sorted(dirs)
and sorted(files) (or sort dirs in-place with dirs.sort()) so exploration order
is stable, and finally apply a deterministic ordering to matches (e.g., sort
matches by path.name or full path) before return so downstream consumers and
logs are stable.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@nemoclaw-blueprint/openclaw-plugins/kimi-inference-compat/index.js`:
- Around line 286-300: The bug is reusing the original deltaSplit.contentIndex
after applySafeExecSplitToMessage may have expanded/shifted combined exec
blocks; instead of trusting deltaSplit.contentIndex, compute the targetIndex by
locating the actual target command in the (possibly-rewritten)
deltaSplit.commands (e.g., use deltaSplit.commands.findIndex with a
deep-equality match to the intended command) and fall back to the clamped
contentIndex only if the find fails; then use that computed targetIndex when
selecting targetCommand, calling encodeToolCallArgumentsLike(event.delta,
targetCommand) and when assigning event.toolCall via
buildSplitToolCalls(deltaSplit.toolCall, deltaSplit.commands)[targetIndex];
ensure this change touches the block that calls
applySafeExecSplitToMessage(event.partial|event.message), uses deltaSplit,
targetIndex, targetCommand, encodeToolCallArgumentsLike, and
buildSplitToolCalls.

In `@test/e2e/test-issue-2478-crash-loop-recovery.sh`:
- Around line 275-277: The current health check uses status_output and greps for
'healthy|ready|running', which matches substrings like "not running"; update the
condition in the if that reads status_output="$(timeout 20 nemoclaw
"$SANDBOX_NAME" status 2>&1)" || true to use word-bounded matching (e.g., grep
-Eiq '\b(healthy|ready|running)\b') or use grep -wi with the exact words (grep
-Eiq 'healthy|ready|running' → grep -Eiq '\b(healthy|ready|running)\b' or grep
-wi -E 'healthy|ready|running') so only standalone statuses are accepted.
- Line 111: The current ps|awk pipeline can pick the first "openclaw" process (a
launcher) instead of the gateway; change the pipeline so it deterministically
selects the gateway PID by first filtering processes whose command/args contain
"gateway" and choosing the lowest PID, with an explicit fallback to a plain
"openclaw" process only if no gateway is found. Update the existing pipeline
(the ps -eo ... | awk '\$2 == "openclaw" || \$0 ~ /openclaw[ -]gateway/ { print
\$1; exit }' | tr -d '[:space:]' line) to: 1) prefer matches of /gateway/ (e.g.
awk '\$0 ~ /[ -]gateway/ { print \$1 }' piped to sort -n | head -n1), and 2)
only if that yields nothing, run the plain "openclaw" selection; ensure the
final result is deterministic (sorted by PID) and still trimmed with tr -d
'[:space:]'.

In `@test/policies.test.ts`:
- Line 407: The test is incorrectly checking array membership for a literal
backtick (expect(hosts).not.toContain("`")) instead of validating each host
string for embedded backticks; update the assertion to iterate or otherwise
inspect each host string returned by hosts (e.g., in the test that defines/uses
the hosts variable and the related test block) and assert that no individual
host contains a backtick (for example, use hosts.forEach or a join+regex check)
so the test fails if any host string includes the ` character.

---

Nitpick comments:
In `@scripts/seed-wechat-accounts.py`:
- Around line 145-155: os.walk traversal is non-deterministic so the discovered
metadata order (matches list) can vary; make discovery deterministic by sorting
directory names and file names during the os.walk loop and then sort (or
otherwise order) the resulting matches before returning. Specifically, inside
the loop that iterates over os.walk(extensions_dir) where variables
extensions_dir, dirs, files, root_path, filename, and matches are used, replace
the implicit iteration with sorted(dirs) and sorted(files) (or sort dirs
in-place with dirs.sort()) so exploration order is stable, and finally apply a
deterministic ordering to matches (e.g., sort matches by path.name or full path)
before return so downstream consumers and logs are stable.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a4ae8b62-f2a0-4a9b-96c6-93325c74d834

📥 Commits

Reviewing files that changed from the base of the PR and between 68b852f and 23f5dec.

📒 Files selected for processing (19)

nemoclaw-blueprint/openclaw-plugins/kimi-inference-compat/index.js
scripts/generate-openclaw-config.py
scripts/seed-wechat-accounts.py
src/lib/policy/index.ts
test/e2e/lib/openclaw-agent-json.py
test/e2e/lib/slack-api-proof.sh
test/e2e/test-bedrock-runtime-compatible-anthropic.sh
test/e2e/test-brave-search-e2e.sh
test/e2e/test-full-e2e.sh
test/e2e/test-issue-2478-crash-loop-recovery.sh
test/e2e/test-launchable-smoke.sh
test/e2e/test-messaging-compatible-endpoint.sh
test/e2e/test-openclaw-inference-switch.sh
test/e2e/test-sandbox-operations.sh
test/generate-openclaw-config.test.ts
test/kimi-inference-compat-plugin.test.ts
test/openclaw-agent-json.test.ts
test/policies.test.ts
test/seed-wechat-accounts.test.ts

✅ Files skipped from review due to trivial changes (1)

test/openclaw-agent-json.test.ts

github-actions · 2026-05-19T23:33:28Z

Selective E2E Results — ❌ Some jobs failed

Run: 26131407000
Target ref: upgrade/all-deps-2026-05-19
Requested jobs: all (no filter)
Summary: 27 passed, 1 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-stop-start-e2e	⚠️ cancelled
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	⚠️ cancelled
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	⚠️ cancelled
docs-validation-e2e	✅ success
double-onboard-e2e	⚠️ cancelled
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	⚠️ cancelled
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	⚠️ cancelled
network-policy-e2e	⚠️ cancelled
onboard-repair-e2e	⚠️ cancelled
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	❌ failure
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	⚠️ cancelled
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	⚠️ cancelled
runtime-overrides-e2e	⚠️ cancelled
sandbox-operations-e2e	⚠️ cancelled
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	⚠️ cancelled
telegram-injection-e2e	✅ success
token-rotation-e2e	⚠️ cancelled
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	⚠️ cancelled

Failed jobs: openclaw-inference-switch-e2e. Check run artifacts for logs.

github-actions · 2026-05-19T23:35:28Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26131484758
Target ref: 23f5decae9fe1f0adb48e5bfd72ae479794e63ec
Workflow ref: main
Requested jobs: cloud-onboard-e2e,cloud-e2e,sandbox-operations-e2e,openshell-gateway-upgrade-e2e,issue-2478-crash-loop-recovery-e2e,kimi-inference-compat-e2e,openclaw-inference-switch-e2e,inference-routing-e2e,messaging-compatible-endpoint-e2e,channels-stop-start-e2e,network-policy-e2e,hermes-e2e,hermes-inference-switch-e2e,launchable-smoke-e2e,rebuild-openclaw-e2e,upgrade-stale-sandbox-e2e
Summary: 9 passed, 0 failed, 0 skipped

Job	Result
channels-stop-start-e2e	⚠️ cancelled
cloud-e2e	✅ success
cloud-onboard-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	⚠️ cancelled
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
network-policy-e2e	⚠️ cancelled
openclaw-inference-switch-e2e	✅ success
openshell-gateway-upgrade-e2e	⚠️ cancelled
rebuild-openclaw-e2e	⚠️ cancelled
sandbox-operations-e2e	⚠️ cancelled
upgrade-stale-sandbox-e2e	⚠️ cancelled

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/test-sandbox-operations.sh`:
- Around line 356-360: Update the TC-SBX-02 rationale comment block so it no
longer implies stderr is dropped: change the numbered points to state that
merged stdout/stderr is preserved for failure diagnostics and that assertions
target the JSON envelope payload text (not the merged stream), and keep the note
about relying on generated `thinkingDefault: off` for first-turn timing;
reference the existing rationale block around "Asserts on payload text..." and
the mention of "thinkingDefault: off" so the comment matches the new SSH capture
path that preserves merged output.
- Around line 380-381: The test currently swallows the SSH command exit status
with "|| true" which can yield false positives; modify the block that runs the
openclaw agent command (the line invoking "openclaw agent --agent main --json
--session-id '${session_id}' -m ...") to capture its exit code into a variable
(e.g., rc=$?), then assert success by requiring rc == 0 in the overall success
condition in addition to checking the output contains "42"; remove the "|| true"
suppression and make the success gate require both rc == 0 and the parsed output
match to prevent non-zero SSH exits from being treated as passes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f3881ea7-6db1-46ad-b92f-14722324448b

📥 Commits

Reviewing files that changed from the base of the PR and between 23f5dec and 6cb5c32.

📒 Files selected for processing (2)

test/e2e/test-openclaw-inference-switch.sh
test/e2e/test-sandbox-operations.sh

github-actions · 2026-05-19T23:42:50Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26131699232
Target ref: upgrade/all-deps-2026-05-19
Requested jobs: openclaw-inference-switch-e2e,sandbox-operations-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
openclaw-inference-switch-e2e	✅ success
sandbox-operations-e2e	✅ success

github-actions · 2026-05-19T23:51:06Z

Selective E2E Results — ❌ Some jobs failed

Run: 26131773800
Target ref: 6cb5c324de2954d847b9e40afbda347d68f0acb6
Workflow ref: main
Requested jobs: cloud-e2e,cloud-onboard-e2e,openshell-gateway-upgrade-e2e,kimi-inference-compat-e2e,messaging-compatible-endpoint-e2e,network-policy-e2e,sandbox-operations-e2e,openclaw-inference-switch-e2e,hermes-e2e,launchable-smoke-e2e
Summary: 9 passed, 1 failed, 0 skipped

Job	Result
cloud-e2e	✅ success
cloud-onboard-e2e	✅ success
hermes-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
network-policy-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openshell-gateway-upgrade-e2e	❌ failure
sandbox-operations-e2e	✅ success

Failed jobs: openshell-gateway-upgrade-e2e. Check run artifacts for logs.

github-actions · 2026-05-19T23:52:16Z

Selective E2E Results — ❌ Some jobs failed

Run: 26132116276
Target ref: upgrade/all-deps-2026-05-19
Requested jobs: all (no filter)
Summary: 30 passed, 1 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-stop-start-e2e	⚠️ cancelled
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	⚠️ cancelled
docs-validation-e2e	✅ success
double-onboard-e2e	⚠️ cancelled
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	❌ failure
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	⚠️ cancelled
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	⚠️ cancelled
network-policy-e2e	✅ success
onboard-repair-e2e	⚠️ cancelled
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	⚠️ cancelled
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	⚠️ cancelled
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	⚠️ cancelled
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	⚠️ cancelled
telegram-injection-e2e	✅ success
token-rotation-e2e	⚠️ cancelled
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	⚠️ cancelled

Failed jobs: hermes-inference-switch-e2e. Check run artifacts for logs.

github-actions · 2026-05-19T23:56:49Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26132424944
Target ref: upgrade/all-deps-2026-05-19
Requested jobs: hermes-inference-switch-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
hermes-inference-switch-e2e	✅ success

github-actions · 2026-05-20T00:14:11Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26132979480
Target ref: 6cb5c324de2954d847b9e40afbda347d68f0acb6
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: rebuild-hermes-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
rebuild-hermes-e2e	✅ success

github-actions · 2026-05-20T00:32:07Z

Selective E2E Results — ❌ Some jobs failed

Run: 26132662112
Target ref: 6cb5c324de2954d847b9e40afbda347d68f0acb6
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: all (no filter)
Summary: 41 passed, 2 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-stop-start-e2e	❌ failure
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	✅ success
docs-validation-e2e	✅ success
double-onboard-e2e	✅ success
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
onboard-repair-e2e	✅ success
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	✅ success
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	❌ failure
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	✅ success
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	✅ success
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	✅ success
telegram-injection-e2e	✅ success
token-rotation-e2e	✅ success
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

Failed jobs: channels-stop-start-e2e, rebuild-hermes-e2e. Check run artifacts for logs.

github-actions · 2026-05-20T00:48:45Z

Selective E2E Results — ❌ Some jobs failed

Run: 26134337857
Target ref: b4d043b39
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: channels-stop-start-e2e,rebuild-hermes-e2e
Summary: 0 passed, 2 failed, 0 skipped

Job	Result
channels-stop-start-e2e	❌ failure
rebuild-hermes-e2e	❌ failure

Failed jobs: channels-stop-start-e2e, rebuild-hermes-e2e. Check run artifacts for logs.

github-actions · 2026-05-20T01:25:22Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26134434964
Target ref: b4d043b393832408208c9c3268765b21da56bfc3
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: channels-stop-start-e2e,rebuild-hermes-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
channels-stop-start-e2e	✅ success
rebuild-hermes-e2e	✅ success

github-actions · 2026-05-20T01:59:56Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26135563540
Target ref: b4d043b393832408208c9c3268765b21da56bfc3
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: all (no filter)
Summary: 43 passed, 0 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-stop-start-e2e	✅ success
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	✅ success
docs-validation-e2e	✅ success
double-onboard-e2e	✅ success
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
onboard-repair-e2e	✅ success
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	✅ success
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	✅ success
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	✅ success
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	✅ success
telegram-injection-e2e	✅ success
token-rotation-e2e	✅ success
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

+
+  if (deltaSplit) {
+    if (!partialChanged) changed = applySafeExecSplitAtContentIndex(event.partial, deltaSplit) || changed;
+    if (!messageChanged) changed = applySafeExecSplitAtContentIndex(event.message, deltaSplit) || changed;


github-actions · 2026-05-20T03:31:04Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26138952562
Target ref: 2740a01224181afc40619c1995db94655c3a9397
Workflow ref: main
Requested jobs: cloud-e2e,openshell-gateway-upgrade-e2e,sandbox-operations-e2e,network-policy-e2e,kimi-inference-compat-e2e,messaging-providers-e2e,hermes-e2e,rebuild-openclaw-e2e,upgrade-stale-sandbox-e2e,launchable-smoke-e2e
Summary: 9 passed, 0 failed, 0 skipped

Job	Result
cloud-e2e	✅ success
hermes-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
openshell-gateway-upgrade-e2e	⚠️ cancelled
rebuild-openclaw-e2e	✅ success
sandbox-operations-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

github-actions · 2026-05-20T03:35:16Z

Selective E2E Results — ❌ Some jobs failed

Run: 26139099322
Target ref: 2740a01224181afc40619c1995db94655c3a9397
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: all (no filter)
Summary: 41 passed, 1 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-stop-start-e2e	⚠️ cancelled
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	✅ success
docs-validation-e2e	✅ success
double-onboard-e2e	✅ success
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	❌ failure
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
onboard-negative-paths-e2e	✅ success
onboard-repair-e2e	✅ success
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	✅ success
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	✅ success
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	✅ success
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	✅ success
telegram-injection-e2e	✅ success
token-rotation-e2e	⚠️ cancelled
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

Failed jobs: launchable-smoke-e2e. Check run artifacts for logs.

github-actions · 2026-05-20T03:47:32Z

Selective E2E Results — ❌ Some jobs failed

Run: 26139491162
Target ref: 3236e08c1d7a91397b2f98cea5b918dec7625fb6
Workflow ref: main
Requested jobs: cloud-e2e,cloud-onboard-e2e,openshell-gateway-upgrade-e2e,launchable-smoke-e2e,sandbox-operations-e2e,issue-2478-crash-loop-recovery-e2e,kimi-inference-compat-e2e,messaging-compatible-endpoint-e2e,messaging-providers-e2e,credential-sanitization-e2e,network-policy-e2e,brave-search-e2e,openclaw-inference-switch-e2e,inference-routing-e2e,rebuild-openclaw-e2e,upgrade-stale-sandbox-e2e,hermes-e2e,hermes-inference-switch-e2e
Summary: 15 passed, 3 failed, 0 skipped

Job	Result
brave-search-e2e	✅ success
cloud-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-sanitization-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	❌ failure
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
openclaw-inference-switch-e2e	❌ failure
openshell-gateway-upgrade-e2e	❌ failure
rebuild-openclaw-e2e	✅ success
sandbox-operations-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

Failed jobs: hermes-inference-switch-e2e, openclaw-inference-switch-e2e, openshell-gateway-upgrade-e2e. Check run artifacts for logs.

github-actions · 2026-05-20T04:10:54Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26139623838
Target ref: 3236e08c1d7a91397b2f98cea5b918dec7625fb6
Workflow ref: upgrade/all-deps-2026-05-19
Requested jobs: all (no filter)
Summary: 44 passed, 0 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-stop-start-e2e	✅ success
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	✅ success
docs-validation-e2e	✅ success
double-onboard-e2e	✅ success
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
onboard-negative-paths-e2e	✅ success
onboard-repair-e2e	✅ success
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	✅ success
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	✅ success
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	✅ success
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	✅ success
telegram-injection-e2e	✅ success
token-rotation-e2e	✅ success
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

ericksoa · 2026-05-20T04:18:27Z

Expanded Issue Sweep: Likely Resolved / Retest Candidates From This Upgrade

I did a second pass across all 296 open NemoClaw issues, this time starting from the capability deltas in the upgraded runtimes rather than only searching package names. The relevant new surface area is larger than the pins themselves:

OpenClaw 2026.4.24 -> 2026.5.18: includes the proxy/FormData fix, broader plugin install/runtime-deps repair, startup/readiness latency work, leaner prompt/tool metadata, model/reasoning/streaming fixes across OpenAI-compatible, Gemini, DeepSeek/GLM/Kimi-style providers, token/usage reporting improvements, and runtime parity/token-efficiency QA gates. Release refs: https://github.com/openclaw/openclaw/releases/tag/v2026.5.2 and https://github.com/openclaw/openclaw/releases/tag/v2026.5.18.
OpenShell 0.0.39 -> 0.0.44: includes cp-style sandbox download plus workspace-boundary checks, local-domain routing, VM/container/runtime recovery work, JSON/YAML sandbox listing, nftables-based policy enforcement, inference-stream truncation regression coverage, and provider credential refresh foundation. Release refs: https://github.com/NVIDIA/OpenShell/releases/tag/v0.0.41, https://github.com/NVIDIA/OpenShell/releases/tag/v0.0.43, and https://github.com/NVIDIA/OpenShell/releases/tag/v0.0.44.
Hermes v2026.4.23 -> v2026.5.16: includes the 0.14 Foundation release: materially lighter installs, about 19s off hermes cold start, cache-first model/provider setup, lazy heavy adapters, 180x faster browser CDP calls, more provider/model routing options, OpenAI-compatible local proxy for OAuth providers, Claude prompt caching, and messaging reliability improvements. Release ref: https://github.com/NousResearch/hermes-agent/releases/tag/v2026.5.16.

High-confidence likely fixed by this PR / close after a quick confirming retest

Bump bundled OpenClaw to ≥2026.5.7 — multipart uploads from sandbox break with [object FormData] body #3246 — asks specifically for bundled OpenClaw >=2026.5.7 for the upstream proxy/FormData multipart fix; this PR ships OpenClaw 2026.5.18.
EXDEV: cross-device link not permitted — plugin installer fails when /tmp is overlay and /sandbox is ext4 #3127 and [Ubuntu 22.04][Agent] openclaw CLI fails to start in sandbox — plugin runtime install errors #3513 — both are the OpenClaw 2026.4.24 plugin runtime-deps EXDEV failure family. This PR moves to 2026.5.18 and the green full nightly proves plugin install/runtime paths on the PR head; still worth one exact VirtualBox/Ubuntu reporter-env retest before closing.
[Ubuntu 24.04][Sandbox] openshell sandbox download produces a directory instead of file content + does not reject path-traversal targets #3345 — OpenShell 0.0.41 explicitly fixed cp-style openshell sandbox download behavior and workspace-boundary checks; this PR pins OpenShell 0.0.44.
[Bug] OpenShell egress proxy "failed to resolve peer binary" — blocks all new CONNECT tunnels including Telegram #1471 and [All Platforms] see lots of error in openclaw log #1642 — both are already labeled fixed-on-latest; this PR moves NemoClaw to the current validated OpenShell/OpenClaw line and full nightly messaging/provider/gateway jobs are green.
Loop detection in openclaw not working with MCP tools — upstream already fixed (openclaw#34574) #2601 — issue text says the MCP loop-detection bug was already fixed upstream in OpenClaw; this PR moves from the affected old OpenClaw line to 2026.5.18.
[nightly-e2e] WeChat channel registration lost after OpenClaw config mutations #3844 — this PR branch contains the WeChat config mutation fix from fix(openclaw): reapply WeChat channel seed after config mutations #3839 / ce03b01, keeps the OpenClaw WeChat plugin at 2.4.3, and the fresh full nightly has channels-stop-start-e2e green.

Strong retest candidates, but I would not close without targeted repro evidence

OpenClaw discards Ollama reasoning field — model output silently lost #247, NEMOCLAW_REASONING not configurable for Option 3 providers; reasoning-only models fail silently #3279, and Cannot use thinking models like Qwen 3.6 27B - Nemoclaw validation request token length too small. #3341 — reasoning/thinking response-shape issues for Ollama, OpenAI-compatible reasoning-only models, and Qwen. The upgraded OpenClaw line has a lot of reasoning/thinking/model-compat work, but NemoClaw may still need explicit provider/model setup in some paths.
Gemini flash 3 preview not supported? Consistent run error: 400 status code (no body) and getting ? for token usage. #1752, OpenClaw TUI shows no output; streamed NVIDIA inference fails with "error decoding response body" #736, [Jentson Orin][aarch64] OpenClaw TUI HTTP 503 "inference service unavailable" #1908, [NemoClaw][macOS][Agent&Skills] Streamed agent response truncated mid-sentence with no error event #2619, and [Station][Inference] TUI token counter shows ? after successful ollama-local inference instead of actual token usage #2747 — Gemini/OpenClaw streaming, timeout, truncated-response, and token-usage symptoms. These map to OpenClaw/OpenShell streaming/model/usage fixes and the PR head passed cloud-inference-e2e, openclaw-inference-switch-e2e, inference-routing-e2e, and both Bedrock-compatible jobs, but the exact user platforms should be rerun.
[DGX Spark][Agent&Skills] Trivial "hello" agent turn takes ~10s P50 / 17s max on local Ollama (nemotron-3-nano:30b) #2598 and Model performance / capability audit across supported agents #3123 — model performance/capability audit surface. This branch includes the runtime upgrades plus the mainline tool-catalog latency work from perf(nemotron): reduce sandbox tool-catalog latency #3808, so it should materially improve the baseline, but it is not a substitute for the audit matrix.
[Brev][Agent&Skills] Telegram provider initialization fails with Network request errors on v0.0.38 — deleteWebhook and setMyCommands never succeed #3339 — Telegram startup/init network errors on OpenClaw 2026.4.24; the OpenClaw release window includes Telegram startup/networking fixes and the PR nightly passed Telegram coverage, but the Brev setup should be retried.
[Nemoclaw] [All Platforms] onboard with hermes agent fails #3359 and [Nemoclaw] [All Platforms]Hermes onboarding: sandbox build succeeds but sandbox never reaches Ready and is deleted after 180s timeout #3764 — Hermes onboard/readiness failures. Hermes v2026.5.16 plus OpenShell 0.0.44 is a much better runtime target and the full nightly passed hermes-e2e, hermes-inference-switch-e2e, rebuild-hermes-e2e, and rebuild-hermes-stale-base-e2e.
[Ubuntu 24.04][Agent&Skills] NemoHermes Slack socket-mode idle reconnect silently drops inbound @mention messages #3582 — Hermes Slack idle reconnect/event delivery. Hermes 0.14 has gateway/platform resilience work and the nightly passed hermes-slack-e2e, but the exact idle-reconnect repro is narrower than current E2E coverage.
feat(messaging): add WeChat as a first-class provider for NemoClaw agents #3006 — this PR advances WeChat materially via plugin 2.4.3, channel seeding, config mutation survival, and nightly coverage. I would call it partial progress, not a full close of the first-class WeChat provider request unless the remaining CLI/docs/diagnostics acceptance criteria are already covered elsewhere.

Explicit non-claims from this sweep

I did not mark broader Slack issues (#1569, #2024, #2031, #3014, #3708, #3753) as resolved from the dependency upgrade alone. The PR has stronger Slack proof coverage and green messaging jobs, but those issues need live Slack token/socket/event-path repros before closure.

I also did not claim #3779, #3707, #3645, or #3850: #3779 points to a separate GLM compat PR, #3707 is primarily hardware/model throughput under a heavy local Qwen agent shape, #3645 already reports Hermes 2026.5.16, and #3850 is a NemoClaw onboarding feature request rather than something the runtime upgrade itself provides.

Validation reference: full nightly https://github.com/NVIDIA/NemoClaw/actions/runs/26139623838 passed on PR head 3236e08c1d7a91397b2f98cea5b918dec7625fb6.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-test.sh`:
- Around line 258-261: The test is asserting a non-existent field
restored.wizard.lastRunVersion; either make the snapshot restoration produce
that field in the OpenClaw config (ensure scripts/generate-openclaw-config.py
populates wizard.lastRunVersion before snapshot) or, more simply, change the
assertion to check an actual generated field such as (restored.meta ||
{}).lastTouchedVersion === '2026.3.11' (or another stable field created by the
generator) so the test validates a real restored value instead of undefined.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f0ad02e7-367f-4ae8-b4d9-44c038c22c59

📥 Commits

Reviewing files that changed from the base of the PR and between 6cb5c32 and 3236e08.

📒 Files selected for processing (20)

.agents/skills/nemoclaw-user-reference/references/commands.md
Dockerfile
docs/reference/commands.mdx
nemoclaw-blueprint/openclaw-plugins/kimi-inference-compat/index.js
scripts/generate-openclaw-config.py
scripts/patch-openclaw-tool-catalog.js
scripts/seed-wechat-accounts.py
src/lib/onboard.ts
test/e2e-test.sh
test/e2e/docs/parity-map.yaml
test/e2e/lib/openclaw-agent-json.py
test/e2e/test-issue-2478-crash-loop-recovery.sh
test/e2e/test-sandbox-operations.sh
test/fetch-guard-patch-regression.test.ts
test/generate-openclaw-config.test.ts
test/kimi-inference-compat-plugin.test.ts
test/openclaw-agent-json.test.ts
test/openclaw-tool-catalog-patch.test.ts
test/policies.test.ts
test/seed-wechat-accounts.test.ts

💤 Files with no reviewable changes (3)

test/seed-wechat-accounts.test.ts
test/openclaw-tool-catalog-patch.test.ts
test/policies.test.ts

✅ Files skipped from review due to trivial changes (1)

docs/reference/commands.mdx

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-20T05:20:37Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26142871395
Target ref: d4df1e4443ff24202673c32c9d3646bc957294e7
Workflow ref: main
Requested jobs: cloud-e2e,openshell-gateway-upgrade-e2e,kimi-inference-compat-e2e,openclaw-inference-switch-e2e,messaging-compatible-endpoint-e2e,channels-stop-start-e2e,hermes-e2e,network-policy-e2e,sandbox-operations-e2e,rebuild-openclaw-e2e,launchable-smoke-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
channels-stop-start-e2e	⚠️ cancelled
cloud-e2e	⚠️ cancelled
hermes-e2e	⚠️ cancelled
kimi-inference-compat-e2e	⚠️ cancelled
launchable-smoke-e2e	⚠️ cancelled
messaging-compatible-endpoint-e2e	⚠️ cancelled
network-policy-e2e	⚠️ cancelled
openclaw-inference-switch-e2e	⚠️ cancelled
openshell-gateway-upgrade-e2e	⚠️ cancelled
rebuild-openclaw-e2e	⚠️ cancelled
sandbox-operations-e2e	⚠️ cancelled

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-20T05:46:03Z

Selective E2E Results — ❌ Some jobs failed

Run: 26143011902
Target ref: d4df1e4443ff24202673c32c9d3646bc957294e7
Workflow ref: main
Requested jobs: all (no filter)
Summary: 42 passed, 1 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-stop-start-e2e	⚠️ cancelled
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	✅ success
docs-validation-e2e	✅ success
double-onboard-e2e	✅ success
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
onboard-negative-paths-e2e	✅ success
onboard-repair-e2e	✅ success
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	❌ failure
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	✅ success
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	✅ success
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	✅ success
telegram-injection-e2e	✅ success
token-rotation-e2e	✅ success
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

Failed jobs: openshell-gateway-upgrade-e2e. Check run artifacts for logs.

coderabbitai

🧹 Nitpick comments (1)

test/e2e/test-openshell-gateway-upgrade.sh (1)

148-148: ⚡ Quick win

Consider using the CURRENT_OPENSHELL_VERSION variable in mock version strings.

The hardcoded version strings in the fake openshell binaries could become inconsistent if the default CURRENT_OPENSHELL_VERSION is updated but these mocks are not. Using the variable would ensure they stay synchronized and prevent misleading test output.

♻️ Suggested refactor to use variable expansion

For the first mock (around line 143):

-  cat >"$fake_bin/openshell" <<'EOF'
+  cat >"$fake_bin/openshell" <<EOF
 #!/usr/bin/env bash
 # request-body-credential-rewrite
 # websocket-credential-rewrite
-if [ "${1:-}" = "--version" ]; then
-  printf 'openshell 0.0.44\n'
+if [ "\${1:-}" = "--version" ]; then
+  printf 'openshell ${CURRENT_OPENSHELL_VERSION}\n'
   exit 0
 fi
 exit 99

Apply the same pattern to the second mock (around line 233):

-  cat >"$fake_bin/openshell" <<'EOF'
+  cat >"$fake_bin/openshell" <<EOF
 #!/usr/bin/env bash
 # request-body-credential-rewrite
 # websocket-credential-rewrite
-if [ "${1:-}" = "--version" ]; then
-  printf 'openshell 0.0.44\n'
+if [ "\${1:-}" = "--version" ]; then
+  printf 'openshell ${CURRENT_OPENSHELL_VERSION}\n'
   exit 0
 fi
 exit 99

Note: Change heredoc delimiter from <<'EOF' to <<EOF to enable variable expansion, and escape literal $ in the generated script with \$.

Also applies to: 238-238

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/test-openshell-gateway-upgrade.sh` at line 148, Replace hardcoded
mock version strings with the CURRENT_OPENSHELL_VERSION variable: update the
printf call(s) that currently print 'openshell 0.0.44' to use printf "openshell
$CURRENT_OPENSHELL_VERSION\n" (or equivalent double-quoted expansion) and change
any surrounding heredoc delimiters from <<'EOF' to <<EOF so variable expansion
occurs; where the mock generates scripts that must contain literal dollar signs,
escape them as \$ to avoid unintended expansion. Target the mock generator
blocks that produce the fake openshell binaries and the printf lines (search for
printf 'openshell 0.0.44' and the heredoc blocks around them) and make these
substitutions to keep mocks in sync with CURRENT_OPENSHELL_VERSION.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/e2e/test-openshell-gateway-upgrade.sh`:
- Line 148: Replace hardcoded mock version strings with the
CURRENT_OPENSHELL_VERSION variable: update the printf call(s) that currently
print 'openshell 0.0.44' to use printf "openshell $CURRENT_OPENSHELL_VERSION\n"
(or equivalent double-quoted expansion) and change any surrounding heredoc
delimiters from <<'EOF' to <<EOF so variable expansion occurs; where the mock
generates scripts that must contain literal dollar signs, escape them as \$ to
avoid unintended expansion. Target the mock generator blocks that produce the
fake openshell binaries and the printf lines (search for printf 'openshell
0.0.44' and the heredoc blocks around them) and make these substitutions to keep
mocks in sync with CURRENT_OPENSHELL_VERSION.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4999ea9e-c2a9-4d7d-ad5e-d33341c4e5b3

📥 Commits

Reviewing files that changed from the base of the PR and between d4df1e4 and cd41b2b.

📒 Files selected for processing (1)

test/e2e/test-openshell-gateway-upgrade.sh

github-actions · 2026-05-20T05:48:57Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26143909397
Target ref: cd41b2b4f5ebf6758a62f4c3036c6817f244db11
Workflow ref: main
Requested jobs: cloud-e2e,cloud-onboard-e2e,openshell-gateway-upgrade-e2e,kimi-inference-compat-e2e,messaging-providers-e2e,messaging-compatible-endpoint-e2e,channels-stop-start-e2e,openclaw-inference-switch-e2e,bedrock-runtime-compatible-anthropic-e2e,network-policy-e2e,sandbox-operations-e2e,hermes-e2e,launchable-smoke-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	⚠️ cancelled
channels-stop-start-e2e	⚠️ cancelled
cloud-e2e	⚠️ cancelled
cloud-onboard-e2e	⚠️ cancelled
hermes-e2e	⚠️ cancelled
kimi-inference-compat-e2e	⚠️ cancelled
launchable-smoke-e2e	⚠️ cancelled
messaging-compatible-endpoint-e2e	⚠️ cancelled
messaging-providers-e2e	⚠️ cancelled
network-policy-e2e	⚠️ cancelled
openclaw-inference-switch-e2e	⚠️ cancelled
openshell-gateway-upgrade-e2e	⚠️ cancelled
sandbox-operations-e2e	⚠️ cancelled

github-actions · 2026-05-20T06:24:22Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26144040521
Target ref: cd41b2b4f5ebf6758a62f4c3036c6817f244db11
Workflow ref: main
Requested jobs: all (no filter)
Summary: 44 passed, 0 failed, 2 skipped

Job	Result
bedrock-runtime-compatible-anthropic-e2e	✅ success
brave-search-e2e	✅ success
channels-stop-start-e2e	✅ success
cloud-e2e	✅ success
cloud-inference-e2e	✅ success
cloud-onboard-e2e	✅ success
credential-migration-e2e	✅ success
credential-sanitization-e2e	✅ success
device-auth-health-e2e	✅ success
diagnostics-e2e	✅ success
docs-validation-e2e	✅ success
double-onboard-e2e	✅ success
gpu-double-onboard-e2e	⏭️ skipped
gpu-e2e	⏭️ skipped
hermes-discord-e2e	✅ success
hermes-e2e	✅ success
hermes-inference-switch-e2e	✅ success
hermes-slack-e2e	✅ success
inference-routing-e2e	✅ success
issue-2478-crash-loop-recovery-e2e	✅ success
kimi-inference-compat-e2e	✅ success
launchable-smoke-e2e	✅ success
messaging-compatible-endpoint-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
onboard-negative-paths-e2e	✅ success
onboard-repair-e2e	✅ success
onboard-resume-e2e	✅ success
openclaw-inference-switch-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
openshell-gateway-upgrade-e2e	✅ success
overlayfs-autofix-e2e	✅ success
rebuild-hermes-e2e	✅ success
rebuild-hermes-stale-base-e2e	✅ success
rebuild-openclaw-e2e	✅ success
runtime-overrides-e2e	✅ success
sandbox-operations-e2e	✅ success
sandbox-survival-e2e	✅ success
shields-config-e2e	✅ success
skill-agent-e2e	✅ success
snapshot-commands-e2e	✅ success
state-backup-restore-e2e	✅ success
telegram-injection-e2e	✅ success
token-rotation-e2e	✅ success
tunnel-lifecycle-e2e	✅ success
upgrade-stale-sandbox-e2e	✅ success

## Summary - Reverts the squash commit from PR #3832 exactly: b7deb55 - Restores dependency/runtime versions and OpenClaw remediation files to the pre-#3832 state while preserving the later main commit fix(snapshot): use gateway metadata for VM-driver health checks (#3784) ## Verification - git revert --signoff --no-edit b7deb55 applied cleanly from current origin/main - git diff --check HEAD^ HEAD Note: This PR intentionally undoes the merged dependency upgrade. It has not been merged.  ## Summary by CodeRabbit ## Release Notes * **Chores** * Updated OpenClaw to version 2026.4.24, OpenShell to 0.0.39, and Hermes to 2026.4.23. * Updated WeChat plugin dependency from 2.4.3 to 2.4.2. * Streamlined WeChat account configuration logic and refined tool-call handling in Kimi inference compatibility. * Updated internal test suites and validation scripts.  [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3924?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)   Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

chore: upgrade agent runtime dependencies

68b19c8

fix: update OpenClaw source patch for 2026.5.18

68b852f

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

fix: remediate OpenClaw 2026.5.18 nightly drift

23f5dec

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

Comment thread nemoclaw-blueprint/openclaw-plugins/kimi-inference-compat/index.js Outdated

Comment thread test/e2e/test-issue-2478-crash-loop-recovery.sh Outdated

Comment thread test/e2e/test-issue-2478-crash-loop-recovery.sh

Comment thread test/policies.test.ts Outdated

fix: avoid transient OpenClaw thinking flag in e2e

6cb5c32

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

Comment thread test/e2e/test-sandbox-operations.sh

Comment thread test/e2e/test-sandbox-operations.sh Outdated

fix: seed preinstalled OpenClaw WeChat channel

b4d043b

ericksoa self-assigned this May 20, 2026

fix: resolve main merge and review feedback

4155bd2

github-advanced-security AI found potential problems May 20, 2026

View reviewed changes

fix: parse streamed OpenClaw agent JSON

3236e08

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread test/e2e-test.sh Outdated

cv approved these changes May 20, 2026

View reviewed changes

fix: validate restored openclaw snapshot content

d4df1e4

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

fix: use checked-out ref in gateway upgrade e2e

cd41b2b

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

jyaunches merged commit b7deb55 into main May 20, 2026
30 checks passed

jyaunches deleted the upgrade/all-deps-2026-05-19 branch May 20, 2026 20:17

ericksoa mentioned this pull request May 20, 2026

Revert "chore: upgrade agent runtime dependencies (#3832)" #3924

Merged

ericksoa mentioned this pull request May 20, 2026

chore: upgrade agent runtime dependencies #3925

Merged

laitingsheng mentioned this pull request May 28, 2026

[Ubuntu 24.04][Sandbox] openshell sandbox download produces a directory instead of file content + does not reject path-traversal targets #3345

Closed

This was referenced Jun 11, 2026

Loop detection in openclaw not working with MCP tools — upstream already fixed (openclaw#34574) #2601

Closed

Bump bundled OpenClaw to ≥2026.5.7 — multipart uploads from sandbox break with [object FormData] body #3246

Closed

Conversation

ericksoa commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 19, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 20, 2026

ericksoa commented May 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 19, 2026 •

edited

Loading

github-actions Bot commented May 19, 2026 •

edited

Loading

ericksoa commented May 20, 2026 •

edited

Loading