test: release blocker validation run by ericksoa · Pull Request #4119 · NVIDIA/NemoClaw

ericksoa · 2026-05-23T02:39:09Z

Summary

temporary validation PR to run selective E2E coverage for release-blocker close calls
adds validation-only workflow wiring for OpenClaw TUI/chat correlation and [DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 optional GPU proof
hardens the TUI/chat validation wrapper so PR runners use repo Vitest instead of transient npx installs
updates the TUI/chat live repro to current OpenClaw gateway protocol v4
hardens token-rotation output assertions after PR evidence showed a pipefail/grep harness false negative
not intended to merge into main as-is

Validation runs

Slack/main-head selective run: https://github.com/NVIDIA/NemoClaw/actions/runs/26320935520
PR-head Slack-only selective run: https://github.com/NVIDIA/NemoClaw/actions/runs/26321312107
PR-head full proof run: https://github.com/NVIDIA/NemoClaw/actions/runs/26321372633
PR-head corrected TUI/chat rerun v2: https://github.com/NVIDIA/NemoClaw/actions/runs/26321747082
PR-head corrected token-rotation rerun: https://github.com/NVIDIA/NemoClaw/actions/runs/26321830179
PR-head corrected TUI/chat rerun v3: https://github.com/NVIDIA/NemoClaw/actions/runs/26321864900

Current evidence

[DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600: code-level optional GPU proof guard passed; platform hardware proof still required.
[Linux][Agent&Skills] TUI chat previous message disappears from UI after reconnect or scroll #2603/[All Platforms][TUI chat] TUI chat — sequential message response is not in order and runs twice #3145: TUI/chat v3 reached live assertions on OpenClaw 2026.5.18 and failed with empty final events for submitted runs; do not close.
[Sandbox][Windows ARM WSL] Slack onboarding has 2 policy gaps #2758: messaging providers, Slack pairing, network-policy, channels-stop-start, and corrected token-rotation all passed on PR-run evidence; close/sync candidate.

Issue gates

[Sandbox][Windows ARM WSL] Slack onboarding has 2 policy gaps #2758: network-policy-e2e, token-rotation-e2e, channels-stop-start-e2e, messaging-providers-e2e, openclaw-slack-pairing-e2e
[All Platforms][TUI chat] TUI chat — sequential message response is not in order and runs twice #3145/[Linux][Agent&Skills] TUI chat previous message disappears from UI after reconnect or scroll #2603: openclaw-tui-chat-correlation-e2e
[DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600: issue-3600-gpu-proof-optional-e2e

This PR is for test evidence only.

Summary by CodeRabbit

New Features
- Added two conditional nightly E2E jobs (TUI/chat correlation and optional GPU-proof check).
- Included a build-time runtime patch step and added a runtime patch script to images.
CI / Chores
- Updated workflow dispatch inputs and wired new jobs into failure/reporting pipelines.
- Pinned Node setup action to a specific commit.
Tests
- Added E2E harness and unit tests for the patch and expanded chat-correlation regression checks.
Config
- Updated E2E path instructions to include the new job and related files.

copy-pr-bot · 2026-05-23T02:39:13Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-05-23T02:39:15Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9f90a60d-cb9e-4e95-9312-c8ad455b5fa3

📥 Commits

Reviewing files that changed from the base of the PR and between bf94b84 and 1c6a7af.

📒 Files selected for processing (1)

.coderabbit.yaml

📝 Walkthrough

Walkthrough

Adds a CLI patch that adjusts OpenClaw runtime JS for chat/send run-id correlation and idempotency, integrates the script into Docker and sandbox staging, provides unit and E2E tests (including a cloud-backed TUI correlation harness), extends correlation trace analysis, and wires two selective-dispatch nightly CI jobs into aggregators.

Changes

OpenClaw Chat Correlation Compatibility Layer

Layer / File(s)	Summary
Patch Script Core Implementation `scripts/patch-openclaw-chat-send.js`	Implements a CLI that scans a `dist` dir, finds exactly one `chat.send`, `get-reply`, and followup-runner runtime file each, injects run-id correlation, `idempotencyKey`, empty-final-event suppression, and queue-mode adjustments, and verifies injected markers.
Patch Script Testing `test/openclaw-chat-send-patch.test.ts`	Vitest suite writes runtime fixtures, runs the patch, asserts expected marker injections and idempotency, and includes a negative “fails closed” case for mismatched runtime shapes.
Patch Integration in Docker Build and Staging `Dockerfile`, `src/lib/sandbox/build-context.ts`, `test/sandbox-build-context.test.ts`, `.coderabbit.yaml`	Dockerfile copies and chmods `patch-openclaw-chat-send.js` and runs it at build time against OpenClaw `dist`; optimized sandbox staging now includes the script and tests/assertions/path_instructions are updated.
Live Correlation Trace Analysis Enhancement `test/openclaw-tui-chat-correlation.test.ts`	Extends `Issue2603Analysis` with `missingReplies`, `duplicateReplies`, `missingUserTurns`; reworks `analyzeIssue2603Trace` to compute these via visible/final-count maps; updates failure summaries, raises live repro websocket protocol to 4, and adds a delta+final idempotency unit test.
E2E Harness and CI Integration `test/e2e/test-openclaw-tui-chat-correlation.sh`, `.github/workflows/nightly-e2e.yaml`, `.coderabbit.yaml`	Adds a shell E2E harness that provisions a cloud sandbox, verifies OpenClaw `2026.5.18`, ensures dev deps, runs the live Vitest correlation test, and performs optional cleanup. Adds `openclaw-tui-chat-correlation-e2e` and `issue-3600-gpu-proof-optional-e2e` nightly jobs and wires them into `notify-on-failure`, `report-to-pr`, and `scorecard`; pins one `actions/setup-node` step to a commit SHA.
Test Refactoring and Regression Alignment `test/e2e/test-token-rotation.sh`, `test/fetch-guard-patch-regression.test.ts`	Replaces echo-pipe grep pipelines with here-string greps across token-rotation assertions and updates the Dockerfile patch extraction end-marker to the new chat.send patch block.

Sequence Diagram

sequenceDiagram
  participant Dev as Patch Script
  participant Docker as Docker Build
  participant Dist as OpenClaw dist
  participant Sandbox as Provisioned Sandbox
  participant Vitest as Vitest Live Test
  participant Analysis as Correlation Analysis
  Dev->>Docker: COPY patch-openclaw-chat-send.js into image
  Docker->>Dist: RUN patch script against compiled dist
  Docker->>Sandbox: produce image with patched runtime
  Sandbox->>Vitest: run TUI correlation harness against 2026.5.18
  Vitest->>Sandbox: send prompts via websocket
  Sandbox->>Vitest: stream delta and final chat events
  Vitest->>Analysis: collect and analyze trace events
  Analysis->>Analysis: compute missing/duplicate/correlation metrics
  Vitest->>Vitest: assert metrics empty

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

NVIDIA/NemoClaw#3869: Dockerfile fetch-guard patch extraction and regression test adjustments align with changes to Dockerfile patch sections introduced in that PR.

Suggested labels

enhancement: testing, E2E, Integration: OpenClaw, Docker, NV QA

Suggested reviewers

jyaunches
cv

Poem

🐰 I hopped a tiny patch into the Claw tonight,
Run-ids snug and idempotent, replies aligned right,
Sandboxes spin up, Vitest taps the keys,
Traces match prompts and calm the test-day breeze,
Hooray — correlation sleeps tight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'test: release blocker validation run' accurately describes the main purpose of this PR, which is to add validation-only E2E coverage and testing infrastructure for release-blocker checks.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/release-blocker-validation-20260522

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-23T02:39:40Z

E2E Advisor Recommendation

Required E2E: openclaw-tui-chat-correlation-e2e, cloud-e2e, sandbox-survival-e2e, rebuild-openclaw-e2e, hermes-e2e
Optional E2E: token-rotation-e2e, issue-3600-gpu-proof-optional-e2e, openshell-gateway-upgrade-e2e

Dispatch hint: openclaw-tui-chat-correlation-e2e,cloud-e2e,sandbox-survival-e2e,rebuild-openclaw-e2e,hermes-e2e

Auto-dispatched E2E: cloud-e2e, sandbox-survival-e2e, rebuild-openclaw-e2e, hermes-e2e via nightly-e2e.yaml at 1c6a7afb195be5424fb497c0bee6227b46829169 — nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

openclaw-tui-chat-correlation-e2e (high): Primary required coverage for the new OpenClaw chat.send patch. It builds/onboards a fresh OpenClaw sandbox and runs the live gateway/WebChat correlation harness against rapid sequential sends, validating that the patched behavior is actually baked into the sandbox image.
cloud-e2e (high): Dockerfile and optimized build-context changes affect the sandbox container image used by normal OpenClaw onboarding. Run the full onboard plus cloud inference path to catch image build, startup, and basic live inference regressions.
sandbox-survival-e2e (high): The container image change can affect sandbox boot and gateway/runtime resilience. This verifies the sandbox survives gateway restart and continues to serve workspace and inference flows.
rebuild-openclaw-e2e (high): The PR changes files copied into the sandbox build context and the Dockerfile patch sequence. Rebuild coverage is needed to verify OpenClaw workspace state survives image rebuilds with the new patch asset included.
hermes-e2e (high): Although the new shim targets OpenClaw, the Dockerfile is a shared sandbox image construction path. Run Hermes onboard and inference to ensure the image build changes do not regress the multi-agent runtime path.

Optional E2E

token-rotation-e2e (high): Only the E2E script assertions were refactored from pipe grep to here-string grep; no credential runtime logic changed. Useful to validate the edited harness, but not merge-blocking for product behavior.
issue-3600-gpu-proof-optional-e2e (low): The workflow adds this new selective-dispatch job. There is no corresponding onboard source change in this PR, so it is mainly useful to prove the new workflow job is wired correctly.
openshell-gateway-upgrade-e2e (high): The workflow changes the setup-node action pin for this job only. Optional smoke of the edited job definition; the PR does not change gateway upgrade runtime code.

New E2E recommendations

None.

Dispatch hint

Workflow: nightly-e2e.yaml
jobs input: openclaw-tui-chat-correlation-e2e,cloud-e2e,sandbox-survival-e2e,rebuild-openclaw-e2e,hermes-e2e

github-actions · 2026-05-23T02:39:41Z

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

None.

Relevant changed files

None.

github-actions · 2026-05-23T02:40:12Z

PR Review Advisor

Findings: 3 needs attention, 1 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 4 still apply, 0 new items found

Review findings

🛠️ Needs attention

Secret-bearing workflow runs code from target_ref (.github/workflows/nightly-e2e.yaml:407): The OpenClaw TUI/chat job checks out `${{ inputs.target_ref || github.ref }}`, installs dependencies, and runs scripts from that checked-out tree while exposing `NVIDIA_API_KEY` and `GITHUB_TOKEN`. The workflow documents `target_ref` as a way for the trusted main workflow to test a PR head SHA; if that SHA contains untrusted changes, the checked-out `package.json`, npm lifecycle scripts, bash harness, or Vitest tests can read or exfiltrate secrets or abuse checkout credentials. This expands the trusted-code boundary in a high-risk E2E path. This finding was present in the previous advisor review and still applies.
- Recommendation: Keep workflow and harness code trusted: run the E2E driver from `origin/main` or another reviewed ref, checkout the candidate target into a separate code-under-test directory, disable persisted checkout credentials where not needed, and do not expose repository secrets to scripts sourced from PR-controlled refs. If PR-head execution is required, gate it behind maintainer-only approval and a secret-free mode.
- Evidence: The job uses `actions/checkout` with `ref: ${{ inputs.target_ref || github.ref }}`; later steps run `npm ci --include=dev` and `bash test/e2e/test-openclaw-tui-chat-correlation.sh` from that checkout with `NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}` and `GITHUB_TOKEN: ${{ github.token }}`.
[DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 validation is only a source-order check (.github/workflows/nightly-e2e.yaml:466): The linked [DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 issue expects onboard to complete on the affected GPU path after an optional direct GPU proof failure. The added `issue-3600-gpu-proof-optional-e2e` job does not run onboard, does not exercise `verifyDirectSandboxGpu`, and does not simulate the optional proof failure path; it only scans source text for one string before another string. That is not sufficient evidence for the issue gate described by the issue and closure comment. This finding was present in the previous advisor review and still applies.
- Recommendation: Replace or supplement this with a behavioral test that invokes the relevant onboard/GPU-proof path with an optional proof failure and asserts onboard continues without the fatal throw. If GB300 hardware is unavailable, use a focused unit/integration harness around `verifyDirectSandboxGpu` or its caller rather than a string-position test.
- Evidence: The job reads `src/lib/onboard.ts`, slices from `function verifyDirectSandboxGpu`, and checks that `if (proof.optional === true) return;` appears before `throw new Error(`GPU proof failed:`; it never executes onboard or a verifier command.
Validation-only changes would become scheduled nightly behavior (.github/workflows/nightly-e2e.yaml:405): The PR description states this is test-evidence-only and not intended to merge into main as-is, but the diff adds production workflow jobs and a Dockerfile OpenClaw runtime patch that would run or ship from main if merged. The new workflow predicates run on scheduled events whenever the job list is empty because the condition allows all non-`workflow_dispatch` events. This finding was present in the previous advisor review and still applies.
- Recommendation: Before merging to main, either remove the validation-only workflow/Dockerfile wiring or convert it into permanent, reviewed coverage with the trusted-code boundary and acceptance gaps fixed. If the PR is only for evidence collection, keep it out of main and avoid landing the scheduled-job additions.
- Evidence: The PR body says `not intended to merge into main as-is` and `This PR is for test evidence only.` The diff adds `openclaw-tui-chat-correlation-e2e` and `issue-3600-gpu-proof-optional-e2e` under the nightly workflow with predicates that run for non-dispatch events, and Dockerfile now executes `node /usr/local/lib/nemoclaw/patch-openclaw-chat-send.js` during sandbox image builds.

🔎 Worth checking

Nightly E2E hard-codes a specific OpenClaw version (test/e2e/test-openclaw-tui-chat-correlation.sh:47): The new nightly wrapper requires the sandbox OpenClaw version output to contain `2026.5.18`. If this lands as a scheduled nightly job, normal OpenClaw updates can break the validation for reasons unrelated to TUI/chat correlation, and the test may fail before reaching the regression assertions. This finding was present in the previous advisor review and still applies.
- Recommendation: Avoid hard-coding a single release version in a scheduled nightly test, or make the expected version an explicit workflow input/environment variable for one-off validation. For permanent coverage, assert required capabilities or protocol behavior instead of an exact version string.
- Evidence: The script runs `openshell sandbox exec --name "$SANDBOX_NAME" -- openclaw --version` and exits unless `grep -q "2026.5.18"` succeeds.

🌱 Nice ideas

None.

Since last review details

Current findings:

Secret-bearing workflow runs code from target_ref (.github/workflows/nightly-e2e.yaml:407): The OpenClaw TUI/chat job checks out `${{ inputs.target_ref || github.ref }}`, installs dependencies, and runs scripts from that checked-out tree while exposing `NVIDIA_API_KEY` and `GITHUB_TOKEN`. The workflow documents `target_ref` as a way for the trusted main workflow to test a PR head SHA; if that SHA contains untrusted changes, the checked-out `package.json`, npm lifecycle scripts, bash harness, or Vitest tests can read or exfiltrate secrets or abuse checkout credentials. This expands the trusted-code boundary in a high-risk E2E path. This finding was present in the previous advisor review and still applies.
- Recommendation: Keep workflow and harness code trusted: run the E2E driver from `origin/main` or another reviewed ref, checkout the candidate target into a separate code-under-test directory, disable persisted checkout credentials where not needed, and do not expose repository secrets to scripts sourced from PR-controlled refs. If PR-head execution is required, gate it behind maintainer-only approval and a secret-free mode.
- Evidence: The job uses `actions/checkout` with `ref: ${{ inputs.target_ref || github.ref }}`; later steps run `npm ci --include=dev` and `bash test/e2e/test-openclaw-tui-chat-correlation.sh` from that checkout with `NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}` and `GITHUB_TOKEN: ${{ github.token }}`.
[DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 validation is only a source-order check (.github/workflows/nightly-e2e.yaml:466): The linked [DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 issue expects onboard to complete on the affected GPU path after an optional direct GPU proof failure. The added `issue-3600-gpu-proof-optional-e2e` job does not run onboard, does not exercise `verifyDirectSandboxGpu`, and does not simulate the optional proof failure path; it only scans source text for one string before another string. That is not sufficient evidence for the issue gate described by the issue and closure comment. This finding was present in the previous advisor review and still applies.
- Recommendation: Replace or supplement this with a behavioral test that invokes the relevant onboard/GPU-proof path with an optional proof failure and asserts onboard continues without the fatal throw. If GB300 hardware is unavailable, use a focused unit/integration harness around `verifyDirectSandboxGpu` or its caller rather than a string-position test.
- Evidence: The job reads `src/lib/onboard.ts`, slices from `function verifyDirectSandboxGpu`, and checks that `if (proof.optional === true) return;` appears before `throw new Error(`GPU proof failed:`; it never executes onboard or a verifier command.
Validation-only changes would become scheduled nightly behavior (.github/workflows/nightly-e2e.yaml:405): The PR description states this is test-evidence-only and not intended to merge into main as-is, but the diff adds production workflow jobs and a Dockerfile OpenClaw runtime patch that would run or ship from main if merged. The new workflow predicates run on scheduled events whenever the job list is empty because the condition allows all non-`workflow_dispatch` events. This finding was present in the previous advisor review and still applies.
- Recommendation: Before merging to main, either remove the validation-only workflow/Dockerfile wiring or convert it into permanent, reviewed coverage with the trusted-code boundary and acceptance gaps fixed. If the PR is only for evidence collection, keep it out of main and avoid landing the scheduled-job additions.
- Evidence: The PR body says `not intended to merge into main as-is` and `This PR is for test evidence only.` The diff adds `openclaw-tui-chat-correlation-e2e` and `issue-3600-gpu-proof-optional-e2e` under the nightly workflow with predicates that run for non-dispatch events, and Dockerfile now executes `node /usr/local/lib/nemoclaw/patch-openclaw-chat-send.js` during sandbox image builds.
Nightly E2E hard-codes a specific OpenClaw version (test/e2e/test-openclaw-tui-chat-correlation.sh:47): The new nightly wrapper requires the sandbox OpenClaw version output to contain `2026.5.18`. If this lands as a scheduled nightly job, normal OpenClaw updates can break the validation for reasons unrelated to TUI/chat correlation, and the test may fail before reaching the regression assertions. This finding was present in the previous advisor review and still applies.
- Recommendation: Avoid hard-coding a single release version in a scheduled nightly test, or make the expected version an explicit workflow input/environment variable for one-off validation. For permanent coverage, assert required capabilities or protocol behavior instead of an exact version string.
- Evidence: The script runs `openshell sandbox exec --name "$SANDBOX_NAME" -- openclaw --version` and exits unless `grep -q "2026.5.18"` succeeds.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

github-actions · 2026-05-23T02:43:16Z

Selective E2E Results — ❌ Some jobs failed

Run: 26321351519
Target ref: 3f5db4fe66a851e251d73b2568846de37cba0206
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: network-policy-e2e,token-rotation-e2e,channels-stop-start-e2e,messaging-providers-e2e,openclaw-slack-pairing-e2e,openclaw-tui-chat-correlation-e2e,issue-3600-gpu-proof-optional-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
channels-stop-start-e2e	⚠️ cancelled
messaging-providers-e2e	⚠️ cancelled
network-policy-e2e	⚠️ cancelled
openclaw-slack-pairing-e2e	⚠️ cancelled
token-rotation-e2e	⚠️ cancelled
openclaw-tui-chat-correlation-e2e	❓ not reported
issue-3600-gpu-proof-optional-e2e	❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e, issue-3600-gpu-proof-optional-e2e. The reporting workflow needs to include these jobs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-23T03:01:56Z

Selective E2E Results — ❌ Some jobs failed

Run: 26321747082
Target ref: 8a1e0a4ac133eea5dab8ae88b9937ba1a1ba4346
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
openclaw-tui-chat-correlation-e2e	❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-23T03:07:41Z

Selective E2E Results — ❌ Some jobs failed

Run: 26321864900
Target ref: 8781bcb18689abf68c1a4ff4766890dc5acea336
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
openclaw-tui-chat-correlation-e2e	❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

github-actions · 2026-05-23T03:12:38Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26321312107
Target ref: 3f579fe8267c1696b5280b8cf0016d97e4de9a31
Workflow ref: main
Requested jobs: network-policy-e2e,token-rotation-e2e,channels-stop-start-e2e,messaging-providers-e2e,openclaw-slack-pairing-e2e
Summary: 5 passed, 0 failed, 0 skipped

Job	Result
channels-stop-start-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
token-rotation-e2e	✅ success

github-actions · 2026-05-23T03:18:26Z

Selective E2E Results — ❌ Some jobs failed

Run: 26321372633
Target ref: d724cfead66fc78f9962fbbe3bae9b163babbfb3
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: network-policy-e2e,token-rotation-e2e,channels-stop-start-e2e,messaging-providers-e2e,openclaw-slack-pairing-e2e,openclaw-tui-chat-correlation-e2e,issue-3600-gpu-proof-optional-e2e
Summary: 4 passed, 1 failed, 0 skipped

Job	Result
channels-stop-start-e2e	✅ success
messaging-providers-e2e	✅ success
network-policy-e2e	✅ success
openclaw-slack-pairing-e2e	✅ success
token-rotation-e2e	❌ failure
openclaw-tui-chat-correlation-e2e	❓ not reported
issue-3600-gpu-proof-optional-e2e	❓ not reported

Failed jobs: token-rotation-e2e. Check run artifacts for logs.
Missing requested jobs: openclaw-tui-chat-correlation-e2e, issue-3600-gpu-proof-optional-e2e. The reporting workflow needs to include these jobs.

github-actions · 2026-05-23T03:26:22Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26321830179
Target ref: 42ac24a7a32e350b9be486d140fbe2af397eaab4
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: token-rotation-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
token-rotation-e2e	✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-23T04:04:54Z

Selective E2E Results — ❌ Some jobs failed

Run: 26322997840
Target ref: 36aa2cee58bbc00f6eea2794d5e55fa71213e723
Workflow ref: main
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
openclaw-tui-chat-correlation-e2e	❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

github-actions · 2026-05-23T04:05:57Z

Selective E2E Results — ❌ Some jobs failed

Run: 26323019724
Target ref: 36aa2cee58bbc00f6eea2794d5e55fa71213e723
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
openclaw-tui-chat-correlation-e2e	❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-23T04:24:42Z

Selective E2E Results — ❌ Some jobs failed

Run: 26323387646
Target ref: 2622726d36017a967eacf2e737c56ca921bd1da2
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
openclaw-tui-chat-correlation-e2e	❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-23T04:41:04Z

Selective E2E Results — ❌ Some jobs failed

Run: 26323703975
Target ref: efe4ca78fb93bdbcca32a2392f0d13487e7298b9
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
openclaw-tui-chat-correlation-e2e	❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

github-actions · 2026-05-23T05:08:00Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26324116054
Target ref: efe4ca78fb93bdbcca32a2392f0d13487e7298b9
Workflow ref: main
Requested jobs: cloud-onboard-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
cloud-onboard-e2e	✅ success

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

.github/workflows/nightly-e2e.yaml (2)
82-109: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the new issue-3600-gpu-proof-optional-e2e job ID to workflow_dispatch.inputs.jobs valid list.

The selective-dispatch jobs contract is now incomplete: one newly added dispatchable job is missing from the documented valid list.

As per coding guidelines, "Keep the selective-dispatch “jobs” contract consistent: add/rename E2E job IDs in workflow_dispatch.inputs.jobs."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/nightly-e2e.yaml around lines 82 - 109, The
workflow_dispatch.inputs.jobs valid list is missing the new job ID; update the
list (the "Valid:" array used for workflow_dispatch.inputs.jobs) to include the
string "issue-3600-gpu-proof-optional-e2e" so the selective-dispatch contract
remains consistent with the added dispatchable job; ensure the new ID is added
alongside the other comma-separated job IDs in the same formatting style.
2439-2490: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include both new E2E jobs in aggregate needs lists (notify-on-failure, report-to-pr, scorecard).

Right now, failures from openclaw-tui-chat-correlation-e2e and issue-3600-gpu-proof-optional-e2e are omitted from nightly notifications, PR summary comments, and scorecard aggregation.

Also applies to: 2539-2590, 2696-2747
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/nightly-e2e.yaml around lines 2439 - 2490, The nightly
workflow's aggregate "needs" arrays are missing the two new E2E jobs so their
failures are not included in notifications, PR report, or scorecard; update the
needs lists used by the notify/report/scorecard jobs (the arrays currently
listing many e2e jobs) to include both "openclaw-tui-chat-correlation-e2e" and
"issue-3600-gpu-proof-optional-e2e", and apply the same addition in the other
two mirrored blocks noted (the other arrays at the same sections referenced) so
all three aggregate job definitions ("notify-on-failure", "report-to-pr",
"scorecard") include these two job names.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Line 417: Replace the mutable tag "actions/setup-node@v6" with a pinned commit
SHA (or digest) wherever it appears in the workflow (the three occurrences of
the uses entry "actions/setup-node@v6"); update each "uses:" line to reference
the full commit SHA string for actions/setup-node to prevent mutable-tag drift
and ensure reproducible runs.

---

Outside diff comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 82-109: The workflow_dispatch.inputs.jobs valid list is missing
the new job ID; update the list (the "Valid:" array used for
workflow_dispatch.inputs.jobs) to include the string
"issue-3600-gpu-proof-optional-e2e" so the selective-dispatch contract remains
consistent with the added dispatchable job; ensure the new ID is added alongside
the other comma-separated job IDs in the same formatting style.
- Around line 2439-2490: The nightly workflow's aggregate "needs" arrays are
missing the two new E2E jobs so their failures are not included in
notifications, PR report, or scorecard; update the needs lists used by the
notify/report/scorecard jobs (the arrays currently listing many e2e jobs) to
include both "openclaw-tui-chat-correlation-e2e" and
"issue-3600-gpu-proof-optional-e2e", and apply the same addition in the other
two mirrored blocks noted (the other arrays at the same sections referenced) so
all three aggregate job definitions ("notify-on-failure", "report-to-pr",
"scorecard") include these two job names.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: aac65153-9c15-45fc-b0a3-8745fbbed8c2

📥 Commits

Reviewing files that changed from the base of the PR and between c70c62c and efe4ca7.

📒 Files selected for processing (10)

.github/workflows/nightly-e2e.yaml
Dockerfile
scripts/patch-openclaw-chat-send.js
src/lib/sandbox/build-context.ts
test/e2e/test-openclaw-tui-chat-correlation.sh
test/e2e/test-token-rotation.sh
test/fetch-guard-patch-regression.test.ts
test/openclaw-chat-send-patch.test.ts
test/openclaw-tui-chat-correlation.test.ts
test/sandbox-build-context.test.ts

…-validation-20260522

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai

🧹 Nitpick comments (1)

.github/workflows/nightly-e2e.yaml (1)
470-478: ⚡ Quick win

Avoid binding this validation to the next function name.

This slice ends at the literal function upsertMessagingProviders, so an unrelated rename/reorder/formatting change in src/lib/onboard.ts will fail the workflow even if verifyDirectSandboxGpu still keeps the optional guard before the throw. Bound the check to verifyDirectSandboxGpu itself instead of the neighboring symbol.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/nightly-e2e.yaml around lines 470 - 478, The check
currently slices source up to the literal "function upsertMessagingProviders",
which couples the test to a neighboring symbol; instead locate the "function
verifyDirectSandboxGpu" start (already found in variable start) and compute the
function end by finding its matching closing brace (e.g., by scanning forward
and tracking brace depth) so you set end based on the matched '}' for
verifyDirectSandboxGpu and then slice into fn and run the same optional/throw
checks; update the logic that computes end (and the variables end/fn) so the
validation only inspects verifyDirectSandboxGpu itself rather than the next
function name.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 470-478: The check currently slices source up to the literal
"function upsertMessagingProviders", which couples the test to a neighboring
symbol; instead locate the "function verifyDirectSandboxGpu" start (already
found in variable start) and compute the function end by finding its matching
closing brace (e.g., by scanning forward and tracking brace depth) so you set
end based on the matched '}' for verifyDirectSandboxGpu and then slice into fn
and run the same optional/throw checks; update the logic that computes end (and
the variables end/fn) so the validation only inspects verifyDirectSandboxGpu
itself rather than the next function name.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c5138a20-8977-44c9-967b-e13c8b9c2b44

📥 Commits

Reviewing files that changed from the base of the PR and between efe4ca7 and 93bedf2.

📒 Files selected for processing (1)

.github/workflows/nightly-e2e.yaml

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-23T05:29:38Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26324533661
Target ref: 93bedf29a16a919fc3716f91accb87ff8ef15c36
Workflow ref: main
Requested jobs: cloud-onboard-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
cloud-onboard-e2e	✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/nightly-e2e.yaml (1)
403-496: ⚠️ Potential issue | 🟠 Major

Update .coderabbit.yaml path_instructions for the new nightly E2E jobs

.coderabbit.yaml has a path: "src/lib/onboard.ts" entry, but it doesn’t mention/route issue-3600-gpu-proof-optional-e2e.

There are no path_instructions matches for openclaw-tui-chat-correlation-e2e or its executed script test/e2e/test-openclaw-tui-chat-correlation.sh, so selective review routing for changes in that area will miss these jobs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/nightly-e2e.yaml around lines 403 - 496, The
.coderabbit.yaml path_instructions missing entries for the new nightly E2E jobs
means changes to relevant files won't be routed; update .coderabbit.yaml to add
path_instructions that match the new job targets: include an entry mapping
"openclaw-tui-chat-correlation-e2e" to the test script path
test/e2e/test-openclaw-tui-chat-correlation.sh (and any related sandbox names),
and add an entry mapping "issue-3600-gpu-proof-optional-e2e" to
src/lib/onboard.ts so changes to the verifyDirectSandboxGpu function are routed;
ensure the match patterns include both the job names and the specific paths
(test/e2e/... and src/lib/onboard.ts) so selective review triggers correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 403-496: The .coderabbit.yaml path_instructions missing entries
for the new nightly E2E jobs means changes to relevant files won't be routed;
update .coderabbit.yaml to add path_instructions that match the new job targets:
include an entry mapping "openclaw-tui-chat-correlation-e2e" to the test script
path test/e2e/test-openclaw-tui-chat-correlation.sh (and any related sandbox
names), and add an entry mapping "issue-3600-gpu-proof-optional-e2e" to
src/lib/onboard.ts so changes to the verifyDirectSandboxGpu function are routed;
ensure the match patterns include both the job names and the specific paths
(test/e2e/... and src/lib/onboard.ts) so selective review triggers correctly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7abe4b6e-1d63-48fd-8b51-a6fd7338b893

📥 Commits

Reviewing files that changed from the base of the PR and between 93bedf2 and bf94b84.

📒 Files selected for processing (2)

.github/workflows/nightly-e2e.yaml
scripts/patch-openclaw-chat-send.js

💤 Files with no reviewable changes (1)

scripts/patch-openclaw-chat-send.js

github-actions · 2026-05-23T05:37:12Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26324678236
Target ref: bf94b84fa17b94dba51294362431ca9f25737209
Workflow ref: main
Requested jobs: cloud-onboard-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
cloud-onboard-e2e	✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

github-actions · 2026-05-23T05:55:24Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26324884099
Target ref: 1c6a7afb195be5424fb497c0bee6227b46829169
Workflow ref: main
Requested jobs: cloud-e2e,sandbox-survival-e2e,rebuild-openclaw-e2e,hermes-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job	Result
cloud-e2e	✅ success
hermes-e2e	✅ success
rebuild-openclaw-e2e	✅ success
sandbox-survival-e2e	✅ success

test: trigger release blocker validation

3f579fe

ericksoa added 2 commits May 22, 2026 19:41

test: add release blocker validation e2es

3f5db4f

test: fix validation workflow setup-node action

d724cfe

test: harden tui chat validation harness

8a1e0a4

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

ericksoa added 2 commits May 22, 2026 20:05

test: harden token rotation output assertions

42ac24a

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

test: update tui chat repro protocol

8781bcb

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

This was referenced May 23, 2026

[Sandbox][Windows ARM WSL] Slack onboarding has 2 policy gaps #2758

Closed

[DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600

Closed

fix(openclaw): shim chat send empty finals

36aa2ce

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

fix(openclaw): preserve chat send run ids in queue

2622726

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

fix(openclaw): queue chat send turns separately

efe4ca7

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

ericksoa added the v0.0.51 Release target label May 23, 2026

ericksoa marked this pull request as ready for review May 23, 2026 05:01

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

Comment thread .github/workflows/nightly-e2e.yaml Outdated

ericksoa added 2 commits May 22, 2026 22:21

Merge remote-tracking branch 'origin/main' into codex/release-blocker…

5f2b1be

…-validation-20260522

ci: include new nightly jobs in aggregates

93bedf2

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

fix: mark OpenClaw chat send patch executable

bc95dec

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

test: decouple gpu proof check from neighbor function

bf94b84

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

ericksoa requested a review from cv May 23, 2026 05:38

chore: route new e2e review hints

1c6a7af

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

cv enabled auto-merge (squash) May 23, 2026 19:59

cv approved these changes May 23, 2026

View reviewed changes

cv merged commit 8d07c62 into main May 23, 2026
29 checks passed

This was referenced May 24, 2026

ci(e2e): reuse nightly script runner workflow #4151

Merged

fix(e2e): install workflow target ref in public onboard tests #4214

Merged

jyaunches mentioned this pull request May 26, 2026

chore: upgrade OpenClaw to 2026.5.18 #3825

Closed

wscurran added the chore Build, CI, dependency, or tooling maintenance label Jun 8, 2026

Conversation

ericksoa commented May 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation runs

Current evidence

Issue gates

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented May 23, 2026

Uh oh!

coderabbitai Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 23, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

ericksoa commented May 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 23, 2026 •

edited

Loading

github-actions Bot commented May 23, 2026 •

edited

Loading

github-actions Bot commented May 23, 2026 •

edited

Loading

github-actions Bot commented May 23, 2026 •

edited

Loading