Skip to content

test: release blocker validation run#4119

Merged
cv merged 14 commits into
mainfrom
codex/release-blocker-validation-20260522
May 23, 2026
Merged

test: release blocker validation run#4119
cv merged 14 commits into
mainfrom
codex/release-blocker-validation-20260522

Conversation

@ericksoa

@ericksoa ericksoa commented May 23, 2026

Copy link
Copy Markdown
Contributor

Summary

  • temporary validation PR to run selective E2E coverage for release-blocker close calls
  • adds validation-only workflow wiring for OpenClaw TUI/chat correlation and [DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 optional GPU proof
  • hardens the TUI/chat validation wrapper so PR runners use repo Vitest instead of transient npx installs
  • updates the TUI/chat live repro to current OpenClaw gateway protocol v4
  • hardens token-rotation output assertions after PR evidence showed a pipefail/grep harness false negative
  • not intended to merge into main as-is

Validation runs

Current evidence

Issue gates

This PR is for test evidence only.

Summary by CodeRabbit

  • New Features

    • Added two conditional nightly E2E jobs (TUI/chat correlation and optional GPU-proof check).
    • Included a build-time runtime patch step and added a runtime patch script to images.
  • CI / Chores

    • Updated workflow dispatch inputs and wired new jobs into failure/reporting pipelines.
    • Pinned Node setup action to a specific commit.
  • Tests

    • Added E2E harness and unit tests for the patch and expanded chat-correlation regression checks.
  • Config

    • Updated E2E path instructions to include the new job and related files.

Review Change Stack

@copy-pr-bot

copy-pr-bot Bot commented May 23, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9f90a60d-cb9e-4e95-9312-c8ad455b5fa3

📥 Commits

Reviewing files that changed from the base of the PR and between bf94b84 and 1c6a7af.

📒 Files selected for processing (1)
  • .coderabbit.yaml

📝 Walkthrough

Walkthrough

Adds a CLI patch that adjusts OpenClaw runtime JS for chat/send run-id correlation and idempotency, integrates the script into Docker and sandbox staging, provides unit and E2E tests (including a cloud-backed TUI correlation harness), extends correlation trace analysis, and wires two selective-dispatch nightly CI jobs into aggregators.

Changes

OpenClaw Chat Correlation Compatibility Layer

Layer / File(s) Summary
Patch Script Core Implementation
scripts/patch-openclaw-chat-send.js
Implements a CLI that scans a dist dir, finds exactly one chat.send, get-reply, and followup-runner runtime file each, injects run-id correlation, idempotencyKey, empty-final-event suppression, and queue-mode adjustments, and verifies injected markers.
Patch Script Testing
test/openclaw-chat-send-patch.test.ts
Vitest suite writes runtime fixtures, runs the patch, asserts expected marker injections and idempotency, and includes a negative “fails closed” case for mismatched runtime shapes.
Patch Integration in Docker Build and Staging
Dockerfile, src/lib/sandbox/build-context.ts, test/sandbox-build-context.test.ts, .coderabbit.yaml
Dockerfile copies and chmods patch-openclaw-chat-send.js and runs it at build time against OpenClaw dist; optimized sandbox staging now includes the script and tests/assertions/path_instructions are updated.
Live Correlation Trace Analysis Enhancement
test/openclaw-tui-chat-correlation.test.ts
Extends Issue2603Analysis with missingReplies, duplicateReplies, missingUserTurns; reworks analyzeIssue2603Trace to compute these via visible/final-count maps; updates failure summaries, raises live repro websocket protocol to 4, and adds a delta+final idempotency unit test.
E2E Harness and CI Integration
test/e2e/test-openclaw-tui-chat-correlation.sh, .github/workflows/nightly-e2e.yaml, .coderabbit.yaml
Adds a shell E2E harness that provisions a cloud sandbox, verifies OpenClaw 2026.5.18, ensures dev deps, runs the live Vitest correlation test, and performs optional cleanup. Adds openclaw-tui-chat-correlation-e2e and issue-3600-gpu-proof-optional-e2e nightly jobs and wires them into notify-on-failure, report-to-pr, and scorecard; pins one actions/setup-node step to a commit SHA.
Test Refactoring and Regression Alignment
test/e2e/test-token-rotation.sh, test/fetch-guard-patch-regression.test.ts
Replaces echo-pipe grep pipelines with here-string greps across token-rotation assertions and updates the Dockerfile patch extraction end-marker to the new chat.send patch block.

Sequence Diagram

sequenceDiagram
  participant Dev as Patch Script
  participant Docker as Docker Build
  participant Dist as OpenClaw dist
  participant Sandbox as Provisioned Sandbox
  participant Vitest as Vitest Live Test
  participant Analysis as Correlation Analysis
  Dev->>Docker: COPY patch-openclaw-chat-send.js into image
  Docker->>Dist: RUN patch script against compiled dist
  Docker->>Sandbox: produce image with patched runtime
  Sandbox->>Vitest: run TUI correlation harness against 2026.5.18
  Vitest->>Sandbox: send prompts via websocket
  Sandbox->>Vitest: stream delta and final chat events
  Vitest->>Analysis: collect and analyze trace events
  Analysis->>Analysis: compute missing/duplicate/correlation metrics
  Vitest->>Vitest: assert metrics empty
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#3869: Dockerfile fetch-guard patch extraction and regression test adjustments align with changes to Dockerfile patch sections introduced in that PR.

Suggested labels

enhancement: testing, E2E, Integration: OpenClaw, Docker, NV QA

Suggested reviewers

  • jyaunches
  • cv

Poem

🐰 I hopped a tiny patch into the Claw tonight,
Run-ids snug and idempotent, replies aligned right,
Sandboxes spin up, Vitest taps the keys,
Traces match prompts and calm the test-day breeze,
Hooray — correlation sleeps tight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test: release blocker validation run' accurately describes the main purpose of this PR, which is to add validation-only E2E coverage and testing infrastructure for release-blocker checks.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/release-blocker-validation-20260522

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: openclaw-tui-chat-correlation-e2e, cloud-e2e, sandbox-survival-e2e, rebuild-openclaw-e2e, hermes-e2e
Optional E2E: token-rotation-e2e, issue-3600-gpu-proof-optional-e2e, openshell-gateway-upgrade-e2e

Dispatch hint: openclaw-tui-chat-correlation-e2e,cloud-e2e,sandbox-survival-e2e,rebuild-openclaw-e2e,hermes-e2e

Auto-dispatched E2E: cloud-e2e, sandbox-survival-e2e, rebuild-openclaw-e2e, hermes-e2e via nightly-e2e.yaml at 1c6a7afb195be5424fb497c0bee6227b46829169nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • openclaw-tui-chat-correlation-e2e (high): Primary required coverage for the new OpenClaw chat.send patch. It builds/onboards a fresh OpenClaw sandbox and runs the live gateway/WebChat correlation harness against rapid sequential sends, validating that the patched behavior is actually baked into the sandbox image.
  • cloud-e2e (high): Dockerfile and optimized build-context changes affect the sandbox container image used by normal OpenClaw onboarding. Run the full onboard plus cloud inference path to catch image build, startup, and basic live inference regressions.
  • sandbox-survival-e2e (high): The container image change can affect sandbox boot and gateway/runtime resilience. This verifies the sandbox survives gateway restart and continues to serve workspace and inference flows.
  • rebuild-openclaw-e2e (high): The PR changes files copied into the sandbox build context and the Dockerfile patch sequence. Rebuild coverage is needed to verify OpenClaw workspace state survives image rebuilds with the new patch asset included.
  • hermes-e2e (high): Although the new shim targets OpenClaw, the Dockerfile is a shared sandbox image construction path. Run Hermes onboard and inference to ensure the image build changes do not regress the multi-agent runtime path.

Optional E2E

  • token-rotation-e2e (high): Only the E2E script assertions were refactored from pipe grep to here-string grep; no credential runtime logic changed. Useful to validate the edited harness, but not merge-blocking for product behavior.
  • issue-3600-gpu-proof-optional-e2e (low): The workflow adds this new selective-dispatch job. There is no corresponding onboard source change in this PR, so it is mainly useful to prove the new workflow job is wired correctly.
  • openshell-gateway-upgrade-e2e (high): The workflow changes the setup-node action pin for this job only. Optional smoke of the edited job definition; the PR does not change gateway upgrade runtime code.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: openclaw-tui-chat-correlation-e2e,cloud-e2e,sandbox-survival-e2e,rebuild-openclaw-e2e,hermes-e2e

@github-actions

github-actions Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

  • None.

Relevant changed files

  • None.

@github-actions

github-actions Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 3 needs attention, 1 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 4 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • Secret-bearing workflow runs code from target_ref (.github/workflows/nightly-e2e.yaml:407): The OpenClaw TUI/chat job checks out `${{ inputs.target_ref || github.ref }}`, installs dependencies, and runs scripts from that checked-out tree while exposing `NVIDIA_API_KEY` and `GITHUB_TOKEN`. The workflow documents `target_ref` as a way for the trusted main workflow to test a PR head SHA; if that SHA contains untrusted changes, the checked-out `package.json`, npm lifecycle scripts, bash harness, or Vitest tests can read or exfiltrate secrets or abuse checkout credentials. This expands the trusted-code boundary in a high-risk E2E path. This finding was present in the previous advisor review and still applies.
    • Recommendation: Keep workflow and harness code trusted: run the E2E driver from `origin/main` or another reviewed ref, checkout the candidate target into a separate code-under-test directory, disable persisted checkout credentials where not needed, and do not expose repository secrets to scripts sourced from PR-controlled refs. If PR-head execution is required, gate it behind maintainer-only approval and a secret-free mode.
    • Evidence: The job uses `actions/checkout` with `ref: ${{ inputs.target_ref || github.ref }}`; later steps run `npm ci --include=dev` and `bash test/e2e/test-openclaw-tui-chat-correlation.sh` from that checkout with `NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}` and `GITHUB_TOKEN: ${{ github.token }}`.
  • [DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 validation is only a source-order check (.github/workflows/nightly-e2e.yaml:466): The linked [DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 issue expects onboard to complete on the affected GPU path after an optional direct GPU proof failure. The added `issue-3600-gpu-proof-optional-e2e` job does not run onboard, does not exercise `verifyDirectSandboxGpu`, and does not simulate the optional proof failure path; it only scans source text for one string before another string. That is not sufficient evidence for the issue gate described by the issue and closure comment. This finding was present in the previous advisor review and still applies.
    • Recommendation: Replace or supplement this with a behavioral test that invokes the relevant onboard/GPU-proof path with an optional proof failure and asserts onboard continues without the fatal throw. If GB300 hardware is unavailable, use a focused unit/integration harness around `verifyDirectSandboxGpu` or its caller rather than a string-position test.
    • Evidence: The job reads `src/lib/onboard.ts`, slices from `function verifyDirectSandboxGpu`, and checks that `if (proof.optional === true) return;` appears before `throw new Error(`GPU proof failed:`; it never executes onboard or a verifier command.
  • Validation-only changes would become scheduled nightly behavior (.github/workflows/nightly-e2e.yaml:405): The PR description states this is test-evidence-only and not intended to merge into main as-is, but the diff adds production workflow jobs and a Dockerfile OpenClaw runtime patch that would run or ship from main if merged. The new workflow predicates run on scheduled events whenever the job list is empty because the condition allows all non-`workflow_dispatch` events. This finding was present in the previous advisor review and still applies.
    • Recommendation: Before merging to main, either remove the validation-only workflow/Dockerfile wiring or convert it into permanent, reviewed coverage with the trusted-code boundary and acceptance gaps fixed. If the PR is only for evidence collection, keep it out of main and avoid landing the scheduled-job additions.
    • Evidence: The PR body says `not intended to merge into main as-is` and `This PR is for test evidence only.` The diff adds `openclaw-tui-chat-correlation-e2e` and `issue-3600-gpu-proof-optional-e2e` under the nightly workflow with predicates that run for non-dispatch events, and Dockerfile now executes `node /usr/local/lib/nemoclaw/patch-openclaw-chat-send.js` during sandbox image builds.

🔎 Worth checking

  • Nightly E2E hard-codes a specific OpenClaw version (test/e2e/test-openclaw-tui-chat-correlation.sh:47): The new nightly wrapper requires the sandbox OpenClaw version output to contain `2026.5.18`. If this lands as a scheduled nightly job, normal OpenClaw updates can break the validation for reasons unrelated to TUI/chat correlation, and the test may fail before reaching the regression assertions. This finding was present in the previous advisor review and still applies.
    • Recommendation: Avoid hard-coding a single release version in a scheduled nightly test, or make the expected version an explicit workflow input/environment variable for one-off validation. For permanent coverage, assert required capabilities or protocol behavior instead of an exact version string.
    • Evidence: The script runs `openshell sandbox exec --name "$SANDBOX_NAME" -- openclaw --version` and exits unless `grep -q "2026.5.18"` succeeds.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Secret-bearing workflow runs code from target_ref (.github/workflows/nightly-e2e.yaml:407): The OpenClaw TUI/chat job checks out `${{ inputs.target_ref || github.ref }}`, installs dependencies, and runs scripts from that checked-out tree while exposing `NVIDIA_API_KEY` and `GITHUB_TOKEN`. The workflow documents `target_ref` as a way for the trusted main workflow to test a PR head SHA; if that SHA contains untrusted changes, the checked-out `package.json`, npm lifecycle scripts, bash harness, or Vitest tests can read or exfiltrate secrets or abuse checkout credentials. This expands the trusted-code boundary in a high-risk E2E path. This finding was present in the previous advisor review and still applies.
    • Recommendation: Keep workflow and harness code trusted: run the E2E driver from `origin/main` or another reviewed ref, checkout the candidate target into a separate code-under-test directory, disable persisted checkout credentials where not needed, and do not expose repository secrets to scripts sourced from PR-controlled refs. If PR-head execution is required, gate it behind maintainer-only approval and a secret-free mode.
    • Evidence: The job uses `actions/checkout` with `ref: ${{ inputs.target_ref || github.ref }}`; later steps run `npm ci --include=dev` and `bash test/e2e/test-openclaw-tui-chat-correlation.sh` from that checkout with `NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}` and `GITHUB_TOKEN: ${{ github.token }}`.
  • [DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 validation is only a source-order check (.github/workflows/nightly-e2e.yaml:466): The linked [DGX Station][Onboard] cuInit(0) SIGSEGV in verifyDirectSandboxGpu — onboard aborts (v0.0.43 GB300 aarch64) #3600 issue expects onboard to complete on the affected GPU path after an optional direct GPU proof failure. The added `issue-3600-gpu-proof-optional-e2e` job does not run onboard, does not exercise `verifyDirectSandboxGpu`, and does not simulate the optional proof failure path; it only scans source text for one string before another string. That is not sufficient evidence for the issue gate described by the issue and closure comment. This finding was present in the previous advisor review and still applies.
    • Recommendation: Replace or supplement this with a behavioral test that invokes the relevant onboard/GPU-proof path with an optional proof failure and asserts onboard continues without the fatal throw. If GB300 hardware is unavailable, use a focused unit/integration harness around `verifyDirectSandboxGpu` or its caller rather than a string-position test.
    • Evidence: The job reads `src/lib/onboard.ts`, slices from `function verifyDirectSandboxGpu`, and checks that `if (proof.optional === true) return;` appears before `throw new Error(`GPU proof failed:`; it never executes onboard or a verifier command.
  • Validation-only changes would become scheduled nightly behavior (.github/workflows/nightly-e2e.yaml:405): The PR description states this is test-evidence-only and not intended to merge into main as-is, but the diff adds production workflow jobs and a Dockerfile OpenClaw runtime patch that would run or ship from main if merged. The new workflow predicates run on scheduled events whenever the job list is empty because the condition allows all non-`workflow_dispatch` events. This finding was present in the previous advisor review and still applies.
    • Recommendation: Before merging to main, either remove the validation-only workflow/Dockerfile wiring or convert it into permanent, reviewed coverage with the trusted-code boundary and acceptance gaps fixed. If the PR is only for evidence collection, keep it out of main and avoid landing the scheduled-job additions.
    • Evidence: The PR body says `not intended to merge into main as-is` and `This PR is for test evidence only.` The diff adds `openclaw-tui-chat-correlation-e2e` and `issue-3600-gpu-proof-optional-e2e` under the nightly workflow with predicates that run for non-dispatch events, and Dockerfile now executes `node /usr/local/lib/nemoclaw/patch-openclaw-chat-send.js` during sandbox image builds.
  • Nightly E2E hard-codes a specific OpenClaw version (test/e2e/test-openclaw-tui-chat-correlation.sh:47): The new nightly wrapper requires the sandbox OpenClaw version output to contain `2026.5.18`. If this lands as a scheduled nightly job, normal OpenClaw updates can break the validation for reasons unrelated to TUI/chat correlation, and the test may fail before reaching the regression assertions. This finding was present in the previous advisor review and still applies.
    • Recommendation: Avoid hard-coding a single release version in a scheduled nightly test, or make the expected version an explicit workflow input/environment variable for one-off validation. For permanent coverage, assert required capabilities or protocol behavior instead of an exact version string.
    • Evidence: The script runs `openshell sandbox exec --name "$SANDBOX_NAME" -- openclaw --version` and exits unless `grep -q "2026.5.18"` succeeds.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26321351519
Target ref: 3f5db4fe66a851e251d73b2568846de37cba0206
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: network-policy-e2e,token-rotation-e2e,channels-stop-start-e2e,messaging-providers-e2e,openclaw-slack-pairing-e2e,openclaw-tui-chat-correlation-e2e,issue-3600-gpu-proof-optional-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
channels-stop-start-e2e ⚠️ cancelled
messaging-providers-e2e ⚠️ cancelled
network-policy-e2e ⚠️ cancelled
openclaw-slack-pairing-e2e ⚠️ cancelled
token-rotation-e2e ⚠️ cancelled
openclaw-tui-chat-correlation-e2e ❓ not reported
issue-3600-gpu-proof-optional-e2e ❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e, issue-3600-gpu-proof-optional-e2e. The reporting workflow needs to include these jobs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26321747082
Target ref: 8a1e0a4ac133eea5dab8ae88b9937ba1a1ba4346
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
openclaw-tui-chat-correlation-e2e ❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

ericksoa added 2 commits May 22, 2026 20:05
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26321864900
Target ref: 8781bcb18689abf68c1a4ff4766890dc5acea336
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
openclaw-tui-chat-correlation-e2e ❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26321312107
Target ref: 3f579fe8267c1696b5280b8cf0016d97e4de9a31
Workflow ref: main
Requested jobs: network-policy-e2e,token-rotation-e2e,channels-stop-start-e2e,messaging-providers-e2e,openclaw-slack-pairing-e2e
Summary: 5 passed, 0 failed, 0 skipped

Job Result
channels-stop-start-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
openclaw-slack-pairing-e2e ✅ success
token-rotation-e2e ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26321372633
Target ref: d724cfead66fc78f9962fbbe3bae9b163babbfb3
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: network-policy-e2e,token-rotation-e2e,channels-stop-start-e2e,messaging-providers-e2e,openclaw-slack-pairing-e2e,openclaw-tui-chat-correlation-e2e,issue-3600-gpu-proof-optional-e2e
Summary: 4 passed, 1 failed, 0 skipped

Job Result
channels-stop-start-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
openclaw-slack-pairing-e2e ✅ success
token-rotation-e2e ❌ failure
openclaw-tui-chat-correlation-e2e ❓ not reported
issue-3600-gpu-proof-optional-e2e ❓ not reported

Failed jobs: token-rotation-e2e. Check run artifacts for logs.
Missing requested jobs: openclaw-tui-chat-correlation-e2e, issue-3600-gpu-proof-optional-e2e. The reporting workflow needs to include these jobs.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26321830179
Target ref: 42ac24a7a32e350b9be486d140fbe2af397eaab4
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: token-rotation-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
token-rotation-e2e ✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26322997840
Target ref: 36aa2cee58bbc00f6eea2794d5e55fa71213e723
Workflow ref: main
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
openclaw-tui-chat-correlation-e2e ❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26323019724
Target ref: 36aa2cee58bbc00f6eea2794d5e55fa71213e723
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
openclaw-tui-chat-correlation-e2e ❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26323387646
Target ref: 2622726d36017a967eacf2e737c56ca921bd1da2
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
openclaw-tui-chat-correlation-e2e ❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26323703975
Target ref: efe4ca78fb93bdbcca32a2392f0d13487e7298b9
Workflow ref: codex/release-blocker-validation-20260522
Requested jobs: openclaw-tui-chat-correlation-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
openclaw-tui-chat-correlation-e2e ❓ not reported

Missing requested jobs: openclaw-tui-chat-correlation-e2e. The reporting workflow needs to include these jobs.

@ericksoa ericksoa added the v0.0.51 Release target label May 23, 2026
@ericksoa ericksoa marked this pull request as ready for review May 23, 2026 05:01
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26324116054
Target ref: efe4ca78fb93bdbcca32a2392f0d13487e7298b9
Workflow ref: main
Requested jobs: cloud-onboard-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/nightly-e2e.yaml (2)

82-109: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the new issue-3600-gpu-proof-optional-e2e job ID to workflow_dispatch.inputs.jobs valid list.

The selective-dispatch jobs contract is now incomplete: one newly added dispatchable job is missing from the documented valid list.

As per coding guidelines, "Keep the selective-dispatch “jobs” contract consistent: add/rename E2E job IDs in workflow_dispatch.inputs.jobs."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/nightly-e2e.yaml around lines 82 - 109, The
workflow_dispatch.inputs.jobs valid list is missing the new job ID; update the
list (the "Valid:" array used for workflow_dispatch.inputs.jobs) to include the
string "issue-3600-gpu-proof-optional-e2e" so the selective-dispatch contract
remains consistent with the added dispatchable job; ensure the new ID is added
alongside the other comma-separated job IDs in the same formatting style.

2439-2490: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include both new E2E jobs in aggregate needs lists (notify-on-failure, report-to-pr, scorecard).

Right now, failures from openclaw-tui-chat-correlation-e2e and issue-3600-gpu-proof-optional-e2e are omitted from nightly notifications, PR summary comments, and scorecard aggregation.

Also applies to: 2539-2590, 2696-2747

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/nightly-e2e.yaml around lines 2439 - 2490, The nightly
workflow's aggregate "needs" arrays are missing the two new E2E jobs so their
failures are not included in notifications, PR report, or scorecard; update the
needs lists used by the notify/report/scorecard jobs (the arrays currently
listing many e2e jobs) to include both "openclaw-tui-chat-correlation-e2e" and
"issue-3600-gpu-proof-optional-e2e", and apply the same addition in the other
two mirrored blocks noted (the other arrays at the same sections referenced) so
all three aggregate job definitions ("notify-on-failure", "report-to-pr",
"scorecard") include these two job names.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Line 417: Replace the mutable tag "actions/setup-node@v6" with a pinned commit
SHA (or digest) wherever it appears in the workflow (the three occurrences of
the uses entry "actions/setup-node@v6"); update each "uses:" line to reference
the full commit SHA string for actions/setup-node to prevent mutable-tag drift
and ensure reproducible runs.

---

Outside diff comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 82-109: The workflow_dispatch.inputs.jobs valid list is missing
the new job ID; update the list (the "Valid:" array used for
workflow_dispatch.inputs.jobs) to include the string
"issue-3600-gpu-proof-optional-e2e" so the selective-dispatch contract remains
consistent with the added dispatchable job; ensure the new ID is added alongside
the other comma-separated job IDs in the same formatting style.
- Around line 2439-2490: The nightly workflow's aggregate "needs" arrays are
missing the two new E2E jobs so their failures are not included in
notifications, PR report, or scorecard; update the needs lists used by the
notify/report/scorecard jobs (the arrays currently listing many e2e jobs) to
include both "openclaw-tui-chat-correlation-e2e" and
"issue-3600-gpu-proof-optional-e2e", and apply the same addition in the other
two mirrored blocks noted (the other arrays at the same sections referenced) so
all three aggregate job definitions ("notify-on-failure", "report-to-pr",
"scorecard") include these two job names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: aac65153-9c15-45fc-b0a3-8745fbbed8c2

📥 Commits

Reviewing files that changed from the base of the PR and between c70c62c and efe4ca7.

📒 Files selected for processing (10)
  • .github/workflows/nightly-e2e.yaml
  • Dockerfile
  • scripts/patch-openclaw-chat-send.js
  • src/lib/sandbox/build-context.ts
  • test/e2e/test-openclaw-tui-chat-correlation.sh
  • test/e2e/test-token-rotation.sh
  • test/fetch-guard-patch-regression.test.ts
  • test/openclaw-chat-send-patch.test.ts
  • test/openclaw-tui-chat-correlation.test.ts
  • test/sandbox-build-context.test.ts

Comment thread .github/workflows/nightly-e2e.yaml Outdated
ericksoa added 2 commits May 22, 2026 22:21
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/nightly-e2e.yaml (1)

470-478: ⚡ Quick win

Avoid binding this validation to the next function name.

This slice ends at the literal function upsertMessagingProviders, so an unrelated rename/reorder/formatting change in src/lib/onboard.ts will fail the workflow even if verifyDirectSandboxGpu still keeps the optional guard before the throw. Bound the check to verifyDirectSandboxGpu itself instead of the neighboring symbol.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/nightly-e2e.yaml around lines 470 - 478, The check
currently slices source up to the literal "function upsertMessagingProviders",
which couples the test to a neighboring symbol; instead locate the "function
verifyDirectSandboxGpu" start (already found in variable start) and compute the
function end by finding its matching closing brace (e.g., by scanning forward
and tracking brace depth) so you set end based on the matched '}' for
verifyDirectSandboxGpu and then slice into fn and run the same optional/throw
checks; update the logic that computes end (and the variables end/fn) so the
validation only inspects verifyDirectSandboxGpu itself rather than the next
function name.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 470-478: The check currently slices source up to the literal
"function upsertMessagingProviders", which couples the test to a neighboring
symbol; instead locate the "function verifyDirectSandboxGpu" start (already
found in variable start) and compute the function end by finding its matching
closing brace (e.g., by scanning forward and tracking brace depth) so you set
end based on the matched '}' for verifyDirectSandboxGpu and then slice into fn
and run the same optional/throw checks; update the logic that computes end (and
the variables end/fn) so the validation only inspects verifyDirectSandboxGpu
itself rather than the next function name.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c5138a20-8977-44c9-967b-e13c8b9c2b44

📥 Commits

Reviewing files that changed from the base of the PR and between efe4ca7 and 93bedf2.

📒 Files selected for processing (1)
  • .github/workflows/nightly-e2e.yaml

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26324533661
Target ref: 93bedf29a16a919fc3716f91accb87ff8ef15c36
Workflow ref: main
Requested jobs: cloud-onboard-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/nightly-e2e.yaml (1)

403-496: ⚠️ Potential issue | 🟠 Major

Update .coderabbit.yaml path_instructions for the new nightly E2E jobs

  • .coderabbit.yaml has a path: "src/lib/onboard.ts" entry, but it doesn’t mention/route issue-3600-gpu-proof-optional-e2e.
  • There are no path_instructions matches for openclaw-tui-chat-correlation-e2e or its executed script test/e2e/test-openclaw-tui-chat-correlation.sh, so selective review routing for changes in that area will miss these jobs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/nightly-e2e.yaml around lines 403 - 496, The
.coderabbit.yaml path_instructions missing entries for the new nightly E2E jobs
means changes to relevant files won't be routed; update .coderabbit.yaml to add
path_instructions that match the new job targets: include an entry mapping
"openclaw-tui-chat-correlation-e2e" to the test script path
test/e2e/test-openclaw-tui-chat-correlation.sh (and any related sandbox names),
and add an entry mapping "issue-3600-gpu-proof-optional-e2e" to
src/lib/onboard.ts so changes to the verifyDirectSandboxGpu function are routed;
ensure the match patterns include both the job names and the specific paths
(test/e2e/... and src/lib/onboard.ts) so selective review triggers correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 403-496: The .coderabbit.yaml path_instructions missing entries
for the new nightly E2E jobs means changes to relevant files won't be routed;
update .coderabbit.yaml to add path_instructions that match the new job targets:
include an entry mapping "openclaw-tui-chat-correlation-e2e" to the test script
path test/e2e/test-openclaw-tui-chat-correlation.sh (and any related sandbox
names), and add an entry mapping "issue-3600-gpu-proof-optional-e2e" to
src/lib/onboard.ts so changes to the verifyDirectSandboxGpu function are routed;
ensure the match patterns include both the job names and the specific paths
(test/e2e/... and src/lib/onboard.ts) so selective review triggers correctly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7abe4b6e-1d63-48fd-8b51-a6fd7338b893

📥 Commits

Reviewing files that changed from the base of the PR and between 93bedf2 and bf94b84.

📒 Files selected for processing (2)
  • .github/workflows/nightly-e2e.yaml
  • scripts/patch-openclaw-chat-send.js
💤 Files with no reviewable changes (1)
  • scripts/patch-openclaw-chat-send.js

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26324678236
Target ref: bf94b84fa17b94dba51294362431ca9f25737209
Workflow ref: main
Requested jobs: cloud-onboard-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success

@ericksoa ericksoa requested a review from cv May 23, 2026 05:38
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26324884099
Target ref: 1c6a7afb195be5424fb497c0bee6227b46829169
Workflow ref: main
Requested jobs: cloud-e2e,sandbox-survival-e2e,rebuild-openclaw-e2e,hermes-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
cloud-e2e ✅ success
hermes-e2e ✅ success
rebuild-openclaw-e2e ✅ success
sandbox-survival-e2e ✅ success

@cv cv enabled auto-merge (squash) May 23, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Build, CI, dependency, or tooling maintenance v0.0.51 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants