test(e2e): add OpenClaw TUI chat-correlation coverage slice#5150
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a live Vitest regression test that provisions a cloud sandbox, executes an in-sandbox websocket repro for issues ChangesOpenClaw TUI Chat Correlation Live E2E Regression Test
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes The PR introduces substantial live E2E regression logic with trace analysis, websocket driver generation, sandbox execution orchestration, nondeterministic retry handling, and contract violation detection across 603 lines of high-complexity code in the test file, plus 71 lines of medium-complexity CI workflow setup. The test file spans multiple independent functional checkpoints (trace types, analysis, driver generation, execution, orchestration) that interact intricately; the analyzer logic is dense with edge-case detection for reply correlation, missing finals, and user-turn ordering. The CI job integrates Docker auth retry logic and artifact configuration. Mixed file complexity and interdependent control flow across checkpoints require careful review. Possibly related issues
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: None Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
|
Vitest E2E Scenario RecommendationRequired Vitest E2E scenarios: Dispatch required Vitest E2E scenarios:
Full Vitest E2E advisor summaryVitest E2E Scenario AdvisorBase: Required Vitest E2E scenarios
Optional Vitest E2E scenarios
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 6 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Consider writing more tests for
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
|
Thanks for writing this up, but this is going in the opposite direction from the post-#5106 / #4941 decision. This PR only edits Concrete asks:
A planning-only inventory flip is not useful in the new direction. A focused migration PR for this coverage would be welcome if it follows the same shape as #5131/#5132/#5133/#5137/#5138/#5139/#5141/#5142/#5143/#5144/#5145/#5146/#5149. |
The openclaw-tui-chat-correlation migration (#5098 epic, #4347 owner) needs to upload a websocket repro driver into a cloud OpenClaw sandbox and run a multi-line shell pipeline that reads the gateway auth token and dispatches the driver. The legacy bash test does this via raw `openshell sandbox upload` and `openshell sandbox exec --name X -- sh -lc "..."` calls. Adds two typed wrappers on SandboxClient: - upload(name, localPath, remotePath) — wraps `openshell sandbox upload` - execShell(name, script, options) — wraps `openshell sandbox exec X -- sh -lc <script>` Both share the existing validateSandboxName guard that rejects flag-shaped names before command construction. Six new unit cases in e2e-clients.test.ts cover argv shape, options propagation, and sandbox-name validation for both helpers. Refs: #4347 #5098
…+ workflow job Ports the live block of test/openclaw-tui-chat-correlation.test.ts into the typed scenario framework as a free-standing live test (#5049 pattern), plus a discrete openclaw-tui-chat-correlation-vitest job in e2e-vitest-scenarios.yaml modeled on #5107's openshell-version-pin-vitest. Coverage: - Onboards a fresh cloud OpenClaw sandbox via OnboardingPhaseFixture.from with the cloud-openclaw profile (already in SUPPORTED_ONBOARDING). - Asserts the pinned 2026.5.27 OpenClaw version (host-side probe via openshell sandbox exec ... openclaw --version). Override via E2E_OPENCLAW_TUI_CORRELATION_PINNED_VERSION when bumping. - Idempotent in-sandbox gateway start on port 18789 via SandboxClient.execShell. - Uploads the websocket repro driver via SandboxClient.upload, runs it via execShell with the gateway auth token read from /sandbox/.openclaw/openclaw.json. - Asserts the #2603 + #3145 contract: no empty finals for submitted runs, no uncorrelated replies, no missing/duplicate replies, no missing/duplicate user turns. Preserves the looksLikeEventCaptureFailure retry-once guard (OpenClaw-side observability race; remove when the runtime exposes a deterministic chat readiness ack). Inventory: bridge-probe entry for test/e2e/test-openclaw-tui-chat-correlation.sh now references this file as its bridge probe (satisfies the inventory invariant that bridge-probe entries have non-empty bridgeProbes pointing at real paths). Bash workflow (nightly-e2e.yaml openclaw-tui-chat-correlation-e2e job) remains untouched per the migration freeze policy; deletion is a follow-up PR with #4357 approval after typed scenario soaks. Refs: #4347 #2603 #3145 #5098 #5049 #5107
First-pass dispatch (run 27284249981) failed at the version-pin assertion
with `spawn openshell ENOENT`. Cloud onboarding (`OnboardingPhaseFixture.from`)
succeeded, but `sandbox.exec(instance.sandboxName, ["openclaw", "--version"])`
spawned with an empty env because no env was passed: `ShellProbe.run`
defaults to `env: { ...(options.env ?? {}) }` (no inherit). Without PATH
the runner can't resolve openshell at ~/.local/bin.
Pass `env: buildAvailabilityProbeEnv()` to every sandbox.* call (exec,
execShell, upload). Mirrors the convention used by every phase fixture
(state-validation.ts, runtime.ts, lifecycle.ts). Documents the convention
inline so the next free-standing live test author follows it.
Refs: #4347 #2603 #3145
…tional
Pass 2 dispatch (run 27285642504) failed at the openclaw-version-pinned
assertion with exit 127 from `openshell sandbox exec <name> -- openclaw --version`.
The matrix `live-scenarios (ubuntu-repo-cloud-openclaw)` job ran 1 of 24
tests (mostly skipped) and the one that ran did not exercise sandbox.exec,
so this latent bug was not visible until a free-standing live test invoked
sandbox.exec against a real openshell binary.
Production code uniformly uses `-n <name>`:
- src/lib/agent/onboard.ts:276,335
- src/lib/onboard/initial-policy.ts:112,118,124
- src/lib/onboard/web-search-verify.ts:77,104
- src/lib/onboard/sandbox-verification-exec.ts:24
- src/lib/onboard/compatible-endpoint-smoke.ts:144
- src/lib/onboard/docker-gpu-supervisor-reconnect.ts:126
- src/lib/status-command-deps.ts:50
Legacy bash tests (test-rebuild-openclaw.sh, test-openclaw-tui-chat-correlation.sh)
and the existing test/openclaw-tui-chat-correlation.test.ts all use the
equivalent `--name <name>` long form. Positional name is rejected by the
real openshell binary as a mis-parsed argument and surfaces as exit 127
("command not found" semantic).
Switch SandboxClient.exec and execShell to `-n` form. SandboxClient.upload
stays positional — production code and the existing live test agree on
positional for `sandbox upload`. Updates the unit-test argv assertions
in e2e-clients, e2e-phase-runtime, and e2e-phase-state-validation; shifts
the chat-completion data-raw payload index from args[11] to args[12].
Refs: #4347 #2603 #3145 #5098
Two consecutive PASS dispatches on the openclaw-tui-chat-correlation-vitest job confirm convergence: - Pass 3: run 27286491062 \xe2\x86\x92 SUCCESS (369.41s) - Pass 4: run 27287442227 \xe2\x86\x92 SUCCESS (358.64s, verification dispatch) Both runs exercised the full 11-step assertion chain end-to-end against a real cloud OpenClaw sandbox: cloud onboarding \xe2\x86\x92 openclaw-version-pinned \xe2\x86\x92 sandbox-gateway-reachable \xe2\x86\x92 live-repro-emitted-result \xe2\x86\x92 6 chat- correlation assertions (#2603 + #3145 contract). Inventory transitions: - status: bridge-probe \xe2\x86\x92 covered - targetVitestScenarios: [test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts] - bridgeProbes: cleared - deletionReady: false (legacy bash wrapper retained until soak; deletion is a follow-up PR with #4357 approval per the migration freeze policy) Refs: #4347 #2603 #3145 #5098
|
#5126 is now green, so this draft should rebase/reshape before it moves toward review. It still targets |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 215-286: The job openclaw-tui-chat-correlation-vitest currently
always runs even when a manual dispatch filtered by inputs.scenarios should skip
it; add a job-level conditional to respect the scenarios selector by adding an
if that allows the job when not a manual dispatch OR when
github.event.inputs.scenarios contains "openclaw-tui-chat-correlation" (e.g. if:
${{ github.event_name != 'workflow_dispatch' ||
contains(github.event.inputs.scenarios, 'openclaw-tui-chat-correlation') }}), so
the job only executes for the intended manual scenario or non-dispatch events.
In `@test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts`:
- Line 256: The test hardcodes port 18789 in the WebSocket URL when creating the
client (new WebSocket(...)), which will break if SANDBOX_GATEWAY_PORT changes;
update the WebSocket construction in openclaw-tui-chat-correlation.test.ts to
use the SANDBOX_GATEWAY_PORT constant instead of 18789 (e.g., interpolate
SANDBOX_GATEWAY_PORT into the "ws://127.0.0.1:<port>/ws" URL or pass it into the
script builder so new WebSocket uses the constant), ensuring the WebSocket
instantiation references SANDBOX_GATEWAY_PORT.
In `@test/e2e-scenario/migration/legacy-inventory.json`:
- Around line 581-588: The row was prematurely marked as covered—revert the
"status" field from "covered" back to its prior state (e.g., "migrating" or the
original value) and remove or clear the "targetVitestScenarios" entry until the
focused migration PR lands; also restore or update the
"notes"/"retiredReason"/"deletionReady" fields to reflect that the bash-backed
nightly lane is still retained and that migration evidence remains in PR `#5150`,
making sure to change the JSON keys "status", "targetVitestScenarios", and
"notes" in this object rather than leaving it marked covered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8fbf1250-a326-4cdd-b5b8-89c0d09e5973
📒 Files selected for processing (7)
.github/workflows/e2e-vitest-scenarios.yamltest/e2e-scenario/framework-tests/e2e-clients.test.tstest/e2e-scenario/framework-tests/e2e-phase-runtime.test.tstest/e2e-scenario/framework-tests/e2e-phase-state-validation.test.tstest/e2e-scenario/framework/clients/sandbox.tstest/e2e-scenario/live/openclaw-tui-chat-correlation.test.tstest/e2e-scenario/migration/legacy-inventory.json
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts (1)
525-530: ⚡ Quick winPersist the full repro payload in
issue2603-trace.json.This artifact currently drops the actual events/history/order data that explains a failure, so post-mortems still depend on console output or a rerun. For a 75-minute live probe, saving
reproand the derivedanalysisis a much better debugging default.♻️ Proposed change
- await artifacts.writeJson("issue2603-trace.json", { - sentRuns: repro.sentRuns, - eventCount: repro.events?.length ?? 0, - attempts: attempts.length, - error: repro.error, - }); + await artifacts.writeJson("issue2603-trace.json", { + repro, + attempts, + analysis: repro.error ? undefined : analyzeIssue2603Trace(repro), + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts` around lines 525 - 530, The artifact write currently only writes selected fields (sentRuns, eventCount, attempts, error); change the call to persist the full repro payload and derived analysis by writing the entire repro object and the analysis object to "issue2603-trace.json" (e.g., include repro and analysis alongside attempts and any metadata) so the file contains full events/history/order data; update the artifacts.writeJson invocation that creates "issue2603-trace.json" to serialize repro and analysis (and keep attempts/errors if desired).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts`:
- Around line 525-530: The artifact write currently only writes selected fields
(sentRuns, eventCount, attempts, error); change the call to persist the full
repro payload and derived analysis by writing the entire repro object and the
analysis object to "issue2603-trace.json" (e.g., include repro and analysis
alongside attempts and any metadata) so the file contains full
events/history/order data; update the artifacts.writeJson invocation that
creates "issue2603-trace.json" to serialize repro and analysis (and keep
attempts/errors if desired).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 88343285-11b4-4374-a113-ef05f31db27d
📒 Files selected for processing (1)
test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts
|
Update after #5126 moved the foundation to the fixtures namespace and picked up the sandbox helper prerequisite: this remains a useful salvage candidate, but the reshaped PR should target #5126, drop |
|
Migration-principle review for #5098: this PR has useful coverage, but it is drifting larger than “simple Vitest test, not new framework.” Reference from Concerns in this PR:
Recommended path:
Other acceptable options:
Suggested immediate changes:
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts (2)
363-364:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail fast when
chat.sendomitsrunId.Line 364 falls back to
idempotencyKey, which turns a missing accepted-run-id contract break into a misleading correlation failure later. Treat a missing/emptyresponse.runIdas the repro failure instead of fabricating the run ID.Proposed fix
- const response = await request("chat.send", { sessionKey, message, deliver: false, timeoutMs: 90_000, idempotencyKey }); - sentRuns.push({ promptToken, replyToken, message, runId: response.runId ?? idempotencyKey }); + const response = await request("chat.send", { + sessionKey, + message, + deliver: false, + timeoutMs: 90_000, + idempotencyKey, + }); + if (typeof response.runId !== "string" || response.runId.length === 0) { + throw new Error(`chat.send did not return a runId for ${promptToken}`); + } + sentRuns.push({ promptToken, replyToken, message, runId: response.runId });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts` around lines 363 - 364, The test currently masks a contract breach by using idempotencyKey when response.runId is missing; update the chat.send call handling to fail fast: after awaiting request("chat.send") inspect response.runId and if it is missing/empty throw or assert (include response in the error for debugging) instead of falling back to idempotencyKey, and only push into sentRuns with the real runId when present; reference the local variables response, request("chat.send"), sentRuns, idempotencyKey and runId in your change.
541-546:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake the pinned-version check exact.
toContain(EXPECTED_OPENCLAW_VERSION)still passes for outputs like2026.5.270or other strings that merely include the pinned value. Since this version gate defines whether the rest of the test is meaningful, compare the parsed version token exactly.Proposed fix
- expect( - versionResult.stdout, - `expected fresh sandbox to run OpenClaw ${EXPECTED_OPENCLAW_VERSION}; ` + - `update E2E_OPENCLAW_TUI_CORRELATION_PINNED_VERSION when bumping. ` + - `actual: ${versionResult.stdout}`, - ).toContain(EXPECTED_OPENCLAW_VERSION); + const actualVersion = versionResult.stdout.trim().split(/\s+/).at(-1); + expect( + actualVersion, + `expected fresh sandbox to run OpenClaw ${EXPECTED_OPENCLAW_VERSION}; ` + + `update E2E_OPENCLAW_TUI_CORRELATION_PINNED_VERSION when bumping. ` + + `actual: ${versionResult.stdout}`, + ).toBe(EXPECTED_OPENCLAW_VERSION);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts` around lines 541 - 546, The current assertion uses toContain and can match substrings; change it to extract the actual version token from versionResult.stdout (e.g., via a regex that captures the version number token) and assert that the parsed token equals EXPECTED_OPENCLAW_VERSION (use toBe/toEqual or an exact string comparison). Locate the assertion referencing versionResult and EXPECTED_OPENCLAW_VERSION in openclaw-tui-chat-correlation.test.ts, replace the toContain check with parsing logic that finds the version token (e.g. /\b\d+\.\d+\.\d+\b/ or similar) and compare that token exactly to EXPECTED_OPENCLAW_VERSION, failing the test if no token is found.
♻️ Duplicate comments (1)
.github/workflows/e2e-vitest-scenarios.yaml (1)
214-214:⚠️ Potential issue | 🟠 Major | ⚡ Quick winIncomplete scenario selector prevents explicit dispatch of this test.
The condition only runs this job when
scenariosis empty, but doesn't handle the case when a user explicitly dispatchesscenarios='openclaw-tui-chat-correlation'. A past review comment suggested:if: ${{ inputs.scenarios == '' || contains(format(',{0},', inputs.scenarios), ',openclaw-tui-chat-correlation,') }}The current implementation prevents users from running only this scenario via filtered dispatch.
🔧 Suggested fix
- if: ${{ inputs.scenarios == '' }} + if: ${{ inputs.scenarios == '' || contains(format(',{0},', inputs.scenarios), ',openclaw-tui-chat-correlation,') }}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/e2e-vitest-scenarios.yaml at line 214, The job's if condition only runs when inputs.scenarios is empty, preventing explicit dispatch of the "openclaw-tui-chat-correlation" scenario; update the conditional on the job (the existing if using inputs.scenarios) to also allow when inputs.scenarios explicitly includes "openclaw-tui-chat-correlation" by using a contains-style check (e.g., normalize with commas and test for ',openclaw-tui-chat-correlation,') so the job runs either when inputs.scenarios == '' or when that scenario is present in inputs.scenarios.
🧹 Nitpick comments (2)
.github/workflows/e2e-vitest-scenarios.yaml (1)
269-269: 💤 Low valueClean up command formatting for readability.
The excessive spaces between arguments are unusual and reduce readability. Consider either keeping it on one line or using proper YAML multiline formatting.
♻️ Suggested formatting
- npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts --silent=false --reporter=default + npx vitest run --project e2e-scenarios-live \ + test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts \ + --silent=false --reporter=default🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/e2e-vitest-scenarios.yaml at line 269, The command line in the workflow step is spaced unusually and hurts readability; replace the heavily spaced single-line invocation (the line containing "npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts --silent=false --reporter=default") with a clean single-line command (collapse extra spaces) or convert it to a YAML multiline scalar (|) with proper indentation so each argument is readable; update the step that runs "npx vitest run" to use one of these formats to remove the extraneous spacing.test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts (1)
555-560: ⚡ Quick winPersist the actual trace payload in
issue2603-trace.json.Right now the uploaded artifact only keeps counts, so a correlation/order/history failure from this 75-minute live job is not reconstructable from the artifact itself. Persist
events,historyMessages, and the per-attempt payloads (or the computedanalysis) here.Based on PR objectives, this workflow is supposed to upload live trace artifacts for this regression slice.
Possible shape
- await artifacts.writeJson("issue2603-trace.json", { - sentRuns: repro.sentRuns, - eventCount: repro.events?.length ?? 0, - attempts: attempts.length, - error: repro.error, - }); + await artifacts.writeJson("issue2603-trace.json", { + repro, + attempts, + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts` around lines 555 - 560, The artifact currently written by artifacts.writeJson only contains counts and error; instead include the full trace payload so the failure can be reconstructed: expand the object passed to artifacts.writeJson (the call near repro.sentRuns/repro.events) to persist repro.events, repro.historyMessages (if present), and a per-attempt payload (e.g., attempts map or computed analysis) alongside sentRuns, eventCount, attempts.length and error; ensure you reference and serialize repro.events, repro.historyMessages and attempts (or a derived analysis object) so the uploaded issue2603-trace.json contains the complete trace, history and attempt-level data needed to replay the regression.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts`:
- Around line 363-364: The test currently masks a contract breach by using
idempotencyKey when response.runId is missing; update the chat.send call
handling to fail fast: after awaiting request("chat.send") inspect
response.runId and if it is missing/empty throw or assert (include response in
the error for debugging) instead of falling back to idempotencyKey, and only
push into sentRuns with the real runId when present; reference the local
variables response, request("chat.send"), sentRuns, idempotencyKey and runId in
your change.
- Around line 541-546: The current assertion uses toContain and can match
substrings; change it to extract the actual version token from
versionResult.stdout (e.g., via a regex that captures the version number token)
and assert that the parsed token equals EXPECTED_OPENCLAW_VERSION (use
toBe/toEqual or an exact string comparison). Locate the assertion referencing
versionResult and EXPECTED_OPENCLAW_VERSION in
openclaw-tui-chat-correlation.test.ts, replace the toContain check with parsing
logic that finds the version token (e.g. /\b\d+\.\d+\.\d+\b/ or similar) and
compare that token exactly to EXPECTED_OPENCLAW_VERSION, failing the test if no
token is found.
---
Duplicate comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Line 214: The job's if condition only runs when inputs.scenarios is empty,
preventing explicit dispatch of the "openclaw-tui-chat-correlation" scenario;
update the conditional on the job (the existing if using inputs.scenarios) to
also allow when inputs.scenarios explicitly includes
"openclaw-tui-chat-correlation" by using a contains-style check (e.g., normalize
with commas and test for ',openclaw-tui-chat-correlation,') so the job runs
either when inputs.scenarios == '' or when that scenario is present in
inputs.scenarios.
---
Nitpick comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Line 269: The command line in the workflow step is spaced unusually and hurts
readability; replace the heavily spaced single-line invocation (the line
containing "npx vitest run --project e2e-scenarios-live
test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts --silent=false
--reporter=default") with a clean single-line command (collapse extra spaces) or
convert it to a YAML multiline scalar (|) with proper indentation so each
argument is readable; update the step that runs "npx vitest run" to use one of
these formats to remove the extraneous spacing.
In `@test/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts`:
- Around line 555-560: The artifact currently written by artifacts.writeJson
only contains counts and error; instead include the full trace payload so the
failure can be reconstructed: expand the object passed to artifacts.writeJson
(the call near repro.sentRuns/repro.events) to persist repro.events,
repro.historyMessages (if present), and a per-attempt payload (e.g., attempts
map or computed analysis) alongside sentRuns, eventCount, attempts.length and
error; ensure you reference and serialize repro.events, repro.historyMessages
and attempts (or a derived analysis object) so the uploaded issue2603-trace.json
contains the complete trace, history and attempt-level data needed to replay the
regression.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ff42bb0a-cafb-48ac-a08b-d0a818b3a65a
📒 Files selected for processing (2)
.github/workflows/e2e-vitest-scenarios.yamltest/e2e-scenario/live/openclaw-tui-chat-correlation.test.ts
…claw-tui-chat-correlation
prekshivyas
left a comment
There was a problem hiding this comment.
Reviewed (code + 9-cat security). Migrates the openclaw TUI chat-correlation E2E (#2603/#3145) to a free-standing live Vitest test, adds a manual CI job + execShell/upload helpers, and fixes exec to pass the -n name flag.
✅ Approve — coverage is a strict superset (all 6 legacy correlation checks + 2 new ordering checks), and the load-bearing -n fix is verified correct: production uniformly uses -n/--name (agent/onboard.ts, initial-policy.ts, web-search-verify.ts), so the old positional name was a latent bug; the 4 unit-test argv updates + the args[11]→args[12] index shift are consistent.
Security: all 9 pass — secrets scoped per-step and machine-enforced by the new workflow-boundary.mts validator (no job-level secret env), actions SHA-pinned, persist-credentials:false, loopback-only websocket, sandbox names validated (rejects flag-shaped names), argv arrays (no shell injection), --ignore-scripts on npm ci. Nits: analyzer types are duplicated rather than imported (documented lockstep cost); the event-capture retry is tightly gated so it can't mask a real correlation/ordering failure.
Selective E2E Results — ✅ All requested jobs passedRun: 27306425374
|
OpenClaw TUI chat-correlation coverage slice
Refs: #4347, #2603, #3145, #5098, #5126
This PR adds a focused free-standing Vitest live coverage slice for the OpenClaw TUI chat-correlation websocket regression guard.
Current scope
This is not a full legacy-migration closeout. It keeps
test/e2e/test-openclaw-tui-chat-correlation.shand the retained nightly lane in place.Adds:
test/e2e-scenario/live/openclaw-tui-chat-correlation.test.tsopenclaw-tui-chat-correlation-vitestjob in.github/workflows/e2e-vitest-scenarios.yamlThe live test exercises a real cloud OpenClaw sandbox via Vitest fixture support and asserts the protocol/history subset of #2603/#3145:
Out of scope:
Simplicity check
Validation
Local:
npx tsc --noEmit --allowImportingTsExtensions --moduleResolution bundler --module esnext --target es2022 --types node,vitest/globals --skipLibCheck test/e2e-scenario/live/openclaw-tui-chat-correlation.test.tsnpx prettier --check .github/workflows/e2e-vitest-scenarios.yaml test/e2e-scenario/live/openclaw-tui-chat-correlation.test.tsnpx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts --reporter=dotLive local run was not completed because Docker is not running locally; CI/manual dispatch should provide live sandbox evidence.
Summary by CodeRabbit