test(onboard): add helper-level Vitest coverage for docker-unreachable gateway-start abort (#4355)#5109
Conversation
…teway-start (#4355) Ahead of retiring test/e2e/test-docker-unreachable-gateway-start.sh, land the test plan as a stub vitest file and transition the migration inventory entry to bridge-probe so the intent is reviewable before the mocks land. The legacy script is structurally a Node-process unit test of startGateway() with a PATH-shimmed openshell binary, not a sandbox-lifecycle e2e. The unique regression target it protects is the startGatewayWithOptions() call-site contract: docker-unreachable classification short-circuits via pRetry.AbortError before any health-poll loop or post-start status/gateway-info probes. That contract is more faithfully covered by an in-process Vitest integration test next to the existing gateway-start-failure family than by anything the e2e-scenario probe model can express. Stage 2 (follow-up commits on this branch): implement the test bodies, flip the inventory entry to status=retired with deletionReady=true, delete the legacy script, and remove the docker-unreachable-gateway- start-e2e job from regression-e2e.yaml. Refs: #2347, #4355
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
📝 WalkthroughWalkthroughThis PR adds a Vitest integration test for Docker-unreachable gateway-start failure handling, covering platform-specific recovery messaging, exit behavior, and diagnostic suppression, and documents the test strategy in the migration inventory. ChangesDocker-unreachable Gateway Start Test Implementation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: None Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
E2E Scenario Advisor RecommendationRequired scenario E2E: None Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 2 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Consider writing more tests for
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
Replace the stage-1 it.todo() placeholders with the full coverage layer
described in the migration plan. 13 tests organized in five groups:
1. printDockerDaemonRecovery platform branches (darwin/linux/other)
— covers legacy assertions 2-3 (recovery message + colima hint).
2. handleFinalGatewayStartFailure({dockerUnreachable: true}) skips
diagnostics + cleanup and exits 1 — covers legacy assertions 1
and 7 (no openshell doctor logs collection).
3. Negative control — handleFinalGatewayStartFailure with
dockerUnreachable=false DOES collect diagnostics and clean up.
Guards against a future refactor that short-circuits the wrong
branch.
4. Composition tests — runs the same helper sequence the call site
uses (classify -> handleFinal -> exitProcess(1)) for both the
macOS Colima and Linux dockerd signatures, plus a negative
control on unrelated start failures.
5. Structural assertions on onboard.ts source proving the call-site
wiring is in place: docker_unreachable classification, co-located
pRetry.AbortError throw (which is what bypasses the retry loop
and prevents any health-poll), and dockerUnreachable forwarded
to handleFinalGatewayStartFailure. Catches refactors that drop
any link in the chain.
Layers 4 and 5 together cover legacy assertions 4 and 5 (no health-poll
log; no status/gateway-info probes after gateway-start failure):
the AbortError throw is the structural guarantee that the for-loop
containing those probes is never reached, and layer 5 ensures the
throw stays in place across refactors.
Imports the symbol via the compiled dist output (../../../dist/lib/
onboard) — same pattern as preflight.test.ts. handleFinalGatewayStart
Failure is exposed only via module.exports at the bottom of
onboard.ts, not as a TS export, and the source-level CommonJS
require() chain in onboard.ts -> onboard/gateway.ts -> onboard/env.ts
does not resolve cleanly under Vitest's TS interpreter without the
compiled output.
CI runs npm run build:cli before vitest in the CLI coverage shards, so
the dist import is satisfied at test time. Verified locally: 13/13
green.
Stage 2b (next commit on this branch): flip inventory entry to
status=retired with deletionReady=true, delete legacy script, remove
the docker-unreachable-gateway-start-e2e job from regression-e2e.yaml.
Refs: #2347, #4355
… flip The 'bridge-probe' status carries an invariant in test/e2e-scenario/framework-tests/e2e-migration-inventory.test.ts:130: `record.bridgeProbes.length > 0`. The Stage 1 commit set status to 'bridge-probe' but kept bridgeProbes=[] because this migration is a full retirement, not a partial bridge — there are no bash assertions remaining as bridge probes — so the invariant tripped on cli-test-shards (2). Revert status to 'not-migrated'. The retirement plan, the new test path, and the in-flight branch reference are preserved in the entry's notes field. Status will flip directly 'not-migrated' -> 'retired' in Stage 2b atomically with the legacy script delete and the regression-e2e.yaml job removal, satisfying the deletion-gate test's 'retired' branch (retiredReason non-empty). Verified locally: e2e-migration-inventory.test.ts 8/8 green; gateway-start-failure-integration.test.ts 13/13 still green. Refs: #4355
Stage 2b: complete the retirement now that the integration test is
green in CI. Atomic four-part change:
1. Inventory entry transitions not-migrated -> retired with
retiredReason filled, deletionReady=true, and
deletionApprovalIssue=#4357 (matching the gate in
e2e-migration-inventory.test.ts).
2. test/e2e/test-docker-unreachable-gateway-start.sh is deleted.
Its assertions are fully covered by
src/lib/onboard/gateway-start-failure-integration.test.ts
(landed in the previous commit on this branch).
3. The docker-unreachable-gateway-start-e2e job is removed from
.github/workflows/regression-e2e.yaml in three places: the
'Valid:' allowlist in the workflow_dispatch input description,
the select_regression_jobs outputs map, the includes_job guard
block, and the job definition itself.
4. The migration-inventory invariant test
(test/e2e-scenario/framework-tests/e2e-migration-inventory.test.ts)
is extended for the first-ever 'retired' entry. Two narrow
relaxations, both gated on status === 'retired':
- Skip repoPathExists(legacyScript) so a retired entry can keep
its legacyScript path for the audit trail after the file is
deleted.
- Filter retired entries out of the inventory<->filesystem
shell-script equality check, since their script no longer
exists on disk.
Non-retired entries still go through the full check; the
relaxation is opt-in by status.
Verified locally: 21/21 tests green across the integration test and
the inventory test suite. CI on the previous commit was 28/31 green
(0 failures, 2 SKIPPED, 1 null artifact); this commit only adds
deletion + invariant relaxations and is expected to remain green.
Marking PR ready for review in the same flow.
Refs: #2347, #4355
A local-only symlink to the main checkout's node_modules slipped into the previous commit via 'git add -A'. node_modules is in .gitignore; this restores that intent. No source change.
…4355) Address PR Review Advisor finding 1 (source-shape budget violation in spirit) and finding 2 (caller-level coverage gap): - Remove the entire 'call-site wiring (structural)' describe block: the three regex assertions reading onboard.ts as text, the beforeEach/afterEach managing the source string, and the extractFunctionBody helper. ci/source-shape-test-budget.json sets maxSourceShapeCases to 0, and the advisor flagged this as source-shape testing in spirit even though source-shape:check's detector did not catch the pattern. - Replace with an explicit 'call-site contracts (caller-level coverage gap)' describe block containing three it.todo entries for the call-site contracts the legacy bash script directly proved (no health-poll on docker-unreachable, no status/info probes after gateway-start failure, dockerUnreachable forwarded to handleFinal). The gap is now visible in test output instead of papered over with structural regexes. - Update the file-level header to a two-layer coverage strategy (unit + composition), document the gap, and explain the two options for closing it: ~10 vi.mock calls or extracting the inner pRetry async body of startGatewayWithOptions into an exported helper that takes streamGatewayStart as a DI parameter. Both are out of scope for this retirement PR; a follow-up issue will track option (b) so the it.todo entries can convert into real assertions. Result: 10 passing tests + 3 documented todo placeholders, no source-shape regexes, file shrinks 442 -> 369 lines. Verified locally: gateway-start-failure-integration.test.ts 10/10 green + 3 todo; e2e-migration-inventory.test.ts 8/8 green. source-shape:check still reports 0 cases. Refs: #2347, #4355
Address PR Review Advisor's new finding 'Inventory retiredReason
claims structural coverage that no longer exists' and the related
'TODO: link once filed' staleness.
- retiredReason previously claimed the replacement test included
'structural assertions on onboard.ts source proving the call-site
wiring'. That section was removed in commit 2f4c473 (advisor
finding 1, source-shape budget). Updated retiredReason to describe
the actual two-layer strategy (helper unit tests + composition
tests) and explicitly call out the three caller-level contracts
that remain as it.todo placeholders.
- Inventory notes field similarly tightened.
- Test file's '(TODO: link once filed)' replaced with concrete
reference to follow-up issue #5113 (Extract attemptGatewayStart
helper from startGatewayWithOptions for caller-level
testability).
This does not close the caller-level coverage gap itself \u2014 the
advisor's larger 'Needs attention' findings about the missing
caller-level proof remain. See PR description's 'Coverage gap
(acknowledged)' section.
Refs: #2347, #4355, #5113
…t.sh per advisor PR Review Advisor's escalated finding (3rd cycle, 'Needs attention' items still apply): caller-level coverage of the #2347 contracts (no health-poll loop entry, no openshell status/gateway info probes after docker-unreachable gateway-start failure, dockerUnreachable forwarded to handleFinalGatewayStartFailure) is not proven by the helper-level tests in this PR. The advisor explicitly recommended either landing that coverage in this PR or not marking the script fully retired until it exists. Choosing the latter: defer the full retirement to a follow-up PR after issue #5113 (extract attemptGatewayStart helper) lands, so the caller-level contracts can be exercised through a behavior-level seam. This commit reverts Stage 2b's deletions while keeping the helper-level Vitest coverage from earlier commits: - Restore test/e2e/test-docker-unreachable-gateway-start.sh (the legacy regression script). - Restore the docker-unreachable-gateway-start-e2e job in .github/workflows/regression-e2e.yaml (Valid: allowlist, select_regression_jobs outputs map, includes_job guard, full job definition). - Restore .../e2e-migration-inventory.test.ts to its origin/main shape; the status==='retired' relaxations are not needed yet and will land with the future deletion PR (their first user). - Inventory entry: status not-migrated -> not-migrated (no change), retiredReason cleared, deletionReady false, notes updated to describe the helper coverage that DID land plus the deferred-retirement plan and the #5113 follow-up. What this PR still delivers: - src/lib/onboard/gateway-start-failure-integration.test.ts: helper-level coverage of printDockerDaemonRecovery platform branches (darwin/linux/win32) + handleFinalGatewayStartFailure skip-diagnostics-and-cleanup behavior on dockerUnreachable=true + composition tests of the call-site helper sequence + classifier pinning for both signatures + 3 it.todo placeholders documenting the caller-level gap. - Inventory notes documenting the partial coverage and the follow-up plan. Refs: #2347, #4355, #5113
Add helper-level Vitest coverage for docker-unreachable gateway-start abort
Refs: #2347 (original regression), #4355 (owning migration issue), #5113 (follow-up: caller-level seam refactor)
Scope (re-scoped after PR Review Advisor feedback)
This PR adds a Vitest integration test next to the existing
gateway-start-failurefamily that covers the helper-level surface of the docker-unreachable abort path thattest/e2e/test-docker-unreachable-gateway-start.shhas historically guarded. The legacy script and itsregression-e2e.yamljob remain in place; full retirement is deferred to a follow-up PR after #5113 lands.Final diff vs
mainsrc/lib/onboard/gateway-start-failure-integration.test.tsit.todo)test/e2e-scenario/migration/legacy-inventory.jsonnotesfield updated to reference this PR + the partial coverage + the follow-upThat's it. No production-code changes, no workflow changes, no script deletions.
What the new test covers
Two layers around the helpers already exported from
src/lib/onboard/:Unit tests of
printDockerDaemonRecovery(darwin/linux/win32 branches) andhandleFinalGatewayStartFailure({dockerUnreachable: true})proving:exitProcess(1)is calledcollectDiagnostics(which would invokeopenshell doctor logs) is never calledcleanupGateway(which would invokedestroyGateway) is never calledprintErrorhandleFinalGatewayStartFailure({dockerUnreachable: false})DOES callcollectDiagnostics+cleanupGateway(guards against a future refactor that short-circuits the wrong branch)Composition tests running the same helper sequence the call site uses —
reportLegacyGatewayStartResultFailure→ checkfailure.kind === "docker_unreachable"→handleFinalGatewayStartFailure({dockerUnreachable: true, ...})→exitProcess(1)— for both the macOS Colima signature and the Linux dockerd signature, plus a negative control on unrelated start-failure output.Plus a classifier pinning suite for the exact byte sequences the legacy bash script generated, so deletion of the script doesn't lose those reference strings.
Caller-level coverage already exists; one narrow gap remains
[Updated after PR Review Advisor cycle 4 caught a missed prior-art reference.] The #2347 caller-level contracts the legacy bash script proved are already covered by an existing test in
test/onboard.test.ts:That test spawns a child Node process with a PATH-shimmed openshell binary returning the Docker socket-not-found signature, mocks
p-retry, loads the compiledstartGateway()fromdist/lib/onboard.js, and asserts:"Docker daemon is not running"in stderr"colima start"macOS hint in stderr"Waiting for gateway health"in stdout (no health-poll loop entry)HEALTH POLL REACHEDin stdout (nostatus/gateway infoprobes)Combined with the helper + composition tests in this PR, four of the five legacy assertions are already proven at the process level today. The only narrow remaining gap is direct evidence that
handleFinalGatewayStartFailurewas called withdockerUnreachable=truespecifically (vs some other exit-1 path) — distinguishable via the absence of the"Cleaning up failed gateway state..."line that the alternative branch prints.test/onboard.test.tsdoes not currently grep for that absence.That narrow gap is the one remaining
it.todoin this file. Follow-up issue #5113 tracks closing it; its scope is now much smaller than originally framed (likely just an additional assertion intest/onboard.test.ts, or a small extension ifattemptGatewayStartextraction proves cleaner).Why land this independently
handleFinalGatewayStartFailureskip-diagnostics-on-dockerUnreachable behavioronboard.tsto land; the seam refactor in Extract attemptGatewayStart helper from startGatewayWithOptions for caller-level testability #5113 is a separate concernSkill contract notes
This PR was driven by
/skill:nemoclaw-e2e-legacy-migrate. The skill's standard outputs assume a full migration; deviations on this entry:startGateway()with a PATH-shimmed openshell. Retirement to a Vitest test undersrc/lib/onboard/was the right architectural call.regression-e2e.yaml, notnightly-e2e.yaml. The skill's "GREEN in last 3 nightlies" baseline contract was inapplicable.Triage report:
.pi/reports/e2e-legacy-migrate/test-docker-unreachable-gateway-start-triage.md(not committed; user-local).🤖 Drafted via the
nemoclaw-e2e-legacy-migrateskill, partial-coverage path.