test(cli): subprocess integration test harness + regression suite for opencode run by kitlangton · Pull Request #28230 · anomalyco/opencode

kitlangton · 2026-05-18T20:07:31Z

Closes the integration-test gap for the opencode run command. Today every cli/run/*.test.ts is a unit test of an extracted helper — nothing exercises the full handler end-to-end. Bugs that span argv → server boot → SDK call → event consumption → exit (like the original /event race or #27371's invalid-model hang) were invisible to in-process tests.

What's in this PR

Harness (`test/lib/run-process.ts`, `test/lib/test-provider.ts`)

withRunFixture(fn) — provisions a TestLLMServer running in-process at a random port, an isolated tmpdir for HOME/XDG_*, and a typed CLI invoker.
runIt.live(name, fixture => effect) — test-runner wrapper that's it.live + withRunFixture in one. Saves one nesting level at every call site.
OpencodeCli object exposed by the fixture:
- opencode.run(message, opts?) — typed builder for opencode run invocations.
- opencode.spawn(argv, opts?) — escape hatch for arbitrary CLI args.
- opencode.expectExit(result, code) — assertion helper that dumps captured stderr/stdout on mismatch.
- opencode.parseJsonEvents(stdout) — parses --format json line-delimited output.
testProviderConfig(url) — shared between the new harness and the existing httpapi-sdk.test.ts (extracted from a near-duplicate).

Configuration flows through opencode's built-in test affordances — OPENCODE_CONFIG_CONTENT (inline JSON), OPENCODE_TEST_HOME, OPENCODE_DISABLE_PROJECT_CONFIG, OPENCODE_PURE, plus the OPENCODE_DISABLE_* flags that suppress auto-update / auto-compact / models-fetch noise. No config files written.

Tests (`test/cli/run/run-process.test.ts`)

Happy path — prompt completes, output reaches stdout, exit 0.
Unknown-model regression for fix(run): restore non-interactive exit behavior #27371 — exits nonzero AND wall-clock under 15s. A re-introduced hang would expire the inner timeout, fail the durationMs assertion, and report distinctly from a real SDK error.
Mid-stream LLM error contract lock-in — when llm.fail(...) errors the SSE stream after the prompt was accepted, opencode currently exits 0. Captures the contract so a future cleanup (flipping session errors to nonzero exit) is explicit.
--format json shape — emits one JSON object per line on stdout. Each event has type and sessionID. At least one text event with the LLM's response. Locks in the wire shape for CI scripts and tooling.

All 4 pass locally in ~10s serial.

Verified

bun run test test/cli/run/run-process.test.ts — 4/4 green
bun typecheck — clean
bun run test test/server/httpapi-sdk.test.ts -t "streams sync-backed" — still green after the testProviderConfig extraction
A Windows CI failure on the latest run is a pre-existing scrollback.surface.test.ts flake unrelated to this PR — my new test passes on Windows in 4.2s.

Why subprocess and not in-process

In-process tests are faster but miss exactly the bugs we're worried about: argv parsing, signal handling, exit codes from the OS perspective, server auto-start, the SDK consuming a real SSE stream over a socket. Subprocess testing is the right tier for "integration" — it's what would have caught the /event race had it existed.

Per-test cost is ~3-5 seconds (opencode startup). Acceptable for a small focused suite. If CI cares later, a shared warm server via --attach is the natural next step.

Follow-ups (not in this PR)

Generalize the harness to withCliFixture so other commands (serve, acp, auth) can reuse the same pattern.
Smoke tests across all CLI commands as a behavioral fingerprint — useful safety net for the eventual Effect CLI migration.
Apply the run-exit cleanup that flips session errors to nonzero exit — test 3 captures the current contract, so the change becomes explicit rather than invisible.

Phase 1 of a broader effort to close the integration test gap for the run command. Today every test under cli/run/*.test.ts is a unit test of an extracted helper — nothing exercises the RunCommand handler end-to- end. Bugs that span argv → server boot → SDK call → event consumption → exit (like #27371 or the /event race) are invisible to in-process tests. This commit adds: - `test/lib/run-process.ts` — a `withRunFixture` helper that provisions a TestLLMServer running in-process, an isolated tmpdir for HOME/XDG, and a `runOpencode(args)` spawn function. The CLI subprocess talks to the fake LLM over real HTTP at a random port. Configuration flows through `OPENCODE_CONFIG_CONTENT` (inline JSON env var), bypassing file-search complexity. Background work (auto-update, auto-compact, models fetch, external plugins) is disabled via opencode's built-in test env vars. - `test/cli/run/run-process.test.ts` — one smoke test that proves the harness wires up correctly: spawn `opencode run "say hi"` against a TestLLMServer queued with a single text response, assert exit 0 and the response appears on stdout. The smoke test runs in ~4s. With this harness in place, Phase 2 will add the regression test suite (invalid-model hang, JSON format, midstream errors, --command path).

Two ergonomic changes after the first pass: 1. Replace the freestanding `runOpencode(["run", "--model", modelID, msg])` with a typed builder on the fixture: `opencode.run(msg, opts?)`. The fixture defaults the model so tests don't repeat it, and flags like `format`, `agent`, `command`, `printLogs` are typed instead of stringly. `opencode.spawn(argv)` stays as the escape hatch for arbitrary args. 2. Introduce `runIt.live(name, fixture => effect)` that wraps `it.live(name, () => withRunFixture(fixture))`. Saves one nesting level + the arrow-to-fixture closure at every call site. `expectExit` and `RunResult` move under the `opencode` namespace returned by the fixture (no separate top-level export to chase). Before: it.live("happy path", () => withRunFixture(({ llm, modelID, runOpencode }) => Effect.gen(function*() { yield* llm.text("hello") const result = yield* runOpencode( ["run", "--model", modelID, "say hi"], { timeoutMs: 30_000 }, ) expectExit(result, 0, "happy path") expect(result.stdout).toContain("hello") }))) After: runIt.live("happy path", ({ llm, opencode }) => Effect.gen(function*() { yield* llm.text("hello") const result = yield* opencode.run("say hi") opencode.expectExit(result, 0) expect(result.stdout).toContain("hello") }))

Two small wins from a simplify-pass review: 1. The fake-LLM provider config was duplicated between test/lib/run-process.ts (new) and test/server/httpapi-sdk.test.ts (existing). Same shape, modulo whitespace and parameter name. Extracted to test/lib/test-provider.ts and re-used from both. 2. Added a comment on `runIt` explaining why only `.live` is exposed — subprocess tests must use the real clock; TestClock can't drive a child process. Future readers won't have to wonder if `.only`/`.skip` were oversights. Other simplify-pass findings reviewed and skipped: - Using `tmpdirScoped` would add a ChildProcessSpawner layer dependency for code paths that don't need git; current 4-line mkdir+cleanup is simpler. - Collapsing the argv conditionals via `.concat([])` is a style preference, not clearer. - `OPENCODE_AUTH_CONTENT` is conceptually part of "test env isolation", not a separate concern; staying in `isolatedEnv`. - crypto.randomUUID() for tmpdir naming — same Math.random pattern as the existing tmpdirScoped helper; collisions are theoretical given bun:test's default serial execution.

Three new tests using the harness from earlier commits: 1. Unknown-model regression for #27371 — used to hang forever waiting on a session.status === idle event that never arrived. Asserts both nonzero exit and wall-clock under 15s (a hang would expire timeout and produce a different signal-killed failure). 2. Mid-stream LLM error contract lock-in — when llm.fail(...) errors the SSE response after the prompt was accepted, opencode currently exits 0. Captures that as the contract so a future cleanup (e.g. flipping session.error events to nonzero exit) is explicit. 3. --format json shape — emits one JSON object per line on stdout. Each event has `type` and `sessionID`. At least one `text` event with the LLM response. Locks in the wire shape for CI scripts and tooling. Total: 4 tests, 10.8s in serial.

The `--format json` regression test was parsing stdout into events with six inline lines (split, trim, filter, parse, then validating). All three simplify-pass reviewers flagged this as a reusable helper. Move it to OpencodeCli as `opencode.parseJsonEvents(stdout)`. The test collapses to a single call. Any future --format json test gets the same parsing for free, including the "throws loudly on malformed line" check. Other simplify-pass findings reviewed and skipped: - expectDurationUnder helper — only one test would use it; premature. - Tighter outer test timeouts — the per-test durationMs assertion already detects hangs at the subprocess level; outer timeout is pure safety net. - Streaming JSON parser — "O(n²)" claim doesn't apply at our scale (<100 events per test). - Parallel test execution — TestLLMServer-style singletons in the existing test pattern make cross-test pollution likely; not worth the risk for marginal CI speedup.

… opencode run (anomalyco#28230)

github-actions Bot added the contributor label May 18, 2026

kitlangton added 2 commits May 18, 2026 16:35

kitlangton marked this pull request as ready for review May 18, 2026 21:21

kitlangton added 2 commits May 18, 2026 17:24

kitlangton changed the title ~~test(cli): subprocess integration test harness for opencode run [phase 1]~~ test(cli): subprocess integration test harness + regression suite for opencode run May 18, 2026

kitlangton merged commit 0f3d168 into dev May 18, 2026
10 of 12 checks passed

kitlangton deleted the worktree-run-integration-harness branch May 18, 2026 22:32

kitlangton mentioned this pull request May 18, 2026

refactor(test/lib): generalize run-process harness into cli-process #28253

Merged

AIALRA-0 pushed a commit to AIALRA-0/opencode-turn-engine that referenced this pull request Jun 10, 2026

test(cli): subprocess integration test harness + regression suite for…

096e440

… opencode run (anomalyco#28230)

AIALRA-0 pushed a commit to AIALRA-0/opencode-turn-engine that referenced this pull request Jun 10, 2026

test(cli): subprocess integration test harness + regression suite for…

0d17e29

… opencode run (anomalyco#28230)

avion23 pushed a commit to avion23/opencode that referenced this pull request Jun 10, 2026

test(cli): subprocess integration test harness + regression suite for…

513fa10

… opencode run (anomalyco#28230)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(cli): subprocess integration test harness + regression suite for opencode run#28230

test(cli): subprocess integration test harness + regression suite for opencode run#28230
kitlangton merged 5 commits into
devfrom
worktree-run-integration-harness

kitlangton commented May 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kitlangton commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's in this PR

Harness (test/lib/run-process.ts, test/lib/test-provider.ts)

Tests (test/cli/run/run-process.test.ts)

Verified

Why subprocess and not in-process

Follow-ups (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kitlangton commented May 18, 2026 •

edited

Loading

Harness (`test/lib/run-process.ts`, `test/lib/test-provider.ts`)

Tests (`test/cli/run/run-process.test.ts`)