test(auto-reply): remove runPreparedReply shield + add real-chain regression test (#2468)#2473
Merged
alexey-pelykh merged 2 commits intomainfrom Apr 22, 2026
Merged
Conversation
…ression test — Area 8 of #2336 (#2468) Removes the global vi.mock("./get-reply-run.js", () => ({ runPreparedReply: vi.fn(async () => undefined) })) from get-reply.test-mocks.ts — one of the five shields that hid the #2334 crash (TypeError: modelState.resolveDefaultThinkingLevel is not a function) from CI. Adds get-reply.real-chain.test.ts, a regression test that drives getReplyFromConfig → resolveReplyDirectives → handleInlineActions → runPreparedReply through the real code path on a first-turn webchat message. Only the CLI boundary (runReplyAgent), outbound delivery (routeReply), session-state init (initSessionState), and process-global dependencies are mocked. Neither resolveReplyDirectives nor handleInlineActions nor runPreparedReply is stubbed. Any future caller in runPreparedReply that assumes a field/method the real resolveReplyDirectives doesn't produce will surface at test time instead of at runtime — the exact regression mode of #2334. Two ancillary changes to the common test mocks so removing the shield doesn't break consumers of registerGetReplyCommonMocks: - resolveCommandAuthorization stub now returns ownerList: [] and senderIsOwner: false in addition to isAuthorizedSender: true (required by buildCommandContext). - ./directive-handling.js is now partial-mocked via vi.importActual so real parseInlineDirectives / isDirectiveOnly remain callable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lates session-run-registry (#2468) The new get-reply.real-chain.test.ts adds `vi.mock("../../agents/session-run-registry.js", ...)` to isolate the `resolveReplyDirectives → runPreparedReply` chain from actual session-run side effects. This is the **isolation** category per CONTRIBUTING.md § Fork-boundary mocks: mocking a dependency with side effects to unit-test logic that would otherwise require full integration setup. Not a stub-placeholder — session-run-registry is live.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Area 8 of #2336 (umbrella) / follow-up to #2463. Closes #2468.
vi.mock("./get-reply-run.js", () => ({ runPreparedReply: vi.fn(async () => undefined) }))fromsrc/auto-reply/reply/get-reply.test-mocks.ts— one of the five shields that hid gut(auto-reply): remove dead thinking-level and model-selection plumbing — TypeError on first-turn messages #2334 (TypeError: modelState.resolveDefaultThinkingLevel is not a function) from CI.src/auto-reply/reply/get-reply.real-chain.test.ts: drivesgetReplyFromConfig → resolveReplyDirectives → handleInlineActions → runPreparedReplythrough the real code path on a first-turn webchat message. Only the CLI boundary (runReplyAgent), outbound delivery (routeReply), and session/filesystem boundaries are mocked. No fabricated shapes for theresolveReplyDirectives→runPreparedReplyintegration boundary.registerGetReplyCommonMocksconsumer green:resolveCommandAuthorizationstub now returnsownerList: []andsenderIsOwner: falsealongsideisAuthorizedSender: true(required bybuildCommandContext)../directive-handling.jsis partial-mocked viavi.importActual, so realparseInlineDirectives/isDirectiveOnlyremain callable.Why this test has teeth
The #2334-class regression mode was: a caller in
runPreparedReplyassumed a method existed on a threaded object thatresolveReplyDirectivesdidn't provide. Every mock in the test harness paved over the gap. The new test does not mockresolveReplyDirectives,handleInlineActions, orrunPreparedReply— so any future shape drift between those three stages throws at test time. The assertion isrunReplyAgentwas called with the expectedcommandBody: proof the chain executed end-to-end.Scope
Two files:
src/auto-reply/reply/get-reply.test-mocks.ts— shield block deleted (replaced with doc comment), two mocks adjusted.src/auto-reply/reply/get-reply.real-chain.test.ts— new test.Explicitly out of scope for this PR (per #2468 body)
as never/as anyin test arrangement code" item — left for a follow-up.resolveReplyDirectivesmock inreset-hooks-fallback.test.ts— it is intentional for fallback-specific testing, not a global shield.Test plan
pnpm check(format + typecheck + lint + fork-integrity gates) → PASSget-reply.real-chain.test.ts→ 1/1 passget-reply.reset-hooks-fallback.test.ts→ 2/2 pass (no regression from common-mock updates)get-reply-run.media-only.test.ts→ 10/10 pass (no regression from shield removal)pnpm testunder project shards (to be verified by the auto-merge gate)Pre-existing failures in
src/auto-reply/reply/session.test.tsandsrc/auto-reply/reply/agent-runner.media-paths.test.tswere confirmed to reproduce on baselineorigin/main— not caused by this PR.Related
🤖 Generated with Claude Code