test(auto-reply): remove runPreparedReply shield + add real-chain regression test (#2468) by alexey-pelykh · Pull Request #2473 · remoteclaw/remoteclaw

alexey-pelykh · 2026-04-22T15:03:47Z

Summary

Area 8 of #2336 (umbrella) / follow-up to #2463. Closes #2468.

Remove the global vi.mock("./get-reply-run.js", () => ({ runPreparedReply: vi.fn(async () => undefined) })) from src/auto-reply/reply/get-reply.test-mocks.ts — one of the five shields that hid gut(auto-reply): remove dead thinking-level and model-selection plumbing — TypeError on first-turn messages #2334 (TypeError: modelState.resolveDefaultThinkingLevel is not a function) from CI.
Add src/auto-reply/reply/get-reply.real-chain.test.ts: drives getReplyFromConfig → resolveReplyDirectives → handleInlineActions → runPreparedReply through the real code path on a first-turn webchat message. Only the CLI boundary (runReplyAgent), outbound delivery (routeReply), and session/filesystem boundaries are mocked. No fabricated shapes for the resolveReplyDirectives → runPreparedReply integration boundary.
Two ancillary mock updates to keep the existing registerGetReplyCommonMocks consumer green:
- resolveCommandAuthorization stub now returns ownerList: [] and senderIsOwner: false alongside isAuthorizedSender: true (required by buildCommandContext).
- ./directive-handling.js is partial-mocked via vi.importActual, so real parseInlineDirectives / isDirectiveOnly remain callable.

Why this test has teeth

The #2334-class regression mode was: a caller in runPreparedReply assumed a method existed on a threaded object that resolveReplyDirectives didn't provide. Every mock in the test harness paved over the gap. The new test does not mock resolveReplyDirectives, handleInlineActions, or runPreparedReply — so any future shape drift between those three stages throws at test time. The assertion is runReplyAgent was called with the expected commandBody: proof the chain executed end-to-end.

Scope

Two files:

src/auto-reply/reply/get-reply.test-mocks.ts — shield block deleted (replaced with doc comment), two mocks adjusted.
src/auto-reply/reply/get-reply.real-chain.test.ts — new test.

Explicitly out of scope for this PR (per #2468 body)

The "consider a lint rule for as never / as any in test arrangement code" item — left for a follow-up.
Other gut(auto-reply): finish thinking-level and model-selection sweep — follow-up to #2335 #2336 areas (3, 5, 6, 7) — directive removal, thinking.ts helpers, session state, directive handling internals, docs, display audit.
The test-local resolveReplyDirectives mock in reset-hooks-fallback.test.ts — it is intentional for fallback-specific testing, not a global shield.

Test plan

pnpm check (format + typecheck + lint + fork-integrity gates) → PASS
get-reply.real-chain.test.ts → 1/1 pass
get-reply.reset-hooks-fallback.test.ts → 2/2 pass (no regression from common-mock updates)
get-reply-run.media-only.test.ts → 10/10 pass (no regression from shield removal)
CI: full pnpm test under project shards (to be verified by the auto-merge gate)
CI: build + docs jobs

Pre-existing failures in src/auto-reply/reply/session.test.ts and src/auto-reply/reply/agent-runner.media-paths.test.ts were confirmed to reproduce on baseline origin/main — not caused by this PR.

Parent: gut(auto-reply): finish thinking-level and model-selection sweep — follow-up to #2335 #2336 (umbrella)
Bug this shield originally hid: gut(auto-reply): remove dead thinking-level and model-selection plumbing — TypeError on first-turn messages #2334
Prior partial sweep: gut(auto-reply): partial thinking-level + model-selection sweep — Areas 1+2+4 of #2336 #2463

🤖 Generated with Claude Code

…ression test — Area 8 of #2336 (#2468) Removes the global vi.mock("./get-reply-run.js", () => ({ runPreparedReply: vi.fn(async () => undefined) })) from get-reply.test-mocks.ts — one of the five shields that hid the #2334 crash (TypeError: modelState.resolveDefaultThinkingLevel is not a function) from CI. Adds get-reply.real-chain.test.ts, a regression test that drives getReplyFromConfig → resolveReplyDirectives → handleInlineActions → runPreparedReply through the real code path on a first-turn webchat message. Only the CLI boundary (runReplyAgent), outbound delivery (routeReply), session-state init (initSessionState), and process-global dependencies are mocked. Neither resolveReplyDirectives nor handleInlineActions nor runPreparedReply is stubbed. Any future caller in runPreparedReply that assumes a field/method the real resolveReplyDirectives doesn't produce will surface at test time instead of at runtime — the exact regression mode of #2334. Two ancillary changes to the common test mocks so removing the shield doesn't break consumers of registerGetReplyCommonMocks: - resolveCommandAuthorization stub now returns ownerList: [] and senderIsOwner: false in addition to isAuthorizedSender: true (required by buildCommandContext). - ./directive-handling.js is now partial-mocked via vi.importActual so real parseInlineDirectives / isDirectiveOnly remain callable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…lates session-run-registry (#2468) The new get-reply.real-chain.test.ts adds `vi.mock("../../agents/session-run-registry.js", ...)` to isolate the `resolveReplyDirectives → runPreparedReply` chain from actual session-run side effects. This is the **isolation** category per CONTRIBUTING.md § Fork-boundary mocks: mocking a dependency with side effects to unit-test logic that would otherwise require full integration setup. Not a stub-placeholder — session-run-registry is live.

alexey-pelykh enabled auto-merge (squash) April 22, 2026 15:03

alexey-pelykh merged commit d45deb1 into main Apr 22, 2026
15 checks passed

alexey-pelykh deleted the test/2468-runreply-shield-removal branch April 22, 2026 15:14

alexey-pelykh mentioned this pull request Apr 22, 2026

gut(session): drop SessionEntry.thinkingLevel field + gateway schema — Area 3 of #2336 #2476

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(auto-reply): remove runPreparedReply shield + add real-chain regression test (#2468)#2473

test(auto-reply): remove runPreparedReply shield + add real-chain regression test (#2468)#2473
alexey-pelykh merged 2 commits intomainfrom
test/2468-runreply-shield-removal

alexey-pelykh commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alexey-pelykh commented Apr 22, 2026

Summary

Why this test has teeth

Scope

Explicitly out of scope for this PR (per #2468 body)

Test plan

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant