Skip to content

gut(auto-reply): remove dead thinking-level and model-selection plumbing — TypeError on first-turn messages #2334

@alexey-pelykh

Description

@alexey-pelykh

Context

Slack inbound debounce flushes crash on first-turn messages without an explicit /think directive:

08:00:33 [slack] inbound debounce flush failed: TypeError: modelState.resolveDefaultThinkingLevel is not a function

The error surfaces at src/slack/monitor/message-handler.ts:172 but is not Slack-specific — it fires in runPreparedReply (src/auto-reply/reply/get-reply-run.ts:390), which is the shared reply runner for every channel. Slack just has the loudest error wrapper. Feishu has the same wrapper at extensions/feishu/src/monitor.account.ts:375, and every other channel reaches this code path via get-reply.ts → runPreparedReply.

The crash only fires when resolvedThinkLevel is falsy — i.e. messages with no /think directive and no session-remembered thinking level. That's most first-turn messages on any channel. The debounce swallows the error and the user sees a silent no-reply.

Root Cause

resolveDefaultThinkingLevel was removed from createModelSelectionState during prior gut sweeps, but the caller was not swept. The caller still does:

// src/auto-reply/reply/get-reply-run.ts:390
if (!resolvedThinkLevel) {
  resolvedThinkLevel = await modelState.resolveDefaultThinkingLevel();
}

…on a modelState object that no longer has that method. The sibling call site at get-reply.ts:338-340 already defends against this with ?? (() => undefined), but the guard was never applied to get-reply-run.ts:390.

Why TypeScript didn't catch it

Two createModelSelectionState declarations coexist with divergent shapes:

File Shape Role
src/auto-reply/reply/get-reply-directives.ts:19-26 Concrete object — no resolveDefaultThinkingLevel What resolveReplyDirectives actually builds
src/auto-reply/reply/model-selection.ts:4 (..._args) => ({}) as any Type source imported by get-reply-run.ts:42 and get-reply-inline-actions.ts:29
src/agents/model-selection.ts:155 (..._args) => ({}) as any Orphan stub, unused

get-reply-run.ts:155 declares modelState: Awaited<ReturnType<typeof createModelSelectionState>> — but the typeof points at the any-returning stub, so modelState.resolveDefaultThinkingLevel() type-checks as any.anything(). TypeScript is blind to the divergence.

Why tests didn't catch it

Five independent shields failed simultaneously:

  1. The branch is never exercised. get-reply-run.media-only.test.ts:129 hardcodes resolvedThinkLevel: "high", so the if (!resolvedThinkLevel) guard at line 389 is never entered. Line 390 has no test coverage.
  2. Every test fabricates modelState instead of using createModelSelectionState. Five files (get-reply-run.media-only.test.ts:138, get-reply.reset-hooks-fallback.test.ts:91, get-reply-inline-actions.skip-when-config-empty.test.ts:77, commands.test.ts:301, commands.test-harness.ts:47) hand in object literals with resolveDefaultThinkingLevel: async () => …, often cast as never. Tests validate a fiction.
  3. Every get-reply.ts test mocks out runPreparedReply globally. get-reply.test-mocks.ts:39-40 replaces it with vi.fn(async () => undefined). The integration between resolveReplyDirectives's real output and runPreparedReply's real input is never exercised.
  4. get-reply.reset-hooks-fallback.test.ts:6 also mocks resolveReplyDirectives. Both ends of the integration are mocked. There's nothing left to integrate.
  5. No channel-inbound → reply integration test exists. src/slack/monitor/message-handler/prepare.test.ts covers prepareSlackMessage only; dispatchPreparedSlackMessage → getReplyFromConfig → runPreparedReply is unit-test orphaned. LIVE=1 pnpm test:live is gated off CI.

The cumulative effect is that four gut sweeps (#2108, #2134, #2151, #2290) touched thinking/model code without any CI signal, because the test harnesses had drifted to match the new shape while the production caller hadn't.

Scope

src/auto-reply/reply/ — core plumbing (delete)

  • model-selection.ts — delete entirely (resolveStoredModelOverride + createModelSelectionState stub). Re-point any lingering imports to get-reply-directives.ts or delete the import.
  • get-reply-directives.ts — delete createModelSelectionState (lines 19-26), delete modelState from ReplyDirectiveContinuation (line 57), delete resolveDefaultThinkingLevel? (line 71), delete the createModelSelectionState({…}) call site (line 301).
  • get-reply.ts — delete the modelState threading (lines 277, 388), delete the defensive resolveDefaultThinkingLevel passthrough (lines 338-340).
  • get-reply-run.ts — delete the modelState field from RunPreparedReplyParams (line 155), delete the import (line 42), delete the crashing call block (lines 389-391), delete the XHigh gating block (lines 392-410) — CLI runtimes own XHigh support, we shouldn't gate it.
  • get-reply-inline-actions.ts — delete the modelState import (line 29), delete resolveDefaultThinkingLevel from the params shape (lines 110-112, 149), delete the two passthrough call sites (lines 322, 359).
  • get-reply-directives-apply.ts — delete the modelState field (line 54), the import (line 14), and the modelState: {…} construction at line 197.
  • directive-handling.ts — delete resolveDefaultThinkingLevel from the shape (line 20).
  • directive-handling.params.ts — delete the modelState field (line 38).
  • directive-handling.fast-lane.ts — delete the _modelState destructure (line 29).
  • commands-types.ts — delete resolveDefaultThinkingLevel from HandleCommandsParams (line 52).
  • commands-status.ts — delete resolveDefaultThinkingLevel field (line 50).
  • directive-handling.parse.ts — delete /think directive parsing.
  • thinking.ts — delete ThinkLevel, supportsXHighThinking, formatXHighModelHint, or reduce to a display-only enum if apps/* echoes CLI-reported levels.

src/agents/

  • model-selection.ts:155 — delete orphan createModelSelectionState stub.

src/plugin-sdk/

  • mattermost.ts:23 — delete the resolveStoredModelOverride re-export from the extension SDK.

Consumers of the stubbed resolveStoredModelOverride

  • extensions/mattermost/src/mattermost/model-picker.ts:6,232 — delete the import and the branch that calls it. The picker either passes the selected model name directly to the CLI or is deleted entirely if it has no other purpose.
  • src/discord/monitor/native-command.ts:40,485 — same treatment.

Session state

  • src/config/sessions/types.ts — delete thinkingLevel from SessionEntry.
  • src/gateway/protocol/schema/sessions.ts — delete the gateway schema field.
  • get-reply-run.ts:401-410 — the "downgrade xhigh → high" session mutation block disappears with the XHigh gating block above.

Session/queue/status plumbing (audit and remove)

  • resolvedThinkLevel threading through session.ts, followup-runner.ts, queue/types.ts, response-prefix-template.ts.

Tests — delete the fabrications

  • get-reply-run.media-only.test.ts:137-139 — delete the fabricated modelState.
  • get-reply.reset-hooks-fallback.test.ts:90-92 — delete the fabricated modelState.
  • get-reply-inline-actions.skip-when-config-empty.test.ts:77 — delete fabricated hook.
  • commands.test.ts:301 — delete fabricated hook.
  • commands.test-harness.ts:47 — delete fabricated hook.
  • get-reply.test-mocks.ts:39-40 — reconsider the global runPreparedReply mock; if the call target is reduced enough to be testable directly, unmock it so the integration runs.
  • Add at least one regression test that constructs modelState/continuation via the real resolveReplyDirectives path and drives runPreparedReply with resolvedThinkLevel: undefined. If the sweep deletes the entire call chain, the regression case becomes "no-op on first-turn message without /think" and should pass without fabrications.

Docs

  • docs/tools/slash-commands.md — remove /think entry.
  • docs/concepts/context.md, docs/web/tui.md, docs/channels/group-messages.md — remove /think references.

Display-only layers (audit, keep if pass-through)

  • apps/android/app/src/main/java/org/remoteclaw/**, apps/shared/RemoteClawKit/**, ui/src/ui/** — these reference thinkingLevel in chat UI. Keep if they're displaying what the CLI reports back. Remove if they drive state we no longer own. Audit per file.

Acceptance Criteria

  • remoteclaw responds to first-turn Slack messages without /think directive (no TypeError, no silent no-reply).
  • src/auto-reply/reply/model-selection.ts deleted.
  • createModelSelectionState symbol exists in zero source files (search the tree).
  • resolveDefaultThinkingLevel symbol exists in zero source files.
  • resolveStoredModelOverride symbol exists in zero source files.
  • modelState parameter/field exists in zero src/auto-reply/reply/ source files.
  • /think directive parsing deleted; related docs updated.
  • sessionEntry.thinkingLevel deleted from session types and gateway schema.
  • XHigh gating block at get-reply-run.ts:392-410 deleted.
  • All five test-file fabrications of resolveDefaultThinkingLevel deleted.
  • At least one regression test exercises runPreparedReply with resolvedThinkLevel: undefined without fabricating modelState.
  • pnpm check green.
  • pnpm test green.
  • pnpm canvas:a2ui:bundle && pnpm test green end-to-end (per project convention).
  • LIVE=1 pnpm test:live green on at least one channel smoke test that exercises the first-turn path (per CLAUDE.md PR submission workflow for middleware changes).
  • Display-only layers in apps/* and ui/* audited — each thinkingLevel reference either justified as pass-through or removed.
  • No as never/as any casts added in test arrangement code to keep things green (doing so would re-introduce the same blind spot).

Delivery

Two PRs, or a single PR with two commits:

  1. Hotfix commit: guard get-reply-run.ts:390 with modelState.resolveDefaultThinkingLevel?.(). One line, stops the bleed on any in-flight deploy.
  2. Sweep commit: everything above. Larger, deletion-heavy, low risk because the code being deleted is already dead — the runtime crash is the evidence.

Whoever takes this can also split by surface: one PR for the core plumbing sweep, a follow-up for the extension consumers (mattermost, discord) and display-layer audit. Either works; the hotfix should ship in the same release cycle as the first PR regardless.

Related

Metadata

Metadata

Assignees

Labels

gutRemoving dead upstream subsystems

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions