Summary
In AUTO approval mode, the two-stage LLM classifier (packages/core/src/permissions/classifier.ts) fails closed on any timeout — a timed-out judge call returns shouldBlock=true, unavailable=true and the action is blocked as an "infrastructure failure". The current stage timeouts are quite aggressive and, in practice, are easy to trip on slow networks or long transcripts, causing legitimate actions to be spuriously blocked.
Current behavior
packages/core/src/permissions/classifier.ts:37-39
export const STAGE1_TIMEOUT_MS = 3_000; // fast stage
export const STAGE2_TIMEOUT_MS = 10_000; // thinking/review stage
Both stages wrap the side query with AbortSignal.timeout(...). On timeout the request aborts and failClosed() / the stage-2 catch path returns an unavailable block. There is:
- no per-fetch timeout separate from the overall stage budget,
- no stall watchdog / retry headroom beyond
runSideQuery's maxAttempts: 2,
- a single fixed budget regardless of transcript size.
3s / 10s is tight when the classifier call includes a large transcript or the network is slow; the user then sees the action blocked with "classifier unavailable" even though nothing is actually wrong with the action.
Request 1 — loosen the stage timeouts
Please consider raising the stage budgets (and/or adding a separate per-fetch timeout + a non-aborting stall log) so that a slow-but-healthy classifier call is not treated as a hard block. Comparable auto-mode classifiers in the same design lineage use far more generous budgets (tens of seconds for the fast stage, ~2 min for the review stage, plus a per-fetch timeout and retries). The exact numbers can be tuned, but 3s/10s appears to be the source of avoidable false blocks.
Request 2 — disable thinking in ALL stages
packages/core/src/permissions/classifier.ts:222 currently enables thoughts in stage 2:
config: {
temperature: 0,
maxOutputTokens: 4096,
thinkingConfig: { includeThoughts: true }, // stage 2
},
Stage 1 already sets includeThoughts: false. For a latency-sensitive permission gate, thinking should be disabled in every stage — enabling it on stage 2 makes the review path slower and more expensive, which directly worsens the timeout problem above. The model can still write its reasoning into the thinking field of the structured output without a reasoning budget being allocated.
Suggested direction
- Raise
STAGE1_TIMEOUT_MS / STAGE2_TIMEOUT_MS to more forgiving values, and optionally add an independent per-request fetch timeout + retry/stall handling so a transient slow call doesn't fail closed.
- Set
thinkingConfig: { includeThoughts: false } in stage 2 as well, so thinking is off across the board.
- (Optional) Make the timeouts configurable or scale them with transcript size, since the budget needs differ a lot between a tiny tool call and a large transcript.
Impact
Reduces spurious "Auto mode classifier unavailable; action blocked for safety" blocks that interrupt otherwise-valid AUTO-mode sessions, and makes the judge cheaper/faster by not allocating a thinking budget on the review stage.
Summary
In AUTO approval mode, the two-stage LLM classifier (
packages/core/src/permissions/classifier.ts) fails closed on any timeout — a timed-out judge call returnsshouldBlock=true, unavailable=trueand the action is blocked as an "infrastructure failure". The current stage timeouts are quite aggressive and, in practice, are easy to trip on slow networks or long transcripts, causing legitimate actions to be spuriously blocked.Current behavior
packages/core/src/permissions/classifier.ts:37-39Both stages wrap the side query with
AbortSignal.timeout(...). On timeout the request aborts andfailClosed()/ the stage-2 catch path returns anunavailableblock. There is:runSideQuery'smaxAttempts: 2,3s / 10s is tight when the classifier call includes a large transcript or the network is slow; the user then sees the action blocked with "classifier unavailable" even though nothing is actually wrong with the action.
Request 1 — loosen the stage timeouts
Please consider raising the stage budgets (and/or adding a separate per-fetch timeout + a non-aborting stall log) so that a slow-but-healthy classifier call is not treated as a hard block. Comparable auto-mode classifiers in the same design lineage use far more generous budgets (tens of seconds for the fast stage, ~2 min for the review stage, plus a per-fetch timeout and retries). The exact numbers can be tuned, but 3s/10s appears to be the source of avoidable false blocks.
Request 2 — disable thinking in ALL stages
packages/core/src/permissions/classifier.ts:222currently enables thoughts in stage 2:Stage 1 already sets
includeThoughts: false. For a latency-sensitive permission gate, thinking should be disabled in every stage — enabling it on stage 2 makes the review path slower and more expensive, which directly worsens the timeout problem above. The model can still write its reasoning into thethinkingfield of the structured output without a reasoning budget being allocated.Suggested direction
STAGE1_TIMEOUT_MS/STAGE2_TIMEOUT_MSto more forgiving values, and optionally add an independent per-request fetch timeout + retry/stall handling so a transient slow call doesn't fail closed.thinkingConfig: { includeThoughts: false }in stage 2 as well, so thinking is off across the board.Impact
Reduces spurious "Auto mode classifier unavailable; action blocked for safety" blocks that interrupt otherwise-valid AUTO-mode sessions, and makes the judge cheaper/faster by not allocating a thinking budget on the review stage.