Problem
The auto-escalation threshold from flash -> pro is hard-coded to 3 in src/loop/turn-failure-tracker.ts:3. The TurnFailureTracker counts repair signals per turn (scavenged, truncated, repeat-loop), and the third one crosses into pro for the rest of the turn.
3 is reasonable for the average case but routinely fires "too early" on long subagent runs where flash naturally hits 2-3 minor tool-call argument truncations on big edit blocks without actually being stuck. Once it crosses to pro, the rest of the turn costs ~3x more, so the false-positive rate matters for cost.
There's no way to tune this short of editing the file. A user who's comfortable paying for one extra repair before escalating should be able to set it.
Proposed change
Make the threshold configurable, with the current value as the default. Two surfaces (mirrors the --budget / config.budgetUsd pattern):
- Config key:
escalation.failureThreshold?: number in ReasonixConfig (default 3).
- Optional CLI flag on
chat / code: --escalate-after <n> for one-shot overrides.
Plumb the resolved value into TurnFailureTracker's constructor (currently parameterless) and let the loop pass it through.
If the value is out of a sane range (e.g. < 1 or > 20), warn and fall back to the default — same lenient parsing as parseBudgetFlag.
Scope
src/loop/turn-failure-tracker.ts — accept threshold via constructor; export the default constant.
src/loop.ts — read config.escalation?.failureThreshold once at construction and pass into the tracker.
src/config.ts — add the optional config shape.
src/cli/index.ts — --escalate-after flag, lenient parse, resolution chain (flag > config > default).
tests/ — one case: threshold=5 takes 5 signals before crossing; threshold=3 (default) preserves current behavior; invalid value warns + falls back.
Out of scope:
- Per-failure-kind weighting (e.g. "truncated counts 0.5, repeat-loop counts 2").
- Slash-command runtime toggle.
- Telemetry on actual escalation rate.
Good first issue
Mechanical plumbing across three files (config -> loop -> tracker) with a clear test surface. The existing --budget / parseBudgetFlag pattern is the template. Comment-policy rules in CLAUDE.md apply.
Problem
The auto-escalation threshold from
flash->prois hard-coded to 3 insrc/loop/turn-failure-tracker.ts:3. TheTurnFailureTrackercounts repair signals per turn (scavenged,truncated,repeat-loop), and the third one crosses into pro for the rest of the turn.3 is reasonable for the average case but routinely fires "too early" on long subagent runs where flash naturally hits 2-3 minor tool-call argument truncations on big edit blocks without actually being stuck. Once it crosses to pro, the rest of the turn costs ~3x more, so the false-positive rate matters for cost.
There's no way to tune this short of editing the file. A user who's comfortable paying for one extra repair before escalating should be able to set it.
Proposed change
Make the threshold configurable, with the current value as the default. Two surfaces (mirrors the
--budget/config.budgetUsdpattern):escalation.failureThreshold?: numberinReasonixConfig(default 3).chat/code:--escalate-after <n>for one-shot overrides.Plumb the resolved value into
TurnFailureTracker's constructor (currently parameterless) and let the loop pass it through.If the value is out of a sane range (e.g.
< 1or> 20), warn and fall back to the default — same lenient parsing asparseBudgetFlag.Scope
src/loop/turn-failure-tracker.ts— accept threshold via constructor; export the default constant.src/loop.ts— readconfig.escalation?.failureThresholdonce at construction and pass into the tracker.src/config.ts— add the optional config shape.src/cli/index.ts—--escalate-afterflag, lenient parse, resolution chain (flag > config > default).tests/— one case: threshold=5 takes 5 signals before crossing; threshold=3 (default) preserves current behavior; invalid value warns + falls back.Out of scope:
Good first issue
Mechanical plumbing across three files (config -> loop -> tracker) with a clear test surface. The existing
--budget/parseBudgetFlagpattern is the template. Comment-policy rules inCLAUDE.mdapply.