Skip to content

Bound hard-tier rescue retries with hardRescueFailureCount counter #4346

@LaZzyMan

Description

@LaZzyMan

Scope: follow-up to #4345 (auto-compaction three-tier ladder).

The hard-rescue mechanism in sendMessageStream fires when effectiveTokens >= hard to force compaction before the API would reject the prompt. Without a retry bound, a session whose history can't shrink (model consistently produces unusable summaries, network broken, compressible slice too small to split, etc.) fires hard-rescue on every send forever — each send burns one compression side-query.

Reactive overflow at the API layer catches the actual failure, so this is an operational cost (~2-5s latency per failed rescue, 1 side-query per send) not a correctness bug. But the cost is bounded by reactive overflow's own retries, not by the rescue itself.

Proposed fix

Add hardRescueFailureCount field to GeminiChat with pessimistic increment pattern:

// pre-call
this.consecutiveFailures = 0;        // rescue overrides the cheap-gate breaker
this.hardRescueFailureCount += 1;    // pessimistic strike

compressionInfo = await this.tryCompress(prompt_id, model, true, ...);

// post-call: refund NOOP (history-too-small ≠ broken mechanism)
if (compressionInfo.compressionStatus === CompressionStatus.NOOP) {
  this.hardRescueFailureCount = Math.max(0, this.hardRescueFailureCount - 1);
}
// COMPRESSED success resets the counter via tryCompress's success branch

Bound rescue trigger by hardRescueFailureCount < MAX_CONSECUTIVE_FAILURES (3). After 3 strikes, reactive overflow becomes the sole defence layer.

Why pessimistic, not post-call increment

Post-call only-on-failure-status leaks two failure shapes silently:

  • throws (provider 5xx / abort) → post-handler unreachable → strike not recorded
  • NOOP (curated history empty / MIN_COMPRESSION_FRACTION undercut) → neither success nor failure-status branch matches → strike not recorded

Pessimistic guarantees the strike sticks for every non-COMPRESSED outcome, then NOOP is the only one that gets refunded (because NOOP isn't a mechanism failure).

Related

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions