Skip to content

estimatePromptTokens: include previous turn's candidatesTokenCount in steady-state estimate #4349

@LaZzyMan

Description

@LaZzyMan

Scope: follow-up to #4345 (auto-compaction three-tier ladder).

estimatePromptTokens's steady-state branch:

return lastPromptTokenCount + estimateContentTokens([userMessage])

That covers the input sent on the previous turn + the new user message, but misses the model response from the previous turn — which has been appended to history between the API response handler and the next turn's prompt-size estimate. The miss is typically 500–5000 tokens.

Why it matters now (didn't matter before)

Pre-#4345 the 70% threshold was far enough from the window edge that this under-count was inconsequential. The new hard tier sits only HARD_BUFFER (≈3K) from the window edge — well within one model response. When the real prompt has crossed hard but the estimate hasn't, hard-rescue doesn't fire and the API call overflows. Reactive recovery catches it (no data loss) but the user pays a doomed API round-trip first (~2-5s latency).

Proposed fix

Plumb lastCandidatesTokenCount (the API's candidatesTokenCount from the previous turn's usage metadata) alongside lastPromptTokenCount:

  • New private field on GeminiChat: lastCandidatesTokenCount
  • Capture from usageMetadata.candidatesTokenCount in the streaming response handler alongside promptTokenCount
  • Reset to 0 in:
    • external setLastPromptTokenCount seeder (inherited history has no anchor)
    • post-COMPRESSED branch (history rewritten → response absorbed into snapshot envelope, already counted in info.newTokenCount)
  • estimatePromptTokens takes optional lastCandidatesTokenCount: number = 0; steady-state branch adds it. Cold-start branch (lastPromptTokenCount === 0) unchanged

Single production caller (sendMessageStream hard-rescue pre-call) passes this.lastCandidatesTokenCount.

Related

R10.1 of PR #4168 (archived at tag pr-4168-archive-pre-revert). Pure cherry-pickable from that commit.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions