estimatePromptTokens: include previous turn's candidatesTokenCount in steady-state estimate

**Scope**: follow-up to #4345 (auto-compaction three-tier ladder).

`estimatePromptTokens`'s steady-state branch:

```
return lastPromptTokenCount + estimateContentTokens([userMessage])
```

That covers the input sent on the **previous** turn + the new user message, but misses the **model response from the previous turn** — which has been appended to `history` between the API response handler and the next turn's prompt-size estimate. The miss is typically 500–5000 tokens.

## Why it matters now (didn't matter before)

Pre-#4345 the 70% threshold was far enough from the window edge that this under-count was inconsequential. The new hard tier sits only `HARD_BUFFER` (≈3K) from the window edge — well within one model response. When the real prompt has crossed `hard` but the estimate hasn't, hard-rescue doesn't fire and the API call overflows. Reactive recovery catches it (no data loss) but the user pays a doomed API round-trip first (~2-5s latency).

## Proposed fix

Plumb `lastCandidatesTokenCount` (the API's `candidatesTokenCount` from the previous turn's usage metadata) alongside `lastPromptTokenCount`:

- New private field on `GeminiChat`: `lastCandidatesTokenCount`
- Capture from `usageMetadata.candidatesTokenCount` in the streaming response handler alongside `promptTokenCount`
- Reset to 0 in:
  - external `setLastPromptTokenCount` seeder (inherited history has no anchor)
  - post-COMPRESSED branch (history rewritten → response absorbed into snapshot envelope, already counted in `info.newTokenCount`)
- `estimatePromptTokens` takes optional `lastCandidatesTokenCount: number = 0`; steady-state branch adds it. Cold-start branch (`lastPromptTokenCount === 0`) unchanged

Single production caller (`sendMessageStream` hard-rescue pre-call) passes `this.lastCandidatesTokenCount`.

## Related

R10.1 of PR #4168 (archived at tag `pr-4168-archive-pre-revert`). Pure cherry-pickable from that commit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

estimatePromptTokens: include previous turn's candidatesTokenCount in steady-state estimate #4349

Why it matters now (didn't matter before)

Proposed fix

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

estimatePromptTokens: include previous turn's candidatesTokenCount in steady-state estimate #4349

Description

Why it matters now (didn't matter before)

Proposed fix

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions