[bug] Chat fails with 413 "prompt is too long" (>200k tokens) instead of trimming context to the model window

## Summary
Chat hard-fails when the assembled prompt exceeds the model's context window. The conversation context is not trimmed or summarized to fit, so long chats become unusable instead of degrading gracefully.

```
AI error in chat: Error: 413 ... {"type":"error","error":{"type":"invalid_request_error","message":"prompt is too long: 206134 tokens > 200000 maximum"}} ... code "413"
```

Reported on app 2.5.4 (macOS).

## Expected
When the assembled context would exceed the model window, trim or summarize older turns (and cap injected history and retrieved context) so the request fits, rather than returning a hard 413 to the user.

## Note
The recent change to always inject full conversation history (#3636) increases prompt size and may make this more likely. A token-budget guard on the final assembled prompt would address both at once.

Possibly related: closed #2570 (chat overflow errors).

_Source: in-app feedback, Jun 5 2026._


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Chat fails with 413 "prompt is too long" (>200k tokens) instead of trimming context to the model window #3852

Summary

Expected

Note

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[bug] Chat fails with 413 "prompt is too long" (>200k tokens) instead of trimming context to the model window #3852

Description

Summary

Expected

Note

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions