Skip to content

[bug] Chat fails with 413 "prompt is too long" (>200k tokens) instead of trimming context to the model window #3852

@louis030195

Description

@louis030195

Summary

Chat hard-fails when the assembled prompt exceeds the model's context window. The conversation context is not trimmed or summarized to fit, so long chats become unusable instead of degrading gracefully.

AI error in chat: Error: 413 ... {"type":"error","error":{"type":"invalid_request_error","message":"prompt is too long: 206134 tokens > 200000 maximum"}} ... code "413"

Reported on app 2.5.4 (macOS).

Expected

When the assembled context would exceed the model window, trim or summarize older turns (and cap injected history and retrieved context) so the request fits, rather than returning a hard 413 to the user.

Note

The recent change to always inject full conversation history (#3636) increases prompt size and may make this more likely. A token-budget guard on the final assembled prompt would address both at once.

Possibly related: closed #2570 (chat overflow errors).

Source: in-app feedback, Jun 5 2026.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions