Raise chat.history text cap to avoid premature truncation#53243
Raise chat.history text cap to avoid premature truncation#53243navendugoyal19 wants to merge 2 commits into
Conversation
Greptile SummaryThis PR raises the per-field text truncation cap (
Confidence Score: 5/5
Reviews (1): Last reviewed commit: "Raise chat.history text cap to avoid pre..." | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a974aab85
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| }; | ||
|
|
||
| const CHAT_HISTORY_TEXT_MAX_CHARS = 12_000; | ||
| const CHAT_HISTORY_TEXT_MAX_CHARS = 64_000; |
There was a problem hiding this comment.
Preserve non-ASCII history text before oversized fallback
Increasing CHAT_HISTORY_TEXT_MAX_CHARS to 64k allows long UTF-8-heavy text (for example CJK or emoji) to bypass truncateChatHistoryText, but those same messages can then exceed the 128 KiB per-message cap and be replaced entirely by [chat.history omitted: message too large] in the later oversized pass. That means chat.history can now drop readable long assistant replies for non-ASCII content that previously remained visible in truncated form; this affects messages roughly in the 43k–64k character range depending on byte density.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Updated this PR to address that exact failure mode. The branch now keeps the higher 64k text cap for long ASCII replies and adds a byte-aware fallback inside truncateChatHistoryText() so UTF-8-heavy replies are truncated to a readable prefix before the oversized-message placeholder path. I also added a regression test for an emoji-heavy assistant reply to verify we no longer collapse straight to [chat.history omitted: message too large].
Address the non-ASCII chat.history regression by adding a byte-aware fallback to text truncation. Keep long UTF-8 replies readable instead of replacing them with the oversized placeholder, and cover the behavior with a regression test for emoji-heavy assistant text.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: df04b818de
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (Buffer.byteLength(next, "utf8") > CHAT_HISTORY_TEXT_MAX_BYTES) { | ||
| const suffixBytes = Buffer.byteLength(CHAT_HISTORY_TRUNCATED_SUFFIX, "utf8"); | ||
| const maxPrefixBytes = Math.max(0, CHAT_HISTORY_TEXT_MAX_BYTES - suffixBytes); |
There was a problem hiding this comment.
Measure history text budget with JSON-encoded bytes
The new truncation logic budgets text with Buffer.byteLength(..., "utf8"), but the later oversize gate uses full JSON.stringify bytes, so escape-heavy text can still be replaced by [chat.history omitted: message too large] even when it could have been preserved by truncation. For example, a long assistant message containing many ", \\, or control characters can stay under CHAT_HISTORY_TEXT_MAX_BYTES here but exceed CHAT_HISTORY_MAX_SINGLE_MESSAGE_BYTES after JSON escaping, causing a full placeholder fallback (a regression from the previous much lower 12k cap).
Useful? React with 👍 / 👎.
|
This pull request has been automatically marked as stale due to inactivity. |
|
Codex review: needs real behavior proof before merge. Latest ClawSweeper review: 2026-05-22 14:16 UTC / May 22, 2026, 10:16 AM ET. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. Source inspection and byte math show that current main can truncate a 50k ASCII assistant message before the 128 KiB serialized single-message cap, and the Control UI path can force an even lower 4,000-character cap. PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. Real behavior proof Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge Security Review findings
Review detailsBest possible solution: Rebase or replace the branch on current main by moving serialized-byte-aware truncation into src/gateway/chat-display-projection.ts, aligning Control UI maxChars/config/default policy, and proving the Gateway plus WebChat history behavior. Do we have a high-confidence way to reproduce the issue? Yes. Source inspection and byte math show that current main can truncate a 50k ASCII assistant message before the 128 KiB serialized single-message cap, and the Control UI path can force an even lower 4,000-character cap. Is this the best way to solve the issue? No. The PR is not the best complete fix because it targets an older helper, budgets raw text bytes instead of serialized message bytes, and does not settle the Control UI/config cap policy. Label justifications:
Full review comments:
Overall correctness: patch is incorrect What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 29118a0f0f0b. |
|
Codex review: found issues before merge. What this changes: This PR raises the old Maintainer follow-up before merge: An open contributor PR already exists and is paired with #53242, but it needs maintainer review/rebase because current main moved the relevant behavior into the shared projection helper and the branch still has a serialized-byte budgeting defect. Review findings:
Review detailsBest possible solution: Implement the behavior in Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against e46dccb35374. |
|
Thanks for pointing this here. If I'm reading the proposed 64k cap correctly, it may look excessive at first glance, but the current practical behavior is much worse: Control UI/WebChat currently sends That is effectively message loss in WebChat, not just cosmetic clipping: #P2 #impact:message-loss. It also makes WebChat much harder to use for the exact case where people choose it over Telegram/mobile channels: reading larger answers, reports, summaries, logs, and debugging output on a bigger screen. @clawsweeper could you please take a look at #83151 and move any relevant labels/priority here if this PR is the canonical fix path? This bug is very painful in day-to-day WebChat usage. |
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
|
Thanks @navendugoyal19. Closing this as superseded by the current WebChat history direction on The branch tries to raise the default Relevant current-main commits:
Proof on current node scripts/run-vitest.mjs src/gateway/server.chat.gateway-server-chat-b.test.ts ui/src/ui/app-sidebar-full-message.test.ts ui/src/ui/chat/grouped-render.test.ts ui/src/ui/controllers/chat.test.tsResult: passed 3 Vitest shards, 156 tests total. Appreciate the clear report and tests. The old default-cap increase is not planned now that WebChat has the bounded-history plus full-message-reader path. |
Summary
Raise the
chat.historytext cap from12_000to64_000and make truncation UTF-8 byte-aware so long assistant replies stay readable instead of being replaced by the oversized placeholder.What changed
CHAT_HISTORY_TEXT_MAX_CHARSinsrc/gateway/server-methods/chat.tsfrom12_000to64_000truncateChatHistoryText()so UTF-8-heavy text is truncated to a safe prefix before the128 * 1024single-message byte guard runs[chat.history omitted: message too large]Why
Before this change,
chat.historytruncatedcontent[].text,partialJson,arguments, andthinkingat 12k characters before the larger byte-budget logic ran.Raising the char cap alone fixes premature truncation for long ASCII replies, but it leaves a second failure mode: long UTF-8-heavy text can bypass the char cap and then be replaced entirely by the oversized placeholder during the later byte-budget pass.
This update fixes both paths:
Test
pnpm -C /tmp/openclaw-pr-53243 exec vitest run --config vitest.gateway.config.ts src/gateway/server.chat.gateway-server-chat-b.test.tsFixes #53242