fix(compressor): use text char sum for multimodal token estimation in _find_tail_cut_by_tokens by briandevans · Pull Request #16113 · NousResearch/hermes-agent

briandevans · 2026-04-26T15:38:35Z

Problem

_find_tail_cut_by_tokens calls len(content) to estimate message tokens. When content is a list of blocks (multimodal: text + image_url), len() returns the block count (e.g. 2) rather than character count.

A message with 500 chars of text + 1 image block was counted as ~10 tokens instead of ~135. This caused the backward walk to exhaust the entire message list before reaching the budget ceiling. The head_end safeguard then forced cut = n - min_tail, shrinking the protected tail to the bare 3-message minimum and preventing effective compression of multimodal conversations.

Closes #16087.

Root cause

# Before — wrong for multimodal content:
content = msg.get("content") or ""
msg_tokens = len(content) // _CHARS_PER_TOKEN + 10
# len([text_block, image_block]) == 2, not ~500

Fix

Mirror the existing pattern already used in _prune_old_tool_results (line 487):

raw_content = msg.get("content") or ""
content_len = (
    sum(len(p.get("text", "")) for p in raw_content)
    if isinstance(raw_content, list)
    else len(raw_content)
)
msg_tokens = content_len // _CHARS_PER_TOKEN + 10

Tests

Three new tests added to TestTokenBudgetTailProtection:

Test	Purpose
`test_multimodal_message_accumulates_text_chars_not_block_count`	Regression guard: fails with old code (tail has 3 msgs), passes with fix (4 msgs)
`test_plain_string_content_unchanged`	Plain string estimation unchanged after fix
`test_image_only_block_contributes_zero_text_chars`	Image-only blocks (no `text` key) contribute 0 chars + base overhead; no crash

All 55 tests in tests/agent/test_context_compressor.py pass.

… _find_tail_cut_by_tokens _find_tail_cut_by_tokens called len(content) to estimate message tokens. When content is a list of blocks (multimodal: text + image_url), len() returns block count (e.g. 2) rather than character count, so a message with 500 chars of text was counted as ~10 tokens instead of ~135. This caused the backward walk to exhaust all messages before hitting the budget ceiling; the head_end safeguard then forced cut = n - min_tail, shrinking the protected tail to the bare minimum and preventing effective compression of long multimodal conversations. Fix mirrors the existing pattern in _prune_old_tool_results (line 487): sum(len(p.get("text", "")) for p in raw_content) if isinstance(raw_content, list) else len(raw_content) Tests: 3 new cases in TestTokenBudgetTailProtection — regression guard (confirms the test fails with the bug), plain-string regression guard, and image-only block edge case. Fixes NousResearch#16087. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Fixes multimodal token estimation in the context compressor’s tail-protection logic so that multimodal content lists contribute token estimates based on summed text character length (instead of list length), preventing overly-large protected tails and enabling effective compression for vision/multimodal chats.

Changes:

Update _find_tail_cut_by_tokens to estimate tokens from summed text chars for multimodal content lists.
Add regression tests covering multimodal content, plain-string parity, and image-only blocks.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`agent/context_compressor.py`	Corrects tail token estimation for multimodal messages by summing text block lengths.
`tests/agent/test_context_compressor.py`	Adds regression tests to ensure multimodal tail protection behaves as intended.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-26T15:41:49Z

-            msg_tokens = len(content) // _CHARS_PER_TOKEN + 10  # +10 for role/metadata
+            raw_content = msg.get("content") or ""
+            content_len = (
+                sum(len(p.get("text", "")) for p in raw_content)


raw_content can be a list that includes non-dict parts (e.g., bare strings are supported/normalized elsewhere). Iterating with p.get(...) will raise AttributeError for string items and would crash context compression. Consider making the length calculation resilient by checking part types (dict vs str) and treating unknown parts as 0 or len(str(part)).

Suggested change

sum(len(p.get("text", "")) for p in raw_content)

sum(

len(p.get("text", ""))

if isinstance(p, dict)

else len(p)

if isinstance(p, str)

else len(str(p))

for p in raw_content

)

…t list raw_content from message["content"] can be a list that contains bare strings, not only dicts. The previous `p.get("text", "")` call raised AttributeError on string items, crashing context compression for any session that had a message with mixed content. Guard with isinstance checks: dict → .get("text"), str → len(p), fallback → len(str(p)). Adds a regression test covering the bare-string case that would have AttributeError'd on the pre-fix code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

briandevans · 2026-04-26T16:09:16Z

Thanks @copilot — valid catch. Addressed in 80ae2621.

Bare-string items in multimodal content list (line 1087): You're right that raw_content can be a list with non-dict items. The old p.get("text", "") call raises AttributeError on any bare string item in the list. Fixed by guarding with isinstance checks:

dict → .get("text", "")
str → len(p) directly
anything else → len(str(p))

Regression guard: running the old code against ["Hello, world!", {"type": "text", "text": "extra"}] raises AttributeError: 'str' object has no attribute 'get'; new code returns 23 (correct char sum). Added test_mixed_list_with_bare_strings_does_not_crash covering this case.

briandevans · 2026-04-26T17:38:01Z

Note re @alt-glitch's cross-reference to #16117:

	This PR (#16113)	#16117 (SimbaKingjoe)
Multimodal dict items ()	✅ `.get("text", "")`	✅ same
Bare string items in list (valid OpenAI wire format)	✅ `len(p)`	❌ excluded by `isinstance(p, dict)` → contributes 0
Non-dict, non-string items	✅ `len(str(p))` fallback	❌ excluded → contributes 0
Crash risk	none	none
Tests	3 targeted tests + 1 regression guard (bare-string case)	1 test (dict blocks only)

The bare-string item case is real: the Anthropic wire format allows a list with a single bare string alongside dict blocks, and the OpenAI format allows a list of strings as content. #16117's fix handles the common case but silently under-counts non-dict list items. Our fix mirrors the exact idiom in _prune_old_tool_results and extends it with the bare-string guard that Copilot caught.

Happy to defer to whichever fix the maintainers prefer.

The bare-string isinstance guard added in 80ae262 covered _find_tail_cut_by_tokens (line 1084) but missed the identical pattern in _calculate_protect_tail_boundary (line 487, the protect-tail scan loop). Both loops call .get("text", "") on every list item in message["content"]; both crash with AttributeError when that list contains a bare string. Apply the same dict/str/fallback isinstance guard to the protect-tail path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

briandevans · 2026-04-26T18:21:37Z

@copilot Valid catch — the bare-string guard was applied to _find_tail_cut_by_tokens in 80ae262 but missed the identical pattern in the protect-tail boundary scan loop (line 487). Fixed in 05cd158: same isinstance(p, dict) / isinstance(p, str) / len(str(p)) guard applied to both loops. All 56 tests/agent/test_context_compressor.py tests pass.

teknium1 · 2026-04-27T04:48:16Z

Salvaged via PR #16369 — your commits were cherry-picked onto current main with authorship preserved via rebase-merge. Thanks @briandevans!

Copilot AI review requested due to automatic review settings April 26, 2026 15:38

Copilot started reviewing on behalf of briandevans April 26, 2026 15:39 View session

Copilot AI reviewed Apr 26, 2026

View reviewed changes

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 26, 2026

alt-glitch mentioned this pull request Apr 26, 2026

Fix/underestimates token count #16117

Closed

2 tasks

teknium1 mentioned this pull request Apr 27, 2026

fix(compressor): use text char sum for multimodal token estimation in _find_tail_cut_by_tokens #16369

Merged

teknium1 closed this Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compressor): use text char sum for multimodal token estimation in _find_tail_cut_by_tokens#16113

fix(compressor): use text char sum for multimodal token estimation in _find_tail_cut_by_tokens#16113
briandevans wants to merge 3 commits into
NousResearch:mainfrom
briandevans:fix/multimodal-tail-token-estimation-16087

briandevans commented Apr 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

briandevans commented Apr 26, 2026

Uh oh!

briandevans commented Apr 26, 2026

Uh oh!

briandevans commented Apr 26, 2026

Uh oh!

teknium1 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-                sum(len(p.get("text", "")) for p in raw_content)
+                sum(
+                    len(p.get("text", ""))
+                    if isinstance(p, dict)
+                    else len(p)
+                    if isinstance(p, str)
+                    else len(str(p))
+                    for p in raw_content
+                )

Conversation

briandevans commented Apr 26, 2026

Problem

Root cause

Fix

Tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

briandevans commented Apr 26, 2026

Uh oh!

briandevans commented Apr 26, 2026

Uh oh!

briandevans commented Apr 26, 2026

Uh oh!

teknium1 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants