fix(agent): prevent GLM stop-to-length heuristic false positives (#14572) by aj-nt · Pull Request #15463 · NousResearch/hermes-agent

aj-nt · 2026-04-25T01:28:14Z

Summary

Fixes #14572 — the Ollama/GLM stop-to-length heuristic was triggering 100% false positives on responses ending with emoji sign-offs (💛, ✨, 🙌), Markdown links, or conversational text lacking terminal punctuation. Each false positive wasted up to 3 continuation API calls per turn.

Root cause: _has_natural_response_ending() only recognized ASCII and CJK punctuation as natural endings. Any response ending with an emoji, symbol, or bare word was classified as truncated.

Three-pronged fix

1. Expand `_has_natural_response_ending()` — emoji and symbol recognition

Strip trailing Unicode combining marks (variation selectors U+FE0F, ZWJ U+200D) before checking last character
Detect unicodedata.category() So (Other_Symbol), Sk (Modifier_Symbol), and Sm (Math_Symbol) as natural endings
Check Extended Pictographic heuristic range U+1F000–U+1FAFF
Removed dead hardcoded string "✓✔✗✘♠♣♥♦♪♫☀☁☂★☆" — every character was already caught by the So category check (the comment incorrectly claimed they were "outside those ranges")
Added Sm category to cover arrows (→ ←) and math symbols (∞ ≈) that commonly end structured responses

2. 500-char minimum-length gate in `_should_treat_stop_as_truncated()`

Responses under 500 visible chars are almost certainly complete — they couldn't have hit a meaningful token limit
Retains original 20-char/no-whitespace short-junk detection
Eliminates the vast majority of false positives from conversational replies

3. Config opt-out — `agent.glm_truncation_heuristic`

New flag in cli-config.yaml.example, defaults to true (heuristic enabled)
Set to false to disable the heuristic entirely
Read with getattr(self, "_glm_truncation_heuristic_enabled", True) for backwards compatibility

Refactoring pass (commit 2)

Moved import unicodedata from inline in method body to module top-level
Added Sm (Math_Symbol) to category check for arrows and math symbols
Removed dead hardcoded string (all chars were So, already covered)
Renamed misleading test test_emoji_sign_off_with_100_chars → test_short_response_with_emoji_does_not_trigger (it tests the 500-char gate, not emoji recognition)
Added 8 parametrized tests for Sm/So characters including arrows and codepoints from the removed list

Test plan

65 tests total (57 original + 8 new Sm/So parametrized tests)
2 existing integration tests pass with updated mock content
All categories covered: So, Sk, Sm, Extended Pictographic range, variation selectors
500-char gate verified for both short (pass) and long (block) responses
Config opt-out verified

alt-glitch · 2026-04-25T01:47:01Z

Likely duplicate of #14574 — both fix #14572 (emoji sign-off false positives in GLM stop-to-length heuristic). This PR is more comprehensive (adds 500-char min gate + config opt-out) but targets the same root cause in _has_natural_response_ending(). Also overlaps with #15363. Maintainer should pick one.

…sResearch#14572) Three-pronged fix for the Ollama/GLM stop-to-length heuristic that was triggering continuation loops on any response not ending with ASCII/CJK punctuation — including emoji sign-offs (💛, ✨, 🙌), Markdown links, and conversational text lacking terminal punctuation. 1. Expand _has_natural_response_ending() with emoji recognition: - Strip trailing Unicode combining marks (variation selectors, ZWJ) before checking the base character. - Detect Unicode categories So (Other_Symbol) and Sk (Modifier_Symbol). - Check Extended_Pictographic heuristic range U+1F000–U+1FAFF. - Hardcode common sign-off codepoints outside those ranges. 2. Add 500-char minimum-length gate in _should_treat_stop_as_truncated(): - Responses under 500 visible chars are almost certainly complete — they couldn't have hit a meaningful token limit. - Retains the original 20-char/no-whitespace short-junk detection. - Eliminates the vast majority of false positives from conversational replies and emoji sign-offs. 3. Config opt-out via agent.glm_truncation_heuristic (default true): - Set to false to disable the heuristic entirely. - Read with getattr(self, _glm_truncation_heuristic_enabled, True) for backwards compatibility. 57 new tests in test_glm_stop_heuristic.py. Updated existing integration test mock content to >=500 chars so it still triggers the heuristic.

…search#14572)\n\n- Move import unicodedata to module top-level (was inline in method body)\n- Add Sm (Math_Symbol) category to natural-ending recognition for arrows\n and math symbols that commonly end structured responses\n- Remove dead hardcoded string - every character in it is category So,\n already caught by the category check. The comment claimed they were\n outside those ranges but they weren't.\n- Rename test_emoji_sign_off_with_100_chars to\n test_short_response_with_emoji_does_not_trigger to accurately describe\n what it tests (the 500-char gate, not emoji recognition).\n- Add 8 parametrized tests for Sm/So characters including arrows, math\n symbols, and codepoints from the removed hardcoded list.

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/ollama Ollama / local models area/config Config system, migrations, profiles labels Apr 25, 2026

AJ added 2 commits April 24, 2026 23:56

aj-nt force-pushed the fix/glm-stop-heuristic-14572 branch from b2e9bfc to 31595e2 Compare April 25, 2026 03:57

aj-nt closed this Apr 25, 2026

aj-nt deleted the fix/glm-stop-heuristic-14572 branch April 25, 2026 13:34

aj-nt mentioned this pull request Apr 26, 2026

GLM stop-to-length truncation heuristic (3-pronged fix) aj-nt/hermes-agent#11

Open

alt-glitch mentioned this pull request May 14, 2026

fix(agent): prevent GLM stop-to-length heuristic false positives (#14572) #25991

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): prevent GLM stop-to-length heuristic false positives (#14572)#15463

fix(agent): prevent GLM stop-to-length heuristic false positives (#14572)#15463
aj-nt wants to merge 2 commits into
NousResearch:mainfrom
aj-nt:fix/glm-stop-heuristic-14572

aj-nt commented Apr 25, 2026 •

edited

Loading

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aj-nt commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Three-pronged fix

1. Expand _has_natural_response_ending() — emoji and symbol recognition

2. 500-char minimum-length gate in _should_treat_stop_as_truncated()

3. Config opt-out — agent.glm_truncation_heuristic

Refactoring pass (commit 2)

Test plan

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aj-nt commented Apr 25, 2026 •

edited

Loading

1. Expand `_has_natural_response_ending()` — emoji and symbol recognition

2. 500-char minimum-length gate in `_should_treat_stop_as_truncated()`

3. Config opt-out — `agent.glm_truncation_heuristic`