Skip to content

fix(gateway): strip [CONTEXT COMPACTION] echoes from gateway responses#6272

Open
giwaov wants to merge 1 commit into
NousResearch:mainfrom
giwaov:fix/6212-strip-compaction-echo
Open

fix(gateway): strip [CONTEXT COMPACTION] echoes from gateway responses#6272
giwaov wants to merge 1 commit into
NousResearch:mainfrom
giwaov:fix/6212-strip-compaction-echo

Conversation

@giwaov

@giwaov giwaov commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Summary

When a Telegram session has prior compressed context, the model can parrot the stored \[CONTEXT COMPACTION]\ handoff summary back as its response to a simple greeting like \/start\ or \Hello?\. This makes the bot dump an opaque internal handoff blob instead of greeting normally.

Root Cause

The context compressor stores a structured handoff summary prefixed with \[CONTEXT COMPACTION] Earlier turns in this conversation were compacted...\ in the conversation history. When a user re-enters the conversation, the model sees this in its context and may echo it verbatim as the response.

Changes

gateway/run.py - response post-processing (after \�gent_result.get('final_response')\)

  • Detects \[CONTEXT COMPACTION]\ in the response text
  • Strips the compaction prefix and boilerplate (up to \�void repeating work:\)
  • If stripping empties the response, sets it to empty string so downstream processing doesn't send a blank message

Observed Behavior

Before:

[CONTEXT COMPACTION] Earlier turns in this conversation were compacted to save context space. The summary below describes work that was already completed...

After:
The compaction boilerplate is stripped, leaving only any user-facing content the model generated after the echo.

Closes #6212

When the model parrots the stored compaction handoff summary back to
the user (e.g. in response to a simple 'Hello?' after a session with
prior compressed context), the raw [CONTEXT COMPACTION] blob leaks
into the visible response.

Strip the compaction prefix and boilerplate from the response before
delivery, so the user sees a clean greeting instead of an opaque
internal handoff dump.

Closes NousResearch#6212
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels Apr 30, 2026
ilyasst added a commit to ilyasst/hermes-agent that referenced this pull request May 13, 2026
Symptom: local Qwen3.5 via llama-server with --jinja occasionally generates
"\nuser\n[<user_name>] ..." past the assistant stop boundary. Hermes formats
user messages as "[<user_name>] ..." (gateway/run.py:5692), so the model
learned that pattern as the user-turn delimiter; when sampling drifts past
the assistant stop, it continues into a plausible user turn. The gateway
delivers that verbatim to Telegram — the bot appears to literally echo
"[ilyass] nudge" back at the user.

Belt + suspenders:

- run_agent.py: inject stop sequences ["\nuser\n[", "\n\nuser\n"] into
  chat_completions api_kwargs after _build_api_kwargs(). Cuts generation
  at the leak boundary, saving tokens too. Disable via
  HERMES_DISABLE_USER_PREFIX_STOP=1.

- gateway/run.py: post-process final_response with a regex that detects
  and strips "\nuser\n[<name>] ..." patterns. Safety net for any leak
  that escapes the stop sequence (e.g., on streaming providers where stop
  detection differs). Logs a warning when it fires so we can monitor.

Related upstream (all open, none merged): NousResearch#6272 (strip [CONTEXT COMPACTION]
echoes), NousResearch#19887 (strip leaked template markers from gemma4 tool args),
NousResearch#15369 (strip MEDIA directives), NousResearch#18529 (title leaks <thinking>).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Telegram fresh hello can dump prior [CONTEXT COMPACTION] handoff instead of greeting normally

2 participants