Skip to content

Context compression summary leaks into user-visible chat output #33256

@Panntyng

Description

@Panntyng

Environment:

  • OS: Windows 10
  • Hermes version: latest (May 2026)
  • Mode: API Server (Gateway)
  • Providers affected: google-gemini-cli, OpenAI Codex (occurs on BOTH)
  • Context compression: enabled (default)

Steps to reproduce:

  1. Start a conversation with Hermes
  2. Continue long enough for context compression to trigger
  3. Observe: the assistant's response includes the compressed conversation summary

Expected:
Context compression should be invisible to the user — it should only be injected into the model's context window, never displayed in chat output.

Actual:
After compression, the user sees:
[new response] ← at top
[CONTEXT COMPACTION - REFERENCE ONLY]
User: old question 1
Assistant: old answer 1
... ← all old history below

Workaround: hermes config set compression.enabled false then /reset

Suspected root cause:
The SUMMARY_PREFIX in agent/context_compressor.py lines 37-51 is injected as a regular message. Either a role alternation violation or the model is outputting the compressed summary text. Since it affects multiple providers, it's not model-specific hallucination.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/agentCore agent loop, run_agent.py, prompt buildercomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions