Skip to content

Context compression summaries injected as regular assistant messages, polluting visible conversation #38392

@redfireblade

Description

@redfireblade

Description

Context compression summaries are injected as ordinary assistant messages at the end of the visible conversation. When compression triggers, users see a wall of compressed historical summaries directly after the latest assistant reply, making the conversation confusing and hard to follow.

Root cause: In agent/context_compressor.py, the compressed summary is prepended with [CONTEXT COMPACTION — REFERENCE ONLY] prefix and inserted as a regular assistant message into the message list. There is no metadata flag, role distinction, or message-level marker that would allow a consumer (CLI, Desktop, API client) to distinguish a compression summary from an actual assistant response.

The predecessor reference (Claude Code) handles this correctly by keeping compression entirely invisible to the user — it is a backend-only mechanism that never enters the visible conversation stream.

Expected Behavior

Compression summaries should not be emitted as a visible assistant message. They are an internal memory-management mechanism.

Options for the agent to handle this:

Option A: Keep the summary in the context window but do not emit it as a new message — it should be a silent context manipulation, not part of the visible message sequence.

Option B: Add a metadata flag (e.g. role: "compression_summary" or is_compressed_summary: true) so frontends can filter them out. This is the minimal fix.

Option C: Deliver the summary via the system prompt or a hidden context field rather than as an assistant message.

Actual Behavior

After a long conversation, the agent begins emitting messages like:

[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted into the summary below...

These appear as the latest messages in the chat, interspersed with the user's current conversation, creating a broken reading order.

Environment

  • Hermes Agent version: Latest main
  • Config: Default compression settings (threshold: 0.5, target_ratio: 0.2, protect_last_n: 20)
  • Platform: Windows (also reproducible on other platforms)

Additional Note

This is a backend issue, not a frontend issue. The agent should not emit compression artifacts as visible assistant messages, regardless of what frontend renders them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/agentCore agent loop, run_agent.py, prompt builderduplicateThis issue or pull request already existstool/memoryMemory tool and memory providerstype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions