Thanks for reporting a bug! Please fill out the sections below so we can reproduce and fix it quickly.
Before submitting, please:
Bug Description
When Hermes uses a model with a small effective context window (e.g., MiniMax-M2.7), after auto context compression is triggered, the model becomes incoherent and gives completely irrelevant answers.
What happened: After context compression, the model ignores the user's current message and responds about something completely unrelated from earlier context.
Steps to Reproduce
- Have a conversation that grows large enough to trigger auto context compression
- After compression, send a new, clear, unambiguous user message (e.g., 'lan control,把我电视声音调低')
- The model responds about something unrelated (e.g., an audio file path from earlier context)
Expected Behavior
Model should acknowledge and act on the current user message.
Actual Behavior
Model responds about something unrelated. Example:
- User: 'lan control,把我电视声音调低' (lower TV volume)
- Model response: Talks about an audio file 'yt_30s_doc.mp3' — completely unrelated
Affected Component
Operating System
Ubuntu 24.04 (Linux 6.17.0)
Python Version
3.11.15
Hermes Version
v0.8.0
Root Cause Analysis (suspected)
After compression, the model doesn't re-read or properly acknowledge the user's last message before responding. It relies on residual context fragments and generates a plausible but wrong answer.
Proposed Fix
After context compression, the model should explicitly confirm the user's last message before generating a response, or the compression logic should preserve a summary of the most recent user intent.
Thanks for reporting a bug! Please fill out the sections below so we can reproduce and fix it quickly.
Before submitting, please:
Bug Description
When Hermes uses a model with a small effective context window (e.g., MiniMax-M2.7), after auto context compression is triggered, the model becomes incoherent and gives completely irrelevant answers.
What happened: After context compression, the model ignores the user's current message and responds about something completely unrelated from earlier context.
Steps to Reproduce
Expected Behavior
Model should acknowledge and act on the current user message.
Actual Behavior
Model responds about something unrelated. Example:
Affected Component
Operating System
Ubuntu 24.04 (Linux 6.17.0)
Python Version
3.11.15
Hermes Version
v0.8.0
Root Cause Analysis (suspected)
After compression, the model doesn't re-read or properly acknowledge the user's last message before responding. It relies on residual context fragments and generates a plausible but wrong answer.
Proposed Fix
After context compression, the model should explicitly confirm the user's last message before generating a response, or the compression logic should preserve a summary of the most recent user intent.