This is the beginning of the attached compress_context_feature_request.md file that Kimi generated for me. Full patches are included in the second attached file.
compress_context_feature_request.md
compress_context.patch
Summary
Add a compress_context tool that lets the agent proactively trigger context compression mid-conversation, rather than waiting for automatic threshold-based compression. This enables autonomous research modes where the agent can manage its own context budget and reset the turn counter before hitting the hard max_iterations limit.
Motivation
Hermes already has excellent automatic context compression (via context_compressor.py), but it only triggers when the token threshold is reached. For long-running autonomous tasks (e.g., research sweeps, multi-step experiments), the agent needs to be able to:
- Reset the turn counter before hitting
max_iterations — critical for modes with a 200-turn budget where the agent plans its own work in chunks
- Preserve specific topics across compressions using the
focus_topic parameter
- Force compression after a failed automatic attempt without waiting for a cooldown
Currently, if the agent approaches max_iterations and automatic compression hasn't triggered (e.g., because tokens are under threshold but turns are high), the conversation hard-stops with no graceful recovery.
This is the beginning of the attached compress_context_feature_request.md file that Kimi generated for me. Full patches are included in the second attached file.
compress_context_feature_request.md
compress_context.patch
Summary
Add a
compress_contexttool that lets the agent proactively trigger context compression mid-conversation, rather than waiting for automatic threshold-based compression. This enables autonomous research modes where the agent can manage its own context budget and reset the turn counter before hitting the hardmax_iterationslimit.Motivation
Hermes already has excellent automatic context compression (via
context_compressor.py), but it only triggers when the token threshold is reached. For long-running autonomous tasks (e.g., research sweeps, multi-step experiments), the agent needs to be able to:max_iterations— critical for modes with a 200-turn budget where the agent plans its own work in chunksfocus_topicparameterCurrently, if the agent approaches
max_iterationsand automatic compression hasn't triggered (e.g., because tokens are under threshold but turns are high), the conversation hard-stops with no graceful recovery.