Skip to content

feat(tool-context): compact large tool results before replay#30980

Open
energypantry wants to merge 1 commit into
NousResearch:mainfrom
energypantry:fix/tool-context-compaction
Open

feat(tool-context): compact large tool results before replay#30980
energypantry wants to merge 1 commit into
NousResearch:mainfrom
energypantry:fix/tool-context-compaction

Conversation

@energypantry

Copy link
Copy Markdown

Fixes #415
Refs #513

What does this PR do?

This PR adds insertion-time tool result compaction so large tool outputs are reduced before they are appended back into the model conversation. The goal is to keep recent tool calls useful without letting raw HTML, long terminal output, or large JSON/file payloads dominate the next prompt before normal context compression has a chance to run.

The approach is deterministic and cache-friendly: it compacts only the tool-result message being appended, leaves TUI/log callbacks on the raw tool result, and avoids LLM summaries or external vector storage.

Related Issue

Fixes #415
Refs #513

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • Add agent/tool_context_policy.py with deterministic compaction for terminal, file, web/HTML, JSON, and generic tool results.
  • Route both sequential and concurrent tool execution through the same context-only compaction path before make_tool_result_message().
  • Keep display/log callbacks on the raw tool result so user-visible output is not reduced by this policy.
  • Add tool_context defaults to hermes_cli/config.py and cli-config.yaml.example.
  • Document tool context compaction in website/docs/user-guide/configuration.md.
  • Add focused tests in tests/agent/test_tool_context_policy.py.

How to Test

  1. Run the focused regression suite:
    /Users/zhi/.hermes/hermes-agent/venv/bin/python -m pytest -q \
      tests/agent/test_tool_context_policy.py \
      tests/tools/test_tool_result_storage.py \
      tests/tools/test_budget_config.py \
      tests/agent/test_context_compressor.py
  2. Compile the touched modules:
    /Users/zhi/.hermes/hermes-agent/venv/bin/python -m py_compile \
      agent/tool_context_policy.py \
      agent/tool_executor.py \
      hermes_cli/config.py
  3. Local smoke test on macOS with a Chinese-language Hermes session and local llama.cpp backend: a real 57,701-character HTML/tool result compacted to an 823-character model-context result while preserving title, description, headings, and text preview.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS, local llama.cpp backend, Chinese-language Hermes session

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

For New Skills

N/A

Screenshots / Logs

Focused tests:

158 passed in 17.16s

@energypantry energypantry marked this pull request as ready for review May 23, 2026 14:25
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder comp/tools Tool registry, model_tools, toolsets labels May 23, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing implementations: #29454 (opt-in compaction, 3-PR series) and #28098 (plugin approach) target the same feature area. Please review existing PRs to consolidate.

@energypantry

Copy link
Copy Markdown
Author

Thanks for the pointer. I reviewed #29454 and #28098.

I agree this overlaps with the same tool-result/context-pressure problem space. My intent with this PR was to propose a smaller insertion-time, context-only compaction path that keeps user-visible/log output raw while reducing only the message replayed into model context.

If maintainers prefer the #29454 opt-in raw-storage direction or the #28098 plugin direction, I’m happy to adapt this PR, close it, or extract any useful tests/config/docs into whichever implementation you want to consolidate around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/tools Tool registry, model_tools, toolsets P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Insertion-Time Tool Result Trimming — Cache-Friendly Context Management

2 participants