Skip to content

fix(agent): gate memory + context engine tool injection on enabled_toolsets (#5544)#30177

Merged
teknium1 merged 2 commits into
mainfrom
hermes/hermes-4b73533f
May 22, 2026
Merged

fix(agent): gate memory + context engine tool injection on enabled_toolsets (#5544)#30177
teknium1 merged 2 commits into
mainfrom
hermes/hermes-4b73533f

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Closes #5544. Salvages #5788.

Summary

platform_toolsets: telegram: [] now actually means "no tools on Telegram."
Previously, memory-provider tools (fact_store etc.) and context-engine tools
(lcm_grep etc.) leaked into the tool surface regardless of the platform's
enabled_toolsets configuration — bypassing the toolset filter that gates every
other tool category.

Root cause

Two blind-injection sites in agent/agent_init.py appended tool schemas to
agent.tools unconditionally after get_tool_definitions(enabled_toolsets=...)
ran:

  1. agent._memory_manager.get_all_tool_schemas() (~line 1128)
  2. agent.context_compressor.get_tool_schemas() (~line 1448)

Both ignored agent.enabled_toolsets entirely.

Impact (from @thundercat49's benchmarks on Qwen3-30B-A3B Q4_K_M / RTX 3090)

Scenario Prompt tokens Time tok/s
No tools 358 291 ms 1,230
1 tool 482 497 ms 970
8 tools (Hermes default) 3,033 22,570 ms 134

A simple "hello" on Telegram with telegram: [] took ~42 s instead of ~1.7 s.
Small local models also entered tool-call loops when memory tools were the only
tools present.

Cloud APIs masked this — prompt processing is near-instant regardless of tool
count, so the bug was invisible to most users.

Fix

Gate both injection sites on agent.enabled_toolsets (matches the natural
interpretation of the existing toolset filter):

enabled_toolsets Behavior
None No filter, inject (backward compat)
Contains 'memory' Inject memory provider tools
Contains 'context_engine' Inject context engine tools
Otherwise (incl. []) Skip injection

Changes

File What
agent/agent_init.py Two gates, ~13 LOC + comments
tests/agent/test_memory_provider.py +89 lines, 14 new tests across two classes
scripts/release.py AUTHOR_MAP entry for @Lempkey

Validation

Unit tests:

  • tests/agent/test_memory_provider.py: 60 → 74 tests, all pass

E2E with real AIAgent + a real MemoryProvider plugin:

enabled_toolsets Before After
None fact_store injected fact_store injected ✓
[] fact_store leaked (34 tools) 0 tools, fact_store blocked ✓
["terminal"] fact_store leaked only terminal+process, fact_store blocked ✓
["memory"] fact_store injected fact_store injected ✓

Notes

  • Salvages @Lempkey's original fix from PR fix(agent): gate memory tool injection on platform_toolsets containing 'memory' #5788 (5,000+ commits stale, can't
    cherry-pick clean — file moved from run_agent.py to agent/agent_init.py).
    First commit preserves their authorship via --author=.
  • Second commit (mine) widens the same gate to the sibling context-engine
    injection site — same bug class, same one-line gate.
  • Tests use the simulator pattern matching the rest of test_memory_provider.py.
    E2E validation against the live AIAgent was done out-of-band.

Co-authored by @Lempkey.
Reported by @thundercat49 with full latency benchmarks.

Lempkey and others added 2 commits May 21, 2026 19:22
MemoryManager.get_all_tool_schemas() output was appended to AIAgent.tools
unconditionally — bypassing the enabled_toolsets / platform_toolsets filter.
Setting `platform_toolsets: telegram: []` had no effect: fact_store and other
memory provider tools still leaked into the tool surface on every session.

Impact on local models (per @thundercat49's benchmarks on Qwen3-30B-A3B Q4_K_M /
RTX 3090): tool-formatted prompts process at 134 tok/s vs 1,230 tok/s for plain
text. With 8 memory tool schemas injected, a simple 'hello' on Telegram took
~42s instead of ~1.7s. Small models also entered tool-call loops when memory
tools were the only tools present.

Gate condition (matches the natural meaning of enabled_toolsets):
  None                       → no filter, inject (backward compat)
  contains 'memory'          → user opted in, inject
  otherwise (including [])   → skip injection

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
The memory-provider gate added in the prior commit closes one of two
blind-injection sites in agent_init.py. The context engine block (lines
~1445) follows the identical pattern: agent.context_compressor.get_tool_schemas()
(lcm_grep, lcm_describe, lcm_expand) was appended to agent.tools unconditionally,
ignoring enabled_toolsets.

Same bug class, same local-model latency penalty, same one-line gate — using
'context_engine' as the toolset name (matches the existing plugin-system
convention in plugins.py, plugins_cmd.py, etc.).

Also adds Lempkey to scripts/release.py AUTHOR_MAP for the prior commit's
authorship.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-4b73533f vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9011 on HEAD, 9011 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4762 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@daimon-nous daimon-nous Bot added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder tool/memory Memory tool and memory providers P2 Medium — degraded but workaround exists labels May 22, 2026
@teknium1 teknium1 merged commit e77f1ed into main May 22, 2026
19 of 20 checks passed
@teknium1 teknium1 deleted the hermes/hermes-4b73533f branch May 22, 2026 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists tool/memory Memory tool and memory providers type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug + feature request: Memory provider tools auto-injected regardless of platform_toolsets config — 10x latency penalty on local models

2 participants