fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335) by Sanjays2402 · Pull Request #17337 · NousResearch/hermes-agent

Sanjays2402 · 2026-04-29T07:50:51Z

Problem

Long-lived Gateway processes (Feishu, etc.) were sending duplicate tool names to providers that enforce uniqueness:

DeepSeek: Tool names must be unique.
Xiaomi MiMo: tools contains duplicate names: lcm_expand
Moonshot/Kimi: function name lcm_grep is duplicated

TUI was unaffected because TUI uses quiet_mode=False and skips the cache.

Root Cause (two layered bugs)

1. model_tools.get_tool_definitions(quiet_mode=True) aliased the cached object on the first call.
The cache-hit path (line 278) already returned list(cached) — safe. But the first uncached call stored and returned the same object. run_agent.py then mutates self.tools in-place (appending memory + LCM context-engine schemas), so the very first agent init in a Gateway process poisoned the cache, and every subsequent init appended LCM schemas again on top of the already-polluted list.

2. run_agent.py's context-engine injection had no dedup.
Memory-tools injection (lines 1728–1748) already skips already-present names. The LCM injection right below it (lines 1986–1993) didn't. So even after fixing the cache, plugin paths that register schemas via ctx.register_tool() could still produce duplicates.

Fix (defense in depth, exactly as the issue suggested)

model_tools.py — on the uncached branch, cache the result but return list(result) to the caller, mirroring the cache-hit path:

result = _compute_tool_definitions(...)
if quiet_mode:
    _tool_defs_cache[cache_key] = result
    return list(result)
return result

run_agent.py — build _existing_tool_names from self.tools and skip already-present schemas, mirroring the memory-tools block above:

_existing_tool_names = {t.get("function", {}).get("name") for t in self.tools if isinstance(t, dict)}
for _schema in self.context_compressor.get_tool_schemas():
    _tname = _schema.get("name", "")
    if _tname and _tname in _existing_tool_names:
        continue
    ...
    _existing_tool_names.add(_tname)

Tests

New file tests/test_get_tool_definitions_cache_isolation.py:

test_first_uncached_call_returns_fresh_list — pins the fix; without it, the first-call alias is the entire bug.
test_cache_hit_returns_fresh_list — pre-existing perf(tools): memoize get_tool_definitions + TTL-cache check_fn results #17098 behavior stays.
test_caller_mutation_does_not_poison_cache — simulates run_agent appending lcm_grep / lcm_expand to the returned list and asserts the next call doesn't see them.
test_repeated_caller_mutation_does_not_accumulate — reproduces the long-lived Gateway accumulation across 5 agent inits.
test_non_quiet_mode_does_not_use_cache — sanity, explains why TUI was unaffected.

$ python -m pytest tests/test_get_tool_definitions_cache_isolation.py tests/test_model_tools.py -q
............................                                            [100%]
28 passed in 0.78s

5/5 new tests pass; 23/23 existing tests/test_model_tools.py still pass.

… injection (NousResearch#17335) Long-lived Gateway processes were sending duplicate tool names to providers that enforce uniqueness: - DeepSeek: 'Tool names must be unique.' - Xiaomi MiMo: 'tools contains duplicate names: lcm_expand' - Moonshot/Kimi: 'function name lcm_grep is duplicated' TUI was unaffected because TUI runs with quiet_mode=False and skips the cache entirely. Root cause (two layered bugs) - model_tools.get_tool_definitions(quiet_mode=True) memoizes its result in _tool_defs_cache. The cache-hit path returned list(cached) (safe), but the FIRST uncached call stored and returned the SAME object. run_agent.py mutates self.tools (memory + LCM context-engine schemas) in-place, so the very first agent init in a Gateway process poisoned the cache, and every subsequent init appended LCM schemas again on top of the already-polluted list. - run_agent.py's context-engine injection (lcm_grep / lcm_describe / lcm_expand) had no dedup, unlike the memory-tools injection right above it which already skips already-present names. Fix (defense in depth, per the issue's suggested fix) - model_tools.get_tool_definitions: on the uncached branch, cache the computed list but return list(result) to the caller. Same pattern as the cache-hit path. - run_agent.py: build _existing_tool_names from self.tools and skip schemas whose names are already present, mirroring the memory-tools block. This also defends against plugin paths that may register the same schemas via ctx.register_tool(). Tests (tests/test_get_tool_definitions_cache_isolation.py) - test_first_uncached_call_returns_fresh_list \u2014 pins the fix; without it, first-call alias caused all the symptoms. - test_cache_hit_returns_fresh_list \u2014 pre-existing behavior stays. - test_caller_mutation_does_not_poison_cache \u2014 simulates run_agent appending lcm_grep / lcm_expand to the returned list and asserts the next call doesn't see them. - test_repeated_caller_mutation_does_not_accumulate \u2014 reproduces the long-lived Gateway accumulation pattern across 5 agent inits. - test_non_quiet_mode_does_not_use_cache \u2014 sanity, explains why TUI was fine. 5/5 pass on the new file; 23/23 still pass on tests/test_model_tools.py.

teknium1 · 2026-04-30T11:34:56Z

Salvaged onto current main via #17889 (merge commit e0fa2cf972). Your authorship is preserved on the commit. Thanks!

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/tools Tool registry, model_tools, toolsets comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 29, 2026

alt-glitch mentioned this pull request Apr 30, 2026

fix: dedup context engine tools + fix kimi aux deadcode model name #17736

Open

23 tasks

teknium1 mentioned this pull request Apr 30, 2026

fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335) #17889

Merged

teknium1 closed this Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335)#17337

fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335)#17337
Sanjays2402 wants to merge 1 commit into
NousResearch:mainfrom
Sanjays2402:fix/17335-quiet-mode-cache-pollution

Sanjays2402 commented Apr 29, 2026

Uh oh!

teknium1 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Sanjays2402 commented Apr 29, 2026

Problem

Root Cause (two layered bugs)

Fix (defense in depth, exactly as the issue suggested)

Tests

Uh oh!

teknium1 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants