fix(gateway): strip internal fields from tool_calls on session reload to preserve KV cache by ygd58 · Pull Request #4563 · NousResearch/hermes-agent

ygd58 · 2026-04-02T09:01:36Z

Problem

KV cache was fully invalidated on every new user message because session reload produced different tokens than the in-memory agentic loop. Three differences were identified.

Fix 1: Strip internal tool_call fields

call_id, response_item_id, finish_reason are Hermes-internal fields not part of OpenAI API spec. Stripped on session reload so tool_calls are byte-identical to agentic loop.

Fix 2: Normalize content whitespace

Assistant content trailing whitespace stripped consistently in both tool message path and simple message path.

Result

Messages sent to API are now consistent between agentic loop iteration and session reload, allowing local backends (llama.cpp, lemonade) to reuse KV cache across turns.

… to preserve KV cache

lsunay · 2026-04-21T10:27:40Z

Thanks for this fix! 👏

I wanted to mention a related but different issue: #13442

Your PR #4563:

Fixes KV cache invalidation on gateway session reload
Location: gateway/run.py
When: Gateway → CLI handoff

Issue #13442:

Fixes KV cache invalidation on every LLM request within run_conversation()
Location: run_agent.py (no global conversation state)
When: Agentic loop (multiple LLM requests per user message)

They're complementary! Both needed for full optimization:

Your fix: Gateway reload cache invalidation ✅
Our issue: Within-CLI agentic loop cache invalidation ❓

Would love your thoughts on #13442 if you have time!

teknium1 · 2026-06-10T07:36:37Z

Automated hermes-sweeper review: this looks implemented on current main through the shared API-bound message sanitization path rather than by applying this exact gateway/run.py diff.

Evidence:

agent/conversation_loop.py:628-646 builds the API copy, removes top-level finish_reason, and calls _sanitize_tool_calls_for_strict_api() before sending.
run_agent.py:4878-4912 strips call_id and response_item_id from tool_calls on that outgoing copy while preserving internal history state.
agent/conversation_loop.py:708-716 normalizes string message content on the API copy specifically for prefix/KV-cache matching on local inference servers.
gateway/run.py:488-523 and tests/gateway/test_replay_entry_fields.py cover the gateway replay parity side for reasoning/reasoning_content/codex fields and finish_reason.

I also read the linked #4555 discussion and the #13442 comment here; #13442 is a related within-loop optimization, but this PR's gateway session-reload request is already covered on main.

fix(gateway): strip internal fields from tool_calls on session reload…

b8f31d6

… to preserve KV cache

ygd58 mentioned this pull request Apr 2, 2026

[Bug]: KV cache invalidation on new user message due to message format differences between agentic loop and session reload #4555

Closed

lsunay mentioned this pull request Apr 21, 2026

[Performance] Missing global conversation history state causes 314x slowdown on llama.cpp with Qwen3.5-27B #13442

Open

alt-glitch added type/perf Performance improvement or optimization P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels May 1, 2026

teknium1 closed this Jun 10, 2026

teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gateway): strip internal fields from tool_calls on session reload to preserve KV cache#4563

fix(gateway): strip internal fields from tool_calls on session reload to preserve KV cache#4563
ygd58 wants to merge 1 commit into
NousResearch:mainfrom
ygd58:fix/kv-cache-invalidation-session-reload

ygd58 commented Apr 2, 2026

Uh oh!

lsunay commented Apr 21, 2026

Uh oh!

teknium1 commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ygd58 commented Apr 2, 2026

Problem

Fix 1: Strip internal tool_call fields

Fix 2: Normalize content whitespace

Result

Uh oh!

lsunay commented Apr 21, 2026

Uh oh!

teknium1 commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants