Bug Description
Issue #14894 correctly identifies that post_llm_call overrides are applied after _persist_session, causing user-facing final_response to diverge from persisted history. This bug also affects the transform_llm_output hook, which has the same ordering problem: the transformed response is never written back into the assistant message before persistence.
Combined impact:
transform_llm_output transforms the text (e.g. persona/style rendering), user sees the transformed response
post_llm_call may further override the response or perform side effects
- But
_persist_session runs before EITHER hook, so:
messages[-1]["content"] = raw model output
- SQLite session DB = raw model output
- JSON session log = raw model output
- Next-turn conversation replay = raw model output
- Only the returned
final_response reflects the transformed/overridden text
This is the same root cause as #14894 but broader in scope. The fix should cover both hooks in one pass.
Steps to Reproduce
- Register a
transform_llm_output hook:
async def on_transform_llm_output(response_text, **kwargs):
return f"[RENDERED] {response_text}"
- Run a turn. The user sees
[RENDERED] ....
- Inspect
result["messages"][-1]["content"] — still the raw text.
- Resume the session — the next turn's context contains the raw text, not the rendered version.
Expected Behavior
Both transform_llm_output and post_llm_call (including its override_response path) should produce durable changes. After finalization:
result["final_response"] ✓ (already works)
result["messages"][-1]["content"] should match
- Persisted session (JSON log + SQLite) should match
- Memory sync (Hindsight, etc.) should receive the final rendered text
- Next-turn context should replay the rendered text
Actual Behavior
_persist_session runs before hooks → persisted history captures raw model output. The rendered/overridden final_response is only visible to the user in the current turn, then lost on resume.
Proposed Fix
Two changes in agent/turn_finalizer.py:
- Reorder: move
_persist_session AFTER both transform_llm_output and post_llm_call hooks.
- Sync: after hooks produce the final
final_response, write it back into the last assistant message via a helper like _sync_final_response_to_last_assistant(messages, final_response) before calling _persist_session.
This is minimally invasive — no new hook invocations, no API surface changes. A reference implementation exists at:
https://github.com/M1p0/Project-Rin/blob/master/patch/hermes_turn_finalizer_render_persistence.patch
Related
Affected Component
Agent Core (conversation loop, session persistence, plugin hooks)
Are you willing to submit a PR for this?
Bug Description
Issue #14894 correctly identifies that
post_llm_calloverrides are applied after_persist_session, causing user-facingfinal_responseto diverge from persisted history. This bug also affects thetransform_llm_outputhook, which has the same ordering problem: the transformed response is never written back into the assistant message before persistence.Combined impact:
transform_llm_outputtransforms the text (e.g. persona/style rendering), user sees the transformed responsepost_llm_callmay further override the response or perform side effects_persist_sessionruns before EITHER hook, so:messages[-1]["content"]= raw model outputfinal_responsereflects the transformed/overridden textThis is the same root cause as #14894 but broader in scope. The fix should cover both hooks in one pass.
Steps to Reproduce
transform_llm_outputhook:[RENDERED] ....result["messages"][-1]["content"]— still the raw text.Expected Behavior
Both
transform_llm_outputandpost_llm_call(including its override_response path) should produce durable changes. After finalization:result["final_response"]✓ (already works)result["messages"][-1]["content"]should matchActual Behavior
_persist_sessionruns before hooks → persisted history captures raw model output. The rendered/overridden final_response is only visible to the user in the current turn, then lost on resume.Proposed Fix
Two changes in
agent/turn_finalizer.py:_persist_sessionAFTER bothtransform_llm_outputandpost_llm_callhooks.final_response, write it back into the last assistant message via a helper like_sync_final_response_to_last_assistant(messages, final_response)before calling_persist_session.This is minimally invasive — no new hook invocations, no API surface changes. A reference implementation exists at:
https://github.com/M1p0/Project-Rin/blob/master/patch/hermes_turn_finalizer_render_persistence.patch
Related
post_llm_calloverride path)Affected Component
Agent Core (conversation loop, session persistence, plugin hooks)
Are you willing to submit a PR for this?