Skip to content

[Bug]: transform_llm_output and post_llm_call hooks both produce final_response/history mismatch — same root cause as #14894 #44239

@M1p0

Description

@M1p0

Bug Description

Issue #14894 correctly identifies that post_llm_call overrides are applied after _persist_session, causing user-facing final_response to diverge from persisted history. This bug also affects the transform_llm_output hook, which has the same ordering problem: the transformed response is never written back into the assistant message before persistence.

Combined impact:

  • transform_llm_output transforms the text (e.g. persona/style rendering), user sees the transformed response
  • post_llm_call may further override the response or perform side effects
  • But _persist_session runs before EITHER hook, so:
    • messages[-1]["content"] = raw model output
    • SQLite session DB = raw model output
    • JSON session log = raw model output
    • Next-turn conversation replay = raw model output
    • Only the returned final_response reflects the transformed/overridden text

This is the same root cause as #14894 but broader in scope. The fix should cover both hooks in one pass.

Steps to Reproduce

  1. Register a transform_llm_output hook:
async def on_transform_llm_output(response_text, **kwargs):
    return f"[RENDERED] {response_text}"
  1. Run a turn. The user sees [RENDERED] ....
  2. Inspect result["messages"][-1]["content"] — still the raw text.
  3. Resume the session — the next turn's context contains the raw text, not the rendered version.

Expected Behavior

Both transform_llm_output and post_llm_call (including its override_response path) should produce durable changes. After finalization:

  • result["final_response"] ✓ (already works)
  • result["messages"][-1]["content"] should match
  • Persisted session (JSON log + SQLite) should match
  • Memory sync (Hindsight, etc.) should receive the final rendered text
  • Next-turn context should replay the rendered text

Actual Behavior

_persist_session runs before hooks → persisted history captures raw model output. The rendered/overridden final_response is only visible to the user in the current turn, then lost on resume.

Proposed Fix

Two changes in agent/turn_finalizer.py:

  1. Reorder: move _persist_session AFTER both transform_llm_output and post_llm_call hooks.
  2. Sync: after hooks produce the final final_response, write it back into the last assistant message via a helper like _sync_final_response_to_last_assistant(messages, final_response) before calling _persist_session.

This is minimally invasive — no new hook invocations, no API surface changes. A reference implementation exists at:

https://github.com/M1p0/Project-Rin/blob/master/patch/hermes_turn_finalizer_render_persistence.patch

Related

Affected Component

Agent Core (conversation loop, session persistence, plugin hooks)

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havecomp/agentCore agent loop, run_agent.py, prompt buildercomp/pluginsPlugin system and bundled pluginstype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions