[Bug]: transform_llm_output and post_llm_call hooks both produce final_response/history mismatch — same root cause as #14894

### Bug Description

Issue #14894 correctly identifies that `post_llm_call` overrides are applied after `_persist_session`, causing user-facing `final_response` to diverge from persisted history. This bug also affects the `transform_llm_output` hook, which has the same ordering problem: the transformed response is never written back into the assistant message before persistence.

Combined impact:

- `transform_llm_output` transforms the text (e.g. persona/style rendering), user sees the transformed response
- `post_llm_call` may further override the response or perform side effects
- But `_persist_session` runs before EITHER hook, so:
  - `messages[-1]["content"]` = raw model output
  - SQLite session DB = raw model output
  - JSON session log = raw model output
  - Next-turn conversation replay = raw model output
  - Only the returned `final_response` reflects the transformed/overridden text

This is the same root cause as #14894 but broader in scope. The fix should cover both hooks in one pass.

### Steps to Reproduce

1. Register a `transform_llm_output` hook:

```python
async def on_transform_llm_output(response_text, **kwargs):
    return f"[RENDERED] {response_text}"
```

2. Run a turn. The user sees `[RENDERED] ...`.
3. Inspect `result["messages"][-1]["content"]` — still the raw text.
4. Resume the session — the next turn's context contains the raw text, not the rendered version.

### Expected Behavior

Both `transform_llm_output` and `post_llm_call` (including its override_response path) should produce durable changes. After finalization:

- `result["final_response"]` ✓ (already works)
- `result["messages"][-1]["content"]` should match
- Persisted session (JSON log + SQLite) should match
- Memory sync (Hindsight, etc.) should receive the final rendered text
- Next-turn context should replay the rendered text

### Actual Behavior

`_persist_session` runs before hooks → persisted history captures raw model output. The rendered/overridden final_response is only visible to the user in the current turn, then lost on resume.

### Proposed Fix

Two changes in `agent/turn_finalizer.py`:

1. **Reorder**: move `_persist_session` AFTER both `transform_llm_output` and `post_llm_call` hooks.
2. **Sync**: after hooks produce the final `final_response`, write it back into the last assistant message via a helper like `_sync_final_response_to_last_assistant(messages, final_response)` before calling `_persist_session`.

This is minimally invasive — no new hook invocations, no API surface changes. A reference implementation exists at:

https://github.com/M1p0/Project-Rin/blob/master/patch/hermes_turn_finalizer_render_persistence.patch

### Related

- #14894 (same root cause for `post_llm_call` override path)

### Affected Component

Agent Core (conversation loop, session persistence, plugin hooks)

### Are you willing to submit a PR for this?

- [ ] I'd like to fix this myself and submit a PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: transform_llm_output and post_llm_call hooks both produce final_response/history mismatch — same root cause as #14894 #44239

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Proposed Fix

Related

Affected Component

Are you willing to submit a PR for this?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: transform_llm_output and post_llm_call hooks both produce final_response/history mismatch — same root cause as #14894 #44239

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Proposed Fix

Related

Affected Component

Are you willing to submit a PR for this?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions