[Bug]: post_llm_call response overrides are applied after persistence, causing final_response/history mismatch

### Bug Description

`post_llm_call` hooks can return a replacement response, but the override is currently applied after session persistence.

This means the user-facing `final_response` can differ from the assistant message stored in `result["messages"]`, the SQLite session DB, and the JSON session log. On the next turn or after resuming the session, Hermes replays the original model response instead of the hook-modified response.

This creates inconsistent behavior for plugins that use `post_llm_call` for response post-processing, rendering, policy transforms, or persona/style shaping.

### Steps to Reproduce

1. Register a `post_llm_call` hook that returns a replacement response:

```python
def on_post_llm_call(**kwargs):
    return {"response": "patched response"}

ctx.register_hook("post_llm_call", on_post_llm_call)
```

2. Run a normal conversation turn where the model produces a response, for example:

```text
original response
```

3. Observe the returned/displayed `final_response`.

4. Inspect any of the following:

- `result["messages"]`
- the persisted session transcript
- the SQLite session DB
- the next turn's replayed conversation context
- a resumed session

### Expected Behavior

If `post_llm_call` supports response overrides, the overridden response should be applied consistently to the completed assistant turn.

The following should all agree:

- returned `result["final_response"]`
- last assistant message in `result["messages"]`
- persisted session DB / JSON session log
- next-turn conversation replay
- resumed session transcript

Alternatively, if `post_llm_call` overrides are intended to be display-only, this should be documented explicitly to avoid plugin authors assuming durable response mutation.

### Actual Behavior

The hook override affects the returned user-facing `final_response`, but does not update the already-persisted assistant message.

Current order is effectively:

```python
self._persist_session(messages, conversation_history)

_post_results = invoke_hook("post_llm_call", ...)
for r in _post_results:
    final_response = ...
```

As a result:

- the user may see `"patched response"`
- `result["messages"]` still contains `"original response"`
- persisted history still contains `"original response"`
- the next turn uses `"original response"` as prior assistant context

### Affected Component

CLI (interactive chat), Agent Core (conversation loop, context compression, memory)

### Messaging Platform (if gateway-related)

_No response_

### Debug Report

```shell
Report     https://paste.rs/pK9p0
  agent.log  https://paste.rs/N7B6x
```

### Operating System

ubuntu 24.04

### Python Version

_No response_

### Hermes Version

_No response_

### Additional Logs / Traceback (optional)

```shell

```

### Root Cause Analysis (optional)

_No response_

### Proposed Fix (optional)

_No response_

### Are you willing to submit a PR for this?

- [ ] I'd like to fix this myself and submit a PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: post_llm_call response overrides are applied after persistence, causing final_response/history mismatch #14894

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Are you willing to submit a PR for this?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: post_llm_call response overrides are applied after persistence, causing final_response/history mismatch #14894

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Are you willing to submit a PR for this?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions