Skip to content

fix(agent): persist repaired-turn responses#46071

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-82ce03d4
Jun 14, 2026
Merged

fix(agent): persist repaired-turn responses#46071
teknium1 merged 1 commit into
mainfrom
hermes/hermes-82ce03d4

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

SessionDB flushing now persists newly appended assistant responses even when message-sequence repair has shortened the live messages list below conversation_history length.

Changes

  • run_agent.py: replace positional flush slicing with per-session message identity tracking, so history dicts are skipped by identity and newly appended dicts are written once.
  • tests/run_agent/test_identity_flush.py: cover repair-shrunk history, stale cached-agent cursor, same-turn dedup, and cached-agent turn reset.
  • tests/run_agent/test_compression_persistence.py: flip the stale-history regression from documenting the drop to asserting persistence.

Validation

Check Result
python3 -m py_compile run_agent.py tests/run_agent/test_identity_flush.py tests/run_agent/test_compression_persistence.py pass
scripts/run_tests.sh tests/run_agent/test_identity_flush.py tests/run_agent/test_compression_persistence.py tests/run_agent/test_860_dedup.py 18 passed

Fixes #46053.

Infographic

Repaired Turns Now Persist

@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-82ce03d4 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 10895 on HEAD, 10893 on base (🆕 +2)

🆕 New issues (2):

Rule Count
unresolved-attribute 2
First entries
tests/run_agent/test_credits_notices_toggle.py:76: [unresolved-attribute] unresolved-attribute: Unresolved attribute `_credits_session_start_micros` on type `AIAgent`
run_agent.py:2920: [unresolved-attribute] unresolved-attribute: Object of type `Self@get_credits_spent_micros` has no attribute `_credits_session_start_micros`

✅ Fixed issues (1):

Rule Count
invalid-assignment 1
First entries
tests/run_agent/test_credits_notices_toggle.py:76: [invalid-assignment] invalid-assignment: Object of type `None` is not assignable to attribute `_credits_session_start_micros` of type `int`

Unchanged: 5724 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@liuhao1024

Copy link
Copy Markdown
Contributor

Verification: LGTM — solid identity-based flush fix

Reviewed the full diff. The positional-slice approach (messages[flush_from:]) was fragile when repair_message_sequence merged/shrunk the history — len(conversation_history) > len(messages) made the slice empty, silently dropping assistant responses (#46053).

The replacement using id(msg) identity tracking is correct:

Test coverage is thorough — 4 new tests in test_identity_flush.py covering the exact regression scenario (stale cursor, repeated flush, cursor reset on new turn). The existing test_flush_with_stale_history_loses_messages is updated from "verifies the bug exists" to "verifies the fix works" — clean transition.

One note: _last_flushed_db_idx is still set to len(messages) at the end (line ~1633), but it's no longer used for slicing — only for the session-id reset gate. This is harmless but could be cleaned up in a follow-up if desired.

@teknium1

Copy link
Copy Markdown
Contributor Author

Live testing added after CI because this touches the core persistence path:

Scenario Result
Direct real SessionDB + real AIAgent._flush_messages_to_session_db with len(conversation_history)=8, len(messages)=3 Old positional flush_from=8 would be empty; PR persisted the new user + assistant rows and repeat flush kept row count stable at 11
Headless real Hermes CLI from this worktree with isolated HERMES_HOME Provider returned LIVE_PERSISTENCE_OK; state.db contained both user and assistant rows, assistant finish_reason=stop
Real AIAgent.run_conversation() with poisoned DB-style history of 7 consecutive persisted user rows repair_message_sequence merged the history path, model returned LIVE_REPAIR_FLOW_OK, state.db persisted the assistant row; repeat _persist_session kept row count stable at 9

The repair-flow test is the sensitive case from #46053: persisted conversation_history is longer than the repaired working messages list. The assistant row still landed in state.db.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: assistant responses silently dropped after repair_message_sequence (conversation_history vs messages mismatch)

2 participants