Skip to content

fix(gateway): persist flushed-session set to DB to prevent double flush on restart#3078

Closed
Mibayy wants to merge 1 commit into
NousResearch:mainfrom
Mibayy:fix/double-memory-flush-on-restart
Closed

fix(gateway): persist flushed-session set to DB to prevent double flush on restart#3078
Mibayy wants to merge 1 commit into
NousResearch:mainfrom
Mibayy:fix/double-memory-flush-on-restart

Conversation

@Mibayy

@Mibayy Mibayy commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

Problem

Fixes #3059.

When the gateway restarted, _pre_flushed_sessions (an in-memory set) was reset to empty. The background expiry watcher would then see the same expired session and flush it again, causing:

  • Duplicate LLM API calls
  • Memory tool errors when trying to save already-saved entries
  • Wasted tokens and potential memory corruption

Root cause

self._pre_flushed_sessions: set = set()  # resets on every restart

Fix

hermes_state.py — Schema v7 migration adds a flushed_sessions table:

  • add_flushed_session(session_id) — called after each successful proactive flush
  • load_flushed_sessions() — returns the set of already-flushed session IDs
  • remove_flushed_session(session_id) — called when a session is fully reset

Also fixes an existing bug in the v6 migration block that incorrectly bumped schema_version to 5 instead of 6.

gateway/session.py — On SessionStore init, load the persisted flushed-session set from DB so markers survive restarts. On session reset, call remove_flushed_session to clean up the DB record.

gateway/run.py — After a successful proactive flush, persist the marker to DB. Also adds memory-near-full detection to the flush prompt: when memory is >=90% full, the agent is warned to consolidate or remove stale entries before adding new ones, preventing invalid tool calls at capacity (the second issue noted in #3059).

…sh on restart

When the gateway restarted, _pre_flushed_sessions (an in-memory set) was
reset to empty, causing the expiry watcher to flush the same expired session
a second time — triggering duplicate LLM calls, wasted tokens, and potential
memory corruption.

Fix:
- Add flushed_sessions table to state.db (schema v7 migration)
- Load the persisted set into _pre_flushed_sessions on SessionStore init
- Persist the marker to DB after each successful proactive flush in run.py
- Clean up the DB record when a session is fully reset (discard + remove)

Also fix the existing v6 migration that incorrectly bumped schema_version
to 5 instead of 6.

Additionally, add memory-near-full detection to the flush prompt: when
memory is >=90% full, the agent is warned to consolidate or remove stale
entries before adding new ones, preventing invalid tool calls at capacity.

Fixes NousResearch#3059
thakoreh added a commit to thakoreh/hermes-agent that referenced this pull request Mar 26, 2026
The _pre_flushed_sessions in-memory set was lost on gateway restart,
causing double-flushing of already-persisted session memories. Persist
to a JSON sidecar file and reload on startup.

Fixes NousResearch#3078
@teknium1

teknium1 commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Thanks @Mibayy! Your approach of persisting flushed state to SQLite was the most thorough of the four PRs targeting this bug. The fix landed in #4481 using a simpler approach — a memory_flushed boolean on the SessionEntry itself, persisted in sessions.json. This avoids a new SQLite table while achieving the same goal. Appreciate the contribution!

@teknium1 teknium1 closed this Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Double memory flush on gateway restart when session expires

2 participants