Skip to content

feat(memory): notify providers on mid-process session_id rotation (#6672)#17409

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-139c33ab
Apr 29, 2026
Merged

feat(memory): notify providers on mid-process session_id rotation (#6672)#17409
teknium1 merged 1 commit into
mainfrom
hermes/hermes-139c33ab

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Closes #6672

Summary

Memory providers now receive on_session_switch() whenever AIAgent.session_id rotates mid-process — /resume, /branch, /reset, /new, and context compression. Before this, providers that cached per-session state in initialize() (Hindsight's _session_id, _document_id, _session_turns, _turn_counter) kept writing into the wrong session after the agent moved on.

Changes

  • agent/memory_provider.py: new optional hook on_session_switch(new_session_id, *, parent_session_id='', reset=False, **kwargs) with no-op ABC default (backward compat).
  • agent/memory_manager.py: on_session_switch() fan-out; per-provider try/except; empty-id no-op.
  • run_agent.py:
    • sync_all() + queue_prefetch_all() now pass session_id=self.session_id, activating existing defensive session_id updates in Hindsight's sync_turn.
    • Compression block (~L8884) now calls _memory_manager.on_session_switch(reason='compression') next to the existing context-engine notification.
  • cli.py: new_session() (reset=True), _handle_resume_command (reset=False, parent=old_sid), _handle_branch_command (reset=False, parent=parent_session_id) all fire the hook.
  • gateway/run.py: _handle_resume_command now evicts the cached AIAgent, mirroring /branch and /reset. Next message rebuilds with correct session_id end-to-end.
  • plugins/memory/hindsight/__init__.py: reference on_session_switch impl — updates _session_id, mints fresh _document_id (prevents Bug: Session reset leaves Hindsight OpenClaw plugin state uncleared — causes duplicate retains vectorize-io/hindsight#1303 overwrite), clears _session_turns / _turn_counter / _turn_index, preserves existing _parent_session_id when not passed.

Validation

Test surface Before After
ABC default on_session_switch did not exist no-op, accepts all kwarg shapes
MemoryManager fan-out n/a dispatches to all providers, isolates failures
sync_all(...) in run_agent.py no session_id passes session_id=self.session_id
CLI /resume, /branch, /reset, /new silent about provider fires on_session_switch with correct reset/parent
Gateway /resume reused stale cached agent evicts agent (matches /branch, /reset)
Compression session_id rotation only context engine notified memory manager also notified with reason='compression'
Hindsight after /resume stale _session_id, growing _session_turns buffer session_id updated, fresh _document_id, buffers cleared

Targeted tests: 54/54 passing across tests/agent/test_memory_session_switch.py, tests/run_agent/test_memory_sync_interrupted.py, tests/cli/test_branch_command.py, tests/gateway/test_resume_command.py. Broader sweep across tests/agent/, tests/run_agent/, tests/cli/, tests/hermes_cli/test_memory_reset.py, tests/honcho_plugin/, tests/plugins/memory/: 4332/4332 passing (one pre-existing failure in tests/agent/test_copilot_acp_client.py reproduces on current main, unrelated).

E2E verified with real imports + isolated HERMES_HOME: /resume, /branch, /new, compression, empty-id no-op, parent preservation, legacy-provider-without-override all behave correctly.

Reporter credit

Issue + Option A proposal by @nicoloboschi (Hindsight maintainer; already wrote the defensive sync_turn session_id update at hindsight/__init__.py:1199 that this PR now activates). Scope-widening note about compression rotation by @kidonng.

Fixes #6672

Memory providers now receive on_session_switch() whenever AIAgent.session_id
rotates mid-process — /resume, /branch, /reset, /new, and context
compression. Before this, providers that cached per-session state in
initialize() (Hindsight's _session_id, _document_id, accumulated
_session_turns, _turn_counter) kept writing into the old session's
record after the agent had moved on.

MemoryProvider ABC
------------------
- New optional hook on_session_switch(new_session_id, *,
  parent_session_id='', reset=False, **kwargs) with no-op default for
  backward compat. reset=True signals /reset or /new — providers should
  flush accumulated per-session buffers. reset=False for /resume,
  /branch, compression where the logical conversation continues.

MemoryManager
-------------
- on_session_switch() fans the hook out to every registered provider.
  Isolated try/except per provider — one bad provider can't block others.
- Empty/None new_session_id is a no-op to avoid corrupting provider state
  during shutdown paths.

run_agent.py
------------
- _sync_external_memory_for_turn now passes session_id=self.session_id
  into sync_all() and queue_prefetch_all(). Providers with defensive
  session_id updates in sync_turn (Hindsight already had this at
  plugins/memory/hindsight/__init__.py:1199) now actually receive the
  current id.
- Compression block at ~L8884 already notified the context engine of
  the rollover; now also calls
  _memory_manager.on_session_switch(reason='compression').

cli.py
------
- new_session() fires reset=True, reason='new_session' so providers
  flush buffers.
- _handle_resume_command fires reset=False, reason='resume' with the
  previous session as parent_session_id.
- _handle_branch_command fires reset=False, reason='branch' with the
  parent session_id already captured for the DB parent link.

gateway/run.py
--------------
- _handle_resume_command now evicts the cached AIAgent, mirroring
  /branch and /reset. The next message rebuilds a fresh agent whose
  memory provider initialize() runs with the correct session_id —
  matches the pattern the gateway already uses for provider state
  cross-session transitions.

Hindsight reference implementation
----------------------------------
- plugins/memory/hindsight/__init__.py adds on_session_switch that:
  updates _session_id, mints a fresh _document_id (prevents
  vectorize-io/hindsight#1303 overwrite), and clears _session_turns /
  _turn_counter / _turn_index so in-flight batches don't flush under
  the new document id. parent_session_id only overwritten when provided
  (avoids clobbering on a bare switch).

Tests
-----
- tests/agent/test_memory_session_switch.py: new dedicated file. ABC
  default no-op, manager fan-out, failure isolation, empty-id no-op,
  session_id propagation through sync_all/queue_prefetch_all, Hindsight
  state transitions for every reset/non-reset case, parent preservation.
- tests/cli/test_branch_command.py: new test verifying /branch fires
  the hook with correct parent_session_id + reset=False + reason.
- tests/gateway/test_resume_command.py: new test verifying /resume
  evicts the cached agent.
- tests/run_agent/test_memory_sync_interrupted.py: updated existing
  assertions to account for the session_id kwarg on sync_all and
  queue_prefetch_all.

E2E verified (real imports, tmp HERMES_HOME):
- /resume: session_id updates, doc_id fresh, buffers cleared, parent set
- /branch: session_id forks, parent links to original
- /new: reset=True clears accumulated state
- compression: reason='compression' propagated, lineage preserved
- Empty id: no-op, state preserved
- Legacy provider without on_session_switch: no crash

Reported by @nicoloboschi (Hindsight maintainer); related scope-widening
comment by @kidonng extending coverage to compression.
@github-actions

Copy link
Copy Markdown
Contributor

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery comp/cli CLI entry point, hermes_cli/, setup wizard comp/plugins Plugin system and bundled plugins tool/memory Memory tool and memory providers labels Apr 29, 2026
@teknium1 teknium1 merged commit 13683c0 into main Apr 29, 2026
10 of 12 checks passed
@teknium1 teknium1 deleted the hermes/hermes-139c33ab branch April 29, 2026 11:57
nicoloboschi added a commit to nicoloboschi/hermes-agent that referenced this pull request Apr 29, 2026
…on switch

Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by NousResearch#17409.

1. Buffered turns silently lost when retain_every_n_turns > 1.
   on_session_switch unconditionally cleared _session_turns without
   flushing. Users who batched every N>1 turns and switched mid-batch
   (/reset, /new, /resume, /branch, or context compression) had those
   buffered turns disappear. Same data-loss class as the shutdown race,
   different lifecycle event.

   Note commit_memory_session() -> on_session_end() runs *before*
   on_session_switch on /reset, but Hindsight doesn't implement
   on_session_end so the buffer survives that step and dies at clear
   time. /resume, /branch, and compression skip commit_memory_session
   entirely so an on_session_end impl wouldn't help them anyway.

   Fix: snapshot the old _session_id, _document_id, _parent_session_id,
   _turn_index, and _session_turns; spawn one final retain that lands
   under the OLD document_id; then rotate state. Metadata is built
   synchronously against the old self._* so session_id / lineage tags
   on the flushed item all reference the prior session consistently.

2. Stale _prefetch_result leaks across switch.
   If queue_prefetch ran in the old session and the result hadn't been
   consumed by prefetch() yet, on_session_switch left the cached recall
   text in place. The next session's first prefetch() call would return
   text mined from the prior session's bank/query.

   Fix: join any in-flight _prefetch_thread (3s bounded — matches
   shutdown()), then clear _prefetch_result under _prefetch_lock before
   rotating session_id.

Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
    - buffered turns flushed under OLD document_id with OLD lineage tags
    - empty buffer => no spurious retain
    - _prefetch_result cleared on switch
    - in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
  attrs the new flush path reads (_retain_source, _platform, _bank_id,
  prefetch state, etc.) and stub _run_hindsight_operation so existing
  switch-state assertions keep passing without network setup.
teknium1 pushed a commit that referenced this pull request Apr 29, 2026
…on switch

Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by #17409.

1. Buffered turns silently lost when retain_every_n_turns > 1.
   on_session_switch unconditionally cleared _session_turns without
   flushing. Users who batched every N>1 turns and switched mid-batch
   (/reset, /new, /resume, /branch, or context compression) had those
   buffered turns disappear. Same data-loss class as the shutdown race,
   different lifecycle event.

   Note commit_memory_session() -> on_session_end() runs *before*
   on_session_switch on /reset, but Hindsight doesn't implement
   on_session_end so the buffer survives that step and dies at clear
   time. /resume, /branch, and compression skip commit_memory_session
   entirely so an on_session_end impl wouldn't help them anyway.

   Fix: snapshot the old _session_id, _document_id, _parent_session_id,
   _turn_index, and _session_turns; spawn one final retain that lands
   under the OLD document_id; then rotate state. Metadata is built
   synchronously against the old self._* so session_id / lineage tags
   on the flushed item all reference the prior session consistently.

2. Stale _prefetch_result leaks across switch.
   If queue_prefetch ran in the old session and the result hadn't been
   consumed by prefetch() yet, on_session_switch left the cached recall
   text in place. The next session's first prefetch() call would return
   text mined from the prior session's bank/query.

   Fix: join any in-flight _prefetch_thread (3s bounded — matches
   shutdown()), then clear _prefetch_result under _prefetch_lock before
   rotating session_id.

Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
    - buffered turns flushed under OLD document_id with OLD lineage tags
    - empty buffer => no spurious retain
    - _prefetch_result cleared on switch
    - in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
  attrs the new flush path reads (_retain_source, _platform, _bank_id,
  prefetch state, etc.) and stub _run_hindsight_operation so existing
  switch-state assertions keep passing without network setup.
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…usResearch#17409)

Fixes NousResearch#6672

Memory providers now receive on_session_switch() whenever AIAgent.session_id
rotates mid-process — /resume, /branch, /reset, /new, and context
compression. Before this, providers that cached per-session state in
initialize() (Hindsight's _session_id, _document_id, accumulated
_session_turns, _turn_counter) kept writing into the old session's
record after the agent had moved on.

MemoryProvider ABC
------------------
- New optional hook on_session_switch(new_session_id, *,
  parent_session_id='', reset=False, **kwargs) with no-op default for
  backward compat. reset=True signals /reset or /new — providers should
  flush accumulated per-session buffers. reset=False for /resume,
  /branch, compression where the logical conversation continues.

MemoryManager
-------------
- on_session_switch() fans the hook out to every registered provider.
  Isolated try/except per provider — one bad provider can't block others.
- Empty/None new_session_id is a no-op to avoid corrupting provider state
  during shutdown paths.

run_agent.py
------------
- _sync_external_memory_for_turn now passes session_id=self.session_id
  into sync_all() and queue_prefetch_all(). Providers with defensive
  session_id updates in sync_turn (Hindsight already had this at
  plugins/memory/hindsight/__init__.py:1199) now actually receive the
  current id.
- Compression block at ~L8884 already notified the context engine of
  the rollover; now also calls
  _memory_manager.on_session_switch(reason='compression').

cli.py
------
- new_session() fires reset=True, reason='new_session' so providers
  flush buffers.
- _handle_resume_command fires reset=False, reason='resume' with the
  previous session as parent_session_id.
- _handle_branch_command fires reset=False, reason='branch' with the
  parent session_id already captured for the DB parent link.

gateway/run.py
--------------
- _handle_resume_command now evicts the cached AIAgent, mirroring
  /branch and /reset. The next message rebuilds a fresh agent whose
  memory provider initialize() runs with the correct session_id —
  matches the pattern the gateway already uses for provider state
  cross-session transitions.

Hindsight reference implementation
----------------------------------
- plugins/memory/hindsight/__init__.py adds on_session_switch that:
  updates _session_id, mints a fresh _document_id (prevents
  vectorize-io/hindsight#1303 overwrite), and clears _session_turns /
  _turn_counter / _turn_index so in-flight batches don't flush under
  the new document id. parent_session_id only overwritten when provided
  (avoids clobbering on a bare switch).

Tests
-----
- tests/agent/test_memory_session_switch.py: new dedicated file. ABC
  default no-op, manager fan-out, failure isolation, empty-id no-op,
  session_id propagation through sync_all/queue_prefetch_all, Hindsight
  state transitions for every reset/non-reset case, parent preservation.
- tests/cli/test_branch_command.py: new test verifying /branch fires
  the hook with correct parent_session_id + reset=False + reason.
- tests/gateway/test_resume_command.py: new test verifying /resume
  evicts the cached agent.
- tests/run_agent/test_memory_sync_interrupted.py: updated existing
  assertions to account for the session_id kwarg on sync_all and
  queue_prefetch_all.

E2E verified (real imports, tmp HERMES_HOME):
- /resume: session_id updates, doc_id fresh, buffers cleared, parent set
- /branch: session_id forks, parent links to original
- /new: reset=True clears accumulated state
- compression: reason='compression' propagated, lineage preserved
- Empty id: no-op, state preserved
- Legacy provider without on_session_switch: no crash

Reported by @nicoloboschi (Hindsight maintainer); related scope-widening
comment by @kidonng extending coverage to compression.
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…on switch

Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by NousResearch#17409.

1. Buffered turns silently lost when retain_every_n_turns > 1.
   on_session_switch unconditionally cleared _session_turns without
   flushing. Users who batched every N>1 turns and switched mid-batch
   (/reset, /new, /resume, /branch, or context compression) had those
   buffered turns disappear. Same data-loss class as the shutdown race,
   different lifecycle event.

   Note commit_memory_session() -> on_session_end() runs *before*
   on_session_switch on /reset, but Hindsight doesn't implement
   on_session_end so the buffer survives that step and dies at clear
   time. /resume, /branch, and compression skip commit_memory_session
   entirely so an on_session_end impl wouldn't help them anyway.

   Fix: snapshot the old _session_id, _document_id, _parent_session_id,
   _turn_index, and _session_turns; spawn one final retain that lands
   under the OLD document_id; then rotate state. Metadata is built
   synchronously against the old self._* so session_id / lineage tags
   on the flushed item all reference the prior session consistently.

2. Stale _prefetch_result leaks across switch.
   If queue_prefetch ran in the old session and the result hadn't been
   consumed by prefetch() yet, on_session_switch left the cached recall
   text in place. The next session's first prefetch() call would return
   text mined from the prior session's bank/query.

   Fix: join any in-flight _prefetch_thread (3s bounded — matches
   shutdown()), then clear _prefetch_result under _prefetch_lock before
   rotating session_id.

Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
    - buffered turns flushed under OLD document_id with OLD lineage tags
    - empty buffer => no spurious retain
    - _prefetch_result cleared on switch
    - in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
  attrs the new flush path reads (_retain_source, _platform, _bank_id,
  prefetch state, etc.) and stub _run_hindsight_operation so existing
  switch-state assertions keep passing without network setup.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…usResearch#17409)

Fixes NousResearch#6672

Memory providers now receive on_session_switch() whenever AIAgent.session_id
rotates mid-process — /resume, /branch, /reset, /new, and context
compression. Before this, providers that cached per-session state in
initialize() (Hindsight's _session_id, _document_id, accumulated
_session_turns, _turn_counter) kept writing into the old session's
record after the agent had moved on.

MemoryProvider ABC
------------------
- New optional hook on_session_switch(new_session_id, *,
  parent_session_id='', reset=False, **kwargs) with no-op default for
  backward compat. reset=True signals /reset or /new — providers should
  flush accumulated per-session buffers. reset=False for /resume,
  /branch, compression where the logical conversation continues.

MemoryManager
-------------
- on_session_switch() fans the hook out to every registered provider.
  Isolated try/except per provider — one bad provider can't block others.
- Empty/None new_session_id is a no-op to avoid corrupting provider state
  during shutdown paths.

run_agent.py
------------
- _sync_external_memory_for_turn now passes session_id=self.session_id
  into sync_all() and queue_prefetch_all(). Providers with defensive
  session_id updates in sync_turn (Hindsight already had this at
  plugins/memory/hindsight/__init__.py:1199) now actually receive the
  current id.
- Compression block at ~L8884 already notified the context engine of
  the rollover; now also calls
  _memory_manager.on_session_switch(reason='compression').

cli.py
------
- new_session() fires reset=True, reason='new_session' so providers
  flush buffers.
- _handle_resume_command fires reset=False, reason='resume' with the
  previous session as parent_session_id.
- _handle_branch_command fires reset=False, reason='branch' with the
  parent session_id already captured for the DB parent link.

gateway/run.py
--------------
- _handle_resume_command now evicts the cached AIAgent, mirroring
  /branch and /reset. The next message rebuilds a fresh agent whose
  memory provider initialize() runs with the correct session_id —
  matches the pattern the gateway already uses for provider state
  cross-session transitions.

Hindsight reference implementation
----------------------------------
- plugins/memory/hindsight/__init__.py adds on_session_switch that:
  updates _session_id, mints a fresh _document_id (prevents
  vectorize-io/hindsight#1303 overwrite), and clears _session_turns /
  _turn_counter / _turn_index so in-flight batches don't flush under
  the new document id. parent_session_id only overwritten when provided
  (avoids clobbering on a bare switch).

Tests
-----
- tests/agent/test_memory_session_switch.py: new dedicated file. ABC
  default no-op, manager fan-out, failure isolation, empty-id no-op,
  session_id propagation through sync_all/queue_prefetch_all, Hindsight
  state transitions for every reset/non-reset case, parent preservation.
- tests/cli/test_branch_command.py: new test verifying /branch fires
  the hook with correct parent_session_id + reset=False + reason.
- tests/gateway/test_resume_command.py: new test verifying /resume
  evicts the cached agent.
- tests/run_agent/test_memory_sync_interrupted.py: updated existing
  assertions to account for the session_id kwarg on sync_all and
  queue_prefetch_all.

E2E verified (real imports, tmp HERMES_HOME):
- /resume: session_id updates, doc_id fresh, buffers cleared, parent set
- /branch: session_id forks, parent links to original
- /new: reset=True clears accumulated state
- compression: reason='compression' propagated, lineage preserved
- Empty id: no-op, state preserved
- Legacy provider without on_session_switch: no crash

Reported by @nicoloboschi (Hindsight maintainer); related scope-widening
comment by @kidonng extending coverage to compression.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…on switch

Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by NousResearch#17409.

1. Buffered turns silently lost when retain_every_n_turns > 1.
   on_session_switch unconditionally cleared _session_turns without
   flushing. Users who batched every N>1 turns and switched mid-batch
   (/reset, /new, /resume, /branch, or context compression) had those
   buffered turns disappear. Same data-loss class as the shutdown race,
   different lifecycle event.

   Note commit_memory_session() -> on_session_end() runs *before*
   on_session_switch on /reset, but Hindsight doesn't implement
   on_session_end so the buffer survives that step and dies at clear
   time. /resume, /branch, and compression skip commit_memory_session
   entirely so an on_session_end impl wouldn't help them anyway.

   Fix: snapshot the old _session_id, _document_id, _parent_session_id,
   _turn_index, and _session_turns; spawn one final retain that lands
   under the OLD document_id; then rotate state. Metadata is built
   synchronously against the old self._* so session_id / lineage tags
   on the flushed item all reference the prior session consistently.

2. Stale _prefetch_result leaks across switch.
   If queue_prefetch ran in the old session and the result hadn't been
   consumed by prefetch() yet, on_session_switch left the cached recall
   text in place. The next session's first prefetch() call would return
   text mined from the prior session's bank/query.

   Fix: join any in-flight _prefetch_thread (3s bounded — matches
   shutdown()), then clear _prefetch_result under _prefetch_lock before
   rotating session_id.

Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
    - buffered turns flushed under OLD document_id with OLD lineage tags
    - empty buffer => no spurious retain
    - _prefetch_result cleared on switch
    - in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
  attrs the new flush path reads (_retain_source, _platform, _bank_id,
  prefetch state, etc.) and stub _run_hindsight_operation so existing
  switch-state assertions keep passing without network setup.
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…usResearch#17409)

Fixes NousResearch#6672

Memory providers now receive on_session_switch() whenever AIAgent.session_id
rotates mid-process — /resume, /branch, /reset, /new, and context
compression. Before this, providers that cached per-session state in
initialize() (Hindsight's _session_id, _document_id, accumulated
_session_turns, _turn_counter) kept writing into the old session's
record after the agent had moved on.

MemoryProvider ABC
------------------
- New optional hook on_session_switch(new_session_id, *,
  parent_session_id='', reset=False, **kwargs) with no-op default for
  backward compat. reset=True signals /reset or /new — providers should
  flush accumulated per-session buffers. reset=False for /resume,
  /branch, compression where the logical conversation continues.

MemoryManager
-------------
- on_session_switch() fans the hook out to every registered provider.
  Isolated try/except per provider — one bad provider can't block others.
- Empty/None new_session_id is a no-op to avoid corrupting provider state
  during shutdown paths.

run_agent.py
------------
- _sync_external_memory_for_turn now passes session_id=self.session_id
  into sync_all() and queue_prefetch_all(). Providers with defensive
  session_id updates in sync_turn (Hindsight already had this at
  plugins/memory/hindsight/__init__.py:1199) now actually receive the
  current id.
- Compression block at ~L8884 already notified the context engine of
  the rollover; now also calls
  _memory_manager.on_session_switch(reason='compression').

cli.py
------
- new_session() fires reset=True, reason='new_session' so providers
  flush buffers.
- _handle_resume_command fires reset=False, reason='resume' with the
  previous session as parent_session_id.
- _handle_branch_command fires reset=False, reason='branch' with the
  parent session_id already captured for the DB parent link.

gateway/run.py
--------------
- _handle_resume_command now evicts the cached AIAgent, mirroring
  /branch and /reset. The next message rebuilds a fresh agent whose
  memory provider initialize() runs with the correct session_id —
  matches the pattern the gateway already uses for provider state
  cross-session transitions.

Hindsight reference implementation
----------------------------------
- plugins/memory/hindsight/__init__.py adds on_session_switch that:
  updates _session_id, mints a fresh _document_id (prevents
  vectorize-io/hindsight#1303 overwrite), and clears _session_turns /
  _turn_counter / _turn_index so in-flight batches don't flush under
  the new document id. parent_session_id only overwritten when provided
  (avoids clobbering on a bare switch).

Tests
-----
- tests/agent/test_memory_session_switch.py: new dedicated file. ABC
  default no-op, manager fan-out, failure isolation, empty-id no-op,
  session_id propagation through sync_all/queue_prefetch_all, Hindsight
  state transitions for every reset/non-reset case, parent preservation.
- tests/cli/test_branch_command.py: new test verifying /branch fires
  the hook with correct parent_session_id + reset=False + reason.
- tests/gateway/test_resume_command.py: new test verifying /resume
  evicts the cached agent.
- tests/run_agent/test_memory_sync_interrupted.py: updated existing
  assertions to account for the session_id kwarg on sync_all and
  queue_prefetch_all.

E2E verified (real imports, tmp HERMES_HOME):
- /resume: session_id updates, doc_id fresh, buffers cleared, parent set
- /branch: session_id forks, parent links to original
- /new: reset=True clears accumulated state
- compression: reason='compression' propagated, lineage preserved
- Empty id: no-op, state preserved
- Legacy provider without on_session_switch: no crash

Reported by @nicoloboschi (Hindsight maintainer); related scope-widening
comment by @kidonng extending coverage to compression.
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…on switch

Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by NousResearch#17409.

1. Buffered turns silently lost when retain_every_n_turns > 1.
   on_session_switch unconditionally cleared _session_turns without
   flushing. Users who batched every N>1 turns and switched mid-batch
   (/reset, /new, /resume, /branch, or context compression) had those
   buffered turns disappear. Same data-loss class as the shutdown race,
   different lifecycle event.

   Note commit_memory_session() -> on_session_end() runs *before*
   on_session_switch on /reset, but Hindsight doesn't implement
   on_session_end so the buffer survives that step and dies at clear
   time. /resume, /branch, and compression skip commit_memory_session
   entirely so an on_session_end impl wouldn't help them anyway.

   Fix: snapshot the old _session_id, _document_id, _parent_session_id,
   _turn_index, and _session_turns; spawn one final retain that lands
   under the OLD document_id; then rotate state. Metadata is built
   synchronously against the old self._* so session_id / lineage tags
   on the flushed item all reference the prior session consistently.

2. Stale _prefetch_result leaks across switch.
   If queue_prefetch ran in the old session and the result hadn't been
   consumed by prefetch() yet, on_session_switch left the cached recall
   text in place. The next session's first prefetch() call would return
   text mined from the prior session's bank/query.

   Fix: join any in-flight _prefetch_thread (3s bounded — matches
   shutdown()), then clear _prefetch_result under _prefetch_lock before
   rotating session_id.

Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
    - buffered turns flushed under OLD document_id with OLD lineage tags
    - empty buffer => no spurious retain
    - _prefetch_result cleared on switch
    - in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
  attrs the new flush path reads (_retain_source, _platform, _bank_id,
  prefetch state, etc.) and stub _run_hindsight_operation so existing
  switch-state assertions keep passing without network setup.
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…usResearch#17409)

Fixes NousResearch#6672

Memory providers now receive on_session_switch() whenever AIAgent.session_id
rotates mid-process — /resume, /branch, /reset, /new, and context
compression. Before this, providers that cached per-session state in
initialize() (Hindsight's _session_id, _document_id, accumulated
_session_turns, _turn_counter) kept writing into the old session's
record after the agent had moved on.

MemoryProvider ABC
------------------
- New optional hook on_session_switch(new_session_id, *,
  parent_session_id='', reset=False, **kwargs) with no-op default for
  backward compat. reset=True signals /reset or /new — providers should
  flush accumulated per-session buffers. reset=False for /resume,
  /branch, compression where the logical conversation continues.

MemoryManager
-------------
- on_session_switch() fans the hook out to every registered provider.
  Isolated try/except per provider — one bad provider can't block others.
- Empty/None new_session_id is a no-op to avoid corrupting provider state
  during shutdown paths.

run_agent.py
------------
- _sync_external_memory_for_turn now passes session_id=self.session_id
  into sync_all() and queue_prefetch_all(). Providers with defensive
  session_id updates in sync_turn (Hindsight already had this at
  plugins/memory/hindsight/__init__.py:1199) now actually receive the
  current id.
- Compression block at ~L8884 already notified the context engine of
  the rollover; now also calls
  _memory_manager.on_session_switch(reason='compression').

cli.py
------
- new_session() fires reset=True, reason='new_session' so providers
  flush buffers.
- _handle_resume_command fires reset=False, reason='resume' with the
  previous session as parent_session_id.
- _handle_branch_command fires reset=False, reason='branch' with the
  parent session_id already captured for the DB parent link.

gateway/run.py
--------------
- _handle_resume_command now evicts the cached AIAgent, mirroring
  /branch and /reset. The next message rebuilds a fresh agent whose
  memory provider initialize() runs with the correct session_id —
  matches the pattern the gateway already uses for provider state
  cross-session transitions.

Hindsight reference implementation
----------------------------------
- plugins/memory/hindsight/__init__.py adds on_session_switch that:
  updates _session_id, mints a fresh _document_id (prevents
  vectorize-io/hindsight#1303 overwrite), and clears _session_turns /
  _turn_counter / _turn_index so in-flight batches don't flush under
  the new document id. parent_session_id only overwritten when provided
  (avoids clobbering on a bare switch).

Tests
-----
- tests/agent/test_memory_session_switch.py: new dedicated file. ABC
  default no-op, manager fan-out, failure isolation, empty-id no-op,
  session_id propagation through sync_all/queue_prefetch_all, Hindsight
  state transitions for every reset/non-reset case, parent preservation.
- tests/cli/test_branch_command.py: new test verifying /branch fires
  the hook with correct parent_session_id + reset=False + reason.
- tests/gateway/test_resume_command.py: new test verifying /resume
  evicts the cached agent.
- tests/run_agent/test_memory_sync_interrupted.py: updated existing
  assertions to account for the session_id kwarg on sync_all and
  queue_prefetch_all.

E2E verified (real imports, tmp HERMES_HOME):
- /resume: session_id updates, doc_id fresh, buffers cleared, parent set
- /branch: session_id forks, parent links to original
- /new: reset=True clears accumulated state
- compression: reason='compression' propagated, lineage preserved
- Empty id: no-op, state preserved
- Legacy provider without on_session_switch: no crash

Reported by @nicoloboschi (Hindsight maintainer); related scope-widening
comment by @kidonng extending coverage to compression.
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…on switch

Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by NousResearch#17409.

1. Buffered turns silently lost when retain_every_n_turns > 1.
   on_session_switch unconditionally cleared _session_turns without
   flushing. Users who batched every N>1 turns and switched mid-batch
   (/reset, /new, /resume, /branch, or context compression) had those
   buffered turns disappear. Same data-loss class as the shutdown race,
   different lifecycle event.

   Note commit_memory_session() -> on_session_end() runs *before*
   on_session_switch on /reset, but Hindsight doesn't implement
   on_session_end so the buffer survives that step and dies at clear
   time. /resume, /branch, and compression skip commit_memory_session
   entirely so an on_session_end impl wouldn't help them anyway.

   Fix: snapshot the old _session_id, _document_id, _parent_session_id,
   _turn_index, and _session_turns; spawn one final retain that lands
   under the OLD document_id; then rotate state. Metadata is built
   synchronously against the old self._* so session_id / lineage tags
   on the flushed item all reference the prior session consistently.

2. Stale _prefetch_result leaks across switch.
   If queue_prefetch ran in the old session and the result hadn't been
   consumed by prefetch() yet, on_session_switch left the cached recall
   text in place. The next session's first prefetch() call would return
   text mined from the prior session's bank/query.

   Fix: join any in-flight _prefetch_thread (3s bounded — matches
   shutdown()), then clear _prefetch_result under _prefetch_lock before
   rotating session_id.

Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
    - buffered turns flushed under OLD document_id with OLD lineage tags
    - empty buffer => no spurious retain
    - _prefetch_result cleared on switch
    - in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
  attrs the new flush path reads (_retain_source, _platform, _bank_id,
  prefetch state, etc.) and stub _run_hindsight_operation so existing
  switch-state assertions keep passing without network setup.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…usResearch#17409)

Fixes NousResearch#6672

Memory providers now receive on_session_switch() whenever AIAgent.session_id
rotates mid-process — /resume, /branch, /reset, /new, and context
compression. Before this, providers that cached per-session state in
initialize() (Hindsight's _session_id, _document_id, accumulated
_session_turns, _turn_counter) kept writing into the old session's
record after the agent had moved on.

MemoryProvider ABC
------------------
- New optional hook on_session_switch(new_session_id, *,
  parent_session_id='', reset=False, **kwargs) with no-op default for
  backward compat. reset=True signals /reset or /new — providers should
  flush accumulated per-session buffers. reset=False for /resume,
  /branch, compression where the logical conversation continues.

MemoryManager
-------------
- on_session_switch() fans the hook out to every registered provider.
  Isolated try/except per provider — one bad provider can't block others.
- Empty/None new_session_id is a no-op to avoid corrupting provider state
  during shutdown paths.

run_agent.py
------------
- _sync_external_memory_for_turn now passes session_id=self.session_id
  into sync_all() and queue_prefetch_all(). Providers with defensive
  session_id updates in sync_turn (Hindsight already had this at
  plugins/memory/hindsight/__init__.py:1199) now actually receive the
  current id.
- Compression block at ~L8884 already notified the context engine of
  the rollover; now also calls
  _memory_manager.on_session_switch(reason='compression').

cli.py
------
- new_session() fires reset=True, reason='new_session' so providers
  flush buffers.
- _handle_resume_command fires reset=False, reason='resume' with the
  previous session as parent_session_id.
- _handle_branch_command fires reset=False, reason='branch' with the
  parent session_id already captured for the DB parent link.

gateway/run.py
--------------
- _handle_resume_command now evicts the cached AIAgent, mirroring
  /branch and /reset. The next message rebuilds a fresh agent whose
  memory provider initialize() runs with the correct session_id —
  matches the pattern the gateway already uses for provider state
  cross-session transitions.

Hindsight reference implementation
----------------------------------
- plugins/memory/hindsight/__init__.py adds on_session_switch that:
  updates _session_id, mints a fresh _document_id (prevents
  vectorize-io/hindsight#1303 overwrite), and clears _session_turns /
  _turn_counter / _turn_index so in-flight batches don't flush under
  the new document id. parent_session_id only overwritten when provided
  (avoids clobbering on a bare switch).

Tests
-----
- tests/agent/test_memory_session_switch.py: new dedicated file. ABC
  default no-op, manager fan-out, failure isolation, empty-id no-op,
  session_id propagation through sync_all/queue_prefetch_all, Hindsight
  state transitions for every reset/non-reset case, parent preservation.
- tests/cli/test_branch_command.py: new test verifying /branch fires
  the hook with correct parent_session_id + reset=False + reason.
- tests/gateway/test_resume_command.py: new test verifying /resume
  evicts the cached agent.
- tests/run_agent/test_memory_sync_interrupted.py: updated existing
  assertions to account for the session_id kwarg on sync_all and
  queue_prefetch_all.

E2E verified (real imports, tmp HERMES_HOME):
- /resume: session_id updates, doc_id fresh, buffers cleared, parent set
- /branch: session_id forks, parent links to original
- /new: reset=True clears accumulated state
- compression: reason='compression' propagated, lineage preserved
- Empty id: no-op, state preserved
- Legacy provider without on_session_switch: no crash

Reported by @nicoloboschi (Hindsight maintainer); related scope-widening
comment by @kidonng extending coverage to compression.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…on switch

Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by NousResearch#17409.

1. Buffered turns silently lost when retain_every_n_turns > 1.
   on_session_switch unconditionally cleared _session_turns without
   flushing. Users who batched every N>1 turns and switched mid-batch
   (/reset, /new, /resume, /branch, or context compression) had those
   buffered turns disappear. Same data-loss class as the shutdown race,
   different lifecycle event.

   Note commit_memory_session() -> on_session_end() runs *before*
   on_session_switch on /reset, but Hindsight doesn't implement
   on_session_end so the buffer survives that step and dies at clear
   time. /resume, /branch, and compression skip commit_memory_session
   entirely so an on_session_end impl wouldn't help them anyway.

   Fix: snapshot the old _session_id, _document_id, _parent_session_id,
   _turn_index, and _session_turns; spawn one final retain that lands
   under the OLD document_id; then rotate state. Metadata is built
   synchronously against the old self._* so session_id / lineage tags
   on the flushed item all reference the prior session consistently.

2. Stale _prefetch_result leaks across switch.
   If queue_prefetch ran in the old session and the result hadn't been
   consumed by prefetch() yet, on_session_switch left the cached recall
   text in place. The next session's first prefetch() call would return
   text mined from the prior session's bank/query.

   Fix: join any in-flight _prefetch_thread (3s bounded — matches
   shutdown()), then clear _prefetch_result under _prefetch_lock before
   rotating session_id.

Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
    - buffered turns flushed under OLD document_id with OLD lineage tags
    - empty buffer => no spurious retain
    - _prefetch_result cleared on switch
    - in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
  attrs the new flush path reads (_retain_source, _platform, _bank_id,
  prefetch state, etc.) and stub _run_hindsight_operation so existing
  switch-state assertions keep passing without network setup.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…usResearch#17409)

Fixes NousResearch#6672

Memory providers now receive on_session_switch() whenever AIAgent.session_id
rotates mid-process — /resume, /branch, /reset, /new, and context
compression. Before this, providers that cached per-session state in
initialize() (Hindsight's _session_id, _document_id, accumulated
_session_turns, _turn_counter) kept writing into the old session's
record after the agent had moved on.

MemoryProvider ABC
------------------
- New optional hook on_session_switch(new_session_id, *,
  parent_session_id='', reset=False, **kwargs) with no-op default for
  backward compat. reset=True signals /reset or /new — providers should
  flush accumulated per-session buffers. reset=False for /resume,
  /branch, compression where the logical conversation continues.

MemoryManager
-------------
- on_session_switch() fans the hook out to every registered provider.
  Isolated try/except per provider — one bad provider can't block others.
- Empty/None new_session_id is a no-op to avoid corrupting provider state
  during shutdown paths.

run_agent.py
------------
- _sync_external_memory_for_turn now passes session_id=self.session_id
  into sync_all() and queue_prefetch_all(). Providers with defensive
  session_id updates in sync_turn (Hindsight already had this at
  plugins/memory/hindsight/__init__.py:1199) now actually receive the
  current id.
- Compression block at ~L8884 already notified the context engine of
  the rollover; now also calls
  _memory_manager.on_session_switch(reason='compression').

cli.py
------
- new_session() fires reset=True, reason='new_session' so providers
  flush buffers.
- _handle_resume_command fires reset=False, reason='resume' with the
  previous session as parent_session_id.
- _handle_branch_command fires reset=False, reason='branch' with the
  parent session_id already captured for the DB parent link.

gateway/run.py
--------------
- _handle_resume_command now evicts the cached AIAgent, mirroring
  /branch and /reset. The next message rebuilds a fresh agent whose
  memory provider initialize() runs with the correct session_id —
  matches the pattern the gateway already uses for provider state
  cross-session transitions.

Hindsight reference implementation
----------------------------------
- plugins/memory/hindsight/__init__.py adds on_session_switch that:
  updates _session_id, mints a fresh _document_id (prevents
  vectorize-io/hindsight#1303 overwrite), and clears _session_turns /
  _turn_counter / _turn_index so in-flight batches don't flush under
  the new document id. parent_session_id only overwritten when provided
  (avoids clobbering on a bare switch).

Tests
-----
- tests/agent/test_memory_session_switch.py: new dedicated file. ABC
  default no-op, manager fan-out, failure isolation, empty-id no-op,
  session_id propagation through sync_all/queue_prefetch_all, Hindsight
  state transitions for every reset/non-reset case, parent preservation.
- tests/cli/test_branch_command.py: new test verifying /branch fires
  the hook with correct parent_session_id + reset=False + reason.
- tests/gateway/test_resume_command.py: new test verifying /resume
  evicts the cached agent.
- tests/run_agent/test_memory_sync_interrupted.py: updated existing
  assertions to account for the session_id kwarg on sync_all and
  queue_prefetch_all.

E2E verified (real imports, tmp HERMES_HOME):
- /resume: session_id updates, doc_id fresh, buffers cleared, parent set
- /branch: session_id forks, parent links to original
- /new: reset=True clears accumulated state
- compression: reason='compression' propagated, lineage preserved
- Empty id: no-op, state preserved
- Legacy provider without on_session_switch: no crash

Reported by @nicoloboschi (Hindsight maintainer); related scope-widening
comment by @kidonng extending coverage to compression.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…on switch

Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by NousResearch#17409.

1. Buffered turns silently lost when retain_every_n_turns > 1.
   on_session_switch unconditionally cleared _session_turns without
   flushing. Users who batched every N>1 turns and switched mid-batch
   (/reset, /new, /resume, /branch, or context compression) had those
   buffered turns disappear. Same data-loss class as the shutdown race,
   different lifecycle event.

   Note commit_memory_session() -> on_session_end() runs *before*
   on_session_switch on /reset, but Hindsight doesn't implement
   on_session_end so the buffer survives that step and dies at clear
   time. /resume, /branch, and compression skip commit_memory_session
   entirely so an on_session_end impl wouldn't help them anyway.

   Fix: snapshot the old _session_id, _document_id, _parent_session_id,
   _turn_index, and _session_turns; spawn one final retain that lands
   under the OLD document_id; then rotate state. Metadata is built
   synchronously against the old self._* so session_id / lineage tags
   on the flushed item all reference the prior session consistently.

2. Stale _prefetch_result leaks across switch.
   If queue_prefetch ran in the old session and the result hadn't been
   consumed by prefetch() yet, on_session_switch left the cached recall
   text in place. The next session's first prefetch() call would return
   text mined from the prior session's bank/query.

   Fix: join any in-flight _prefetch_thread (3s bounded — matches
   shutdown()), then clear _prefetch_result under _prefetch_lock before
   rotating session_id.

Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
    - buffered turns flushed under OLD document_id with OLD lineage tags
    - empty buffer => no spurious retain
    - _prefetch_result cleared on switch
    - in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
  attrs the new flush path reads (_retain_source, _platform, _bank_id,
  prefetch state, etc.) and stub _run_hindsight_operation so existing
  switch-state assertions keep passing without network setup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have tool/memory Memory tool and memory providers type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(memory): notify memory providers on in-chat /resume and /branch session switches

2 participants