Skip to content

[Bug]: MemoryProvider.on_session_end() never called on gateway session expiry #11205

@eric-tramel

Description

@eric-tramel

Bug Description

MemoryProvider.on_session_end() is defined in the ABC (agent/memory_provider.py:153) and fully dispatched by MemoryManager.on_session_end() (agent/memory_manager.py:285-293), but the gateway never invokes it when a session actually ends.

Gateway sessions end via one of three paths — all handled by GatewayRunner._async_flush_memories()_flush_memories_for_session() (gateway/run.py:850 and :743):

  1. Idle-timeout expiry from the _session_expiry_watcher (gateway/run.py:2046)
  2. Daily scheduled reset (same watcher)
  3. Explicit session reset from a platform handler (gateway/run.py:4343, :6422)

Each of these paths builds a separate flush AIAgent with the builtin memory tool and prompts it to save durable facts, but it never calls memory_manager.on_session_end(history) on the cached live agent's memory manager. The result: for any plugin that implements on_session_end as its final-pass extraction hook (matching the ABC docstring — "end-of-session extraction"), that hook fires exactly zero times on gateway platforms, regardless of how many sessions expire.

grep confirms zero call sites of on_session_end under gateway/ — every current call site lives in run_agent.py and only runs during CLI-side graceful shutdown (run_agent.py:3202, :3227), which gateway sessions don't exercise.

This is the same structural class of bug as #7193 (on_turn_start defined + dispatched but never called) and #7192 (on_pre_compress return value silently discarded): the hook contract exists, the dispatch layer works, but one caller is missing. It is adjacent to — but distinct from — #6157, which covers a different problem in the same flush path (the bespoke flush agent is built with skip_memory=True, breaking the builtin memory tool for that agent).

Steps to Reproduce

  1. Install a memory provider plugin that implements on_session_end() and logs when it's called. The built-in obsidian_consolidation plugin is a convenient reference — it logs a SESSION_END line to obsidian_consolidation.log from the hook.
  2. Run the gateway (hermes gateway) and interact with a platform session (Slack/Telegram/Discord) long enough to produce ≥4 messages.
  3. Stop messaging and let the session expire — either by waiting out the configured idle_minutes or by triggering the daily reset.
  4. Inspect the plugin's log. TRIGGERED | reason=periodic and TRIGGERED | reason=pre_compress entries appear as expected; SESSION_END entries never appear.

Quick sanity check:

grep -rn "on_session_end\|\.on_session_end(" gateway/ agent/ run_agent.py
# Under gateway/: zero matches.

Expected Behavior

When a gateway session ends (idle expiry, scheduled reset, or explicit reset), the flush path should call memory_manager.on_session_end(history) on the live session's memory manager before teardown, so any registered MemoryProvider gets its documented final-pass extraction opportunity.

Actual Behavior

The hook is never invoked. Plugin providers that depend on on_session_end for their final extraction pass only ever run during periodic / pre_compress triggers, and lose whatever final-pass work (comprehensive summary, reconciliation, durable write) they were designed to do at session boundaries.

Affected Component

Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

Slack

Debug Report

can't :)

Operating System

Ubuntu 24.04

Python Version

3.12.13

Hermes Version

0.9.0

Additional Logs / Traceback (optional)

Representative plugin log from a gateway deployment over a multi-day window — dozens of expired sessions, zero `on_session_end` dispatches:


TRIGGERED | reason=periodic       (many)
TRIGGERED | reason=pre_compress   (many)
TRIGGERED | reason=session_end    0

Root Cause Analysis (optional)

The gateway's flush path was designed around the builtin memory tool rather than the MemoryProvider plugin interface. In gateway/run.py:_flush_memories_for_session (line 74
3), the function:

  1. Loads the transcript from session_store.
  2. Constructs a fresh AIAgent(skip_memory=True, enabled_toolsets=["memory", "skills"], ...).
  3. Runs a one-shot prompt that asks the model to call memory/skill_manage tools.
  4. Returns.

Proposed Fix (optional)

In gateway/run.py:_flush_memories_for_session, before (or instead of) spawning the bespoke flush AIAgent, invoke the provider hook on the live memory manager. Sketch:

# gateway/run.py, inside _flush_memories_for_session(old_session_id, session_key=...)
try:
    cached_agent = self._cached_agents.get(session_key) if session_key else None
    if cached_agent and getattr(cached_agent, "_memory_manager", None):
        history = self.session_store.load_transcript(old_session_id) or []
        cached_agent._memory_manager.on_session_end(history)
except Exception as exc:
    logger.warning("on_session_end dispatch failed for session %s: %s", old_session_id, exc)

This keeps the existing bespoke-flush behavior intact (fixing #6157 independently) and restores the documented on_session_end contract for plugin providers. Providers that do
n't need the hook remain unaffected thanks to the ABC's default no-op.

A follow-up could deprecate the bespoke flush agent entirely once the plugin hook covers its functionality, but that's a larger scope and not required by this fix.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions