Skip to content

fix(gateway): emit session:end from idle-expiry watcher + auto-reset paths#28750

Open
mbs-vhs wants to merge 3 commits into
NousResearch:mainfrom
mbs-vhs:fix/session-end-emit-from-idle-and-autoreset
Open

fix(gateway): emit session:end from idle-expiry watcher + auto-reset paths#28750
mbs-vhs wants to merge 3 commits into
NousResearch:mainfrom
mbs-vhs:fix/session-end-emit-from-idle-and-autoreset

Conversation

@mbs-vhs

@mbs-vhs mbs-vhs commented May 19, 2026

Copy link
Copy Markdown

What does this PR do?

Fix the gateway session:end hook so it fires from every session-close path, not just /new//reset. Without this, external hook subscribers (under ~/.hermes/hooks/) silently miss every idle-expiry- and auto-reset-driven close — leaving stale mirror state forever.

The gateway emits the lifecycle event session:end from exactly one call site, the /new//reset handler. Two other paths close sessions but don't emit:

  1. Idle-expiry watcher (gateway/run.py:_session_expiry_watcher) — fires the plugin-level on_session_finalize (per [Bug]: The on_session_finalize hook is not being fired when gateway sessions expire due to configured idle time. #14981), evicts the cached agent, marks entry.expiry_finalized = True. But never self.hooks.emit("session:end", ...).

  2. Auto-reset branch (gateway/session.py:get_or_create_session) — when a stale entry is rolled over for a new session_id, the local SQLite session ends but no gateway event fires for the old session_id.

This PR makes session:end symmetric with session:start — every close path now emits, every subscriber sees the close.

Related Issue

Fixes #28746

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • gateway/run.py:_session_expiry_watcher — emit session:end with reason="idle_expiry" right after entry.expiry_finalized = True. Wrapped in try/except logger.debug(..., exc_info=True) so a misbehaving subscriber can't break the watcher loop. (platform is pulled from the session_key just like the existing on_session_finalize invocation; user_id from entry.origin with the same or "" guard pattern used elsewhere in this file.)

  • gateway/run.py:_handle_message_with_agent — in the _is_new_session branch, before emitting session:start, check getattr(session_entry, "auto_reset_prior_session_id", None). If set, emit session:end with reason="auto_reset", then clear the field so a follow-up turn cannot re-emit for the same close.

  • gateway/session.py — add a transient field auto_reset_prior_session_id: Optional[str] = None to SessionEntry. get_or_create_session populates it (with the prior session_id) whenever the auto-reset branch fires. Not added to to_dict() — this is a transient signal consumed by the next emit pass, never persisted to sessions.json.

  • gateway/hooks.py — docstring update to reflect the new firing contract: session:end now lists the three close paths and notes the reason key on the new emit paths.

  • tests/gateway/test_session_boundary_hooks.py — three new tests:

    • test_idle_expiry_emits_session_end — runs _session_expiry_watcher against a mocked-out store with one expired entry; asserts the emit fires with the expired session_id, session_key, and reason="idle_expiry". Regression guard for the watcher path.
    • test_auto_reset_emits_session_end_for_prior_session — exercises the emit fragment in _handle_message_with_agent against a SessionEntry that was just produced by auto-reset (with auto_reset_prior_session_id="sess-old-prior"); asserts session:end fires before session:start, with the right session_id + reason="auto_reset", and that the transient field is cleared after consume.
    • test_session_entry_has_auto_reset_prior_session_id_field — dataclass-level sanity check (default None, writable, not in to_dict()).

The existing 6 tests in this file (test_reset_*, test_finalize_before_reset, test_shutdown_*, test_hook_error_*, test_idle_expiry_fires_finalize_hook) all still pass — no existing call site changes behavior.

How to Test

Automated

pytest tests/gateway/test_session_boundary_hooks.py -v --override-ini="addopts="

All 9 tests pass (6 pre-existing + 3 new).

Manual reproduction

  1. Install a gateway hook in ~/.hermes/hooks/test-session-end/:

    # HOOK.yaml
    name: test-session-end
    events: [session:start, session:end]
    # handler.py
    import datetime, json, pathlib
    LOG = pathlib.Path.home() / "session_end_test.log"
    async def handle(event_type, context):
        LOG.parent.mkdir(parents=True, exist_ok=True)
        with LOG.open("a") as f:
            f.write(json.dumps({"event": event_type, "ts": datetime.datetime.now().isoformat(), "ctx": context}) + "\n")
  2. Start the gateway. Open a Telegram DM session. Send one message. Confirm session:start lands in ~/session_end_test.log.

  3. Stop messaging. Wait for the configured idle-reset window to pass plus one 5-min watcher tick.

  4. Before this fix: the log shows only session:start (no session:end). ~/.hermes/sessions/sessions.json shows expiry_finalized: true for the session.

  5. After this fix: the log shows session:end with reason: "idle_expiry", session_id, session_key, platform, user_id. sessions.json still shows expiry_finalized: true.

  6. Repeat for the auto-reset path: start a fresh session, idle past the reset window, then send a NEW message (not /new). The handler logs session:end for the OLD session_id (with reason: "auto_reset"), then session:start for the NEW one.

Note on Hermes version

Tested against the fork's main branch (currently aligned with hermes-agent 0.14.0 per pyproject.toml). The patched code paths exist identically in 0.13.0 and earlier — the line numbers in the original issue reference the v0.13.0 layout. The semantic anchors (function names, code shapes) are unchanged.

Checklist

Code

Documentation & Housekeeping

  • I've updated relevant documentation — gateway/hooks.py Events docstring reflects the new firing contract for session:end
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — N/A (no platform-specific code paths touched)
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

…paths

The gateway ``session:end`` hook only fired from the explicit
``/new``/``/reset`` command handler.  Two other paths that terminate a
session — the background idle-expiry watcher and the auto-reset branch
in ``SessionStore.get_or_create_session`` — fired only the plugin-level
``on_session_finalize`` hook and closed the session locally without
notifying gateway hook subscribers.

External hooks under ``~/.hermes/hooks/`` that subscribe to ``session:end``
(to close mirror rows in an external DB, finalize logging, etc.)
silently missed every idle-expiry- and auto-reset-driven close — leaving
stale state forever.

This change makes ``session:end`` symmetric with ``session:start``:

- ``gateway/run.py:_session_expiry_watcher`` now emits ``session:end``
  with ``reason="idle_expiry"`` right after marking
  ``entry.expiry_finalized = True``.  Wrapped in ``try/except`` so a
  misbehaving subscriber can't break the watcher loop.

- ``gateway/run.py:_handle_message_with_agent`` now emits ``session:end``
  for the OLD session_id (with ``reason="auto_reset"``) in the
  ``_is_new_session`` branch — BEFORE emitting ``session:start`` for the
  new one — when ``SessionEntry.auto_reset_prior_session_id`` is set.

- ``gateway/session.py`` adds a transient
  ``auto_reset_prior_session_id`` field on ``SessionEntry`` (default
  ``None``, not persisted to ``sessions.json``) so ``get_or_create_session``
  can carry the prior id forward to the emit pipeline.

- ``gateway/hooks.py`` docstring updated to document the new firing
  contract (``reason`` keys for the new emit paths).

Existing behavior is preserved: the explicit ``/new``/``/reset`` emit
shape at the original call site is unchanged, and existing subscribers
keep receiving the same events for the same paths.  The new emits
include ``session_id`` (the explicit ``/new`` emit currently doesn't —
that asymmetry is left for a separate change because it's load-bearing
for the new emit paths only).

Closes NousResearch#28746

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists labels May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: gateway session:end event not emitted from idle-expiry watcher or auto-reset path

2 participants