Skip to content

fix(gateway): keep idle cached agents alive until session expires#31856

Open
trismegistus-wanderer wants to merge 1 commit into
NousResearch:mainfrom
trismegistus-wanderer:fix/gateway-agent-cache-session-expiry
Open

fix(gateway): keep idle cached agents alive until session expires#31856
trismegistus-wanderer wants to merge 1 commit into
NousResearch:mainfrom
trismegistus-wanderer:fix/gateway-agent-cache-session-expiry

Conversation

@trismegistus-wanderer

@trismegistus-wanderer trismegistus-wanderer commented May 25, 2026

Copy link
Copy Markdown

What does this PR do?

The idle-TTL sweep (_sweep_idle_cached_agents) was evicting agents as soon as they passed _AGENT_CACHE_IDLE_TTL_SECS, even when the session hadn't expired yet. In daily-reset mode the reset can fire hours after the last user message — evicting the agent early means the session-expiry watcher has no agent in cache to call on_session_end() with, so memory providers miss the live transcript.

Now the sweep checks the session store before evicting: if the session still exists and hasn't expired, the agent stays in cache so the expiry watcher can tear it down properly later. When the session store is unavailable or throws, falls back to the original eviction behavior (safe default).

Related Issue

Fixes #11205

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • gateway/run.py_sweep_idle_cached_agents() now checks the session store before evicting an agent past idle TTL. If the session hasn't actually expired (e.g., daily-reset mode where the reset fires hours after the last user message), the agent stays in cache so _session_expiry_watcher can find it later and call on_session_end() with the live transcript. Falls back to original eviction behavior when the session store is unavailable or throws.

  • tests/gateway/test_agent_cache.py — Two new tests in TestAgentCacheBoundedGrowth:

    • test_idle_sweep_keeps_agent_when_session_not_expired — agent past idle TTL is kept when session store says session is alive
    • test_idle_sweep_evicts_when_session_is_expired — agent past idle TTL is evicted when session store confirms session expired

How to Test

  1. Run the gateway in daily-reset mode with an active memory provider
  2. Send a message, then stop interacting for longer than _AGENT_CACHE_IDLE_TTL_SECS but before the daily reset time
  3. Without the fix: the agent is evicted from cache before the session expires; when the session-expiry watcher fires, it finds no agent and memory providers miss on_session_end()
  4. With the fix: the agent stays in cache until the session actually expires in the session store, then _session_expiry_watcher tears it down properly — memory providers receive the live transcript

Unit test verification: python -m pytest tests/gateway/test_agent_cache.py::TestAgentCacheBoundedGrowth -v -o 'addopts='

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: Fedora 44 (Linux 6.19)

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) — N/A (pure Python session-store logic, no OS surface)
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

Screenshots / Logs

$ python -m pytest tests/gateway/test_agent_cache.py::TestAgentCacheBoundedGrowth -v -o 'addopts='
...
test_cap_evicts_lru_when_exceeded PASSED
test_cap_respects_move_to_end PASSED
test_cap_triggers_cleanup_thread PASSED
test_idle_ttl_sweep_evicts_stale_agents PASSED
test_idle_sweep_skips_agents_without_activity_ts PASSED
test_idle_sweep_keeps_agent_when_session_not_expired PASSED  ← new
test_idle_sweep_evicts_when_session_is_expired PASSED         ← new
test_plain_dict_cache_is_tolerated PASSED
test_main_lookup_updates_lru_order PASSED
9 passed in 0.40s

$ python -m pytest tests/gateway/test_shutdown_cache_cleanup.py tests/gateway/test_shutdown_memory_provider_messages.py -o 'addopts='
15 passed in 0.95s

…pires

The idle-TTL sweep (_sweep_idle_cached_agents) was evicting agents
as soon as they passed _AGENT_CACHE_IDLE_TTL_SECS, even when the
session hadn't expired yet. In daily-reset mode the reset can fire
hours after the last user message — evicting the agent early means
the session-expiry watcher has no agent in cache to call
on_session_end() with, so memory providers miss the live transcript.

Now the sweep checks the session store before evicting: if the
session still exists and hasn't expired, the agent stays in cache
so the expiry watcher can tear it down properly later.
When the session store is unavailable or throws, falls back to the
original eviction behavior (safe default).

Fixes: NousResearch#11205
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists labels May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: MemoryProvider.on_session_end() never called on gateway session expiry

2 participants