Skip to content

fix(gateway): publish compression route advances#26204

Closed
bizyumov wants to merge 1 commit into
NousResearch:mainfrom
bizyumov:fix/compression-route-publication
Closed

fix(gateway): publish compression route advances#26204
bizyumov wants to merge 1 commit into
NousResearch:mainfrom
bizyumov:fix/compression-route-publication

Conversation

@bizyumov

Copy link
Copy Markdown

Summary

  • Publish compression-induced route advances through SessionStore without using explicit session-switch semantics.
  • Follow Telegram topic bindings to compression tips before transcript load and rebind topic lanes after compression splits.
  • Evict cached agents when their session id disagrees with the canonical route session.

Tests

  • /opt/hermes/src/venv/bin/python -m pytest -q tests/gateway/test_telegram_topic_mode.py tests/gateway/test_agent_cache.py tests/gateway/test_session_hygiene.py tests/gateway/test_compress_command.py

Closes #25921

@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter labels May 15, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing fix with #26088 — both address #25921 (infinite preflight compression loop from parent-sized history after compression split). This PR additionally handles Telegram topic rebinding after compression splits.

NishantEC

This comment was marked as outdated.

teknium1 added a commit that referenced this pull request May 29, 2026
…hildren (#34409)

Telegram DM topic bindings persist (chat_id, thread_id) -> session_id in
SQLite so reopening a topic resumes the right Hermes session. When
compression rotated session_entry.session_id mid-turn, the binding row
stayed pointed at the pre-compression parent. On the next inbound
message in that topic the gateway reloaded the oversized parent
transcript, retriggering preflight compression — sometimes in a loop.

Two-pronged fix:

1. `_sync_telegram_topic_binding(source, entry, *, reason)` helper
   called immediately after each of the three session_id rotation sites
   in _handle_message_with_agent (hygiene compression, agent-result
   compression rotation, /compress command). Keeps future bindings
   fresh.

2. Read-path self-heal: when resolving an existing topic binding, walk
   SessionDB.get_compression_tip() forward and switch_session to the
   descendant instead of the stored parent. Rewrites the binding row to
   the tip so subsequent messages skip the walk. Heals existing stale
   state on the next user message without requiring a gateway restart.

Skipped from competing PRs as not load-bearing for the bug:
- advance_session_after_compression SessionStore primitive (#26204/
  #28870/#33416) — preserves end_reason='compression' analytics nicety
  but doesn't affect routing correctness.
- Cached-agent eviction on session_id mismatch — _compress_context()
  already mutates tmp_agent.session_id on the cached object so the
  in-memory agent self-corrects.
- Startup repair pass (#33416) — redundant once the read path heals on
  the next message; one-line CLI follow-up can address bindings for
  topics users never reopen.

Closes #20470, #29712, #33414. Acknowledges work in #23195
(@litvinovvo), #26204 (@bizyumov), #28870 (@donrhmexe), #29713
(@hehehe0803), #29945 (@eugeneb1ack), #33416 (@bizyumov).
@teknium1

Copy link
Copy Markdown
Contributor

Thanks for this work — the fix landed via PR #34409 (#34409), merged as commit db96fc60d.

I reviewed all six PRs targeting #20470 / #29712 / #33414 and synthesized the load-bearing minimum:

  1. Your _sync_telegram_topic_binding helper pattern after each session_id rotation site (Family A, this PR's most direct ancestor).
  2. The compression-tip read-path self-heal from @bizyumov's PRs (Family B), so existing already-stale bindings recover on the next message without a restart.

Skipped intentionally: the advance_session_after_compression SessionStore primitive (analytics nicety), explicit cached-agent eviction (_compress_context() already mutates tmp_agent.session_id on the cached object), and the startup repair pass (redundant once the read path self-heals).

Closing as superseded — your work shaped the final design. Appreciate the thorough analysis.

@teknium1 teknium1 closed this May 29, 2026
KKT-OPT pushed a commit to KKT-OPT/hermes-agent that referenced this pull request May 31, 2026
…hildren (NousResearch#34409)

Telegram DM topic bindings persist (chat_id, thread_id) -> session_id in
SQLite so reopening a topic resumes the right Hermes session. When
compression rotated session_entry.session_id mid-turn, the binding row
stayed pointed at the pre-compression parent. On the next inbound
message in that topic the gateway reloaded the oversized parent
transcript, retriggering preflight compression — sometimes in a loop.

Two-pronged fix:

1. `_sync_telegram_topic_binding(source, entry, *, reason)` helper
   called immediately after each of the three session_id rotation sites
   in _handle_message_with_agent (hygiene compression, agent-result
   compression rotation, /compress command). Keeps future bindings
   fresh.

2. Read-path self-heal: when resolving an existing topic binding, walk
   SessionDB.get_compression_tip() forward and switch_session to the
   descendant instead of the stored parent. Rewrites the binding row to
   the tip so subsequent messages skip the walk. Heals existing stale
   state on the next user message without requiring a gateway restart.

Skipped from competing PRs as not load-bearing for the bug:
- advance_session_after_compression SessionStore primitive (NousResearch#26204/
  NousResearch#28870/NousResearch#33416) — preserves end_reason='compression' analytics nicety
  but doesn't affect routing correctness.
- Cached-agent eviction on session_id mismatch — _compress_context()
  already mutates tmp_agent.session_id on the cached object so the
  in-memory agent self-corrects.
- Startup repair pass (NousResearch#33416) — redundant once the read path heals on
  the next message; one-line CLI follow-up can address bindings for
  topics users never reopen.

Closes NousResearch#20470, NousResearch#29712, NousResearch#33414. Acknowledges work in NousResearch#23195
(@litvinovvo), NousResearch#26204 (@bizyumov), NousResearch#28870 (@donrhmexe), NousResearch#29713
(@hehehe0803), NousResearch#29945 (@eugeneb1ack), NousResearch#33416 (@bizyumov).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P1 High — major feature broken, no workaround platform/telegram Telegram bot adapter type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gateway can reuse parent-sized history after compression split, causing infinite preflight compression

4 participants