Skip to content

fix(telegram): prune stale DM topic binding when Bot API returns Thread-not-found#31512

Open
xxxigm wants to merge 3 commits into
NousResearch:mainfrom
xxxigm:fix/31501-telegram-prune-stale-topic-binding
Open

fix(telegram): prune stale DM topic binding when Bot API returns Thread-not-found#31512
xxxigm wants to merge 3 commits into
NousResearch:mainfrom
xxxigm:fix/31501-telegram-prune-stale-topic-binding

Conversation

@xxxigm

@xxxigm xxxigm commented May 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes #31501.

When a Telegram user deletes a DM topic in the client, the Bot
API responds to the gateway's next send with Thread not found.
The adapter already falls back to a plain send (no
message_thread_id), but the matching row in
telegram_dm_topic_bindings was left untouched — so
gateway.run._recover_telegram_topic_thread_id kept walking the
user's bindings newest-first and steering every later inbound
message back to the deleted topic. Tool progress, approvals and
replies all silently landed in the wrong place until the
operator hand-edited state.db.

What changed

  • SessionDB.delete_telegram_topic_binding(chat_id, thread_id)
    — targeted prune helper; returns the row count, silently
    no-ops when the topic-mode tables haven't been migrated yet
    so the helper is safe to call from a send-fallback hot path.
  • TelegramAdapter._prune_stale_dm_topic_binding(chat_id, thread_id) — adapter glue that calls the helper, logs at
    INFO when a row is dropped, and swallows DB exceptions so
    a failed cleanup never breaks the user-facing send.
  • Both existing Thread not found fallback sites now invoke
    the prune helper:
    • The streaming send loop's second-failure branch (the one
      that flips used_thread_fallback to True and retries
      without message_thread_id). The same-thread one-shot
      retry stays untouched — Bot API has been observed to
      return a transient Thread not found that recovers on
      immediate retry, so we only prune after the second
      failure confirms the topic really is gone.
    • The _send_message_with_thread_fallback helper used for
      control-style sends (approval prompts, model picker,
      update prompts).
  • Optional channel_directory.json cleanup mentioned in the
    bug report's "Proposed Fix" is not included — that file
    is a separate cron / Discord / MCP concept and the Telegram
    topic adapter never writes to it.

Test plan

  • pytest tests/gateway/test_telegram_prune_stale_topic_binding_31501.py
    — 13 new tests covering the SessionDB helper contract,
    the adapter glue's defensive paths, source-level guards
    on both fallback sites, and an end-to-end semantic test
    that confirms _recover_telegram_topic_thread_id no
    longer returns the pruned thread.
  • pytest tests/gateway/test_telegram_thread_fallback.py tests/gateway/test_telegram_topic_mode.py tests/test_hermes_state.py — 299 existing passes (no
    regression in the surrounding fallback / topic-mode /
    state-db suites).

xxxigm added 3 commits May 24, 2026 21:01
…ch#31501)

Targeted ``(chat_id, thread_id)`` prune for the
``telegram_dm_topic_bindings`` table — the missing piece for
NousResearch#31501, where the Telegram adapter detects a topic the user
deleted out-of-band but the binding row keeps living in
state.db.  The recovery logic in
``gateway.run._recover_telegram_topic_thread_id`` then steers
every future inbound message back to the dead topic, dropping
tool progress, approvals and replies into the wrong place.

Returns the number of rows deleted; silently no-ops when the
topic-mode tables haven't been migrated yet (read-only / pristine
profile) so the helper is safe to call from a send-fallback
hot path before the schema has run.
…ousResearch#31501)

Both fallback sites that currently log "Thread X not found,
retrying without message_thread_id" now also drop the
``telegram_dm_topic_bindings`` row keyed on
``(chat_id, thread_id)``:

* The streaming send loop (``send`` body) — fires on the
  second failure, after the same-thread one-shot retry confirms
  the thread really is gone (the first attempt is left alone
  because Bot API has been observed to return a transient
  "Thread not found" that recovers on immediate retry).
* The control-message helper ``_send_message_with_thread_fallback``
  (approval prompts, model picker, update prompts) — single-shot
  retry, prune unconditionally on the BadRequest match.

Without this prune, a user who deletes a Telegram DM topic in
the client keeps getting their next inbound message recovered
back to the dead thread by
``_recover_telegram_topic_thread_id`` in ``gateway/run.py``,
which walks the per-user binding list newest-first and treats
the deleted thread as authoritative.  The reproduction in the
bug report is exactly this: tool progress, approvals, activity
messages and replies all land in the wrong place until the user
manually runs DELETE on state.db.

Cleanup is best-effort — we log at INFO when it succeeds, swallow
any exception from the SessionDB call, and the user-facing send
proceeds either way.

Refs NousResearch#31501
…h#31501)

Thirteen tests across four layers:

* ``SessionDB.delete_telegram_topic_binding`` — pin the new
  helper's contract: removes only the (chat_id, thread_id) row
  it was asked about, leaves siblings alone, returns 0 silently
  when the row never existed, and is a no-op on a pristine
  database whose topic-mode tables haven't been migrated yet.
* ``TelegramAdapter._prune_stale_dm_topic_binding`` — the glue
  must drop the binding when ``self._session_store._db``
  exposes the helper, swallow exceptions so a failed cleanup
  never breaks the user-facing send, and refuse to issue a
  DELETE for ``chat_id=None`` / ``thread_id=None`` so a
  bookkeeping miss can't accidentally null-match every row.
* Source-level guards on ``TelegramAdapter.send`` and
  ``_send_message_with_thread_fallback`` — the prune call must
  sit beside the two existing "Thread X not found, retrying
  without message_thread_id" warnings, before the retry runs,
  so a future refactor can't silently drop the cleanup wire.
* End-to-end semantic — once a topic is pruned, the
  ``GatewayRunner._recover_telegram_topic_thread_id`` walk
  steers future inbound messages to the surviving binding
  instead of the dead one.  This is the exact behaviour change
  the bug report's reproduction asks for: no more landings in
  the wrong topic until the operator hand-edits ``state.db``.

Refs NousResearch#31501
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter P2 Medium — degraded but workaround exists labels May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists platform/telegram Telegram bot adapter type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telegram DM topic bindings are not pruned when Bot API returns Thread not found

2 participants