Skip to content

fix(telegram): prevent duplicate message delivery on send timeout#3899

Closed
tmdgusya wants to merge 2 commits into
NousResearch:mainfrom
tmdgusya:fix/telegram-duplicate-send-on-timeout
Closed

fix(telegram): prevent duplicate message delivery on send timeout#3899
tmdgusya wants to merge 2 commits into
NousResearch:mainfrom
tmdgusya:fix/telegram-duplicate-send-on-timeout

Conversation

@tmdgusya

Copy link
Copy Markdown
Contributor

Summary

Fixes duplicate message delivery when send_message() times out but the message was already delivered to the user.

The Bug

TimedOut is a subclass of NetworkError in python-telegram-bot. The retry logic treats it as a transient connection error and retries — but send_message is not idempotent: the message may have already been delivered, so retrying sends it again.

Two retry layers compound the issue:

  • send() inner loop: catches NetworkError → retries 3x
  • _send_with_retry() outer loop: matches "timed out" in error string → retries 2x

Worst case: up to 9 delivery attempts for a single message.

Reproduction

The new test test_telegram_timeout_duplicate.py reproduces this deterministically:

# BEFORE fix (original code):
DUPLICATE DELIVERY: send_message was called 3 times.
User would receive 3 copies of the same message.
Messages delivered: ['Here is your answer', 'Here is your answer', 'Here is your answer']

# AFTER fix:
send_message called 1 time. No duplicates.

The Fix

  1. telegram.py send(): TimedOut is no longer retried in the inner loop — it's raised immediately. SendResult is marked retryable=False for timeouts.
  2. base.py _RETRYABLE_ERROR_PATTERNS: Removed "timeout", "timed out", "readtimeout", "writetimeout", "network" (too broad). Only connection-level errors remain (connecterror, connectionreset, etc.) which are safe to retry because the request never reached the server.

Connection errors (ConnectionError, ConnectionReset, ConnectionRefused) are still retried — these fail before the request reaches the server, so retrying is safe and doesn't cause duplicates.

Test plan

  • New reproduction test: test_telegram_timeout_duplicate.py — verifies single delivery on timeout
  • Updated test_send_retry.py — verifies timeout skips retry loop
  • Updated test_telegram_thread_fallback.py — added FakeTimedOut to fake module
  • Full gateway test suite: 1763 passed

🤖 Generated with Claude Code

tmdgusya and others added 2 commits March 30, 2026 15:50
TimedOut errors were retried at two layers — send() internally (3x)
and _send_with_retry() (2x) — risking up to 9 delivery attempts for
a single message. Since TimedOut means the request may have reached
Telegram's server, retrying a non-idempotent send_message causes
duplicate messages.

Now TimedOut is propagated immediately from send() and marked as
non-retryable in SendResult, so _send_with_retry skips it.
Connection-level errors (ConnectionError, ConnectionReset, etc.)
remain safely retryable.

Constraint: send_message is not idempotent — no dedup key available
Rejected: retry with dedup | Telegram Bot API has no idempotency keys
Confidence: high
Scope-risk: moderate
Directive: Do not add "timeout" back to _RETRYABLE_ERROR_PATTERNS without idempotency guarantees
Tested: all gateway tests pass (1763 passed)
Not-tested: live network timeout scenario (requires flaky connectivity)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Timeout errors were falling through to the plain-text fallback path
in _send_with_retry, sending "(Response formatting failed, plain
text:)" prefix to users. Timeouts are not formatting errors — return
failure immediately without retry or fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tmdgusya

Copy link
Copy Markdown
Contributor Author
image

@dieutx

dieutx commented Mar 30, 2026

Copy link
Copy Markdown
Contributor

Heads up — #3922 by @dlkakbs appears to address the same issue (duplicate Telegram messages on send timeout, also fixes #3906). Both PRs modify gateway/platforms/telegram.py and gateway/platforms/base.py. Might be worth coordinating to avoid merge conflicts.

@tmdgusya

Copy link
Copy Markdown
Contributor Author

Thank you for heads up! Since @dlkakbs closed their PR, I'll keeping working on this. Let me know if there's anything else I should check

teknium1 added a commit that referenced this pull request Apr 5, 2026
TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from #3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR #3899), barun1997 (PR #3904) for identifying the
bug and proposing fixes.

Closes #3899, closes #3904.
@teknium1 teknium1 closed this in 85cefc7 Apr 5, 2026
@teknium1

teknium1 commented Apr 5, 2026

Copy link
Copy Markdown
Contributor

Merged via PR #5153. Your analysis of the two-layer retry problem (inner 3× + outer 3× = up to 9 duplicates) was spot-on and informed the fix. The inner-loop TimedOut carve-out follows the same isinstance pattern you proposed. Thanks @tmdgusya!

naoironman-hue pushed a commit to naoironman-hue/hermes-agent that referenced this pull request Apr 5, 2026
…usResearch#5153)

TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from NousResearch#3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR NousResearch#3899), barun1997 (PR NousResearch#3904) for identifying the
bug and proposing fixes.

Closes NousResearch#3899, closes NousResearch#3904.
Tommyeds pushed a commit to Tommyeds/hermes-agent that referenced this pull request Apr 12, 2026
…usResearch#5153)

TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from NousResearch#3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR NousResearch#3899), barun1997 (PR NousResearch#3904) for identifying the
bug and proposing fixes.

Closes NousResearch#3899, closes NousResearch#3904.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…usResearch#5153)

TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from NousResearch#3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR NousResearch#3899), barun1997 (PR NousResearch#3904) for identifying the
bug and proposing fixes.

Closes NousResearch#3899, closes NousResearch#3904.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 28, 2026
TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from NousResearch#3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR NousResearch#3899), barun1997 (PR NousResearch#3904) for identifying the
bug and proposing fixes.

Closes NousResearch#3899, closes NousResearch#3904.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…usResearch#5153)

TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from NousResearch#3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR NousResearch#3899), barun1997 (PR NousResearch#3904) for identifying the
bug and proposing fixes.

Closes NousResearch#3899, closes NousResearch#3904.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…usResearch#5153)

TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from NousResearch#3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR NousResearch#3899), barun1997 (PR NousResearch#3904) for identifying the
bug and proposing fixes.

Closes NousResearch#3899, closes NousResearch#3904.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…usResearch#5153)

TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from NousResearch#3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR NousResearch#3899), barun1997 (PR NousResearch#3904) for identifying the
bug and proposing fixes.

Closes NousResearch#3899, closes NousResearch#3904.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…usResearch#5153)

TimedOut is a subclass of NetworkError in python-telegram-bot. The
inner retry loop in send() and the outer _send_with_retry() in base.py
both treated it as a transient connection error and retried — but
send_message is not idempotent. When the request reaches Telegram but
the HTTP response times out, the message is already delivered. Retrying
sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x).

Inner loop (telegram.py):
- Import TimedOut separately, isinstance-check before generic
  NetworkError retry (same pattern as BadRequest carve-out from NousResearch#3390)
- Re-raise immediately — no retry
- Mark as retryable=False in outer exception handler

Outer loop (base.py):
- Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from
  _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous)
- Add 'connecttimeout' (safe — connection never established)
- Keep 'network' (other platforms still need it)
- Add _is_timeout_error() + early return to prevent plain-text fallback
  on timeout errors (would also cause duplicate delivery)

Connection errors (ConnectionReset, ConnectError, etc.) are still
retried — these fail before the request reaches the server.

Credit: tmdgusya (PR NousResearch#3899), barun1997 (PR NousResearch#3904) for identifying the
bug and proposing fixes.

Closes NousResearch#3899, closes NousResearch#3904.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants