fix(signal): back off sendTyping spam for unreachable recipients by teknium1 · Pull Request #12118 · NousResearch/hermes-agent

teknium1 · 2026-04-18T11:04:52Z

Summary

Signal send_typing backs off after repeated NETWORK_FAILUREs so an unreachable recipient stops producing a WARNING log every 2 seconds for as long as the agent is busy.

Problem

base.py::_keep_typing refreshes the typing indicator every ~2s while the agent processes a turn. When signal-cli returns NETWORK_FAILURE (recipient offline, unroutable, group membership lost), the unmitigated _rpc path logs WARNING on every single refresh. A user report showed 1048 warnings in 41 minutes for one offline contact, with a matching volume of pointless RPC traffic to signal-cli.

Changes

File	What
`gateway/platforms/signal.py`	`_rpc()` takes `log_failures: bool = True`; `send_typing()` tracks consecutive failures per chat and short-circuits the RPC during an exponential cooldown (16s → 32s → 60s cap) after 3 failures; first failure still logs WARNING, subsequent ones log DEBUG; success resets counters; `_stop_typing_indicator()` clears the backoff state
`tests/gateway/test_signal.py`	`TestSignalTypingBackoff` — 5 tests covering log-level demotion, 3-failure cooldown engagement, per-chat isolation, success reset, stop-typing cleanup

Validation

	Before	After
RPCs in 41-min offline window	1230	45 (-96%)
WARNING log lines	1048	1
DEBUG log lines	0	44
`tests/gateway/test_signal.py`	57 passed	62 passed

E2E simulation replays base.py::_keep_typing calling send_typing every 2s for the reported 41-minute duration against a stub _rpc that returns the exact NETWORK_FAILURE shape from the user's log.

Notes

Salvages the _rpc(log_failures=...) kwarg idea from #12056 (credits @kshitijk4poor). The broader restructure in that PR — a second nested per-chat loop inside send_typing interacting with base.py's _keep_typing via asyncio.Task cleanup — is avoided here in favour of stateful backoff that preserves the existing _keep_typing architecture. Closing #12056 in favour of this narrower fix; the session_search serialization half of that PR is unrelated to the reported incident (logs show aux timeouts, not 429s) and isn't included here.

base.py's _keep_typing refresh loop calls send_typing every ~2s while the agent is processing. If signal-cli returns NETWORK_FAILURE for the recipient (offline, unroutable, group membership lost), the unmitigated path was a WARNING log every 2 seconds for as long as the agent stayed busy — a user report showed 1048 warnings in 41 minutes for one offline contact, plus the matching volume of pointless RPC traffic to signal-cli. - _rpc() accepts log_failures=False so callers can route repeated expected failures (typing) to DEBUG while keeping send/receive at WARNING. - send_typing() tracks consecutive failures per chat. First failure still logs WARNING so transport issues remain visible; subsequent failures log at DEBUG. After three consecutive failures we skip the RPC during an exponential cooldown (16s, 32s, 60s cap) so we stop hammering signal-cli for a recipient it can't deliver to. A successful sendTyping resets the counters. - _stop_typing_indicator() clears the backoff state so the next agent turn starts fresh. E2E simulation against the reported 41-minute window: RPCs drop from 1230 to 45 (-96%), log lines from 1048 WARNINGs to 1 WARNING + 44 DEBUGs. Credits kshitijk4poor (#12056) for the _rpc log_failures kwarg idea; the broader restructure in that PR (nested per-chat loop inside send_typing) is avoided here in favour of stateful backoff that preserves base.py's existing _keep_typing architecture.

github-actions · 2026-04-18T11:05:06Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Install hook files modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

…#12113, NousResearch#12116, NousResearch#12118, NousResearch#12123) - fix(signal): back off sendTyping spam for unreachable recipients - docs(terminal): warn against stacking watch_patterns + notify_on_complete - feat(steer): /steer <prompt> injects mid-run note after next tool call - docs(browser): improve /browser connect setup guidance Resolved conflicts: - run_agent.py: keep budget pressure injection + tool-repeat hint + /steer

…sResearch#12118) base.py's _keep_typing refresh loop calls send_typing every ~2s while the agent is processing. If signal-cli returns NETWORK_FAILURE for the recipient (offline, unroutable, group membership lost), the unmitigated path was a WARNING log every 2 seconds for as long as the agent stayed busy — a user report showed 1048 warnings in 41 minutes for one offline contact, plus the matching volume of pointless RPC traffic to signal-cli. - _rpc() accepts log_failures=False so callers can route repeated expected failures (typing) to DEBUG while keeping send/receive at WARNING. - send_typing() tracks consecutive failures per chat. First failure still logs WARNING so transport issues remain visible; subsequent failures log at DEBUG. After three consecutive failures we skip the RPC during an exponential cooldown (16s, 32s, 60s cap) so we stop hammering signal-cli for a recipient it can't deliver to. A successful sendTyping resets the counters. - _stop_typing_indicator() clears the backoff state so the next agent turn starts fresh. E2E simulation against the reported 41-minute window: RPCs drop from 1230 to 45 (-96%), log lines from 1048 WARNINGs to 1 WARNING + 44 DEBUGs. Credits kshitijk4poor (NousResearch#12056) for the _rpc log_failures kwarg idea; the broader restructure in that PR (nested per-chat loop inside send_typing) is avoided here in favour of stateful backoff that preserves base.py's existing _keep_typing architecture.

teknium1 merged commit 9527707 into main Apr 18, 2026
4 of 5 checks passed

teknium1 deleted the hermes/hermes-1a3c1633 branch April 18, 2026 11:13

teknium1 mentioned this pull request Apr 18, 2026

Fix paperclip recall fan-out and Signal typing retry spam #12056

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(signal): back off sendTyping spam for unreachable recipients#12118

fix(signal): back off sendTyping spam for unreachable recipients#12118
teknium1 merged 1 commit into
mainfrom
hermes/hermes-1a3c1633

teknium1 commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

teknium1 commented Apr 18, 2026

Summary

Problem

Changes

Validation

Notes

Uh oh!

github-actions Bot commented Apr 18, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Install hook files modified

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant