fix(gateway): honor group-chat silence in empty-response retry loop (#13248)#13252
fix(gateway): honor group-chat silence in empty-response retry loop (#13248)#13252WaseemTheDream wants to merge 6 commits into
Conversation
3fc721c to
16c9e2d
Compare
|
Pushed two follow-ups ( 1. WhatsApp adapter wasn't wired ( 2. Streaming + interim drafts bypass Fix is to gate both After both commits, the live install passes the original repro:
Happy to split commit 2 out into a separate PR if maintainers prefer the streaming-vs-silence interaction tracked independently from the original retry-loop fix — but practically they share the same root signal ( |
When a message arrives in a Slack group channel without an @mention (thread continuity, existing session, etc.), the model may legitimately choose to stay silent per its group-chat rules. The agent's retry loop previously treated empty/thinking-only responses as generation failures and nudged the model 2x for thinking-only + 3x for empty + a fallback provider switch, fighting the intended silence and ultimately posting a '(empty)' warning to the channel. This adds a 'silence_allowed' signal plumbed from the Slack adapter (true when not DM and not @mentioned) through the gateway into the AIAgent. When set, all three retry branches short-circuit and the gateway suppresses the '(empty)' sentinel so nothing is sent. DMs and @mentions still get full retry semantics — silence is never valid when the bot was explicitly addressed. Fixes NousResearch#13248
…ence_allowed turns Follow-up to prior commit. The initial fix only gated the empty/ thinking-only retry branches, but two other paths still leaked messages into non-addressed group channels: 1. _status_callback_sync — posts _emit_status() warnings directly to the channel (e.g. 'Max retries (3) for invalid responses — trying fallback...', 'Empty/malformed response — switching to fallback...', compression notices, rate-limit warnings). Now no-ops when silence_allowed is True; warnings continue to log internally. 2. The failed-agent error surface — when the agent returns failed=True with final_response=None, the gateway was wrapping 'error' into a user-facing 'The request failed: ...' message. Now returns an empty response in silence_allowed turns, matching the silence contract. Together with the prior commit, these close off every path by which a non-@mention group message could produce a visible reply. Refs NousResearch#13248
…-kaito) Two leak paths converted raw provider HTTP response bodies into chat messages that bypassed the silence_allowed guard in group chats. 1. run.py ~L8942: final_response = "⚠️ {result[error]}" on agent failure flowed through the normal send path. Now return empty final_response and propagate via error field. 2. run.py ~L4004: user-facing fallback leaked 300 chars of raw error. New _sanitize_agent_error() extracts HTTP status + message only, strips URLs and request_ids, hard-caps at 160 chars. Real-world symptom: Anthropic out-of-usage HTTP 400 dumped JSON payload plus request_ids into a group chat with a business contact. Silence contract was honored for status telemetry but not for the failure-path final_response. 8-case sanitizer test: all outputs <=160 chars, no req_ids, no URLs. Gateway test suite regressions: 0 (106 failed, was 107).
Mirrors the cron scheduler's SILENT_MARKER handling so messaging surfaces (Slack, WhatsApp, Telegram, Signal, Discord) honor the same silence contract. When the model emits the literal '[SILENT]' token as its entire response on a non-@mention group message, the gateway now suppresses outbound delivery entirely instead of posting the token verbatim. Pairs with cd5956df (silence_allowed flag) and 3fc721cf (status-telemetry suppression) to form the complete silence-contract fix. Refs NousResearch#13248.
The Slack adapter sets silence_allowed=(not is_dm) and (not is_mentioned) on the constructed MessageEvent, but the WhatsApp adapter's _build_message_event didn't — so even after this PR's plumbing, non-@mention WhatsApp group messages still ran the empty/thinking-only retry loop and surfaced (empty) warnings, sanitized errors, or the literal [SILENT] sentinel. Mirror the Slack logic on WhatsApp using its existing addressed-detection helpers (slash command, reply-to-bot, @mention, name-pattern match). Validated end-to-end on a live install: a non-@mention message in a multi-person WhatsApp group now produces no visible response, while DMs and direct @mentions retain full retry semantics. Refs NousResearch#13248
The streaming consumer creates a placeholder via adapter.send() and edits it during generation via adapter.edit_message(). Both paths bypass the post-agent [SILENT] suppression in gateway/platforms/base.py, so any draft text — and notably the literal "[SILENT]" sentinel when it's the model's final output — leaks into the channel before suppression runs. Real-world symptom on WhatsApp (with display.streaming: true): a non-@mention group message produced visible mid-deliberation drafts as message edits, with the final "[SILENT]" token landing as plain text in the chat. Gating both _want_stream_deltas and _want_interim_messages on `not silence_allowed` ensures the post-agent dispatch is the single delivery path in silence-valid turns, where base.py's [SILENT] handler can suppress cleanly. DMs and @mentions stream as before. Refs NousResearch#13248
6a08730 to
0d2e846
Compare
|
Rebased onto current
All 6 commits replayed; Tagging @teknium1 — based on |
|
Merged the narrow exact-token silence behavior via PR #46080, using #37940 as the implementation base. Your broader |
Fixes #13248
On
claude-opus-4-7in Slack group threads, a genuine discussion message that is not an @mention causes the model to reason into "don't reply" (per its group-chat rules) and emit no visible text. The gateway retry loop treated this as a generation failure: 2× thinking-only prefill-nudge → 3× empty-response retry → fallback-provider switch, ultimately posting a(empty)warning to the channel that fought the intended silence.Fix
Plumb a
silence_allowedsignal from adapter → gateway → agent:MessageEvent— newsilence_allowed: bool = Falsefield.silence_allowed=(not is_dm) and (not is_mentioned)on the constructedMessageEvent. Messages flowing through thread-continuity / existing-session gates without a direct address get the flag.GatewayRunner._run_agent— newsilence_allowedkwarg, forwarded toAIAgent.run_conversation.AIAgent.run_conversation— storesself._silence_allowed.run_agent.py~L10894, L10930, L10949) — all three branches now also requirenot self._silence_allowed. When silence is valid, the model's first empty/thinking-only response terminates the turn instead of triggering retries.silence_allowedis set and the final response is(empty), suppress the "DMs and direct @mentions get unchanged retry semantics — silence is never valid when the bot was explicitly addressed.
Tests
tests/test_silence_allowed_retry_suppression.py(8 tests, passing):MessageEventdataclass accepts and defaults the flag correctlyrun_conversationand_run_agentsignatures expose the kwarg with default Falsesilence_allowed=TrueAlso verified
tests/test_empty_model_fallback.pystill passes (12/12).