Skip to content

fix(gateway): honor group-chat silence in empty-response retry loop (#13248)#13252

Closed
WaseemTheDream wants to merge 6 commits into
NousResearch:mainfrom
WaseemTheDream:fix/silence-allowed-empty-retry-13248
Closed

fix(gateway): honor group-chat silence in empty-response retry loop (#13248)#13252
WaseemTheDream wants to merge 6 commits into
NousResearch:mainfrom
WaseemTheDream:fix/silence-allowed-empty-retry-13248

Conversation

@WaseemTheDream

Copy link
Copy Markdown

Fixes #13248

On claude-opus-4-7 in Slack group threads, a genuine discussion message that is not an @mention causes the model to reason into "don't reply" (per its group-chat rules) and emit no visible text. The gateway retry loop treated this as a generation failure: 2× thinking-only prefill-nudge → 3× empty-response retry → fallback-provider switch, ultimately posting a (empty) warning to the channel that fought the intended silence.

Fix

Plumb a silence_allowed signal from adapter → gateway → agent:

  • MessageEvent — new silence_allowed: bool = False field.
  • Slack adapter — sets silence_allowed=(not is_dm) and (not is_mentioned) on the constructed MessageEvent. Messages flowing through thread-continuity / existing-session gates without a direct address get the flag.
  • GatewayRunner._run_agent — new silence_allowed kwarg, forwarded to AIAgent.run_conversation.
  • AIAgent.run_conversation — stores self._silence_allowed.
  • Retry gates (run_agent.py ~L10894, L10930, L10949) — all three branches now also require not self._silence_allowed. When silence is valid, the model's first empty/thinking-only response terminates the turn instead of triggering retries.
  • Gateway outbound — when silence_allowed is set and the final response is (empty), suppress the "⚠️ The model returned no response…" sentinel and send nothing. DMs and @mentions retain the warning.

DMs and direct @mentions get unchanged retry semantics — silence is never valid when the bot was explicitly addressed.

Tests

tests/test_silence_allowed_retry_suppression.py (8 tests, passing):

  • MessageEvent dataclass accepts and defaults the flag correctly
  • run_conversation and _run_agent signatures expose the kwarg with default False
  • The three retry gates suppress when silence_allowed=True
  • Addressed messages (flag=False) still take all retry branches

Also verified tests/test_empty_model_fallback.py still passes (12/12).

@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/gateway Gateway runner, session dispatch, delivery comp/agent Core agent loop, run_agent.py, prompt builder platform/slack Slack app adapter labels Apr 22, 2026
@WaseemTheDream WaseemTheDream force-pushed the fix/silence-allowed-empty-retry-13248 branch from 3fc721c to 16c9e2d Compare April 23, 2026 02:13
@WaseemTheDream

Copy link
Copy Markdown
Author

Pushed two follow-ups (a064f2b, 6a08730) after end-to-end validation on a live install surfaced two gaps that the prior 4 commits didn't fully close on the WhatsApp side:

1. WhatsApp adapter wasn't wired (a064f2b).
The Slack adapter sets silence_allowed=(not is_dm) and (not is_mentioned) on the constructed MessageEvent, but gateway/platforms/whatsapp.py:_build_message_event didn't set the flag at all — so non-@mention group messages on WhatsApp still ran the empty/thinking-only retry loop and surfaced (empty) warnings, sanitized errors, or the literal [SILENT] token as the final body. The new commit mirrors the Slack logic on WhatsApp using its existing addressed-detection helpers (slash command, reply-to-bot, _message_mentions_bot, _message_matches_mention_patterns). DMs always expect a reply and stay False.

2. Streaming + interim drafts bypass [SILENT] suppression (6a08730).
The base.py [SILENT] handler runs in the post-agent dispatch, but the streaming consumer creates a placeholder via adapter.send() and edits it during generation via adapter.edit_message() — both before suppression. With display.streaming: true, a non-@mention WhatsApp group message produced visible mid-deliberation drafts as message edits, with the final [SILENT] token landing as plain text. Reproducer: real chat, screenshotted in our local debug session — the agent edited its draft three times before settling on [SILENT], which then sat in the channel as a literal message.

Fix is to gate both _want_stream_deltas and _want_interim_messages on not silence_allowed in _run_agent, so silence-valid turns route through the single post-agent delivery path where base.py's suppression actually fires. DMs and @mentions stream as before.

After both commits, the live install passes the original repro:

Scenario Behaviour
Non-@mention group message Completely silent — no draft, no [SILENT], no warning
@mention in same group Replies normally, with streaming
DM Replies normally, with streaming

Happy to split commit 2 out into a separate PR if maintainers prefer the streaming-vs-silence interaction tracked independently from the original retry-loop fix — but practically they share the same root signal (silence_allowed) and one without the other still leaks into the channel.

WaseemTheDream and others added 6 commits May 4, 2026 12:23
When a message arrives in a Slack group channel without an @mention
(thread continuity, existing session, etc.), the model may legitimately
choose to stay silent per its group-chat rules. The agent's retry loop
previously treated empty/thinking-only responses as generation failures
and nudged the model 2x for thinking-only + 3x for empty + a fallback
provider switch, fighting the intended silence and ultimately posting
a '(empty)' warning to the channel.

This adds a 'silence_allowed' signal plumbed from the Slack adapter
(true when not DM and not @mentioned) through the gateway into the
AIAgent. When set, all three retry branches short-circuit and the
gateway suppresses the '(empty)' sentinel so nothing is sent.

DMs and @mentions still get full retry semantics — silence is never
valid when the bot was explicitly addressed.

Fixes NousResearch#13248
…ence_allowed turns

Follow-up to prior commit. The initial fix only gated the empty/
thinking-only retry branches, but two other paths still leaked
messages into non-addressed group channels:

1. _status_callback_sync — posts _emit_status() warnings directly to
   the channel (e.g. 'Max retries (3) for invalid responses — trying
   fallback...', 'Empty/malformed response — switching to fallback...',
   compression notices, rate-limit warnings). Now no-ops when
   silence_allowed is True; warnings continue to log internally.

2. The failed-agent error surface — when the agent returns failed=True
   with final_response=None, the gateway was wrapping 'error' into a
   user-facing 'The request failed: ...' message. Now returns an empty
   response in silence_allowed turns, matching the silence contract.

Together with the prior commit, these close off every path by which a
non-@mention group message could produce a visible reply.

Refs NousResearch#13248
…-kaito)

Two leak paths converted raw provider HTTP response bodies into chat
messages that bypassed the silence_allowed guard in group chats.

1. run.py ~L8942: final_response = "⚠️ {result[error]}" on agent failure
   flowed through the normal send path. Now return empty final_response
   and propagate via error field.

2. run.py ~L4004: user-facing fallback leaked 300 chars of raw error.
   New _sanitize_agent_error() extracts HTTP status + message only,
   strips URLs and request_ids, hard-caps at 160 chars.

Real-world symptom: Anthropic out-of-usage HTTP 400 dumped JSON payload
plus request_ids into a group chat with a business contact. Silence
contract was honored for status telemetry but not for the failure-path
final_response.

8-case sanitizer test: all outputs <=160 chars, no req_ids, no URLs.
Gateway test suite regressions: 0 (106 failed, was 107).
Mirrors the cron scheduler's SILENT_MARKER handling so messaging surfaces
(Slack, WhatsApp, Telegram, Signal, Discord) honor the same silence contract.

When the model emits the literal '[SILENT]' token as its entire response
on a non-@mention group message, the gateway now suppresses outbound
delivery entirely instead of posting the token verbatim.

Pairs with cd5956df (silence_allowed flag) and 3fc721cf (status-telemetry
suppression) to form the complete silence-contract fix.

Refs NousResearch#13248.
The Slack adapter sets silence_allowed=(not is_dm) and (not is_mentioned) on
the constructed MessageEvent, but the WhatsApp adapter's _build_message_event
didn't — so even after this PR's plumbing, non-@mention WhatsApp group messages
still ran the empty/thinking-only retry loop and surfaced (empty) warnings,
sanitized errors, or the literal [SILENT] sentinel. Mirror the Slack logic on
WhatsApp using its existing addressed-detection helpers (slash command,
reply-to-bot, @mention, name-pattern match).

Validated end-to-end on a live install: a non-@mention message in a
multi-person WhatsApp group now produces no visible response, while DMs and
direct @mentions retain full retry semantics.

Refs NousResearch#13248
The streaming consumer creates a placeholder via adapter.send() and edits it
during generation via adapter.edit_message(). Both paths bypass the post-agent
[SILENT] suppression in gateway/platforms/base.py, so any draft text — and
notably the literal "[SILENT]" sentinel when it's the model's final output —
leaks into the channel before suppression runs.

Real-world symptom on WhatsApp (with display.streaming: true): a non-@mention
group message produced visible mid-deliberation drafts as message edits, with
the final "[SILENT]" token landing as plain text in the chat.

Gating both _want_stream_deltas and _want_interim_messages on `not
silence_allowed` ensures the post-agent dispatch is the single delivery path
in silence-valid turns, where base.py's [SILENT] handler can suppress cleanly.
DMs and @mentions stream as before.

Refs NousResearch#13248
@WaseemTheDream WaseemTheDream force-pushed the fix/silence-allowed-empty-retry-13248 branch from 6a08730 to 0d2e846 Compare May 4, 2026 19:24
@WaseemTheDream

Copy link
Copy Markdown
Author

Rebased onto current main (was 22 commits behind). Two textual conflicts, both resolved cleanly:

  • gateway/platforms/slack.py — kept the upstream additions (reply_to_text=reply_to_text, auto_skill=_auto_skill) and this PR's silence_allowed=(not is_dm) and (not is_mentioned) on the MessageEvent constructor. Both are independent kwargs.
  • gateway/run.py — kept the upstream native-image-attachment pre-processing (_native_imgsbuild_native_content_parts_run_message) and added silence_allowed=silence_allowed to the agent.run_conversation(_run_message, …) call. Also kept upstream's new _gateway_runner_ref weakref alongside this PR's _sanitize_agent_error helper at module scope.

All 6 commits replayed; ast.parse clean on every touched file; force-pushed with --force-with-lease.

Tagging @teknium1 — based on git log you've authored/merged the bulk of recent gateway work (incl. last week's fix(gateway): commits to the same files), so this PR's rebase + the two follow-up commits would benefit from your eye when you have a chance. No rush.

@teknium1

Copy link
Copy Markdown
Contributor

Merged the narrow exact-token silence behavior via PR #46080, using #37940 as the implementation base. Your broader silence_allowed adapter/agent-loop plumbing is intentionally not included in this salvage because the accepted contract is exact-token final-response suppression only: preserve history, suppress delivery, do not reinterpret generic empty responses as silence. Thanks for the original report/context on #13248.

#46080

@teknium1 teknium1 closed this Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery P1 High — major feature broken, no workaround platform/slack Slack app adapter type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Empty-response retry loop on claude-opus-4-7 in Slack group threads (non-@mention discussion messages)

3 participants