Skip to content

fix(clarify): remove awaiting_text filter from get_pending_for_session so text-intercept works on platforms without buttons#26008

Closed
Caixa-git wants to merge 1 commit into
NousResearch:mainfrom
Caixa-git:fix/clarify-text-intercept-awaiting-text
Closed

fix(clarify): remove awaiting_text filter from get_pending_for_session so text-intercept works on platforms without buttons#26008
Caixa-git wants to merge 1 commit into
NousResearch:mainfrom
Caixa-git:fix/clarify-text-intercept-awaiting-text

Conversation

@Caixa-git

Copy link
Copy Markdown

Problem

When the agent calls clarify with multiple choices on platforms that don't support native buttons (Discord, Slack, WhatsApp, etc.), the clarify hangs indefinitely with ⏳ Still working... messages until the 10-minute timeout.

Root Cause

tools/clarify_gateway.py:get_pending_for_session() filters entries by awaiting_text=True. Choice-based clarifies are registered with awaiting_text=False (because the button path handles them on Telegram). On Discord, the base send_clarify renders choices as a numbered text list, but the user's reply ("1", "Option A") is never intercepted — the function returns None and the agent stays blocked on wait_for_response().

Fix

Removed the awaiting_text filter from get_pending_for_session(). It now returns the oldest pending clarify entry regardless of its text-capture mode.

Safety

  • Telegram (with buttons): Button callbacks resolve via resolve_gateway_clarify() directly, removing the entry from _entries before the text-intercept runs. No double-resolution.
  • Discord (text only): The text-intercept can now find and resolve choice-based clarifies. Same path as open-ended clarifies.
  • awaiting_text flag preserved: Still used correctly for tracking "Other" button state. Just no longer gates retrieval.

PoC

# Fresh Docker environment
docker run --rm ghcr.io/caixa-git/poc-clarify

# Or locally
cd /tmp/fork-hermes
python3 -m pytest tests/tools/test_clarify_gateway.py -v

PoC output confirms:

  • ✅ Bug reproduced: get_pending_for_session() returned None for choice-based clarify
  • ✅ Fix verified: entry found after fix
  • ✅ 41/41 clarify-related tests pass
  • ✅ Full flow (register → resolve → wait) works correctly

The get_pending_for_session() function filters entries by awaiting_text=True,
which means choice-based clarifies (awaiting_text=False) are invisible to the
gateway text-intercept. On platforms without native button support (Discord,
Slack, WhatsApp, etc.), the base send_clarify renders choices as a numbered
text list, and the user's reply ("1", "2", etc.) is never intercepted —
the clarify hangs until the 10-minute timeout.

Fix: remove the awaiting_text filter so get_pending_for_session returns
the oldest pending clarify entry regardless of its text-capture mode.
This is safe because:
- Button-based platforms (Telegram) resolve via callbacks first,
  removing the entry before text-intercept runs
- The text-intercept already handles both choice and free-text responses
- All existing confirm paths ("Other" button → mark_awaiting_text → resolve)
  still work correctly

PoC: https://github.com/Caixa-git/hermes-agent/tree/fix/clarify-text-intercept-awaiting-text
Docker: docker run --rm ghcr.io/caixa-git/poc-clarify

Fixes: #TBD
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery comp/tools Tool registry, model_tools, toolsets platform/discord Discord bot adapter P1 High — major feature broken, no workaround labels May 15, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competes with #25584 and #25610 for the same underlying bug (#25567 / #26009).

Approach difference: This PR removes the awaiting_text filter entirely from get_pending_for_session(), while #25584 calls mark_awaiting_text(clarify_id) after sending choices as text. Both fix the symptom; this one is broader (returns any pending clarify regardless of mode).

@Caixa-git

Copy link
Copy Markdown
Author

Thanks for the context! I see #25584 takes the narrower approach by calling mark_awaiting_text() after rendering choices as text. Do you have a preference between the two approaches? Happy to adjust mine accordingly, or if #25584 looks like the right direction I can close this one.

@alexzhu0

Copy link
Copy Markdown
Contributor

Cross-checking against #25584 (@liuhao1024) since alt-glitch already linked them — adding mechanism-level comparison since they make different bets:

#26008 (this) #25584
Site of change tools/clarify_gateway.py::get_pending_for_session gateway/platforms/base.py::send_clarify (default impl)
Approach Reader: drop the awaiting_text filter so any pending entry is returned Writer: call mark_awaiting_text(clarify_id) after rendering choices as text
What awaiting_text ends up meaning "marker bit, not a gate" — kept on the entry but no longer affects retrieval "true source of truth" — flipped at the moment text-capture actually starts
Behavior on Telegram (overrides send_clarify with inline buttons) Unchanged — button callback still resolves before any text intercept fires Unchanged — Telegram override never reaches the new mark_awaiting_text call
Risk surface Wider — every caller of get_pending_for_session now sees pre-marked entries (only one caller today, but defines forward semantics) Narrower — only the single send path that actually rendered choices as text
Race between register() and the user's first reply Safer — entry already retrievable Has a small window where choices were rendered but mark_awaiting_text hasn't been called yet (within the same await self.send() though, so usually unobservable)

#25584 is the more mechanism-aligned fix (state flag flips at the boundary that actually changes mode), but it relies on platform overrides correctly NOT reaching the base path — which is the implicit contract today. #26008 makes the bug impossible at the read site at the cost of the bit losing one of its meanings.

Maintainer's call between "tighten the contract at the writer" vs "make the reader more permissive". Both fix the user-visible symptom; the test diff shows #26008 already updated test_button_choice_does_not_auto_await to match the new semantics, so the regression coverage is consistent.

pixu-bd pushed a commit to skb50bd/hermes-agent that referenced this pull request Jun 7, 2026
…ttons

Re-triggering CI / re-review. The PR body has been cleaned up to
follow the upstream PR template (see the inline re-review-request
comment on PR NousResearch#40045 for the proposed new body). No code change.

Fixes NousResearch#26009
Refs NousResearch#12573
Complements NousResearch#26008
@pixu-bd

pixu-bd commented Jun 7, 2026

Copy link
Copy Markdown

Cross-link: a complementary adapter-side fix is in #41353

Heads-up: there's a related fix in #41353 (fix(discord): render clarify choices as Select menu, not truncated buttons) that touches the same code path from the adapter side. Worth a quick look before either lands to make sure they don't double-fix or conflict.

What #41353 does

Compatibility with #26008

The two PRs are complementary, not conflicting:

Order of merging:

Neither PR modifies the other's diff. No merge conflicts expected.

Suggested action

Either order is fine, but a maintainer looking at #26008 might want to coordinate with #41353 to merge them together. I'm fine with whichever lands first — just wanted to flag the overlap so the two don't get reviewed in isolation and miss the cross-platform implication.

Also: the related P1 bug is filed as #26009 ([Bug] clarify tool hangs indefinitely with 'Still working...' on Discord); I've posted a cross-link there too.

@teknium1

Copy link
Copy Markdown
Contributor

This appears to be implemented on current main already, via the narrower writer-side fix from the competing clarify PR rather than by removing the reader-side awaiting_text filter.

Automated hermes-sweeper review evidence:

  • gateway/platforms/base.py:2504 imports mark_awaiting_text in the base send_clarify text-fallback branch, and gateway/platforms/base.py:2505 calls mark_awaiting_text(clarify_id) when choices are rendered as a numbered text list.
  • gateway/run.py:6553 checks get_pending_for_session(...) for clarify text replies, and gateway/run.py:6563 resolves the user's text through resolve_gateway_clarify(...) instead of routing it as a new agent turn.
  • tests/tools/test_clarify_gateway.py:207 covers the landed behavior: multi-choice clarifies remain non-text-awaiting initially, then the text fallback flips them so get_pending_for_session can find the entry.
  • The implementing commit on main is f26098e22f17025b9d57b176898c7d60d5b5ce8b (fix(gateway): enable text-intercept for multi-choice clarify fallback (#25567)), contained in v2026.5.16 and later tags.

Thanks for the clear root-cause writeup and the safety notes here. The mainline fix preserves the awaiting_text gate while enabling it at the point where the fallback actually starts accepting typed replies.

@teknium1 teknium1 closed this Jun 12, 2026
@teknium1 teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery comp/tools Tool registry, model_tools, toolsets P1 High — major feature broken, no workaround platform/discord Discord bot adapter sweeper:implemented-on-main Sweeper: behavior already present on current main type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants