Skip to content

chore(whatsapp-mcp): remove HITL regex receive-side (superseded by waba-mcp payload routing)#1

Merged
Wizarck merged 394 commits into
mainfrom
chore/remove-wa-hitl-regex-receive
May 5, 2026
Merged

chore(whatsapp-mcp): remove HITL regex receive-side (superseded by waba-mcp payload routing)#1
Wizarck merged 394 commits into
mainfrom
chore/remove-wa-hitl-regex-receive

Conversation

@Wizarck

@Wizarck Wizarck commented May 5, 2026

Copy link
Copy Markdown
Owner

WhatsApp HITL approvals now flow via Meta interactive buttons → waba-mcp payload_routes → aiops /webhook/hitl. The free-form regex parser here is dead AND a security smell (any text with 'aprobar' stem could fire a resolution). Telegram inline-keyboard path is untouched. Diff: 605 deletions, 376 insertions (net -229).

helix4u and others added 30 commits April 19, 2026 00:28
Follow-up for the helix4u easy-fix salvage batch:
- route remaining context-engine quiet-mode output through
  _should_emit_quiet_tool_messages() so non-CLI/library callers stay
  silent consistently
- drop the extra senderAliases computation from WhatsApp allowlist-drop
  logging and remove the now-unused import

This keeps the batch scoped to the intended fixes while avoiding
leaked quiet-mode output and unnecessary duplicate work in the bridge.
Extends the existing cron script hook with a wake gate ported from
nanoclaw NousResearch#1232. When a cron job's pre-check Python script (already
sandboxed to HERMES_HOME/scripts/) writes a JSON line like
```json
{"wakeAgent": false}
```
on its last stdout line, `run_job()` returns the SILENT marker and
skips the agent entirely — no LLM call, no delivery, no tokens spent.
Useful for frequent polls (every 1-5 min) that only need to wake the
agent when something has genuinely changed.

Any other script output (non-JSON, missing key, non-dict, `wakeAgent: true`,
truthy/falsy non-False values) behaves as before: stdout is injected
as context and the agent runs normally. Strict `False` is required
to skip — avoids accidental gating from arbitrary JSON.

Refactor:
- New pure helper `_parse_wake_gate(script_output)` in cron/scheduler.py
- `_build_job_prompt` accepts optional `prerun_script` tuple so the
  script runs exactly once per job (run_job runs it for the gate check,
  reuses the output for prompt injection)
- `run_job` short-circuits with SILENT_MARKER when gate fires

Script failures (success=False) still cannot trigger the gate — the
failure is reported as context to the agent as before.

This replaces the approach in closed PR NousResearch#3837, which inlined bash
scripts via tempfile and lost the path-traversal/scripts-dir sandbox
that main's impl has. The wake-gate idea (the one net-new capability)
is ported on top of the existing sandboxed Python-script model.

Tests:
- 11 pure unit tests for _parse_wake_gate (empty, whitespace, non-JSON,
  non-dict JSON, missing key, truthy/falsy non-False, multi-line,
  trailing blanks, non-last-line JSON)
- 5 integration tests for run_job wake-gate (skip returns SILENT,
  wake-true passes through, script-runs-only-once, script failure
  doesn't gate, no-script regression)
- Full tests/cron/ suite: 194/194 pass
… streamed text (NousResearch#8124)

When a streaming edit fails mid-stream (flood control, transport error)
and a tool boundary arrives before the fallback threshold is reached,
the pre-boundary tail in `_accumulated` was silently discarded by
`_reset_segment_state`. The user saw a frozen partial message and
missing words on the other side of the tool call.

Flush the undelivered tail as a continuation message before the reset,
computed relative to the last successfully-delivered prefix so we don't
duplicate content the user already saw.
…esearch#12471)

During gateway shutdown, a message arriving while
cancel_background_tasks is mid-await (inside asyncio.gather) spawns
a fresh _process_message_background task via handle_message and adds
it to self._background_tasks.  The original implementation's
_background_tasks.clear() at the end of cancel_background_tasks
dropped the reference; the task ran untracked against a disconnecting
adapter, logged send-failures, and lingered until it completed on
its own.

Fix: wrap the cancel+gather in a bounded loop (MAX_DRAIN_ROUNDS=5).
If new tasks appeared during the gather, cancel them in the next
round.  The .clear() at the end is preserved as a safety net for
any task that appeared after MAX_DRAIN_ROUNDS — but in practice the
drain stabilizes in 1-2 rounds.

Tests: tests/gateway/test_cancel_background_drain.py — 3 cases.
- test_cancel_background_tasks_drains_late_arrivals: spawn M1, start
  cancel, inject M2 during M1's shielded cleanup, verify M2 is
  cancelled.
- test_cancel_background_tasks_handles_no_tasks: no-op path still
  terminates cleanly.
- test_cancel_background_tasks_bounded_rounds: baseline — single
  task cancels in one round, loop terminates.

Regression-guard validated: against the unpatched implementation,
the late-arrival test fails with exactly the expected message
('task leaked').  With the fix it passes.

Blast radius is shutdown-only; the audit classified this as MED.
Shipping because the fix is small and the hygiene is worth it.

While investigating the audit's other MEDs (busy-handler double-ack,
Discord ExecApprovalView double-resolve, UpdatePromptView
double-resolve), I verified all three were false positives — the
check-and-set patterns have no await between them, so they're
atomic on single-threaded asyncio.  No fix needed for those.
…inuation (NousResearch#7183)

When _send_fallback_final() is called with nothing new to deliver
(the visible partial already matches final_text), the last edit may
still show the cursor character because fallback mode was entered
after a failed edit.  Before this fix the early-return path left
_already_sent = True without attempting to strip the cursor, so the
message stayed frozen with a visible ▉ permanently.

Adds a best-effort edit inside the empty-continuation branch to clean
the cursor off the last-sent text.  Harmless when fallback mode
wasn't actually armed or when the cursor isn't present.  If the strip
edit itself fails (flood still active), we return without crashing
and without corrupting _last_sent_text.

Adapted from PR NousResearch#7429 onto current main — the surrounding fallback
block grew the NousResearch#10807 stale-prefix handling since NousResearch#7429 was written,
so the cursor strip lives in the new else-branch where we still
return early.

3 unit tests covering: cursor stripped on empty continuation, no edit
attempted when cursor is not configured, cursor-strip edit failure
handled without crash.

Originally proposed as PR NousResearch#7429.
Follow-up on top of the helix4u NousResearch#6392 cherry-pick:
- reuse one helper for actionable Docker-local file-not-found errors
  across document/image/video/audio local-media send paths
- include /outputs/... alongside /output/... in the container-local
  path hint
- soften the gateway startup warning so it does not imply custom
  host-visible mounts are broken; the warning now targets the specific
  risky pattern of emitting container-local MEDIA paths without an
  explicit export mount
- add focused regressions for /outputs/... and non-document media hint
  coverage

This keeps the salvage aligned with the actual MEDIA delivery problem on
current main while reducing false-positive operator messaging.
Follow-up on top of the helix4u NousResearch#12388 cherry-picks:
- make deferred post-delivery callbacks generation-aware end-to-end so
  stale runs cannot clear callbacks registered by a fresher run for the
  same session
- bind callback ownership to the active session event at run start and
  snapshot that generation inside base adapter processing so later event
  mutation cannot retarget cleanup
- pass run_generation through proxy mode and drop stale proxy streams /
  final results the same way local runs are dropped
- centralize stop/new interrupt cleanup into one helper and replace the
  open-coded branches with shared logic
- unify internal control interrupt reason strings via shared constants
- remove the return from base.py's finally block so cleanup no longer
  swallows cancellation/exception flow
- add focused regressions for generation forwarding, proxy stale
  suppression, and newer-callback preservation

This addresses all review findings from the initial NousResearch#12388 review while
keeping the fix scoped to stale-output/typing-loop interrupt handling.
Built-in → optional-skills/:
  mlops/training/peft         → optional-skills/mlops/peft
  mlops/training/pytorch-fsdp → optional-skills/mlops/pytorch-fsdp
  mlops/models/clip           → optional-skills/mlops/clip
  mlops/models/stable-diffusion → optional-skills/mlops/stable-diffusion
  mlops/models/whisper        → optional-skills/mlops/whisper
  mlops/cloud/modal           → optional-skills/mlops/modal
  mcp/mcporter                → optional-skills/mcp/mcporter

Built-in mlops training kept: axolotl, trl-fine-tuning, unsloth.
Built-in mlops models kept: audiocraft, segment-anything.
Built-in mlops evaluation/research/huggingface-hub/inference all kept.
native-mcp stays built-in (documents the native MCP tool); mcporter was a
redundant alternative CLI.

Also: removed now-empty skills/mlops/cloud/ dir, refreshed
skills/mlops/models/DESCRIPTION.md and skills/mcp/DESCRIPTION.md to match
what's left, and synchronized both catalog pages (skills-catalog.md,
optional-skills-catalog.md).
…ousResearch#12473)

External services can now push plain-text notifications to a user's chat
via the webhook adapter without invoking the agent. Set deliver_only=true
on a route and the rendered prompt template becomes the literal message
body — dispatched directly to the configured target (Telegram, Discord,
Slack, GitHub PR comment, etc.).

Reuses all existing webhook infrastructure: HMAC-SHA256 signature
validation, per-route rate limiting, idempotency cache, body-size limits,
template rendering with dot-notation, home-channel fallback. No new HTTP
server, no new auth scheme, no new port.

Use cases: Supabase/Firebase webhooks → user notifications, monitoring
alert forwarding, inter-agent pings, background job completion alerts.

Changes:
- gateway/platforms/webhook.py: new _direct_deliver() helper + early
  dispatch branch in _handle_webhook when deliver_only=true. Startup
  validation rejects deliver_only with deliver=log.
- hermes_cli/main.py + hermes_cli/webhook.go: --deliver-only flag on
  subscribe; list/show output marks direct-delivery routes.
- website/docs/user-guide/messaging/webhooks.md: new Direct Delivery
  Mode section with config example, CLI example, response codes.
- skills/devops/webhook-subscriptions/SKILL.md: document --deliver-only
  with use cases (bumped to v1.1.0).
- tests/gateway/test_webhook_deliver_only.py: 14 new tests covering
  agent bypass, template rendering, status codes, HMAC still enforced,
  idempotency still applies, rate limit still applies, startup
  validation, and direct-deliver dispatch.

Validation: 78 webhook tests pass (64 existing + 14 new). E2E verified
with real aiohttp server + real urllib POST — agent not invoked, target
adapter.send() called with rendered template, duplicate delivery_id
suppressed.

Closes the gap identified in PR NousResearch#12117 (thanks to @H1an1 / Antenna team)
without adding a second HTTP ingress server.
Adds a maps optional skill with 8 commands, 44 POI categories, and
zero external dependencies. Uses free open data: Nominatim, Overpass
API, OSRM, and TimeAPI.io.

Commands: search, reverse, nearby, distance, directions, timezone,
area, bbox.

Improvements over original PR NousResearch#2015:
- Fixed directory structure (optional-skills/productivity/maps/)
- Fixed distance argparse (--to flag instead of broken dual nargs=+)
- Fixed timezone (TimeAPI.io instead of broken worldtimeapi heuristic)
- Expanded POI categories from 12 to 44
- Added directions command with turn-by-turn OSRM steps
- Added area command (bounding box + dimensions for a named place)
- Added bbox command (POI search within a geographic rectangle)
- Added 23 unit tests
- Improved haversine (atan2 for numerical stability)
- Comprehensive SKILL.md with workflow examples

Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
Skills are self-contained scripts — they don't need test suites in
the repo.
…skill

find-nearby and the (new) maps optional skill both used OpenStreetMap's
Overpass + Nominatim to answer the same question — 'what's near this
location?' — so shipping both would be duplicate code for overlapping
capability. Consolidate into one active-by-default skill at
skills/productivity/maps/ that is a strict superset of find-nearby.

Moves + deletions:
- optional-skills/productivity/maps/ → skills/productivity/maps/ (active,
  no install step needed)
- skills/leisure/find-nearby/ → DELETED (fully superseded)

Upgrades to maps_client.py so it covers everything find-nearby did:
- Overpass server failover — tries overpass-api.de then
  overpass.kumi.systems so a single-mirror outage doesn't break the skill
  (new overpass_query helper, used by both nearby and bbox)
- nearby now accepts --near "<address>" as a shortcut that auto-geocodes,
  so one command replaces the old 'search → copy coords → nearby' chain
- nearby now accepts --category (repeatable) for multi-type queries in
  one call (e.g. --category restaurant --category bar), results merged
  and deduped by (osm_type, osm_id), sorted by distance, capped at --limit
- Each nearby result now includes maps_url (clickable Google Maps search
  link) and directions_url (Google Maps directions from the search point
  — only when a ref point is known)
- Promoted commonly-useful OSM tags to top-level fields on each result:
  cuisine, hours (opening_hours), phone, website — instead of forcing
  callers to dig into the raw tags dict

SKILL.md:
- Version bumped 1.1.0 → 1.2.0, description rewritten to lead with
  capability surface
- New 'Working With Telegram Location Pins' section replacing
  find-nearby's equivalent workflow
- metadata.hermes.supersedes: [find-nearby] so tooling can flag any
  lingering references to the old skill

External references updated:
- optional-skills/productivity/telephony/SKILL.md — related_skills
  find-nearby → maps
- website/docs/reference/skills-catalog.md — removed the (now-empty)
  'leisure' section, added 'maps' row under productivity
- website/docs/user-guide/features/cron.md — find-nearby example
  usages swapped to maps
- tests/tools/test_cronjob_tools.py, tests/hermes_cli/test_cron.py,
  tests/cron/test_scheduler.py — fixture string values swapped
- cli.py:5290 — /cron help-hint example swapped

Not touched:
- RELEASE_v0.2.0.md — historical record, left intact

E2E-verified live (Nominatim + Overpass, one query each):
- nearby --near "Times Square" --category restaurant --category bar → 3 results,
  sorted by distance, all with maps_url, directions_url, cuisine, phone, website
  where OSM had the tags

All 111 targeted tests pass across tests/cron/, tests/tools/, tests/hermes_cli/.
Adds the Mibayy noreply email to the AUTHOR_MAP so CI attribution checks
pass for the NousResearch#3884 maps skill feat commit (7fa01fa).
…unning (NousResearch#12548)

agent.switch_model() mutates self.model, self.provider, self.base_url,
self.api_key, self.api_mode, and rebuilds self.client / self._anthropic_client
in place.  The worker thread running agent.run_conversation reads those
fields on every iteration.  A concurrent config.set key=model or slash-
worker-mirrored /model / /personality / /prompt / /compress can send an
HTTP request with mismatched model + base_url (or the old client keeps
running against a new endpoint) — 400/404s the user never asked for.

Fix: same pattern as the session.undo / session.compress guards
(PR NousResearch#12416) and the gateway runner's running-agent /model guard (PR
NousResearch#12334).  Reject with 4009 'session busy' when session.running is True.

Two call sites guarded:
- config.set with key=model: primary /model entry point from Ink
- _mirror_slash_side_effects for model / personality / prompt /
  compress: slash-worker passthrough path that applies live-agent
  side effects

Idle sessions still switch models normally — regression guard test
verifies this.

Tests (tests/test_tui_gateway_server.py): 4 new cases.
- test_config_set_model_rejects_while_running
- test_config_set_model_allowed_when_idle (regression guard)
- test_mirror_slash_side_effects_rejects_mutating_commands_while_running
- test_mirror_slash_side_effects_allowed_when_idle (regression guard)

Validated: against unpatched server.py, the two 'rejects_while_running'
tests fail with the exact race they assert against.  With the fix all
4 pass.  Live E2E against the live Python environment confirmed both
guards enforce 4009 / 'session busy' exactly as designed.
Adds two complementary GitHub PR review guides from contest submissions:
- Cron-based PR review agent (from PR NousResearch#5836 by @dieutx) — polls on a
  schedule, no server needed, teaches skills + memory authoring
- Webhook-based PR review (from PR NousResearch#6503 by @gaijinkush) — real-time via
  GitHub webhooks, documents previously undocumented webhook feature
Both guides are cross-linked so users can pick the approach that fits.

Reworks quickstart.md by integrating the best content from PR NousResearch#5744
by @aidil2105:
- Opinionated decision table ('The fastest path')
- Common failure modes table with causes and fixes
- Recovery toolkit sequence
- Session lifecycle verification step
- Better first-chat guidance with example prompts

Slims down installation.md:
- Removes 10-step manual/dev install section (already covered in
  developer-guide/contributing.md)
- Links to Contributing guide for dev setup
- Keeps focused on the automated installer + prerequisites + troubleshooting
…races (NousResearch#12555)

When a user hits /new or /resume before the previous session finishes
initializing, session.close runs while the previous session.create's
_build thread is still constructing the agent.  session.close pops
_sessions[sid] and closes whatever slash_worker it finds (None at that
point — _build hasn't installed it yet), then returns.  _build keeps
running in the background, installs the slash_worker subprocess and
registers an approval-notify callback on a session dict that's now
unreachable via _sessions.  The subprocess leaks until process exit;
the notify callback lingers in the global registry.

Fix: _build now tracks what it allocates (worker, notify_registered)
and checks in its finally block whether _sessions[sid] still points
to the session it's building for.  If not, the build was orphaned by
a racing close, so clean up the subprocess and unregister the notify
ourselves.

tui_gateway/server.py:
- _build reads _sessions.get(sid) safely (returns early if already gone)
- tracks allocated worker + notify registration
- finally checks orphan status and cleans up

Tests (tests/test_tui_gateway_server.py): 2 new cases.
- test_session_create_close_race_does_not_orphan_worker: slow
  _make_agent, close mid-build, verify worker.close() and
  unregister_gateway_notify both fire from the build thread's
  cleanup path.
- test_session_create_no_race_keeps_worker_alive: regression guard —
  happy path does NOT over-eagerly clean up a live worker.

Validated: against the unpatched code, the race test fails with
'orphan worker was not cleaned up — closed_workers=[]'.  Live E2E
against the live Python environment confirmed the cleanup fires
exactly when the race happens.
Two small races in gateway/platforms/discord.py, bundled together
since they're adjacent in the adapter and both narrow in impact.

1. on_message vs _resolve_allowed_usernames (startup window)
   DISCORD_ALLOWED_USERS accepts both numeric IDs and raw usernames.
   At connect-time, _resolve_allowed_usernames walks the bot's guilds
   (fetch_members can take multiple seconds) to swap usernames for IDs.
   on_message can fire during that window; _is_allowed_user compares
   the numeric author.id against a set that may still contain raw
   usernames — legitimate users get silently rejected for a few
   seconds after every reconnect.

   Fix: on_message awaits _ready_event (with a 30s timeout) when it
   isn't already set.  on_ready sets the event after the resolve
   completes.  In steady state this is a no-op (event already set);
   only the startup / reconnect window ever blocks.

2. join_voice_channel check-and-connect
   The existing-connection check at _voice_clients.get() and the
   channel.connect() call straddled an await boundary with no lock.
   Two concurrent /voice channel invocations could both see None and
   both call connect(); discord.py raises ClientException
   ("Already connected") on the loser.  Same race class for leave
   running concurrently with _voice_timeout_handler.

   Fix: per-guild asyncio.Lock (_voice_locks dict with lazy alloc via
   _voice_lock_for).  join_voice_channel and leave_voice_channel both
   run their body under the lock.  Sequential within a guild, still
   fully concurrent across guilds.

Both: LOW severity.  The first only affects username-based allowlists
on fast-follow-up messages at startup; the second is a narrow
exception on simultaneous voice commands.  Bundled so the adapter
gets a single coherent polish pass.

Tests (tests/gateway/test_discord_race_polish.py): 2 regression cases.
- test_concurrent_joins_do_not_double_connect: two concurrent
  join_voice_channel calls on the same guild result in exactly one
  channel.connect() invocation.
- test_on_message_blocks_until_ready_event_set: asserts the expected
  wait pattern is present in on_message (source inspection, since
  full discord.py client setup isn't practical here).

Regression-guard validated: against unpatched gateway/platforms/discord.py
both tests fail.  With the fix they pass.  Full Discord suite (118
tests) green.
…search#12546)

The stdin-read loop in entry.py calls handle_request() inline, so the
five handlers that can block for seconds to minutes
(slash.exec, cli.exec, shell.exec, session.resume, session.branch)
freeze the dispatcher. While one is running, any inbound RPC —
notably approval.respond and session.interrupt — sits unread in the
pipe buffer and lands only after the slow handler returns.

Route only those five onto a small ThreadPoolExecutor; every other
handler stays on the main thread so the fast-path ordering is
unchanged and the audit surface stays small. write_json is already
_stdout_lock-guarded, so concurrent response writes are safe. Pool
size defaults to 4 (overridable via HERMES_TUI_RPC_POOL_WORKERS).

- add _LONG_HANDLERS set + ThreadPoolExecutor + atexit shutdown
- new dispatch(req) function: pool for long handlers, inline for rest
- _run_and_emit wraps pool work in a try/except so a misbehaving
  handler still surfaces as a JSON-RPC error instead of silently
  dying in a worker
- entry.py swaps handle_request → dispatch
- 5 new tests: sync path still inline, long handlers emit via stdout,
  fast handler not blocked behind slow one, handler exceptions map to
  error responses, non-long methods always take the sync path

Manual repro confirms the fix: shell.exec(sleep 3) + terminal.resize
sent back-to-back now returns the resize response at t=0s while the
sleep finishes independently at t=3s. Before, both landed together
at t=3s.

Fixes NousResearch#12546.
ethernet8023 and others added 21 commits April 20, 2026 16:50
…ersion

The link regex in format_message used [^)]+ for the URL portion, which
  stopped at the first ) character. URLs with nested parentheses (e.g.
  Wikipedia links like Python_(programming_language)) were improperly parsed.

  Use a better regex, which is the same the Slack adapter uses.
… gibberish

The colored ✓/✗ marks in /tools list, /tools enable, and /tools disable
  were showing up as "?[32m✓ enabled?[0m" instead of green and red. The
  colors come out as ANSI escape codes, but the tui eats
  the ESC byte and replaces it with "?" when those codes are printed
  straight to stdout. They need to go through prompt_toolkit's renderer.

  Fix: capture the command's output and re-print each line through
  _cprint(), the same workaround used elsewhere for NousResearch#2262. The capture
  buffer fakes isatty()=True so the color helper still emits escapes
  (StringIO.isatty() is False, which would otherwise strip colors).
  The capture path only runs inside the TUI; standalone CLI and tests
  go straight through to real stdout where colors already work.
…ousResearch#9091, NousResearch#13131

- Fix duplicate 'timezone' import in e2e conftest
- Fix test_text_before_command_not_detected asserting send() is awaited
  when no agent is present in mock setup (text messages don't produce
  command output)
The inline markdown regex matched `_..._` / `__...__` anywhere, so file
paths like `browser_screenshot_ecc1c3feab.png` got mid-path italics.

Require non-word flanking (`(?<!\w)` / `(?!\w)`) on underscore emphasis
so snake_case identifiers and paths render literally, matching the
CommonMark intraword rule. `*` / `**` keep intraword semantics.
The agent emits `MEDIA:<path>` to signal file delivery to the gateway,
and `[[audio_as_voice]]` as a voice-delivery hint. The gateway strips
both before sending to Telegram/Discord/Slack, but the TUI was rendering
them raw through markdown — which is also how the intraword underscore
bug originally surfaced (`browser_screenshot_ecc…`).

At the `Md` layer, detect both sentinels on their own line:
- `MEDIA:<path>` → `▸ <path>` with the path rendered literal and wrapped
  in a `Link` for OSC 8 hyperlink support (absolute paths get a
  `file://` URL, so modern terminals make them click-to-open).
- `[[audio_as_voice]]` → dropped silently; it has no meaning in TUI.

Covers tests for quoted/backticked MEDIA variants, Windows drive paths,
whitespace, and the inline-in-prose case (left untouched — still
protected by the intraword-underscore guard).
…wn-intraword-underscore

fix(tui): markdown — guard intraword underscores + clean protocol sentinels
…tool execution (NousResearch#13205)

When /steer is sent during an API call (model thinking), the steer text
sits in _pending_steer until after the next tool batch — which may never
come if the model returns a final response. In that case the steer is
only delivered as a post-run follow-up, defeating the purpose.

Add a pre-API-call drain at the top of the main loop: before building
api_messages, check _pending_steer and inject into the last tool result
in the messages list. This ensures steers sent during model thinking are
visible on the very next API call.

If no tool result exists yet (first iteration), the steer is restashed
for the post-tool drain to pick up — injecting into a user message would
break role alternation.

Three new tests cover the pre-API-call drain: injection into last tool
result, restash when no tool message exists, and backward scan past
non-tool messages.
Three-layer defense against secrets leaking into compaction summaries:
1. Input redaction: redact_sensitive_text() on message content and tool
   call arguments in _serialize_for_summary() before sending to summarizer
2. Prompt instructions: NEVER include API keys/tokens/passwords in the
   summarizer preamble, template Critical Context section, and focus topic
3. Output redaction: redact_sensitive_text() on the summary output and
   _previous_summary for iterative updates

Reuses existing agent/redact.py patterns (sk-*, ghp_*, key=value, etc).

Cherry-picked from PR NousResearch#9200 by @entropidelic.
Adds a new gateway platform that lets Hermes act as a WhatsApp bot using the
official Meta Cloud Business API. The platform is webhook-only: it expects an
external WhatsApp MCP server to own the Meta Cloud webhook (verify handshake,
HMAC validation, contact tagging, multi-tenant routing) and forward messages
of interest to this gateway via a simple POST.

This complements the existing whatsapp.py adapter (Baileys / WhatsApp Web).
Many users run a number on Meta's official Cloud API and want a TOS-compliant
integration without giving up Hermes; this platform fills that gap.

Forward POST contract:
  POST <PATH> (default /wa)
  X-Webhook-Secret: <shared secret>
  body: {message_id, phone, type, content, ...}

The gateway responds 200 immediately so the MCP's short forward timeout (the
original Meta webhook handler must reply <5 s) is not exceeded by LLM latency;
the actual reply is sent asynchronously by send() via Meta's Graph API.

What's included:
  * gateway/platforms/whatsapp_via_mcp_meta_business_api.py - the adapter
  * gateway/config.py - Platform enum + _apply_env_overrides + _token_env_names
  * gateway/run.py - factory in _create_adapter + auth allowlists
  * agent/prompt_builder.py - PLATFORM_HINTS entry
  * toolsets.py - hermes-whatsapp-via-mcp-meta-business-api toolset
  * cron/scheduler.py - cron platform_map entry
  * tools/send_message_tool.py - send_message routing + standalone _send_*
  * tools/cronjob_tools.py - deliver schema mention
  * hermes_cli/status.py - Messaging Platforms status entry
  * hermes_cli/gateway.py - setup wizard _PLATFORMS entry
  * tests/gateway/test_whatsapp_via_mcp_meta_business_api.py - 25 tests
  * website/docs/user-guide/messaging/whatsapp-via-mcp-meta-business-api.md
  * website/docs/user-guide/messaging/index.md - platform comparison row
  * website/docs/reference/environment-variables.md - env vars table
  * README.md - feature row
Adds a callback_query routing hop that detects the "hitl:" prefix and
delegates to a new _handle_hitl_callback method. All other callbacks
(model picker mp:/mm:/mb/mx/mg:, default) flow unchanged.

The HITL handler:
- Validates identity against TELEGRAM_ALLOWED_USERS env (comma-separated
  user IDs); rejected attempts emit hitl.identity.rejected at warn and
  never write to the resolved file.
- Parses callback_data shape "hitl:<approve|reject>:<request_id>".
- Appends a single-line JSONL record to
  $ELIGIA_HITL_DIR/approvals-resolved.jsonl (default /opt/eligia/data/hitl)
  for the eligia-core langgraph workflow polling to pick up.
- Calls answerCallbackQuery so the inline-keyboard button stops spinning.

This is the multiplex implementation for Change A Phase 2b (eligia-core
docs/openspec-slice-phase5.md). Single Elig_ia_bot now serves both
Hermes conversation (existing) and ELIGIA HITL approvals (new).

The existing _handle_callback_query dispatch was extended with one
early-return branch — no other behavior changes.

Auxiliary _log_hitl_event helper appends structured audit events to
$ELIGIA_HITL_DIR/events.jsonl. Best-effort; logging failures are
swallowed to avoid blocking the callback handler.

Apply-fix-contract spec authority lives in eligia-core via the
.ai-playbook submodule pin v0.7.1 (commit 8d5f68c).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…messages

Adds a regex-based pre-check at the top of _handle_webhook that catches
"aprobar X" / "rechazar X" replies (and Spanish/English variants) from
identity-bound senders and writes the resolution directly to the shared
$ELIGIA_HITL_DIR/approvals-resolved.jsonl file, skipping the LLM.

This is the receive half of Change A Phase 3 (WhatsApp HITL) — the send
half lives in eligia-core/langgraph-aiops/channels/whatsapp.py and POSTs
to Meta Graph API directly.

Architectural notes:
- WhatsApp has no inline keyboards. The envelope body explicitly tells
  the user the exact phrases ("aprobar <id>" / "rechazar <id>"). The
  regex is permissive: aprob*, aprueb*, ✅, ok, sí, dale on the approve
  side; rechaz*, recha*, ❌, no on the reject side. Failed pattern match
  falls through to Hermes' normal LLM handler.
- Identity binding via WA_HITL_ARTURO_E164 (env, comma-separated for
  future multi-signer; single-signer in v1). DISTINCT from the
  conversational WHATSAPP_VIA_MCP_META_BUSINESS_API_ALLOWED_USERS list,
  so an HITL-bound number can be a subset of conversational-allowed.
- Non-bound senders are silently passed to the LLM (not "trying" to
  approve, just chatting normally — no audit-log noise).
- Resolution write at $ELIGIA_HITL_DIR (default /opt/eligia/data/hitl,
  same shared volume Telegram uses) keeps both channels coherent for
  the workflow polling side.
- Ack reply ("✅ Aprobado <id>" / "❌ Rechazado <id>") is sent via the
  existing self.send() so the user gets confirmation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3b of Change A. Lets the user approve/reject the most recent open
HITL envelope with a single token — no request_id needed — when an open
request exists in $ELIGIA_HITL_DIR/approvals-pending.jsonl within the
24 h staleness window.

## What changed
- Two new bare-token regexes:
    APPROVE: aprob*/aprueb*/✅/ok/si/sí/dale/yes/y/1/👍
    REJECT:  rechaz*/recha*/❌/no/nope/n/0/👎
  They match a *whole-message* token (no id required).
- Bare-token branch added to _try_parse_hitl_intent AFTER the existing
  explicit "verb + id" match. Explicit form still wins when present.
- New helper _find_most_recent_open_request_id() reads pending JSONL,
  filters out already-resolved + stale (>24 h) entries, returns the
  latest open id (or None).
- When bare-token matches but no open request exists, returns None
  → falls through to the LLM. So plain "1" or "no" without HITL context
  remain conversational with Hermes.

## Why
WhatsApp doesn't have inline keyboards. Typing the full request_id
("aprobar wa-smoke-1") feels like a CLI in a chat client. With this
change the user can reply "1" or "ok" and Hermes resolves the most
recent envelope — same UX shape as a Telegram inline-keyboard tap.

## Compatibility
- Existing "verb + id" form unchanged.
- Existing tests for that form continue to pass (still tested in
  eligia-core/tests/test_channels_whatsapp.py — the explicit pattern is
  exercised on the dispatcher side).
- Bare-token path is gated by presence of $ELIGIA_HITL_DIR/approvals-
  pending.jsonl. If the file is absent or empty, behavior is the
  pre-Phase-3b fall-through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n IVR convention

Phase 3b iteration. Users mentally read numbered choice lists as
"1) yes / 2) no", not as boolean integers (1/0). Switching the bare
reject regex to recognise "2" instead of "0" so the WhatsApp UX feels
natural ("aprobar = 1, rechazar = 2").

The envelope rendering side (langgraph-aiops/channels/whatsapp.py) now
prints an explicit numbered list:

    Cómo responder:
      1 — aprobar
      2 — rechazar

so the user sees the choice mapping inline; no more guessing "is 1
ok or yes?" before tapping the keyboard.

Companion change in eligia-core: channels/whatsapp.py body format +
test assertion update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pre-LLM HITL parser in this WhatsApp adapter was the legacy path
that detected free-form text replies like "aprobar abc-123" / "1" /
"ok" and wrote a resolution to approvals-resolved.jsonl directly.

That path is dead since 2026-05-05: eligia-core landed payload-routes
in waba-mcp (PR #1) + a /webhook/hitl endpoint on aiops (PR NousResearch#126).
WhatsApp HITL approvals now flow as Meta `interactive` button replies
whose `interactive.button_reply.id` starts with `hitl:`. waba-mcp
short-circuits the personal_hermes tag forward when the prefix matches,
sending the click directly to aiops. Hermes never sees those clicks.

Removed:
  - the pre-LLM HITL pre-check in _handle_webhook
  - _try_parse_hitl_intent
  - _find_most_recent_open_request_id
  - _write_hitl_resolution
  - _HITL_APPROVE_RE / _HITL_REJECT_RE / _HITL_BARE_APPROVE_RE / _HITL_BARE_REJECT_RE
  - _HITL_PENDING_STALENESS_HOURS
  - newly-unused imports (re, timedelta, timezone, Path, List)

Why the cleanup matters beyond dead code:
  - Security smell: any free-form text containing "aprobar" stem
    triggered a resolution attempt. Casual chat could fire approvals
    by accident if an in-flight request happened to be open.
  - Identity binding still exists at the channel boundary (waba-mcp
    routing + aiops HMAC) — tighter than env-list match here.

Telegram receive-side (`gateway/platforms/telegram.py`) is UNCHANGED.
That path uses Telegram's native inline keyboard callbacks and is
still the primary HITL channel for cron-style approvals where the
phone might be elsewhere.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Wizarck Wizarck merged commit e6062f0 into main May 5, 2026
@Wizarck Wizarck deleted the chore/remove-wa-hitl-regex-receive branch May 5, 2026 22:51
@coderabbitai

coderabbitai Bot commented May 5, 2026

Copy link
Copy Markdown

Important

Review skipped

Too many files!

This PR contains 300 files, which is 150 over the limit of 150.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fecc240e-2adf-497e-9986-5541cd8a239b

📥 Commits

Reviewing files that changed from the base of the PR and between 64b3547 and 8fb0b60.

📒 Files selected for processing (300)
  • .github/actions/nix-setup/action.yml
  • .github/workflows/docker-publish.yml
  • .github/workflows/nix-lockfile-check.yml
  • .github/workflows/nix-lockfile-fix.yml
  • .github/workflows/nix.yml
  • .github/workflows/supply-chain-audit.yml
  • .github/workflows/tests.yml
  • .gitignore
  • Dockerfile
  • README.md
  • acp_adapter/entry.py
  • agent/anthropic_adapter.py
  • agent/auxiliary_client.py
  • agent/codex_responses_adapter.py
  • agent/context_compressor.py
  • agent/context_references.py
  • agent/credential_pool.py
  • agent/display.py
  • agent/error_classifier.py
  • agent/gemini_cloudcode_adapter.py
  • agent/gemini_native_adapter.py
  • agent/gemini_schema.py
  • agent/insights.py
  • agent/model_metadata.py
  • agent/models_dev.py
  • agent/prompt_builder.py
  • agent/redact.py
  • agent/smart_model_routing.py
  • batch_runner.py
  • cli-config.yaml.example
  • cli.py
  • cron/jobs.py
  • cron/scheduler.py
  • docs/acp-setup.md
  • docs/honcho-integration-spec.html
  • docs/honcho-integration-spec.md
  • docs/migration/openclaw.md
  • docs/plans/2026-03-16-pricing-accuracy-architecture-design.md
  • docs/plans/2026-04-01-ink-gateway-tui-migration-plan.md
  • docs/skins/example-skin.yaml
  • docs/specs/container-cli-review-fixes.md
  • gateway/channel_directory.py
  • gateway/config.py
  • gateway/platforms/api_server.py
  • gateway/platforms/base.py
  • gateway/platforms/dingtalk.py
  • gateway/platforms/discord.py
  • gateway/platforms/feishu.py
  • gateway/platforms/feishu_comment.py
  • gateway/platforms/feishu_comment_rules.py
  • gateway/platforms/matrix.py
  • gateway/platforms/mattermost.py
  • gateway/platforms/signal.py
  • gateway/platforms/slack.py
  • gateway/platforms/telegram.py
  • gateway/platforms/webhook.py
  • gateway/platforms/wecom.py
  • gateway/platforms/whatsapp.py
  • gateway/platforms/whatsapp_via_mcp_meta_business_api.py
  • gateway/run.py
  • gateway/session.py
  • gateway/session_context.py
  • gateway/status.py
  • gateway/stream_consumer.py
  • hermes_cli/auth.py
  • hermes_cli/auth_commands.py
  • hermes_cli/backup.py
  • hermes_cli/codex_models.py
  • hermes_cli/commands.py
  • hermes_cli/config.py
  • hermes_cli/debug.py
  • hermes_cli/doctor.py
  • hermes_cli/dump.py
  • hermes_cli/gateway.py
  • hermes_cli/main.py
  • hermes_cli/model_switch.py
  • hermes_cli/models.py
  • hermes_cli/platforms.py
  • hermes_cli/plugins.py
  • hermes_cli/plugins_cmd.py
  • hermes_cli/profiles.py
  • hermes_cli/providers.py
  • hermes_cli/runtime_provider.py
  • hermes_cli/setup.py
  • hermes_cli/status.py
  • hermes_cli/timeouts.py
  • hermes_cli/tips.py
  • hermes_cli/uninstall.py
  • hermes_cli/web_server.py
  • hermes_cli/webhook.py
  • hermes_state.py
  • mcp_serve.py
  • mini_swe_runner.py
  • model_tools.py
  • nix/checks.nix
  • nix/devShell.nix
  • nix/lib.nix
  • nix/nixosModules.nix
  • nix/packages.nix
  • nix/python.nix
  • nix/tui.nix
  • nix/web.nix
  • optional-skills/autonomous-ai-agents/honcho/SKILL.md
  • optional-skills/creative/touchdesigner-mcp/SKILL.md
  • optional-skills/creative/touchdesigner-mcp/references/mcp-tools.md
  • optional-skills/creative/touchdesigner-mcp/references/network-patterns.md
  • optional-skills/creative/touchdesigner-mcp/references/operators.md
  • optional-skills/creative/touchdesigner-mcp/references/pitfalls.md
  • optional-skills/creative/touchdesigner-mcp/references/python-api.md
  • optional-skills/creative/touchdesigner-mcp/references/troubleshooting.md
  • optional-skills/creative/touchdesigner-mcp/scripts/setup.sh
  • optional-skills/mcp/mcporter/SKILL.md
  • optional-skills/mlops/clip/SKILL.md
  • optional-skills/mlops/clip/references/applications.md
  • optional-skills/mlops/guidance/SKILL.md
  • optional-skills/mlops/guidance/references/backends.md
  • optional-skills/mlops/guidance/references/constraints.md
  • optional-skills/mlops/guidance/references/examples.md
  • optional-skills/mlops/hermes-atropos-environments/SKILL.md
  • optional-skills/mlops/modal/SKILL.md
  • optional-skills/mlops/modal/references/advanced-usage.md
  • optional-skills/mlops/modal/references/troubleshooting.md
  • optional-skills/mlops/peft/SKILL.md
  • optional-skills/mlops/peft/references/advanced-usage.md
  • optional-skills/mlops/peft/references/troubleshooting.md
  • optional-skills/mlops/pytorch-fsdp/SKILL.md
  • optional-skills/mlops/pytorch-fsdp/references/index.md
  • optional-skills/mlops/pytorch-fsdp/references/other.md
  • optional-skills/mlops/stable-diffusion/SKILL.md
  • optional-skills/mlops/stable-diffusion/references/advanced-usage.md
  • optional-skills/mlops/stable-diffusion/references/troubleshooting.md
  • optional-skills/mlops/whisper/SKILL.md
  • optional-skills/mlops/whisper/references/languages.md
  • optional-skills/productivity/telephony/SKILL.md
  • optional-skills/research/duckduckgo-search/SKILL.md
  • optional-skills/research/duckduckgo-search/scripts/duckduckgo.sh
  • plans/gemini-oauth-provider.md
  • plugins/disk-cleanup/README.md
  • plugins/disk-cleanup/__init__.py
  • plugins/disk-cleanup/disk_cleanup.py
  • plugins/disk-cleanup/plugin.yaml
  • plugins/memory/honcho/__init__.py
  • plugins/memory/honcho/cli.py
  • plugins/memory/honcho/client.py
  • plugins/memory/honcho/session.py
  • pyproject.toml
  • run_agent.py
  • scripts/install.ps1
  • scripts/install.sh
  • scripts/release.py
  • scripts/whatsapp-bridge/bridge.js
  • skills/autonomous-ai-agents/hermes-agent/SKILL.md
  • skills/creative/baoyu-infographic/PORT_NOTES.md
  • skills/creative/baoyu-infographic/SKILL.md
  • skills/creative/baoyu-infographic/references/analysis-framework.md
  • skills/creative/baoyu-infographic/references/base-prompt.md
  • skills/creative/baoyu-infographic/references/layouts/bento-grid.md
  • skills/creative/baoyu-infographic/references/layouts/binary-comparison.md
  • skills/creative/baoyu-infographic/references/layouts/bridge.md
  • skills/creative/baoyu-infographic/references/layouts/circular-flow.md
  • skills/creative/baoyu-infographic/references/layouts/comic-strip.md
  • skills/creative/baoyu-infographic/references/layouts/comparison-matrix.md
  • skills/creative/baoyu-infographic/references/layouts/dashboard.md
  • skills/creative/baoyu-infographic/references/layouts/dense-modules.md
  • skills/creative/baoyu-infographic/references/layouts/funnel.md
  • skills/creative/baoyu-infographic/references/layouts/hierarchical-layers.md
  • skills/creative/baoyu-infographic/references/layouts/hub-spoke.md
  • skills/creative/baoyu-infographic/references/layouts/iceberg.md
  • skills/creative/baoyu-infographic/references/layouts/isometric-map.md
  • skills/creative/baoyu-infographic/references/layouts/jigsaw.md
  • skills/creative/baoyu-infographic/references/layouts/linear-progression.md
  • skills/creative/baoyu-infographic/references/layouts/periodic-table.md
  • skills/creative/baoyu-infographic/references/layouts/story-mountain.md
  • skills/creative/baoyu-infographic/references/layouts/structural-breakdown.md
  • skills/creative/baoyu-infographic/references/layouts/tree-branching.md
  • skills/creative/baoyu-infographic/references/layouts/venn-diagram.md
  • skills/creative/baoyu-infographic/references/layouts/winding-roadmap.md
  • skills/creative/baoyu-infographic/references/structured-content-template.md
  • skills/creative/baoyu-infographic/references/styles/aged-academia.md
  • skills/creative/baoyu-infographic/references/styles/bold-graphic.md
  • skills/creative/baoyu-infographic/references/styles/chalkboard.md
  • skills/creative/baoyu-infographic/references/styles/claymation.md
  • skills/creative/baoyu-infographic/references/styles/corporate-memphis.md
  • skills/creative/baoyu-infographic/references/styles/craft-handmade.md
  • skills/creative/baoyu-infographic/references/styles/cyberpunk-neon.md
  • skills/creative/baoyu-infographic/references/styles/hand-drawn-edu.md
  • skills/creative/baoyu-infographic/references/styles/ikea-manual.md
  • skills/creative/baoyu-infographic/references/styles/kawaii.md
  • skills/creative/baoyu-infographic/references/styles/knolling.md
  • skills/creative/baoyu-infographic/references/styles/lego-brick.md
  • skills/creative/baoyu-infographic/references/styles/morandi-journal.md
  • skills/creative/baoyu-infographic/references/styles/origami.md
  • skills/creative/baoyu-infographic/references/styles/pixel-art.md
  • skills/creative/baoyu-infographic/references/styles/pop-laboratory.md
  • skills/creative/baoyu-infographic/references/styles/retro-pop-grid.md
  • skills/creative/baoyu-infographic/references/styles/storybook-watercolor.md
  • skills/creative/baoyu-infographic/references/styles/subway-map.md
  • skills/creative/baoyu-infographic/references/styles/technical-schematic.md
  • skills/creative/baoyu-infographic/references/styles/ui-wireframe.md
  • skills/creative/pixel-art/ATTRIBUTION.md
  • skills/creative/pixel-art/SKILL.md
  • skills/creative/pixel-art/references/palettes.md
  • skills/creative/pixel-art/scripts/__init__.py
  • skills/creative/pixel-art/scripts/palettes.py
  • skills/creative/pixel-art/scripts/pixel_art.py
  • skills/creative/pixel-art/scripts/pixel_art_video.py
  • skills/devops/webhook-subscriptions/SKILL.md
  • skills/leisure/find-nearby/SKILL.md
  • skills/leisure/find-nearby/scripts/find_nearby.py
  • skills/mcp/DESCRIPTION.md
  • skills/mlops/cloud/DESCRIPTION.md
  • skills/mlops/inference/gguf/SKILL.md
  • skills/mlops/inference/llama-cpp/SKILL.md
  • skills/mlops/inference/llama-cpp/references/advanced-usage.md
  • skills/mlops/inference/llama-cpp/references/troubleshooting.md
  • skills/mlops/models/DESCRIPTION.md
  • skills/mlops/training/grpo-rl-training/README.md
  • skills/mlops/training/trl-fine-tuning/SKILL.md
  • skills/mlops/training/trl-fine-tuning/references/grpo-training.md
  • skills/mlops/training/trl-fine-tuning/templates/basic_grpo_training.py
  • skills/productivity/maps/SKILL.md
  • skills/productivity/maps/scripts/maps_client.py
  • skills/social-media/xitter/SKILL.md
  • skills/social-media/xurl/SKILL.md
  • tests/acp/test_ping_suppression.py
  • tests/agent/test_auxiliary_client.py
  • tests/agent/test_auxiliary_client_anthropic_custom.py
  • tests/agent/test_auxiliary_main_first.py
  • tests/agent/test_bedrock_integration.py
  • tests/agent/test_codex_cloudflare_headers.py
  • tests/agent/test_context_compressor.py
  • tests/agent/test_context_references.py
  • tests/agent/test_credential_pool_routing.py
  • tests/agent/test_display.py
  • tests/agent/test_error_classifier.py
  • tests/agent/test_gemini_cloudcode.py
  • tests/agent/test_gemini_native_adapter.py
  • tests/agent/test_insights.py
  • tests/agent/test_memory_provider.py
  • tests/agent/test_memory_user_id.py
  • tests/agent/test_prompt_builder.py
  • tests/agent/test_redact.py
  • tests/agent/test_smart_model_routing.py
  • tests/agent/test_subagent_progress.py
  • tests/cli/test_cli_external_editor.py
  • tests/cli/test_cli_markdown_rendering.py
  • tests/cli/test_cli_provider_resolution.py
  • tests/cli/test_cli_save_config_value.py
  • tests/cli/test_cli_status_bar.py
  • tests/cli/test_cli_user_message_preview.py
  • tests/cli/test_fast_command.py
  • tests/cli/test_gquota_command.py
  • tests/cli/test_manual_compress.py
  • tests/cli/test_quick_commands.py
  • tests/cli/test_reasoning_command.py
  • tests/cli/test_resume_display.py
  • tests/cron/test_codex_execution_paths.py
  • tests/cron/test_scheduler.py
  • tests/e2e/conftest.py
  • tests/e2e/test_discord_adapter.py
  • tests/gateway/restart_test_helpers.py
  • tests/gateway/test_api_server_multimodal.py
  • tests/gateway/test_cancel_background_drain.py
  • tests/gateway/test_channel_directory.py
  • tests/gateway/test_command_bypass_active_session.py
  • tests/gateway/test_dingtalk.py
  • tests/gateway/test_discord_channel_prompts.py
  • tests/gateway/test_discord_race_polish.py
  • tests/gateway/test_discord_send.py
  • tests/gateway/test_dm_topics.py
  • tests/gateway/test_fast_command.py
  • tests/gateway/test_feishu.py
  • tests/gateway/test_feishu_comment.py
  • tests/gateway/test_feishu_comment_rules.py
  • tests/gateway/test_pending_drain_race.py
  • tests/gateway/test_pending_event_none.py
  • tests/gateway/test_proxy_mode.py
  • tests/gateway/test_restart_drain.py
  • tests/gateway/test_restart_redelivery_dedup.py
  • tests/gateway/test_restart_resume_pending.py
  • tests/gateway/test_run_progress_topics.py
  • tests/gateway/test_runner_startup_failures.py
  • tests/gateway/test_running_agent_session_toggles.py
  • tests/gateway/test_safe_adapter_disconnect.py
  • tests/gateway/test_session_race_guard.py
  • tests/gateway/test_session_store_prune.py
  • tests/gateway/test_signal.py
  • tests/gateway/test_slack.py
  • tests/gateway/test_status.py
  • tests/gateway/test_status_command.py
  • tests/gateway/test_steer_command.py
  • tests/gateway/test_stream_consumer.py
  • tests/gateway/test_telegram_documents.py
  • tests/gateway/test_telegram_mention_boundaries.py
  • tests/gateway/test_text_batching.py
  • tests/gateway/test_unauthorized_dm_behavior.py
  • tests/gateway/test_voice_command.py
  • tests/gateway/test_voice_mode_platform_isolation.py
  • tests/gateway/test_webhook_deliver_only.py
  • tests/gateway/test_webhook_signature_rate_limit.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/remove-wa-hitl-regex-receive

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

Wizarck pushed a commit that referenced this pull request May 6, 2026
* ci(nix): auto-fix stale npm hashes on push to main

When a PR merges to main with updated package-lock.json or package.json
in ui-tui/ or web/, the new auto-fix-main job detects stale npmDepsHash
values and pushes a fix commit directly to main.

This eliminates the recurring manual hash-bump PRs (NousResearch#15420, NousResearch#15314,
NousResearch#15272, NousResearch#15244) by reusing the existing fix-lockfiles --apply pipeline.

The fix commit only touches nix/*.nix files, which are outside the push
path filter (package-lock.json / package.json), so it cannot re-trigger
itself.

Closes NousResearch#15314

* fix(ci): use GitHub App token for auto-fix-main push

GITHUB_TOKEN commits are invisible to workflow triggers (GitHub's
infinite-loop prevention). The auto-fix-main job pushes directly to
main, so the fix commit never triggered downstream nix.yml verification.

Mint a short-lived token via the repo's GitHub App (daimon-nous, APP_ID
+ APP_PRIVATE_KEY secrets) so the push is treated as a real event and
nix.yml fires to verify the corrected hashes.

Tested via workflow_dispatch dry-run: app token minted successfully,
checkout with app token succeeded, fix job correctly gated.

Resolves review feedback from Bugbot (r3144569551).

* ci(nix): rename lockfile check job for required status check

Rename 'check' → 'nix-lockfile-check' so the status check name is
unambiguous when added as a required check on main.

* fix(ci): harden auto-fix-main against races, loops, and silent failures

Address adversarial review findings:

1. Race condition (#1): Job-level concurrency with cancel-in-progress
   collapses back-to-back pushes; ref: main checkout always gets latest
   branch state; explicit push target (origin HEAD:main).

2. Loop prevention (#2): File-whitelist check before commit aborts if
   any file outside nix/{tui,web}.nix was modified, preventing
   accidental self-triggering.

3. Silent infra failures (#8): nix-lockfile-check now fails explicitly
   when fix-lockfiles exits without reporting stale status (catches nix
   setup failures, network errors, script bugs that bypass continue-on-error).

4. Commit traceability (#11): Auto-fix commits include source SHA and
   workflow run URL in the commit body.

5. Explicit push target (#12): git push origin HEAD:main instead of
   bare git push.

---------

Co-authored-by: alt-glitch <alt-glitch@users.noreply.github.com>
Wizarck pushed a commit that referenced this pull request May 6, 2026
…ch#16706)

* fix(tui): drop stale stream events after ctrl-c interrupt

Once interruptTurn() flips this.interrupted, only recordMessageDelta
short-circuited.  recordReasoningDelta/Available, recordToolStart/
Progress/Complete, and recordInlineDiffToolComplete kept populating
turnState until the python loop reached its next _interrupt_requested
check (~1s on busy turns), making it look like ctrl-c was ignored
while late "thinking" + tool calls kept landing in the UI.

Add the same interrupted guard to every stream-side recorder, and
clear the flag at startMessage() so the next turn isn't suppressed
if the previous turn never delivered message.complete.

* fix(tui): guard recordTodos against post-interrupt mutation; fake-timers in test

Copilot review on PR NousResearch#16706:

1. `recordToolStart` is interruption-guarded, but `tool.start`
   handler also calls `recordTodos(payload.todos)` first — so a
   late tool.start carrying todos could still mutate `turnState.todos`
   after Ctrl-C, leaving ghost rows in the panel.  Adds the same
   `if (this.interrupted) return` early-exit to `recordTodos` so
   *all* tool.start side-effects are dropped post-interrupt.

2. The interrupt test was leaking a real `setTimeout` (interrupt
   cooldown) across test files, which could fire later and mutate
   uiStore from the wrong test context.  Wraps the test in
   `vi.useFakeTimers()` + `vi.runAllTimers()` and restores real
   timers in finally.

3. Extends the same test with a todos payload on the post-interrupt
   tool.start so we have explicit regression coverage for #1.

* fix(tui): guard pushTrail post-interrupt; harden interrupt-test cleanup

Round 2 Copilot review on PR NousResearch#16706:

1. `tool.generating` events route through `pushTrail`, which was not
   interruption-guarded — late events could still write 'drafting …'
   into `turnTrail` after Ctrl-C, leaving a stale shimmer in the UI.
   Adds the same `if (this.interrupted) return` early-exit.

2. Test cleanup moved `vi.runAllTimers()` into `finally` (before
   `vi.useRealTimers()`) so a mid-test assertion failure can't leak
   the interrupt-cooldown setTimeout across other test files.

3. Replaced the misleading 'pre-interrupt todos … expected to be
   cleared by the interrupt cycle' comment with an accurate one
   reflecting current behaviour (interrupt does NOT clear todos).

4. Added an explicit assertion that a post-interrupt `tool.generating`
   event does not extend `turnTrail` — regression coverage for #1.
Wizarck added a commit that referenced this pull request May 6, 2026
Sync of fork with upstream/main (first sync since #1 WABA merge). Resolved
~219 file conflicts; ~210 of them were upstream-only changes (refactors of
agent/, tests/, hermes_cli/, web/, ui-tui/, nix/, docs/, etc.) where we had
no fork-specific changes — took theirs. The 10 WABA-touchpoint files needed
manual resolution to preserve the integration:

  * gateway/config.py — kept Platform.WHATSAPP_VIA_MCP_META_BUSINESS_API +
    env loader; added Platform.WEB_VIA_HTTP_SSE for the new platform.
  * gateway/run.py — kept WABA factory + auth allowlists; added WEB_VIA_HTTP_SSE
    factory + auth allowlists.
  * agent/prompt_builder.py — kept WABA PLATFORM_HINTS; added web_via_http_sse hint.
  * toolsets.py — restored hermes-whatsapp-via-mcp-meta-business-api toolset
    (upstream refactor dropped it) + added hermes-web-via-http-sse; both
    re-included in the hermes-gateway composite.
  * cron/scheduler.py, tools/send_message_tool.py, hermes_cli/{status,gateway}.py,
    website/docusaurus.config.ts — upstream refactored these to use the
    Platform enum dynamically (via _missing_) instead of explicit per-platform
    static maps; took theirs since WABA dispatch now works via the enum.

Net new: gateway/platforms/web_via_http_sse.py (~430 LOC, generic SSE
platform mirroring the WABA structure — new file, no conflict).

Verified: ast.parse passes for all modified Python files; gateway/platforms/
whatsapp_via_mcp_meta_business_api.py untouched (binary-equivalent); no merge
markers remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wizarck pushed a commit that referenced this pull request May 15, 2026
…registries

Both web_search_registry._resolve() and image_gen_registry.get_active_provider()
walked their registered providers and returned the first one matching the
capability flag — without checking whether that provider was actually
usable. On a fresh install with no credentials at all, this meant
get_active_search_provider() returned `brave-free` (legacy preference
order) even though BRAVE_SEARCH_API_KEY was unset, leading the
dispatcher to surface a "BRAVE_SEARCH_API_KEY is not set" error for a
provider the user never chose. Same bug shape in image_gen for FAL.

Resolution semantics now match tools.web_tools._get_backend():

  1. Explicit config name wins, ignoring is_available() — the dispatcher
     surfaces a precise "X_API_KEY is not set" error rather than silently
     switching backends. Matches user expectation: "I configured X, tell
     me what's wrong with X."
  2. Fallback (no explicit config) walks the legacy preference order
     filtered by is_available() — pick the highest-priority backend the
     user actually has credentials for.

is_available() is wrapped in a try/except so a buggy provider doesn't
brick resolution.

E2E verified:
  - No creds + no config: get_active_search_provider() -> None
  - Explicit brave-free + no key: get_active_search_provider() -> brave-free
    (and .is_available() correctly reports False)

This fix was identified during the spike (NousResearch#25182 finding #1) and is
fold-in to the same PR rather than a follow-up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.