Skip to content

fix(hermes): add Discord facade#3293

Merged
ericksoa merged 13 commits into
mainfrom
aerickson/pr-3238-main-discord-facade-replay
May 9, 2026
Merged

fix(hermes): add Discord facade#3293
ericksoa merged 13 commits into
mainfrom
aerickson/pr-3238-main-discord-facade-replay

Conversation

@ericksoa

@ericksoa ericksoa commented May 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Commit shape

Local validation

  • python3 -m py_compile agents/hermes/discord-facade.py agents/hermes/decode-proxy.py
  • bash -n agents/hermes/start.sh test/e2e/test-hermes-discord-e2e.sh
  • npm run build:cli
  • npx vitest run test/hermes-discord-facade.test.ts test/generate-hermes-config.test.ts src/lib/agent/runtime.test.ts test/sandbox-init.test.ts test/sandbox-provisioning.test.ts test/policies.test.ts
  • git diff --check

Original PRs: #3153, #3238

Signed-off-by: Aaron Erickson aerickson@nvidia.com

Summary by CodeRabbit

  • New Features

    • Local Discord facade added to emulate Discord Gateway+REST in the Hermes sandbox; runtime now exposes Discord proxy/facade URLs and augments PYTHONPATH for preload routing.
    • Sandbox policies updated to allow facade preload files and required Discord network routes.
  • Tests

    • Extensive unit and E2E coverage added for gateway auth boundary, REST forwarding/polling, interaction signature handling/token localization, preload routing, startup/init, and sandbox policy validation.

@coderabbitai

coderabbitai Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 283b1c01-98ba-4bb1-9f79-aab21b313390

📥 Commits

Reviewing files that changed from the base of the PR and between f7550cb and e6e5ead.

📒 Files selected for processing (1)
  • test/e2e/test-hermes-discord-e2e.sh

📝 Walkthrough

Walkthrough

This PR adds a local Discord REST/Gateway facade service to Hermes, a preload that rewrites aiohttp Discord traffic to the facade, Dockerfile and sandbox policy updates, start.sh wiring to run the facade and export Discord envs/PYTHONPATH, runtime recovery updates, and tests covering facade behavior and policies.

Changes

Discord Facade for Hermes Gateway Emulation

Layer / File(s) Summary
Data Contracts & Configuration
agents/hermes/config/messaging-config.ts, agents/hermes/Dockerfile
Adds internal Discord env constants; buildMessagingEnvLines() emits DISCORD_PROXY, NEMOCLAW_DISCORD_FACADE_URL, and NEMOCLAW_DISCORD_GUILD_IDS when discord is enabled; Dockerfile docs updated.
Build & Security Policy
agents/hermes/Dockerfile, agents/hermes/policy-additions.yaml, agents/hermes/policy-permissive.yaml
Installs nemoclaw-discord-facade and discord-preload/ with permissions; policies allowlist preload path and add scoped Discord network routes (gateway, commands, messages, reactions, webhooks) with explicit tls attributes where specified.
Preload & Decode-Proxy Clarifications
agents/hermes/discord-preload/sitecustomize.py, agents/hermes/decode-proxy.py
sitecustomize.py monkey-patches aiohttp to rewrite Discord REST (/api*) and gateway WebSocket (gateway.discord.gg) URLs to the local facade; decode-proxy.py docstring clarifies it does not inspect/modify WebSocket frames.
Sandbox Startup
agents/hermes/start.sh
Adds start_discord_facade(), uses Hermes venv python for proxy/facade, exports NEMOCLAW_DISCORD_FACADE_URL, prepends preload to PYTHONPATH, injects DISCORD_PROXY/facade into proxy env, launches facade before Hermes gateway, and tracks facade PID.
Facade Core Implementation
agents/hermes/discord-facade.py
aiohttp server with /gateway WebSocket handling (IDENTIFY/RESUME enforcement), READY/RESUMED opcodes, session/sequence tracking, broadcast dispatch, and lifecycle management.
REST Routing & Interaction Handling
agents/hermes/discord-facade.py
Handles gateway status and @me, application command storage, interaction webhook signature verification and token localization, webhook/callback token rewriting, forwarding to discord.com with hop-by-hop header filtering and token-redacted logs.
Polling & Normalization
agents/hermes/discord-facade.py
Polling loop and guild/channel discovery, per-channel message and thread polling, message normalization/change detection, reaction-count delta computation, and synthesized gateway events (MESSAGE_/THREAD_ and reaction add/remove).
Optional Tunnel & Main Wiring
agents/hermes/discord-facade.py
Optional cloudflared tunnel support, public interactions listener, interactions endpoint registration, main() wiring, and graceful shutdown.
Runtime Recovery & CLI Wiring
src/lib/agent/runtime.ts, src/lib/agent/runtime.test.ts
hermesGatewayEnvPrefix() includes Discord proxy/facade/PYTHONPATH; recovery command uses venv python, readiness checks for proxy (3129) and facade (3130), log setup, and retry loops; tests updated to assert presence of facade-related env/commands.
Unit Tests: Facade & Preload
test/hermes-discord-facade.test.ts
Vitest tests exercise in-process facade/preload: gateway IDENTIFY validation, REST forwarding, polling→event synthesis, Ed25519 signature verification/token localization, callback token mapping, and aiohttp rewriting.
Unit Tests: Config, Policy & Provisioning
test/generate-hermes-config.test.ts, test/policies.test.ts, test/sandbox-provisioning.test.ts
Validates generated .env contains Discord proxy/facade/guild IDs; policy YAML tests enforce gateway L4 WebSocket shape and scoped REST mutation allowlist; preload path allowlisted in policies.
Integration & E2E Tests
test/e2e/test-hermes-discord-e2e.sh, test/e2e/test-messaging-providers.sh, test/sandbox-init.test.ts
E2E checks for facade /health, gateway protocol (HELLO/READY/heartbeat), token isolation, env/guild propagation, gateway-auth-boundary log confirmation, and start.sh assertions (venv python usage, privileged launch, logs).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

I nibble code beneath the moonlit log,
A rabbit patches paths and guards the proxy bog,
Facade hums local, tokens tucked away,
Polls turn to events and gateways softly play,
Sandbox hops happy — bravo, tiny cog! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(hermes): add Discord facade' directly describes the main change—adding Discord facade functionality to Hermes. It is concise, specific, and clearly summarizes the primary objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch aerickson/pr-3238-main-discord-facade-replay

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agents/hermes/discord-facade.py`:
- Around line 160-163: The public tunnel currently exposes the full facade
(routes registered via web.get("/gateway", self.handle_gateway), web.route("*",
"/api/{tail:.*}", self.handle_rest)) which allows _forward_rest() to accept
caller headers (including Authorization) and lets external users drive
bot-authenticated calls via _run_tunnel_command(); restrict the public tunnel to
only expose web.post("/interactions", self.handle_interaction) or instead create
a separate aiohttp/web listener for the public webhook and move
handle_gateway/handle_rest (and any _forward_rest logic) behind an auth gate
that strips/validates Authorization headers before proxying. Update the tunnel
registration logic in _run_tunnel_command and routing setup so only
handle_interaction is published, or implement a new private listener and attach
handle_gateway/handle_rest there with header sanitization.
- Around line 625-634: _handle_interaction_callback currently returns a local
204 for callbacks whose local_token is in self._interaction_tokens, which
discards the bot's real callback instead of forwarding it; update the logic so
that when local_token matches an entry in self._interaction_tokens you map it to
the original Discord token and forward the request upstream via _forward_rest
(using the mapped real token) instead of immediately returning 204, or
alternatively remove the eager ACK in handle_interaction so the bot performs the
initial {"type":5} itself; locate references to _interaction_tokens,
handle_interaction, _handle_interaction_callback, _forward_rest and _read_bytes
and implement the token mapping + forwarding path or the removal of the eager
ACK consistently (also apply the same fix to the duplicated block at lines
649-662).
- Around line 343-382: The MESSAGE_DELETE synthesis in _poll_channel_messages is
producing false deletes because it diffs the cached IDs against only the latest
25 messages; to mitigate, change _poll_channel_messages so it does not
synthesize deletes when the fetched page is saturated (len(messages) == 25) —
i.e., if the page is full, skip the loop that computes previous_ids -
current_ids and emitting MESSAGE_DELETE via dispatch_to_all; keep all other
behavior (normal CREATE/UPDATE/reaction handling and updating
_channel_message_ids and _message_cache) the same so the facade remains accurate
until a proper paginated/tombstone strategy is implemented.

In `@test/hermes-discord-facade.test.ts`:
- Around line 11-25: The test helpers leak parent DISCORD_* and TELEGRAM_*
environment variables into child Python processes; update both the
hasCryptography spawnSync call and the runPython function to use a sanitized
child environment: create sanitizedEnv by copying process.env but removing any
keys starting with "DISCORD_" or "TELEGRAM_", then when invoking spawnSync pass
env: { ...sanitizedEnv, ...env } (for hasCryptography pass sanitizedEnv as the
env arg, and for runPython use sanitizedEnv as the base before layering the
function's env parameter) so tests remain hermetic; reference hasCryptography
and runPython when making this change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6bb60e3e-cd76-4466-ac14-87440d193026

📥 Commits

Reviewing files that changed from the base of the PR and between 30545bb and 6e77d48.

📒 Files selected for processing (17)
  • agents/hermes/Dockerfile
  • agents/hermes/config/messaging-config.ts
  • agents/hermes/decode-proxy.py
  • agents/hermes/discord-facade.py
  • agents/hermes/discord-preload/sitecustomize.py
  • agents/hermes/policy-additions.yaml
  • agents/hermes/policy-permissive.yaml
  • agents/hermes/start.sh
  • src/lib/agent/runtime.test.ts
  • src/lib/agent/runtime.ts
  • test/e2e/test-hermes-discord-e2e.sh
  • test/e2e/test-messaging-providers.sh
  • test/generate-hermes-config.test.ts
  • test/hermes-discord-facade.test.ts
  • test/policies.test.ts
  • test/sandbox-init.test.ts
  • test/sandbox-provisioning.test.ts

Comment thread agents/hermes/discord-facade.py
Comment thread agents/hermes/discord-facade.py
Comment thread agents/hermes/discord-facade.py
Comment thread test/hermes-discord-facade.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
agents/hermes/discord-facade.py (1)

846-858: 💤 Low value

Store reference to background task for proper lifecycle management.

The task created by asyncio.create_task(_capture_url()) is not stored, which means:

  1. Exceptions raised in _capture_url will be silently ignored
  2. The task cannot be awaited or cancelled during shutdown
  3. Risk of garbage collection (though unlikely here due to stdout reference)
♻️ Suggested fix

Return the task from the function and manage it in main():

-    asyncio.create_task(_capture_url())
-    return proc
+    capture_task = asyncio.create_task(_capture_url())
+    return proc, capture_task

Then in main(), store and optionally cancel on shutdown.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/hermes/discord-facade.py` around lines 846 - 858, The background task
started with asyncio.create_task(_capture_url()) inside the function that
currently returns only proc should be preserved and surfaced so it can be
awaited/cancelled and so exceptions aren't lost; change the call site that
creates the task (the asyncio.create_task(_capture_url()) line) to capture the
Task object (e.g., task = asyncio.create_task(_capture_url())), ensure the
coroutine _capture_url has proper try/except logging or attach a done callback
to log exceptions, and return or attach that Task alongside the proc (or return
a tuple (proc, task)) so callers (e.g., main()) can store, await, and cancel the
task during shutdown.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agents/hermes/discord-facade.py`:
- Line 136: The _interaction_tokens dict in discord-facade.py is unbounded
(tokens are inserted in _localize_interaction_token and never removed); change
it to a bounded/TLL-aware structure and prune expired entries: replace or wrap
self._interaction_tokens with a TTL-aware cache (e.g., a small LRU or TTL cache
keyed by interaction id with 15-minute expiry) or store values as (token,
expiry_timestamp) and in _localize_interaction_token (and any lookup paths)
remove stale entries before inserting/returning; ensure cleanup runs on
insert/lookup to prevent growth and keep behavior identical otherwise.

---

Nitpick comments:
In `@agents/hermes/discord-facade.py`:
- Around line 846-858: The background task started with
asyncio.create_task(_capture_url()) inside the function that currently returns
only proc should be preserved and surfaced so it can be awaited/cancelled and so
exceptions aren't lost; change the call site that creates the task (the
asyncio.create_task(_capture_url()) line) to capture the Task object (e.g., task
= asyncio.create_task(_capture_url())), ensure the coroutine _capture_url has
proper try/except logging or attach a done callback to log exceptions, and
return or attach that Task alongside the proc (or return a tuple (proc, task))
so callers (e.g., main()) can store, await, and cancel the task during shutdown.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b02976f5-5219-44f0-9a11-5bc998d42e18

📥 Commits

Reviewing files that changed from the base of the PR and between 6e77d48 and be9b2b6.

📒 Files selected for processing (2)
  • agents/hermes/discord-facade.py
  • test/hermes-discord-facade.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/hermes-discord-facade.test.ts

Comment thread agents/hermes/discord-facade.py Outdated
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa self-assigned this May 8, 2026
@ericksoa ericksoa added v0.0.38 integration: hermes Hermes integration behavior bug Something fails against expected or documented behavior integration: discord Discord integration or channel behavior labels May 8, 2026
@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 25583784468
Branch: aerickson/pr-3238-main-discord-facade-replay
Requested jobs: hermes-discord-e2e,hermes-e2e,messaging-providers-e2e,token-rotation-e2e,rebuild-hermes-e2e,rebuild-hermes-stale-base-e2e
Summary: 5 passed, 1 failed, 0 skipped

Job Result
hermes-discord-e2e ❌ failure
hermes-e2e ✅ success
messaging-providers-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
token-rotation-e2e ✅ success

Failed jobs: hermes-discord-e2e. Check run artifacts for logs.

@github-actions

github-actions Bot commented May 9, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 25587605524
Branch: aerickson/pr-3238-main-discord-facade-replay
Requested jobs: hermes-discord-e2e,hermes-e2e,messaging-providers-e2e,token-rotation-e2e,rebuild-hermes-e2e,rebuild-hermes-stale-base-e2e
Summary: 6 passed, 0 failed, 0 skipped

Job Result
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
messaging-providers-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
token-rotation-e2e ✅ success

@ericksoa ericksoa merged commit 1bfc136 into main May 9, 2026
16 of 17 checks passed
@ericksoa

ericksoa commented May 9, 2026

Copy link
Copy Markdown
Contributor Author

Post-merge housekeeping: the superseded source/replay PRs have been closed and labeled now that this main-based replay landed.

Thanks again to Ben Barclay and the NousResearch team for the partnership on this Hermes Discord work. Their follow-up in #3238 helped get the facade path from plausible to actually wired through the Hermes sandbox runtime, and that work is represented in this merged replay alongside the maintainer conflict-resolution and validation commits.

@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 3, 2026
@wscurran wscurran removed the bug Something fails against expected or documented behavior label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression integration: discord Discord integration or channel behavior integration: hermes Hermes integration behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants