fix(hermes): normalize Telegram send_message pseudo-call leaks by chengjiew · Pull Request #4175 · NVIDIA/NemoClaw

chengjiew · 2026-05-25T07:54:03Z

Summary

Fixes the NemoHermes first Telegram reply path so a raw send_message: "to telegram: ..." pseudo-tool response is not delivered to the user as plain text.

The PR adds both a first-turn prompt guard for messaging platforms and a narrow final-response normalizer for targeted raw messaging pseudo-calls.

Related Issue

Fixes #3893

Changes

Normalize whole-response raw messaging pseudo-calls such as send_message: "to telegram: Hello" into the actual message body before gateway delivery.
Add first-turn NemoClaw context telling messaging agents to reply to the current chat with normal assistant text, and reserve send_message for explicit cross-platform delivery requests.
Add Hermes plugin regression coverage for the normalizer and Telegram first-turn context.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Additional verification run locally:

npm test -- test/hermes-plugin-handlers.test.ts test/generate-hermes-config.test.ts
python3 -m py_compile agents/hermes/plugin/__init__.py
npm run build:cli
git diff --check HEAD~1..HEAD

Note: local pre-commit was not used for the final commit because its CLI coverage hook recursively triggered hooks through a temporary Git fixture. The targeted checks above were run explicitly after rebasing onto latest origin/main.

Signed-off-by: Chengjie Wang chengjiew@nvidia.com

Summary by CodeRabbit

Bug Fixes
- Prevented raw tool-like messaging strings from appearing in chat when Hermes is used as a gateway for messaging platforms.
Improvements
- Normalized and cleaned messaging outputs so assistant replies appear as natural chat text; cross-platform sends occur only when explicitly requested and platform-aware behavior is maintained across sessions.
Tests
- Added automated tests to verify messaging normalization and correct platform-specific reply behavior.

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>

copy-pr-bot · 2026-05-25T07:54:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-05-25T07:54:15Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 94c4bf01-b088-4092-b418-c56d918490b3

📥 Commits

Reviewing files that changed from the base of the PR and between f43e6e3 and fb02ff2.

📒 Files selected for processing (2)

agents/hermes/plugin/__init__.py
test/hermes-plugin-handlers.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

test/hermes-plugin-handlers.test.ts
agents/hermes/plugin/init.py

📝 Walkthrough

Walkthrough

Detects and extracts intended message bodies from raw send_message: pseudo-tool outputs for same-platform deliveries, patches the agent think-block stripper to return normalized text, and injects a platform-aware NemoClaw context to avoid raw tool-like final responses; tests cover normalization and context injection.

Changes

Messaging Platform Response Handling

Layer / File(s)	Summary
Message Response Normalization and Patching `agents/hermes/plugin/__init__.py`	Adds `re` import, patch guard, platform identifiers and regexes, helpers to strip quotes and extract message bodies, current-platform tracking, and installs a guarded monkeypatch on `run_agent.AIAgent._strip_think_blocks` at register, `_pre_llm_call`, and `on_session_start`.
Context Instruction for Messaging Platforms `agents/hermes/plugin/__init__.py`	Refactors NemoClaw context construction to build ordered context lines and conditionally append a platform-specific instruction that tells the model to reply with normal assistant text for known messaging adapters and to avoid emitting raw `send_message:` / `to <platform>:` strings as final chat output.
Test Validation `test/hermes-plugin-handlers.test.ts`	Adds two Vitest tests (via Python harness): one verifies targeted/untargeted `send_message` normalization, cross-platform and unknown-platform blocking, and patched `_strip_think_blocks` behavior; the other verifies `_pre_llm_call` injects the Telegram-first-turn reply constraint.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through logs at break of dawn,
Found raw calls where chat belonged,
With regex teeth and patching paw,
I trimmed the quotes and fixed the flaw,
Now Hermes greets with proper song.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main fix: normalizing Telegram send_message pseudo-call leaks, which aligns perfectly with the core objective to prevent raw send_message text from leaking to users.
Linked Issues check	✅ Passed	The code changes fully address issue `#3893` by normalizing send_message pseudo-calls for Telegram and adding first-turn context instructions to prevent raw tool text from appearing in user-facing responses.
Out of Scope Changes check	✅ Passed	All changes are scoped to Hermes plugin initialization and test coverage, directly addressing the Telegram send_message leak issue without introducing unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/3893_telegram-tool-call-leak-dco

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-25T07:55:29Z

E2E Advisor Recommendation

Required E2E: hermes-e2e, hermes-discord-e2e, hermes-slack-e2e
Optional E2E: messaging-providers-e2e, hermes-inference-switch-e2e

Dispatch hint: hermes-e2e,hermes-discord-e2e,hermes-slack-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

hermes-e2e (high): Required because the Hermes plugin changed. This job runs install → onboard --agent hermes → sandbox health → live inference, validating plugin load and real Hermes assistant behavior in an OpenShell sandbox.
hermes-discord-e2e (high): Required because the change is explicitly platform-aware messaging response handling. This is existing Hermes Discord E2E coverage for the Hermes messaging onboarding/config/gateway path and is one of the closest real-channel checks for the new current-platform guard.
hermes-slack-e2e (high): Required because cross-platform send_message normalization now depends on the active messaging platform. Existing Hermes Slack E2E validates Hermes Slack onboarding, policy, provider placeholder handling, and sandbox integration for this affected platform.

Optional E2E

messaging-providers-e2e (high): Optional adjacent confidence for the broader Telegram/Discord/Slack messaging credential provider and L7 proxy chain. The PR does not change credential injection or network policy, but the runtime behavior is messaging-related.
hermes-inference-switch-e2e (high): Optional confidence that Hermes remains healthy after inference configuration changes, since the modified pre_llm_call context includes provider/model/gateway state used during live inference turns.

New E2E recommendations

hermes-telegram-first-message (high): The PR specifically references a first-turn messaging race and examples use Telegram, but there is no existing Hermes Telegram E2E job analogous to hermes-discord-e2e or hermes-slack-e2e. Add a Hermes Telegram E2E that onboards Hermes with Telegram enabled and verifies first inbound reply text is delivered as normal assistant text, while cross-platform raw send_message output is not silently delivered to the wrong chat.
- Suggested test: Add hermes-telegram-e2e using a fake or hermetic Telegram gateway path plus an optional real-token inbound reply mode.
hermes-messaging-response-normalization (medium): Current unit tests cover the normalizer and hook anchoring with stubs, but existing E2Es do not appear to assert that real Hermes gateway first-message responses avoid leaking raw send_message: ... final text.
- Suggested test: Extend Hermes messaging E2Es to assert first-turn Discord/Slack/Telegram assistant replies do not surface raw send_message pseudo-calls and preserve cross-platform targets as an error/dispatch path.

Dispatch hint

Workflow: .github/workflows/nightly-e2e.yaml
jobs input: hermes-e2e,hermes-discord-e2e,hermes-slack-e2e

github-actions · 2026-05-25T07:55:30Z

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-hermes-discord, ubuntu-repo-cloud-hermes-slack
Optional scenario E2E: ubuntu-repo-cloud-hermes

Dispatch required scenario E2E:

gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-discord
gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

ubuntu-repo-cloud-hermes-discord: Hermes plugin changes alter messaging-platform grounding and raw send_message response normalization; the Hermes Discord scenario exercises the Hermes messaging adapter path on a dispatchable scenario route.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-discord
ubuntu-repo-cloud-hermes-slack: Hermes plugin changes alter messaging response filtering and cross-platform target handling; the Hermes Slack scenario provides targeted coverage for a second Hermes messaging adapter path.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack

Optional scenario E2E

ubuntu-repo-cloud-hermes: Optional baseline Hermes scenario to verify the plugin still loads and the Hermes sandbox remains healthy outside messaging-adapter onboarding.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes

Relevant changed files

agents/hermes/plugin/__init__.py

github-actions · 2026-05-25T07:57:33Z

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

None.

🔎 Worth checking

PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
- Recommendation: Re-run the PR Review Advisor or perform a manual review.
- Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

None.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agents/hermes/plugin/__init__.py`:
- Around line 1108-1111: The broad except Exception around the dynamic import of
"run_agent" (the __import__("run_agent", fromlist=["AIAgent"]) call) should be
narrowed to only catch import-related errors so other failures surface; change
the handler to catch ImportError and ModuleNotFoundError (and return False
there) instead of catching Exception so AttributeError or other runtime errors
during import are not swallowed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d6dbc18b-0cb7-482b-83ca-c4d54be0aece

📥 Commits

Reviewing files that changed from the base of the PR and between 50c208b and 2e9e2b5.

📒 Files selected for processing (2)

agents/hermes/plugin/__init__.py
test/hermes-plugin-handlers.test.ts

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>

github-actions · 2026-05-26T05:41:36Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26434471925
Target ref: fix/3893_telegram-tool-call-leak-dco
Workflow ref: main
Requested jobs: hermes-e2e,hermes-discord-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
hermes-discord-e2e	✅ success
hermes-e2e	✅ success

github-actions · 2026-05-26T15:58:49Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26459374275
Target ref: f43e6e342f19a5d8cb3c7dd2dd560709b7d9297c
Workflow ref: main
Requested jobs: hermes-slack-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
hermes-slack-e2e	✅ success

wscurran · 2026-05-26T17:58:12Z

✨
Related open issues:

#3893 [Brev][Agent&Skills] NemoHermes first Telegram message leaks raw send_message tool call as text instead of executing it

cv

Requesting changes based on the PR Review Advisor rubric.

The global raw send_message normalizer can misroute cross-platform pseudo-calls. _RAW_MESSAGING_TARGET_RE accepts many platforms and _install_messaging_response_patch() installs the normalization globally on AIAgent._strip_think_blocks() without knowing the current chat platform, first-turn state, or whether the user explicitly asked to send a separate cross-platform message. A malformed response like send_message: "to slack: ..." from a Telegram-origin chat could be stripped to plain text and delivered back to Telegram instead of remaining a tool-dispatch/error case.

Please restrict the fallback to the current platform/current-chat readiness case (ideally with platform/turn context), and add a negative test proving a non-current-platform target is not silently normalized into the current chat. Also document the source-of-truth rationale for the workaround and add/cite runtime validation through the Hermes gateway first-message path.

@cv

…4175) Addresses the CHANGES_REQUESTED review by @cv on PR #4175. The original patch installed `_normalize_raw_messaging_tool_response` globally on `AIAgent._strip_think_blocks` with no current-platform context, so a stray response like `send_message: "to slack: ..."` emitted in a Telegram chat would be silently normalized to plain text and delivered back to Telegram — misrouting the cross-platform send_message intent into the wrong chat. This commit: - Adds `_current_messaging_platform` module state + `_set_/_get_` helpers, validated against the `_MESSAGING_PLATFORMS` allowlist. - Extends `_normalize_raw_messaging_tool_response` to take an optional `current_platform` argument and leave the raw `send_message:` text intact whenever the target platform doesn't match the current chat or the current platform is unknown — so dispatch / error paths still surface upstream rather than getting silently delivered into the wrong chat. - Wires the patched `_strip_think_blocks` closure to read `_get_current_messaging_platform()`. - Updates `_pre_llm_call` to call `_set_current_messaging_platform` on every turn (not gated on context injection), so the normalizer has a current-platform anchor even on non-first / non-grounding turns. - Expands the docstring to call out (a) `agents/hermes/run_agent.py` as source of truth for send_message routing, (b) this normalizer as defense-in-depth output filtering, (c) `hermes-e2e`, `hermes-discord-e2e`, and `hermes-slack-e2e` as runtime validation through the gateway first-message path. - Adds negative-test coverage in `test/hermes-plugin-handlers.test.ts` for the cross-platform-blocked and unknown-platform-blocked cases plus positive coverage of the same-platform normalization through the class patch. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com> Co-authored-by: Chengjie Wang <chengjiew@nvidia.com>

@cv

…m anchor Addresses @cv's review ask #4 on PR #4175 — runtime validation through the Hermes gateway first-message path — by adding an integration-style test that drives the actual `_pre_llm_call` → `AIAgent._strip_think_blocks` chain Hermes uses, rather than only exercising the normalizer in isolation. The new test: - Stubs `run_agent.AIAgent._strip_think_blocks` as a real method. - Calls `_pre_llm_call(platform="telegram", is_first_turn=True)` to simulate the first Telegram turn arriving via the gateway hook — this both sets the platform anchor and installs the patch via the same code path Hermes uses at runtime. - Calls `AIAgent._strip_think_blocks(...)` twice to assert (a) same-platform body is extracted, (b) cross-platform `to slack: ...` is preserved. - Calls `_pre_llm_call(platform="discord", is_first_turn=True)` to simulate a turn on a different platform and asserts the anchor refreshes per-turn (Discord body extracted, telegram target preserved). Combined with the existing isolated-unit cases and the cited `hermes-e2e` / `hermes-discord-e2e` / `hermes-slack-e2e` scenarios, this gives both "added" and "cited" runtime validation for the gateway first-message path. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com> Co-authored-by: Chengjie Wang <chengjiew@nvidia.com>

cjagwani · 2026-06-03T19:10:47Z

@cv The normalizer is now anchored to the current chat: _current_messaging_platform is set on every _pre_llm_call and read by the patched _strip_think_blocks, so cross-platform or unknown-platform targets pass through untouched.

Your to slack: from a Telegram chat case is pinned by negative tests at both the direct-API and class-patch layers, plus a new integration test that drives _pre_llm_call → _strip_think_blocks across a Telegram → Discord switch. Docstring names agents/hermes/run_agent.py as authoritative for routing and cites hermes-e2e / hermes-discord-e2e / hermes-slack-e2e as gateway coverage.

The #3893 readiness-gate suggestion is a different architectural approach to the same symptom — left out of scope here. Happy to file a follow-up if you'd rather pursue it. Re-requesting review.

fix(hermes): normalize Telegram send_message pseudo-call leaks

2e9e2b5

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

Comment thread agents/hermes/plugin/__init__.py

fix(hermes): narrow messaging patch import errors

f43e6e3

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>

jyaunches mentioned this pull request May 26, 2026

test(e2e): migrate Hermes feature coverage to scenario suites #3811

Closed

chengjiew added the v0.0.50 Release target label May 26, 2026

wscurran added fix integration: telegram Telegram integration or channel behavior labels May 26, 2026

wscurran added the integration: hermes Hermes integration behavior label May 26, 2026

cv added v0.0.52 Release target v0.0.53 Release target and removed v0.0.50 Release target v0.0.52 Release target labels May 26, 2026

cv requested changes May 27, 2026

View reviewed changes

ericksoa added v0.0.55 and removed v0.0.53 Release target labels May 27, 2026

jyaunches added R2 v0.0.56 Release target and removed v0.0.55 labels May 29, 2026

sandl99 added the enhancement: messaging label Jun 1, 2026

cv added the v0.0.57 Release target label Jun 1, 2026

cv removed the v0.0.56 Release target label Jun 1, 2026

cv assigned cjagwani Jun 2, 2026

cjagwani and others added 2 commits June 2, 2026 12:37

Merge branch 'main' into fix/3893_telegram-tool-call-leak-dco

02a6255

cv added v0.0.58 Release target and removed v0.0.57 Release target labels Jun 3, 2026

wscurran added area: cli Command line interface, flags, terminal UX, or output area: messaging Messaging channels, bridges, manifests, or channel lifecycle bug-fix PR fixes a bug or regression feature PR adds or expands user-visible functionality and removed NemoClaw CLI labels Jun 3, 2026

cjagwani and others added 2 commits June 3, 2026 11:59

Merge branch 'main' into fix/3893_telegram-tool-call-leak-dco

b1571fe

cjagwani requested a review from cv June 3, 2026 19:10

Merge branch 'main' into fix/3893_telegram-tool-call-leak-dco

ee1a2a8

cv approved these changes Jun 3, 2026

View reviewed changes

cv enabled auto-merge (squash) June 3, 2026 19:11

cv disabled auto-merge June 3, 2026 19:21

cv merged commit 8637a78 into main Jun 3, 2026
18 checks passed

cv deleted the fix/3893_telegram-tool-call-leak-dco branch June 3, 2026 19:22

coderabbitai Bot mentioned this pull request Jun 3, 2026

fix(hermes): preserve strip think method binding #4731

Merged

12 tasks

miyoungc mentioned this pull request Jun 4, 2026

docs: refresh 0.0.58 release docs and refresh skills #4743

Merged

wscurran removed the feature PR adds or expands user-visible functionality label Jun 9, 2026

Conversation

chengjiew commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented May 26, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 26, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

wscurran commented May 26, 2026

Uh oh!

cv left a comment

Choose a reason for hiding this comment

Uh oh!

cjagwani commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

chengjiew commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading

github-actions Bot commented May 25, 2026 •

edited

Loading

github-actions Bot commented May 25, 2026 •

edited

Loading

github-actions Bot commented May 25, 2026 •

edited

Loading

cjagwani commented Jun 3, 2026 •

edited

Loading