Skip to content

docs(troubleshooting): add Hermes section#4169

Merged
miyoungc merged 4 commits into
NVIDIA:mainfrom
latenighthackathon:docs/3658-hermes-troubleshooting-v2
Jun 2, 2026
Merged

docs(troubleshooting): add Hermes section#4169
miyoungc merged 4 commits into
NVIDIA:mainfrom
latenighthackathon:docs/3658-hermes-troubleshooting-v2

Conversation

@latenighthackathon

@latenighthackathon latenighthackathon commented May 25, 2026

Copy link
Copy Markdown
Contributor

Summary

Append a ## Hermes section to docs/reference/troubleshooting.mdx covering the most common surprises operators hit when running Hermes through nemohermes.

Related Issue

Fresh recreation of the closed #3658.

Changes

  • Add a ## Hermes troubleshooting section to docs/reference/troubleshooting.mdx with the following entries:
  • Port 8642 returns a blank page or Cannot GET /: Hermes serves an OpenAI-compatible API at that port, not a chat dashboard; documents the /health check and the /v1 client URL.
  • Sandbox 'X' already exists as OpenClaw: explains the one-agent-per-sandbox-name rule and the destroy plus re-onboard pattern to convert.
  • nemohermes: command not found immediately after install: covers the shim publication path and the NEMOCLAW_AGENT=hermes re-install workaround.
  • Choosing between OAuth and API key for the Hermes Provider: documents NEMOCLAW_HERMES_AUTH_METHOD accepted values, the NEMOCLAW_HERMES_AUTH and NEMOCLAW_NOUS_AUTH_METHOD back-compat aliases, and the headless-host device-code fallback.
  • 401 Unauthorized against port 8642: Hermes requires an Authorization: Bearer header, not the OpenClaw #token= URL fragment.
  • Brave Search policy preset has no effect under Hermes: Hermes does not use NemoClaw's OpenClaw web-search configuration; explains the wizard omission and why adding the preset post-onboard opens egress without wiring it into the agent.
  • Re-onboarding asks every messaging prompt again: links [Ubuntu 22.04][Onboard] nemohermes re-onboard re-asks all 5 messaging per-channel prompts; credentials.json never written; "Messaging: none" on Run 2 review #3581 and documents the env-var workaround for unattended re-onboards.
  • Use absolute cross-reference paths (/get-started/quickstart-hermes, /reference/network-policies) per the file's convention after the Fern MDX migration (docs: remove legacy markdown docs and refresh MDX checks #3837). The original PR landed with relative .md paths flagged in review; this recreation uses absolute paths from the first commit.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Ran: python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx --dry-run (clean); markdownlint-cli2 (clean). Pre-commit hooks pass except the pre-existing tsc-plugin infra failure present on stock upstream/main.


Signed-off-by: latenighthackathon latenighthackathon@users.noreply.github.com

Summary by CodeRabbit

  • Documentation
    • Expanded troubleshooting with Hermes operational guidance: port 8642 expectations (API vs dashboard), sandbox name conflicts and re-onboard flow, shim verification and reinstall steps, provider auth-method selection (OAuth vs API key) with non‑interactive env-var setup, common 401/Auth header fixes, Brave Search preset behavior, and messaging re-onboarding using exported bot-token env vars
    • Clarified that host-to-sandbox routing via host.docker.internal is unreliable and recommend using the local host proxy route (host.openshell.internal:11435) with diagnostic examples and tips

@copy-pr-bot

copy-pr-bot Bot commented May 25, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9e62f034-39db-48d3-8b2f-947681ae978a

📥 Commits

Reviewing files that changed from the base of the PR and between a46d1ad and 729787d.

📒 Files selected for processing (2)
  • .agents/skills/nemoclaw-user-reference/references/troubleshooting.md
  • docs/reference/troubleshooting.mdx
✅ Files skipped from review due to trivial changes (2)
  • docs/reference/troubleshooting.mdx
  • .agents/skills/nemoclaw-user-reference/references/troubleshooting.md

📝 Walkthrough

Walkthrough

Clarifies that host.docker.internal is unreliable inside OpenShell sandboxes and adds a comprehensive Hermes troubleshooting section (port 8642 behavior, sandbox-name conflicts, missing shim, provider auth method, 401 bearer-token requirements, Brave preset note, and re-onboard resume behavior).

Changes

Troubleshooting Documentation Expansion

Layer / File(s) Summary
Host service networking inside OpenShell sandbox
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md, docs/reference/troubleshooting.mdx
Clarified that host.docker.internal does not reliably reach host services within the OpenShell k3s sandbox; documented gateway alias host.openshell.internal, described DNS/port-forwarding failure modes, and recommended using the NemoClaw auth-proxy (host.openshell.internal:11435).
Hermes operational and onboarding troubleshooting
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md, docs/reference/troubleshooting.mdx
Added Hermes troubleshooting covering port 8642 access/health expectations (OpenAI-compatible API), sandbox name vs agent-type conflicts and conversion steps, missing nemohermes shim remediation, Hermes Provider auth-method selection (OAuth vs API key) including non-interactive env-var setup and accepted NEMOCLAW_HERMES_AUTH_METHOD values, 401 Unauthorized bearer-token header requirements and credential reset flow, note that the Brave Search preset has no effect under Hermes, and guidance that --resume re-prompts messaging credentials unless messaging env vars are exported for non-interactive resume.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4451: Related documentation updates around Hermes exposure/auth (port 8642 and bearer token usage).

Suggested reviewers

  • ericksoa

A rabbit hops through troubleshooting pages bright,
Whispering Hermes ports and BotFather privacy light,
Gateway hosts reclaimed with a gentle ping,
Shim and auth tips make the docs sing,
🐰✨ Logs cleared, and operators delight.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'docs(troubleshooting): add Hermes section' accurately and concisely describes the primary change: adding a new Hermes troubleshooting section to documentation.
Linked Issues check ✅ Passed The code changes comprehensively meet all objectives from #3658: added Hermes troubleshooting section with all required entries (port 8642 confusion, sandbox collisions, shim omissions, auth selection, 401 errors, Brave Search ineffectiveness, re-onboarding prompt behavior), used absolute cross-reference paths per Fern MDX conventions, and regenerated the auto-mirrored file.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue objectives: documentation-only updates to troubleshooting guidance for Hermes operator issues with no extraneous modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md (1)

1390-1391: ⚡ Quick win

Replace superlative phrasing with neutral wording.

“most common surprises” reads like marketing/superlative language; prefer direct factual wording. LLM pattern detected.

As per coding guidelines, "Superlatives and marketing language ('powerful,' 'robust,' 'seamless,' 'cutting-edge'). Say what it does, not how great it is."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/nemoclaw-user-reference/references/troubleshooting.md around
lines 1390 - 1391, Replace the superlative/marketing phrasing in the
Troubleshooting intro: change the sentence "The issues below cover the most
common surprises operators hit when running Hermes through `nemohermes`." to
neutral factual wording such as "The issues below cover common or frequent
issues operators encounter when running Hermes through `nemohermes`." Update
only that phrase in the
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md content so
it uses "common" or "frequent issues" (or similar neutral wording) and keep the
next sentence about Quickstart with Hermes unchanged.
docs/reference/troubleshooting.mdx (1)

1403-1403: ⚡ Quick win

Use direct second-person voice in the section intro.

“The issues below cover the most common surprises operators hit…” addresses a generic audience instead of the reader; prefer “you” phrasing to match the docs voice standard.

As per coding guidelines, "Second person ("you") when addressing the reader."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/troubleshooting.mdx` at line 1403, Replace the section intro
sentence "The issues below cover the most common surprises operators hit when
running Hermes through `nemohermes`." with a direct second-person phrasing that
addresses the reader (for example: "The issues below cover the most common
surprises you may encounter when running Hermes with `nemohermes`.") so the
intro uses "you" and matches the docs' voice standard.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.agents/skills/nemoclaw-user-reference/references/troubleshooting.md:
- Around line 1078-1085: The subsection contains multiple sentences on the same
source line (e.g., the line beginning "Configuring an inference provider with a
base URL like `http://host.docker.internal:11434/v1`..."); update the paragraph
in .agents/skills/nemoclaw-user-reference/references/troubleshooting.md so every
sentence is on its own line — split the long sentence about OpenShell/k3s and
the sentence describing the sandbox DNS/connection failures into separate lines,
and ensure each remaining sentence in that subsection follows the
one-sentence-per-line rule.

---

Nitpick comments:
In @.agents/skills/nemoclaw-user-reference/references/troubleshooting.md:
- Around line 1390-1391: Replace the superlative/marketing phrasing in the
Troubleshooting intro: change the sentence "The issues below cover the most
common surprises operators hit when running Hermes through `nemohermes`." to
neutral factual wording such as "The issues below cover common or frequent
issues operators encounter when running Hermes through `nemohermes`." Update
only that phrase in the
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md content so
it uses "common" or "frequent issues" (or similar neutral wording) and keep the
next sentence about Quickstart with Hermes unchanged.

In `@docs/reference/troubleshooting.mdx`:
- Line 1403: Replace the section intro sentence "The issues below cover the most
common surprises operators hit when running Hermes through `nemohermes`." with a
direct second-person phrasing that addresses the reader (for example: "The
issues below cover the most common surprises you may encounter when running
Hermes with `nemohermes`.") so the intro uses "you" and matches the docs' voice
standard.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: efe9aa65-3ddd-49fa-9696-9dc952357b0b

📥 Commits

Reviewing files that changed from the base of the PR and between 50c208b and 695753d.

📒 Files selected for processing (2)
  • .agents/skills/nemoclaw-user-reference/references/troubleshooting.md
  • docs/reference/troubleshooting.mdx

Comment thread .agents/skills/nemoclaw-user-reference/references/troubleshooting.md Outdated
@wscurran wscurran added documentation enhancement New capability or improvement request integration: hermes Hermes integration behavior labels May 26, 2026
@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this detailed PR about adding a Hermes section to the troubleshooting documentation. This proposes a way to improve the documentation for common issues encountered when running Hermes through nemohermes.


Related open PRs:


Related open issues:

@latenighthackathon latenighthackathon force-pushed the docs/3658-hermes-troubleshooting-v2 branch from a84bc93 to a46d1ad Compare May 27, 2026 03:08

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md (1)

1-1500: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep documentation edits in docs/ source pages only.

This PR includes direct changes to a generated mirror file under .agents/skills/...; that risks source/mirror drift and bypasses the intended docs workflow.

As per coding guidelines, "docs/**/*.{md,mdx}: For normal documentation changes, include only source pages under docs/; the docs-to-skills hook runs in dry-run mode to validate generated output."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/nemoclaw-user-reference/references/troubleshooting.md around
lines 1 - 1500, You edited the generated mirror of the Troubleshooting page
instead of the docs source; revert your edits to the generated mirror (undo
changes to the “Troubleshooting” mirror file) and apply your content changes to
the canonical docs source Troubleshooting page under the docs/ source tree, then
run the docs-to-skills validation hook (dry-run) to confirm the generated output
matches and only docs/**/*.{md,mdx} were changed before committing.
🧹 Nitpick comments (2)
docs/reference/troubleshooting.mdx (2)

1511-1511: ⚡ Quick win

Use active voice for the issue reference sentence.

“This is tracked in #3581.” is passive; prefer an active form (for example, “Issue #3581 tracks this behavior.”).

As per coding guidelines, "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/troubleshooting.mdx` at line 1511, Change the passive sentence
"This is tracked in `#3581`." to an active-voice form such as "Issue `#3581` tracks
this behavior." — edit the sentence that contains the issue reference (`#3581`) so
it uses an active subject ("Issue `#3581`") and an active verb ("tracks") instead
of the passive construction.

1426-1428: ⚡ Quick win

Use active voice in the sandbox-type conflict explanation.

The sentence “If a sandbox named X was created…” is passive; rewrite it in active voice.

As per coding guidelines, "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/troubleshooting.mdx` around lines 1426 - 1428, The sentence
"If a sandbox named `X` was created with the default OpenClaw agent, a later
`nemohermes onboard` for the same name exits with:" uses passive voice; rewrite
it in active voice so the subject performs the action — e.g., "When you create a
sandbox named `X` with the default OpenClaw agent, running `nemohermes onboard`
later for the same name exits with:" — update the sentence in
docs/reference/troubleshooting.mdx replacing the passive construction (the
sentence beginning "If a sandbox named `X` was created...") with an active-voice
variant.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.agents/skills/nemoclaw-user-reference/references/troubleshooting.md:
- Around line 1-1500: You edited the generated mirror of the Troubleshooting
page instead of the docs source; revert your edits to the generated mirror (undo
changes to the “Troubleshooting” mirror file) and apply your content changes to
the canonical docs source Troubleshooting page under the docs/ source tree, then
run the docs-to-skills validation hook (dry-run) to confirm the generated output
matches and only docs/**/*.{md,mdx} were changed before committing.

---

Nitpick comments:
In `@docs/reference/troubleshooting.mdx`:
- Line 1511: Change the passive sentence "This is tracked in `#3581`." to an
active-voice form such as "Issue `#3581` tracks this behavior." — edit the
sentence that contains the issue reference (`#3581`) so it uses an active subject
("Issue `#3581`") and an active verb ("tracks") instead of the passive
construction.
- Around line 1426-1428: The sentence "If a sandbox named `X` was created with
the default OpenClaw agent, a later `nemohermes onboard` for the same name exits
with:" uses passive voice; rewrite it in active voice so the subject performs
the action — e.g., "When you create a sandbox named `X` with the default
OpenClaw agent, running `nemohermes onboard` later for the same name exits
with:" — update the sentence in docs/reference/troubleshooting.mdx replacing the
passive construction (the sentence beginning "If a sandbox named `X` was
created...") with an active-voice variant.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 52f44745-be78-481f-8b7e-69da3a18dc16

📥 Commits

Reviewing files that changed from the base of the PR and between a84bc93 and a46d1ad.

📒 Files selected for processing (2)
  • .agents/skills/nemoclaw-user-reference/references/troubleshooting.md
  • docs/reference/troubleshooting.mdx

Append a new `## Hermes` section to `docs/reference/troubleshooting.mdx`
covering the most common surprises operators hit when running Hermes
through `nemohermes`:

  - Port 8642 returns blank page or `Cannot GET /` (Hermes serves an
    OpenAI-compat API at that port, not a chat dashboard).
  - `Sandbox 'X' already exists as OpenClaw` (each sandbox name maps
    to one agent type).
  - `nemohermes: command not found` immediately after install (shim
    publication path).
  - Choosing between OAuth and API key (`NEMOCLAW_HERMES_AUTH_METHOD`
    and the back-compat aliases).
  - `401 Unauthorized` against port 8642 (Hermes requires
    `Authorization: Bearer`, not the OpenClaw `#token=` fragment).
  - `Brave Search` policy preset has no effect under Hermes (Hermes
    does not use NemoClaw's OpenClaw web-search configuration).
  - Re-onboarding asks every messaging prompt again (tracked in
    NVIDIA#3581; documents the env-var workaround).

Cross-references use absolute paths (`/get-started/quickstart-hermes`,
`/reference/network-policies`) per the file's convention post Fern
MDX migration (NVIDIA#3837) and per the CodeRabbit nit on the original PR.

The mirror at
`.agents/skills/nemoclaw-user-reference/references/troubleshooting.md`
is regenerated by `scripts/docs-to-skills.py`.

Fresh recreation of the closed NVIDIA#3658 rebuilt cleanly on top of
current upstream/main as a single signed commit.

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
…ternal note

Three CodeRabbit nits on the Hermes troubleshooting section:

- The host.docker.internal paragraph was hard-wrapped at ~80 chars, so a
  few sentences landed on the same source line. Reflow to
  one-sentence-per-line for cleaner diffs.
- The Hermes intro used "the most common surprises operators hit".
  Drop the superlative phrasing for neutral wording.
- In the .mdx (user-facing) file, address the reader in second person
  ("you may encounter") instead of generic "operators hit". The .md
  agent-mirror keeps the impersonal form used elsewhere in that file.

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
@latenighthackathon latenighthackathon force-pushed the docs/3658-hermes-troubleshooting-v2 branch from a46d1ad to e6ee234 Compare May 31, 2026 16:07
Comment thread docs/reference/troubleshooting.mdx Outdated
@miyoungc miyoungc enabled auto-merge (squash) June 2, 2026 02:38
@miyoungc miyoungc merged commit 96946a1 into NVIDIA:main Jun 2, 2026
20 checks passed
@latenighthackathon latenighthackathon deleted the docs/3658-hermes-troubleshooting-v2 branch June 2, 2026 02:47
@wscurran wscurran added feature PR adds or expands user-visible functionality area: docs Documentation, examples, guides, or docs build and removed documentation labels Jun 3, 2026
@wscurran wscurran removed the enhancement New capability or improvement request label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: docs Documentation, examples, guides, or docs build feature PR adds or expands user-visible functionality integration: hermes Hermes integration behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants