feat(secrets/bitwarden): EU Cloud + self-hosted server URL support#31378
Merged
Conversation
Closes #31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token
Contributor
🔎 Lint report:
|
exosyphon
pushed a commit
to exosyphon/hermes-agent
that referenced
this pull request
May 24, 2026
…ousResearch#31378) Closes NousResearch#31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token
sahilm-ti
pushed a commit
to sahilm-ti/hermes-agent
that referenced
this pull request
May 25, 2026
…ousResearch#31378) Closes NousResearch#31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token
sahilm-ti
pushed a commit
to sahilm-ti/hermes-agent
that referenced
this pull request
May 25, 2026
…ousResearch#31378) Closes NousResearch#31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token
bot-ted
added a commit
to bot-ted/hermes-agent
that referenced
this pull request
May 25, 2026
* fix(tui): log parent gateway lifecycle exits (#31051)
* fix(tui): log parent gateway lifecycle exits
Add parent-side breadcrumbs for TUI gateway shutdown and transport exits so future backend EOF/SIGTERM reports identify the parent action that caused them.
* chore(tui): retrigger lifecycle logging checks
Retry transient GitHub checkout failures on the lifecycle logging PR.
* fix(tui): ignore late thinking deltas after completion (#31055)
* fix(tui): ignore late thinking deltas after completion
Prevent stale reasoning events from repainting the TUI status after a turn has already completed and the UI is idle.
* test(tui): restore timers after thinking delta assertion
Keep fake timer cleanup in a finally block so assertion failures cannot leak timer mode into later tests.
* fix(tui): refresh virtual transcript on viewport resize
Notify scroll subscribers when ScrollBox viewport bounds change and key virtual-history updates on viewport height so resize/keyboard changes remount the tail rows instead of leaving stale spacers visible.
* fix(tui): keep status rule one-line in skinny terminals
Clamp and truncate the cwd/branch segment so narrow status bars cannot wrap into the composer input row.
* test(tui): isolate viewport-height remount regression
Keep the resize delta below the virtual history scroll quantum so the regression test specifically depends on viewport height entering the snapshot key.
* fix(tui): reclaim status width when cwd is hidden
Make the cwd separator width conditional so the computed status layout matches the rendered row on ultra-narrow terminals.
* test(tui): clarify virtual history resize snapshot
Update the resize regression and comments so the test specifically guards viewport-height changes in the virtual-history snapshot key.
* fix(tui): measure status cwd by display width
Budget the right-hand status label by terminal display width so wide Unicode paths cannot wrap skinny status bars.
* docs(tui): clarify scrollbox subscription signals
Document that ScrollBox subscribers are notified for renderer-computed viewport and content bound changes, not only imperative scrolls.
* fix(tui): recompute virtual tail after width resize
Avoid preserving a frozen virtual transcript range when wrapped rows shrink enough that the old tail window no longer covers the viewport.
* fix(tui): preserve transcript tail across resizes
Wraps + heights are column-dependent, so a width change must remeasure
every row and the renderer must repaint the full viewport.
- Key virtualRows on cols so React remounts wrapped rows on resize.
- Snap back to bottom after sticky-mode resize once React rerenders.
- Reserve a scrollbar + gap column in transcriptBodyWidth (non-termux).
- Full repaint on any viewport height change (was: shrink-only).
- ScrollBox scrollHeight uses deepest child bottom so sticky-bottom
math can reach the real final rendered row after reflow.
- DECSTBM fast-path now requires full container rect match.
* feat(tui): responsive banner tiers
Terminals can't scale glyphs, so the banner now picks a layout per
column width instead of always rendering the full 101-col logo:
- Wide (>= logo width): full ASCII logo + tagline.
- Mid (>= 58 cols): centered rule banner that expands with viewport.
- Narrow (>= 34 cols): brand line + tagline, both width-aware.
- < 34 cols: hidden.
SessionPanel surfaces model/cwd/sid inline when the hero column is
hidden, so narrow layouts don't lose that info. Logo width constants
derive from the art itself.
* fix(tui): re-check sticky inside resize debounce + document remount
Addresses Copilot review on PR #31077:
- onResize now re-checks isSticky() inside the 100ms timer so manual
scrolls during the debounce window don't get snapped back to tail.
- Comment on the virtualRows cols-keying calls out the deliberate
trade-off: per-row local state (e.g. systemOpen) resets on resize so
yoga can remeasure off live geometry. The hook's scale-by-ratio path
is too approximate for mixed markdown widths.
* feat(ntfy): add ntfy platform adapter with atomic reconnect, identity fix, and 81 tests
* refactor(ntfy): convert built-in adapter to platform plugin
ntfy now ships as a self-contained plugin under plugins/platforms/ntfy/
instead of editing 8 core files (gateway/config.py Platform enum,
gateway/run.py factory + auth maps, cron/scheduler.py, toolsets.py,
hermes_cli/status.py, agent/prompt_builder.py, gateway/channel_directory.py,
tools/send_message_tool.py).
All routing goes through gateway/platform_registry via register_platform():
- adapter_factory, check_fn, validate_config, is_connected
- env_enablement_fn seeds PlatformConfig.extra from NTFY_* env vars so
gateway status reflects env-only setups without instantiating httpx
- standalone_sender_fn handles deliver=ntfy cron jobs when cron runs
out-of-process from the gateway
- allowed_users_env / allow_all_env hook into _is_user_authorized
- cron_deliver_env_var=NTFY_HOME_CHANNEL for cron home routing
- platform_hint surfaces in the system prompt
- pii_safe=True (topic names are the only identifier; no PII to redact)
Tests moved to tests/gateway/test_ntfy_plugin.py using _plugin_adapter_loader
so the module lives under plugin_adapter_ntfy in sys.modules and cannot
collide with sibling plugin-adapter tests on the same xdist worker. The
core-file grep tests (Platform.NTFY in source, hermes-ntfy in toolsets,
etc.) are replaced with plugin-shape tests covering register() metadata,
env_enablement_fn output, and standalone_sender_fn behavior.
68 tests pass under scripts/run_tests.sh.
* ntfy: tighten robustness, dedupe auth/truncation, add docs
Robustness:
- Surface 401/404 stream failures via _set_fatal_error() so the gateway's
runtime status reflects 'fatal: ntfy_unauthorized' / 'ntfy_topic_not_found'
instead of staying 'connected' when the reconnect loop halts. Matches
the pattern in whatsapp / telegram / sms adapters.
- Strip whitespace from auth tokens so pasted tokens with trailing
newlines don't produce malformed Authorization headers.
Simplicity:
- Extract _build_auth_header() and _truncate_body() to module-level
helpers, used by both NtfyAdapter and _standalone_send. Removes the
duplicated auth/truncation logic between the two paths.
Docs:
- website/docs/user-guide/messaging/ntfy.md — full setup guide,
identity-model warning, self-hosting, cron usage, troubleshooting.
- website/docs/reference/environment-variables.md — all 9 NTFY_* vars.
- website/docs/user-guide/messaging/index.md — platform comparison row.
- website/sidebars.ts — sidebar entry between simplex and open-webui.
Tests: 78/78 (+ 10 new robustness tests covering token hygiene, fatal
error propagation for 401/404, and the _truncate_body helper).
* fix(cli): prevent temp directory leak on ZIP update failure
Move shutil.rmtree into a finally block so the temp directory is always
cleaned up, even when an exception occurs during download, extraction,
or file copying.
* fix(tui): clear TTS env var on voice off, and add TTS indicator to status bar
Bug 1: /voice off in TUI mode did not clear HERMES_VOICE_TTS,
leaving TTS stuck ON with no way to disable it (the voice.toggle
tts handler requires voice mode to be ON).
Bug 2: TUI status bar only showed 'voice on/off' without any
indication of whether TTS speech output is active, because the
frontend never tracked voiceTts state.
- tui_gateway/server.py: clear HERMES_VOICE_TTS when voice is turned off
- ui-tui/src/app/useMainApp.ts: add voiceTts state, thread setVoiceTts
through voice contexts, display [tts] in status bar
- ui-tui/src/app/slash/commands/session.ts: sync tts from voice.toggle response
- ui-tui/src/app/interfaces.ts: add setVoiceTts to all voice context interfaces
* fix(telegram): preserve observed group slash commands
* chore: map Glucksberg noreply email in AUTHOR_MAP
* docs(simplex): remove broken Docker install command (#26974) (#26975)
* docs(simplex): remove broken Docker install command (#26974)
The "Or Docker" snippet pointed at `simplexchat/simplex-chat`, which is
not a published Docker Hub image. Users following the docs hit:
docker: Error response from daemon: pull access denied for
simplexchat/simplex-chat, repository does not exist or may require
'docker login'.
The SimpleX Chat project only publishes Docker images for its server
components (smp-server, xftp-server) — the chat CLI is distributed as a
binary release. Drop the broken `docker run` line and keep the verified
binary-download path, with a note pointing users to the upstream
Dockerfile if they want to build a container themselves.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(simplex): drop misleading "Dockerfile" link text
Copilot review flagged that the link text claimed "Dockerfile in the
upstream repo" but the URL pointed at the repository root, not a
specific Dockerfile path. Reword to "build from source from the
simplex-chat repository" so the link text and target match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(env): strip null bytes from .env before python-dotenv loads
Null bytes in API key values (introduced by copy-paste) crash
os.environ[k] = v with ValueError: embedded null byte, preventing
hermes from starting at all.
* fix(compressor): ABC compliance — total_tokens, api_mode, logger consistency
* fix(compressor): propagate api_mode and fix root logger calls
- Add api_mode to 4 update_model() call sites:
- conversation_loop.py: long_context failover and probe stepping
- agent_runtime_helpers.py: rollback restore (also saves compressor_api_mode)
- chat_completion_helpers.py: fallback activation
- Fix 31 root-logger calls across 5 files (logging.warning/error/info
-> logger.warning/error/info) to respect module-level log filtering
* fix(cli): synchronize HERMES_SESSION_ID across environment and contextvar during session switches
* feat(skills): add opt-in AST deep diagnostics
Add opt-in AST diagnostics for skill review without making Skills Guard stricter by default.
- Add hermes skills inspect --ast-deep to scan fetched skill bundles before installation
- Add hermes skills audit --deep to scan already-installed hub skills
- Keep AST analysis in tools/skills_ast_audit.py, separate from tools/skills_guard.py
- Label output as diagnostic hints, not security verdicts
- Cover dynamic import/access patterns: importlib, __import__(computed), getattr(computed), and __dict__[computed]
This follows the maintainer guidance from closed PR #7436: useful AST-level analysis belongs in an opt-in diagnostic path, not in Skills Guard's default heuristic scan.
* refactor(skills): slim AST diagnostic to single entry point
Trim ~600 LOC off the original contribution while keeping the same
operator-facing surface and detection coverage.
- Collapse three entry points (file / dir / bundle) into one
ast_scan_path(path) that handles both files and directories.
- Drop AstFinding dataclass + severity field — replaced with plain
(file, line, pattern_id, description) tuples. Severity ordering was
display-only for a diagnostic that explicitly disclaims security
verdicts, so the field added bookkeeping without earning its place.
- Replace Rich-markup formatter with plain text grouped by file.
- Drop the 'inspect --ast-deep' surface — same scanner, same output as
'audit --deep', single CLI entry is enough. Operators audit after
install; pre-install inspection signal isn't worth the second surface.
- Trim test file to the cases that earn their place: bypass payload,
syntax error survival, RecursionError survival, false-positive guard
(importer lookalike), literal-arg false-positive guard, non-.py
ignored, directory recursion + cache-dir skipping, missing-path,
getattr/__dict__ detection, formatter empty + populated.
Net: tools/skills_ast_audit.py 353 -> 133 LOC,
tests/tools/test_skills_ast_audit.py 299 -> 103 LOC, full diff
+704/-12 -> +264/-6. No change to tools/skills_guard.py — Skills Guard
verdicts remain untouched per SECURITY.md §2.4.
* fix(cli): validate runtime token refresh capability in Qwen auth status
* feat(plugins): add register_auxiliary_task() to PluginContext API
Auxiliary LLM tasks (vision, compression, web_extract, etc.) currently
require modifications to core files for any plugin that needs its own
task slot — specifically the _AUX_TASKS list in hermes_cli/main.py and
the hardcoded env-var bridging dict in gateway/run.py. This violates
the 'plugins must not modify core files' rule and forces every memory
or context plugin that wants its own auxiliary task to either fork
core or open a coupled core+plugin PR.
This change adds a generic plugin surface for auxiliary task
registration:
ctx.register_auxiliary_task(
key='memory_retain_filter',
display_name='Memory retain filter',
description='hindsight pre-retain dedup/extract',
defaults={'timeout': 30, 'extra_body': {'reasoning_effort': 'low'}},
)
After registration, the task automatically:
- Appears in 'hermes model → Configure auxiliary models' picker via
a new _all_aux_tasks() merge of built-in + plugin tasks
- Has its provider/model/base_url/api_key bridged from config.yaml
to AUXILIARY_<KEY_UPPER>_* env vars at gateway startup
(gateway/run.py now uses a dynamic bridged-keys set instead of
a hardcoded per-task dict)
- Gets plugin-declared defaults (timeout, extra_body, etc.) layered
underneath user config so unconfigured plugin tasks still work
(agent/auxiliary_client._get_auxiliary_task_config)
- Resets to auto via 'Reset all to auto' alongside built-ins
Validation:
- Rejects shadowing of built-in keys (vision, compression, etc.)
- Rejects invalid key shapes (must match [A-Za-z0-9_]+)
- Rejects cross-plugin collisions (clear error)
- Allows same-plugin re-registration (idempotent updates)
Plugin discovery failures (rare) fall back gracefully — the aux
config UI still shows built-in tasks if get_plugin_auxiliary_tasks()
raises, and gateway env-var bridging keeps working for built-ins.
Built-in tasks remain hardcoded in _AUX_TASKS for stability — they're
the baseline UX, and DEFAULT_CONFIG already ships their defaults.
Plugin tasks layer on top.
Tests: 15 new tests in test_plugin_auxiliary_tasks.py covering API
validation, manager state lifecycle, helper sort order, _all_aux_tasks
merge semantics, _reset_aux_to_auto inclusion of plugin tasks, and
default-layering in auxiliary_client.
Updates the gateway-bridge code-parity test (test_auxiliary_config_bridge)
to assert the new dynamic shape rather than the hardcoded literal env
var names which no longer appear post-refactor.
Motivation: this unblocks PR #20262 (hindsight smart retain pipeline)
and similar plugins that need a dedicated aux task slot. The change
is non-breaking — built-in env vars (AUXILIARY_VISION_PROVIDER, etc.)
keep working since they're produced by the same f-string template
that built the hardcoded names.
* chore: map edison@mcclean.codes for PR #29817 salvage
* fix(provider): make config.yaml model.provider the single source of truth (#31222)
Policy: if it ain't a secret it goes in config.yaml. HERMES_INFERENCE_PROVIDER
was leaking behavioral config into the .env surface, including from the gateway,
which bypassed config.yaml entirely.
Behavior:
- gateway/run.py: drop HERMES_INFERENCE_PROVIDER read in _resolve_runtime_agent_kwargs.
Gateway now flows through resolve_runtime_provider() with no `requested` override,
which reads model.provider from config.yaml first.
Docs/UX (strip env var from user-facing surface):
- --provider help text no longer mentions the env var
- cli-config.yaml.example same
- reference/environment-variables.md: remove HERMES_INFERENCE_PROVIDER row and
the cross-reference from HERMES_INFERENCE_MODEL
- reference/cli-commands.md: blank the env-var column for --provider
- guides/xai-grok-oauth.md, guides/minimax-oauth.md: replace
HERMES_INFERENCE_PROVIDER=x hermes invocations with config.yaml / --provider
- developer-guide/adding-providers.md, model-provider-plugin.md: reframe
Internal mechanism (kept as-is):
- hermes_cli/main.py writes HERMES_INFERENCE_PROVIDER into the TUI subprocess env
- tui_gateway/server.py reads it on TUI startup
- resolve_requested_provider() / oneshot.py / cli.py still fall through to the
env var as a last-resort behind config.yaml, which is what makes the TUI
parent->child handoff work
This stays. We just stop documenting it as a user knob.
Tests: tests/gateway/test_auth_fallback.py — simplify mock to fail on first
call, succeed on second; drop monkeypatch.setenv lines that no longer matter.
Supersedes #31064 (closed with credit to @novax635 who surfaced the underlying
issue but proposed aligning gateway *to* the env var rather than removing it).
* docs(providers): rewrite Nous Portal section as primary recommended path (#31230)
The old section sold Nous Portal as access to Hermes-4 models, which is
backwards — Hermes 4 is a chat/reasoning family that's NOT recommended
for Hermes Agent (per portal.nousresearch.com/info itself). The actual
value prop is the 300+ frontier agentic models (Claude, GPT, Gemini,
DeepSeek, etc.) plus the Tool Gateway plus Nous Chat under one
subscription.
Rewrite to lead with that, position the portal as the recommended way
to run Hermes Agent, demote Hermes 4 to a 'note' explaining why it's
not the right pick for agent workloads, and link to the
manage-subscription page from setup.
* fix(browser): use process-tree termination for daemon cleanup
os.kill(pid, SIGTERM) only signals the parent, leaving Chromium child
processes (renderer, GPU, etc.) orphaned. Reuse the existing
ProcessRegistry._terminate_host_pid() helper which walks the process
tree leaf-up via psutil, terminating children before the parent.
* fix(process_registry): use taskkill /T /F for tree-kill on Windows
The Windows branch of `_terminate_host_pid` early-returned after
`os.kill(pid, SIGTERM)` (which Python maps to `TerminateProcess` for
the target handle only), leaving descendant processes — e.g. Chromium
renderer/GPU/network helpers spawned by an `agent-browser` daemon —
running on Windows even after the preceding commit fixed POSIX.
The right Windows primitive is `taskkill /PID <pid> /T /F`:
`/T` walks the tree, `/F` force-terminates. Same approach
`gateway.status.terminate_pid(force=True)` already uses for the
gateway's own shutdown path; reuse the same shape here.
Why NOT extend the POSIX psutil tree-walk to Windows:
1. Windows doesn't maintain a Unix-style process tree. `psutil.
Process.children(recursive=True)` walks PPID links that go stale
when intermediate processes exit, so enumeration is best-effort
and silently misses orphaned descendants. The whole bug we're
fixing is orphaned descendants.
2. `psutil.Process.terminate()` on Windows is `TerminateProcess()`
for one handle — same single-PID scope as the existing
`os.kill`. The existing comment in `gateway/status.py::
terminate_pid` warns this explicitly: 'os.kill SIGTERM is not
equivalent to a tree-killing hard stop' on Windows.
3. Headless Chromium has no GUI window, so the softer
`taskkill /T` without `/F` (which sends WM_CLOSE) won't reach
it either. `/F` is required.
POSIX path is unchanged. The taskkill subprocess uses the same
`creationflags=windows_hide_flags()` pattern other Windows shellouts
in this codebase use. `FileNotFoundError` / `TimeoutExpired` /
`OSError` fall back to bare `os.kill(SIGTERM)` as cheap insurance.
Tests cover the Windows branch via the codebase's standard
`monkeypatch _IS_WINDOWS` pattern (`references/windows-native-
support.md`), plus POSIX tree-walk order, NoSuchProcess swallow,
and the OSError fallback path. 7 new tests, all green on Linux CI.
* fix(tui): handle images with codex app-server
* docs(providers): move Nous Portal first, Google Gemini OAuth last (#31287)
Reorder the per-provider subsections under '## Inference Providers'
so Nous Portal — the recommended setup — leads the list, and Google
Gemini via OAuth (which carries a policy-risk warning) drops to last
position right before the '## Custom & Self-Hosted LLM Providers'
section. All other provider sections keep their relative order. Pure
section move; no content changes.
* fix(dingtalk): finalize open streaming cards before disconnect
AI Card "tool progress" cards created with finalize=False were left in
streaming state on DingTalk's UI after a gateway restart because
disconnect() called _streaming_cards.clear() without first closing
them via _close_streaming_siblings.
Move the finalization loop before self._http_client.aclose() so the
HTTP client is still available when the finalize requests are sent.
Adds a regression test that asserts the HTTP client is alive during
finalization.
* fix(terminal): warn at call time when background=true runs silently (#31289)
`terminal(background=true)` without `notify_on_complete=true` or
`watch_patterns` runs the process SILENTLY — the agent has no way
to learn it finished short of calling `process(action='poll')`
explicitly. That's correct for genuine long-lived processes (servers,
watchers, daemons) but is a footgun for every bounded task (tests,
builds, deploys, CI pollers, batch jobs), which is the vast majority
of background uses.
Hit on May 23, 2026 (PR #31231 incident): agent launched a CI-watch
loop with `background=true` only. The poller ran fine, exited green
6 minutes later, agent never noticed. User had to surface 'we are
green CI, you can merge.' Memory and skill docs said *what* to do
(poll in background) but not *how* to receive the result. The
`notify_on_complete=true` flag exists and works, but is easy to
forget when bg seems sufficient on its own.
Two changes here, mutually reinforcing:
1. Runtime nudge: tool result for `background=true` w/o notify or
watch_patterns now includes a `hint` field explaining the silent-
process failure mode and pointing at the corrective flag. Agent
sees it on the same turn and self-corrects without needing the
user to surface anything. Cost for legitimate server cases is one
ignored read (~50 tokens); cost for forgot-notify cases is
prevented blindness (potentially many turns, or a user nudge).
False positives << false negatives.
2. Schema/description rewrite: top-level TERMINAL_TOOL_DESCRIPTION
and the `background` field description now lead with 'Almost
always pair with notify_on_complete=true' instead of presenting
it as one of two equally-likely patterns. The two legitimate
non-notify shapes (long-lived servers; watch_patterns mid-process
signals) are still documented, but as the minority case.
Tests cover all four shapes: bg-only emits hint, bg+notify doesn't,
bg+watch_patterns doesn't, foreground doesn't. 4 new tests; full
suite of background/process tests stays green (160/160 across the
relevant 6 test files).
* Fix CLI verbose tool progress config fallback
* fix(cli): surface tool failures with specific error messages
Improves the failure suffix on tool completion lines. Instead of always
showing '[error]' for non-terminal failures, parse the tool's JSON result
and surface the actual message:
Before: ┊ 📖 read foo.py 0.1s [error]
After: ┊ 📖 read foo.py 0.1s [File not found: foo.py]
Before: ┊ 💻 $ ls bad 0.1s [exit 127]
After: ┊ 💻 $ ls bad 0.1s [ls: cannot access 'bad'...]
Adds a _trim_error helper that strips long absolute paths down to the
filename and caps the suffix at 48 chars so it stays readable on narrow
terminals.
Threads the tool result through the tool.completed progress callback so
agent/display.get_cute_tool_message can inspect it. The cli.py [error]
post-suffix is removed in favor of the richer suffix _detect_tool_failure
now produces directly.
Originally proposed in PR #17194 by Albert.Zhou; salvaged onto current
main with the dead-code preview-length bumps dropped (tool_preview_length
config already strictly caps previews, so the per-tool n= defaults are
unreachable).
Co-authored-by: Albert.Zhou <albert748@gmail.com>
* feat(cli): show todo progress as done/total fraction
Parse the todo_tool result summary to display completion progress in
CLI tool preview lines:
Read: ┊ 📋 plan 3/4 task(s) 0.5s
Update: ┊ 📋 plan update 3/4 ✓ 0.5s
Create: falls back to plain count when no completed tasks
Falls back gracefully to the existing 'N task(s)' format when the
result is missing, malformed, or has no completed items.
Originally proposed in PR #17194 by Albert.Zhou; salvaged onto current
main.
Co-authored-by: Albert.Zhou <albert748@gmail.com>
* test(display): cover failure-suffix rendering + update scrollback test
The original PR #17194 description claimed test_display_tool_preview.py
but only ever shipped test_display_todo_progress.py. Add the missing
coverage for the failure-suffix path:
- _trim_error: whitespace strip, length cap, File-not-found path collapse
- _detect_tool_failure: terminal exit codes, memory full, structured
{error}/{message} extraction, malformed JSON, None result
- get_cute_tool_message E2E: read_file failure, terminal exit-only,
terminal stderr message, memory full, success path, no-result path
Also update test_tool_progress_scrollback.test_error_suffix_on_failed_tool
to reflect the new behavior: the generic '[error]' fallback in cli.py
has been removed; failure suffixes now come from the result-aware
_detect_tool_failure (e.g. '[exit 1]', '[File not found: x]').
* docs: dedicated Nous Portal integration page and setup guide (#31296)
If Nous Portal is the recommended way to run Hermes Agent, it deserves
more than a sub-section buried under `## Inference Providers`. Add two
new pages and shrink the existing providers.md section to a stub that
points at them.
New pages:
- `website/docs/integrations/nous-portal.md` — landing page. What's in
the subscription (300+ model catalog table, Tool Gateway breakdown,
Nous Chat, cross-platform parity, no-dotfile-credentials). Hermes 4
recommendation note. Setup paths (fresh install, existing install,
headless / SSH, profiles). Day-to-day usage (portal status / portal
tools / portal open, switching models, mixing gateway with own
backends, subscription management). Configuration reference. Token
handling. Troubleshooting. Cross-links. Sidebar-position 1 — first
entry under Integrations.
- `website/docs/guides/run-hermes-with-nous-portal.md` — task script.
Eight numbered steps: subscribe → setup --portal → verify with
portal status → first chat → switch models → customize gateway
routing → voice mode → cron/always-on. Per-step troubleshooting.
'What this gets you in plain numbers' comparison table. Sidebar
position 1 — first entry under Guides & Tutorials.
Existing providers.md:
- Replace the 80-line `### Nous Portal` deep-dive with a 13-line stub
that summarizes the value prop, lists the three CLI commands, and
links to the new pages. Saves ~6KB. Other provider sections and
callouts (Codex Note, Two Commands, Tool Gateway tip) preserved.
Sidebar:
- `integrations/nous-portal` inserted right after `integrations/index`,
before `integrations/providers`.
- `guides/run-hermes-with-nous-portal` inserted first in Guides &
Tutorials.
* fix(tui): stop slash dropdown from chopping last char of /goal (#31311)
Two independent bugs caused the slash-command autocomplete to render
`/goal` as `/goa` (and `/gquota` as `/gquot` for that matter) in the TUI:
1. `tui_gateway/server.py` was forwarding `c.display` from
prompt_toolkit's `Completion` straight into the JSON-RPC payload.
prompt_toolkit normalizes `display=` into `FormattedText` (a `list`
subclass), so the wire format became `[["", "/goal"]]` instead of
the `string` that `CompletionItem.display` in the TUI declares.
`meta` already went through `to_plain_text` — `display` did not.
2. The dropdown row in `appOverlays.tsx` used `flexDirection="row"`
with the display `<Text>` and the (very long) meta `<Text>` as
siblings. When the meta overflows the row width, Ink/Yoga shrinks
the *first* column by one cell, lopping the trailing character off
the command name. `/goal` triggers it reliably because its meta
string is the longest of any built-in command (description +
embedded `[text | pause | resume | clear | status]` usage hint).
Wrapping the display column in `<Box flexShrink={0}>` keeps it at
its natural width and lets the meta wrap or truncate instead.
* fix(background-review): allow pinned skills to be improved
The post-turn background reviewer prompt listed pinned skills under
'Protected skills (DO NOT edit these)' alongside bundled and
hub-installed skills, with the instruction to say 'Nothing to save.'
if only protected skills needed updating. This meant the reviewer
would refuse to patch a pinned skill even when the user explicitly
wanted that skill improved.
The underlying tool layer already gets this right: skill_manage's
_pinned_guard only fires on delete; patch/edit/write_file go through
on pinned skills. Curator archive/consolidation still skips pinned
at the data layer (agent/curator.py), which is the correct place for
that protection — pin's job is anti-deletion, not anti-improvement.
Both _SKILL_REVIEW_PROMPT and _COMBINED_REVIEW_PROMPT now explicitly
tell the reviewer that pinned skills can be patched, with rationale,
so it doesn't bail out of an improvement just because the target is
pinned.
* fix(cli): reuse canonical root model key normalization in load_cli_config
* feat(cli): kanban promote verb for manual todo->ready recovery
Adds `hermes kanban promote <task_id>` for manual lifecycle recovery
when an auto-promote daemon misses the parent-done transition (issue
#28822). Refuses promotion unless every parent dep is done/archived
(override with --force). Emits a `promoted_manual` audit event distinct
from the automatic `promoted` kind, so audit consumers can filter
human-driven from system-driven promotions. Supports --dry-run and
--json for orchestration. Does not mutate assignee/claim state — the
dispatcher picks the card up via its normal ready polling path.
Closes #28822.
* feat(kanban): --ids bulk promote + AUTHOR_MAP entry for #29464
Adds an --ids flag to 'hermes kanban promote' mirroring the existing
block/schedule convention, so the marquee use case from issue #28822
(promote all children of a closed organizational parent in one shot)
doesn't require a shell loop. Single-id JSON output stays a flat
object for back-compat; bulk emits a list. Dedupes positional + --ids
so the same id can't be promoted twice in one call. 5 new CLI-level
tests cover bulk happy path, partial-failure exit code, JSON shapes,
and dedup.
Also adds the thedavidmurray noreply-email -> github-login mapping in
scripts/release.py so the salvage cherry-pick passes the AUTHOR_MAP
contributor-credit check.
* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290)
* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint
Adds a soft guard so an agent running under one Hermes profile cannot
silently edit a different profile's skills/plugins/cron/memories.
Three layers:
A. agent/file_safety.classify_cross_profile_target
Classifies a write target against the active HERMES_HOME. Returns
a {active_profile, target_profile, area, target_path} dict when the
path lands in another profile's scoped area. PROFILE_SCOPED_AREAS =
(skills, plugins, cron, memories). get_cross_profile_warning()
wraps it into a model-facing error string that names both profiles,
names the area, and points at the cross_profile=True bypass.
Defense-in-depth, NOT a security boundary — the terminal tool runs
as the same OS user and can write any of these paths directly. The
guard exists to prevent confused-agent corruption, not to stop a
determined attacker. SECURITY.md §3.2 (terminal-bypass posture)
still applies.
Wired into tools/file_tools.write_file_tool and patch_tool with a
cross_profile=False kwarg. WRITE_FILE_SCHEMA and PATCH_SCHEMA both
advertise cross_profile so the model can pass it after explicit
user direction. patch_tool extracts target paths from V4A patch
bodies before checking (same shape as the existing sensitive-path
check).
skill_manage is already scoped to the active profile's SKILLS_DIR
by construction, so no extra guard wiring is needed there. The
D-side error message (below) still names other profiles when the
skill exists elsewhere.
B. agent/system_prompt
One deterministic line near the environment-hints block names the
active profile and tells the model not to modify another profile's
skills/plugins/cron/memories without explicit direction. Profile
name is stable for the lifetime of the AIAgent, so the line is
prompt-cache-safe.
D. tools/skill_manager_tool._skill_not_found_error
Replaces the bare "Skill 'X' not found." with a message that:
- names the active profile,
- searches OTHER profiles' skills dirs for the same name,
- names the profile(s) where the skill exists and the path,
- suggests `hermes -p <name>` to switch profiles, or
cross_profile=True for an explicit edit.
All 5 "not found" sites in skill_manager_tool (edit, patch, delete,
write_file, remove_file) now go through the helper.
Reference incident (May 2026): a hermes-security profile session
edited skills under both ~/.hermes/profiles/hermes-security/skills/
AND ~/.hermes/skills/ (the default profile's skills) without
realizing the second path belonged to a different profile. Three of
the four skill files needed manual restoration afterward.
What this PR does NOT do:
* No hard block. The terminal tool can still touch any of these
paths with no guard — same posture as the dangerous-command
approval flow. SECURITY.md §3.2 applies.
* No regex sweep on terminal commands for cross-profile paths.
That direction is a Skills-Guard-style arms race (cd + relative
paths, base64, etc.) and would false-positive on legitimate
cross-profile reads. Filed as a follow-up.
* No on-disk path migration. ~/.hermes/skills/ remains the
default profile's skills dir; this PR is about telling the
agent about that boundary, not changing the layout.
Tests:
tests/agent/test_file_safety_cross_profile.py (16 tests)
- _resolve_active_profile_name covers default/named/failure paths
- classify_cross_profile_target covers all four scoped areas,
both directions (default → named, named → default, named → named),
non-Hermes paths, and root-level config files
- get_cross_profile_warning covers in-profile no-op, cross-profile
message shape, and the defense-in-depth self-documentation
tests/tools/test_cross_profile_guard.py (12 tests)
- write_file: in-profile allow, cross-profile block, cross_profile=True
bypass, non-Hermes pass-through
- patch: replace-mode block, cross_profile=True bypass, V4A patch
path extraction
- skill_manage: error names the other profile (single + multiple),
missing-everywhere falls back to skills_list hint
- system prompt: contract-level checks (both branches present,
cross_profile=True mentioned, ~/.hermes/profiles/ referenced)
All 207 existing tests in file_safety/file_operations/skill_manager
still pass. 10 system-prompt tests still pass.
E2E verified: the exact incident scenario (security profile editing
default's hermes-agent-dev skill) is now blocked with the warning
message; cross_profile=True unblocks.
* fix(code_execution): add cross_profile to write_file/patch stubs
The cross_profile kwarg added to write_file_tool/patch_tool needs to
flow through the execute_code sandbox stubs in _TOOL_STUBS so the
test_stubs_cover_all_schema_params drift test passes. Without this,
scripts running inside execute_code couldn't pass cross_profile=True
through hermes_tools.write_file().
Caught by CI on PR #31290.
* fix(wecom-callback): retry send with fresh token on errcode 40001/42001
When WeCom returns errcode=40001 (invalid credential) or 42001 (token
expired), send() was returning a failure without evicting the bad token
from _access_tokens. All subsequent sends then kept using the same
invalid cached token until its TTL naturally expired (~7200s).
Fix: on the first token-rejection errcode, evict the cache entry and
retry once with a freshly fetched token. Non-token errcodes fail
immediately as before. If the refreshed token also fails, the error
is returned without looping further.
Adds four regression tests covering: successful retry on 40001,
successful retry on 42001, no retry on unrelated errcode, and clean
failure when the refresh does not help.
* gateway: debounce queued text follow-ups
* chore: map pnascimento9596@gmail.com for PR #31235 salvage
* fix(gateway): drop text snippet from debounce debug log (CodeQL)
CodeQL py/clear-text-logging-sensitive-data flagged the candidate-accept
debug log including event.text[:60]. Log text_len instead — sufficient for
debugging burst behavior without surfacing message contents.
Co-authored-by: Paulo Nascimento <pnascimento9596@gmail.com>
* fix(wecom): guard flush task against cancel-delivery race to prevent message loss
When asyncio.sleep() fires just before Task.cancel() is called, CPython
sets _must_cancel=True but cannot cancel the already-completed sleep
future, so CancelledError is delivered at the next await (handle_message)
rather than at the sleep. By that point the superseded task has already
popped the merged event from _pending_text_batches, so the superseding
task sees an empty batch and silently drops the message.
Fix: add a synchronous task-registry check between the sleep and the pop.
No await between the check and the pop means no other coroutine can
interleave, so the guard is race-free.
* fix(cli): decouple tool_progress=verbose from global DEBUG logging (#31379)
PR #6a1aa420e coupled `display.tool_progress: verbose` (a per-tool display
toggle for full args / results / think blocks) to `self.verbose` — which
controls root-logger DEBUG level. Result: setting tool_progress: verbose
in config silently flipped every module in the process to DEBUG and
flooded the terminal with internal logging, far beyond just full tool
calls.
The two concepts are separate:
- `tool_progress_mode == 'verbose'` → display behavior (tool rendering)
- `self.verbose` → logging behavior (root logger → DEBUG, line 9795)
This change keeps PR #6a1aa420e's argparse.SUPPRESS / config-fallback
plumbing but severs the verbose-display → debug-logging link.
Changes:
- cli.py:2868 — `self.verbose` only follows explicit `verbose=` arg; no
longer auto-True when tool_progress_mode == 'verbose'.
- cli.py:_toggle_verbose — slash-cycle through tool progress modes no
longer flips `self.verbose` / `agent.verbose_logging` / `agent.quiet_mode`.
- cli.py:9355 — fix misleading label (drop 'and debug logs').
- tui_gateway/server.py:_make_agent — same decoupling on the TUI side
(verbose_logging no longer derived from tool_progress_mode).
- tests/cli/test_tool_progress_scrollback.py — invert the test that
asserted the broken coupling; add coverage for explicit `--verbose`
still enabling DEBUG independent of tool_progress.
Live verified:
- tool_progress: verbose, no --verbose flag → 0 DEBUG/INFO log lines
- --verbose flag explicit → 32 DEBUG/INFO log lines (as expected)
* feat(secrets/bitwarden): EU Cloud + self-hosted server URL support (#31378)
Closes #31370.
bws defaults to the US identity endpoint, so EU Cloud and self-hosted
machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"}
during 'hermes secrets bitwarden setup'. The token is valid — it's just
being checked against the wrong region.
Add a Bitwarden region step to the wizard between the access-token and
project-list steps:
Step 1 Install bws
Step 2 Provide access token
Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL)
Step 4 Pick project (now talks to the right endpoint)
Step 5 Test fetch
Region is stored in config.yaml as secrets.bitwarden.server_url and
plumbed into every bws subprocess as BWS_SERVER_URL (project list,
secret list, test fetch, and the env_loader startup pull).
Also:
- Non-interactive: 'hermes secrets bitwarden setup --server-url ...'
- Pre-existing BWS_SERVER_URL in the shell is detected and reused
- Cache key includes server_url so EU/US fetches don't collide
- 'hermes secrets bitwarden status' shows the configured region
- 'invalid_client' / '400 Bad Request' from bws now triggers a hint
pointing at the region setting instead of looking like a bad token
* fix(security): restrict dashboard websockets to loopback clients (#30741)
* fix(gateway): stop enabling dingtalk allow-all during setup (#30743)
* fix(gateway): remove discord role allowlist auth bypass (#30742)
* security: restrict default webhook toolset capabilities (#30745)
* fix(qqbot): authorize approval button interactions by session owner (#30737)
* Harden msgraph webhook auth requirements (#30169)
* fix(docker): keep dashboard side-process loopback by default (#30740)
* security: harden API server key placeholder handling (#30738)
* fix(feishu): authorize interactive exec approval callbacks (#30739)
* fix(feishu): enforce auth and chat binding for approval buttons (#30744)
* fix(feishu): require webhook auth secret and honor config extras (#30746)
* fix(acp): deliver final_response after streaming — transform_llm_output hook now visible
When streaming is active, streamed_message=True skipped the final_response
update, causing plugin hooks like transform_llm_output to be silently
invisible. Remove the `not streamed_message` guard so the final response
(possibly transformed by plugins) is always delivered to the ACP client.
* fix: propagate response_transformed flag — plugin hook output survives streaming suppression
When a transform_llm_output hook modifies final_response after streaming,
the gateway was silently discarding the transformed content because
streamed=True / content_delivered=True triggered the final-send
suppression. Three changes:
1. conversation_loop: set `_response_transformed=True` when a
transform_llm_output hook returns a non-empty string, and expose it
as `response_transformed` in the result dict.
2. gateway/run: skip the final-send suppression when
`response_transformed` is True — the transformed response must
reach the client even if streaming already sent the original text.
3. acp_adapter/server: remove `not streamed_message` guard so
final_response is always delivered (ACP path fixed separately).
* fix(gateway): propagate response_transformed flag through run_sync return dict
run_sync() cherry-picks fields from the run_conversation result dict into
a new response dict for the gateway. response_transformed was missing from
the cherry-pick list, so the gateway always saw it as False and suppressed
the final send even though a transform_llm_output hook had modified the content.
* fix(gateway): edit streamed message instead of sending duplicate when response_transformed
When a transform_llm_output hook appends content after streaming, the previous
fix skipped the final-send suppression which caused the full response to be
sent as a NEW message (duplicate). Instead, edit the existing streamed message
in-place to append the transformed content, then set already_sent=True.
Added stream_consumer.message_id and .accumulated_text public properties.
* test(gateway): regression for plugin-transformed response after streaming
Adds a test that fails without the gateway fix, exercising the
response_transformed=True branch in _finalize_response: a streamed
response whose final text was modified by a transform_llm_output
plugin hook must be edit_message'd in place (not duplicate-sent),
with already_sent=True so the normal final-send is skipped.
Also drops two minor leftovers from the salvaged PR #29119:
* accumulated_text property on GatewayStreamConsumer (unused)
* duplicate _response_transformed=False inside the hook try block
* chore: map kenyon1977@gmail.com for PR #29119 salvage
* fix(acp): only deliver final_response after streaming when transformed
PR #29119 dropped the 'not streamed_message' guard unconditionally so
that plugin-transformed responses (transform_llm_output hook) would
reach ACP clients. That regressed test_prompt_does_not_duplicate_streamed_final_message:
when no transform happened, the streamed text was re-sent as a duplicate
final delivery.
Tighten the condition to mirror the gateway side: deliver after streaming
only when response_transformed=True. Otherwise keep the old guard.
Adds test_prompt_delivers_transformed_response_after_streaming so the
transformed path stays covered.
* fix(streaming): emit finish_reason=length on text-only partial-stream stub
When the API connection drops mid-stream after text deltas have already
been delivered, chat_completion_helpers returned a stub response with
finish_reason=stop. The conversation loop then classified the stub as a
clean text completion (text_response(finish_reason=stop)) and exited
with iteration budget remaining — even when the goal-judge verdict
came back as "continue" milliseconds later (issue #30963).
Switch the text-only partial-stream stub to finish_reason=length. The
existing length-continuation path (length_continue_retries up to 3,
"continue exactly where you left off" prompt, partial parts merged
into final_response) then fires automatically: the partial assistant
content is persisted, the model is asked to continue from the cut
point, and the loop keeps making progress against the goal.
The mid-tool-call branch keeps finish_reason=stop on purpose — its
user-facing warning ("Ask me to retry if you want to continue") asks
the user to drive the retry rather than auto-replaying a tool call
with possible side effects.
#5544's "no duplicate message" contract is preserved verbatim: the
partial content is reused, never re-emitted as a fresh API call, so
the user never sees two copies of the same delta.
Refs: NousResearch/hermes-agent#30963
* fix(conversation-loop): tailor length-continuation prompt for partial stream
The length-continue path's user-facing vprint and continuation prompt
both told the model "your response was truncated by the output length
limit." That's a lie when the stub came from a partial-stream network
error (issue #30963) — and a lie the model can detect, leading to "I
wasn't truncated, I'm done" no-op responses that defeat the
continuation entirely.
Detect the partial-stream-stub via response.id and swap in:
- vprint: "Stream interrupted by network error
(finish_reason='length' on partial-stream-stub)"
- prompt: "[System: The previous response was cut off by a network
error mid-stream. Continue exactly where you left off.
Do not restart or repeat prior text. Finish the answer
directly.]"
Real length truncations still see the original "truncated by output
length limit" prompt — the model needs to know which class of failure
it's recovering from. Same length_continue_retries=3 budget,
truncated_response_parts merging, and final-response stitching
infrastructure on both branches.
Refs: NousResearch/hermes-agent#30963
* test(streaming): pin partial-stream-stub finish_reason + continuation contract
Three test classes lock in the #30963 fix:
1. TestPartialStreamStubFinishReason — drives _interruptible_streaming_api_call
through the two recovery branches and asserts:
- text-only partial → finish_reason="length" (the new behaviour),
- mid-tool-call partial → finish_reason="stop" (unchanged on purpose).
2. TestLengthContinuationPromptBranching — pure-Python check on the branch
that picks the continuation prompt by response.id. Locks the network
error wording for partial-stream-stub vs. the output-length wording
for everything else.
3. TestConversationLoopPartialStreamContinuation — feeds a stub +
continuation pair into run_conversation, verifies the loop makes a
second API call (instead of exiting with text_response(stop)),
confirms the network-error continuation prompt actually reaches the
model on call #2, and that final_response stitches both halves.
Refs: NousResearch/hermes-agent#30963
* fix(mcp): raise ImportError instead of NameError when stdio SDK missing (#31450)
When the 'mcp' Python SDK isn't installed, _run_stdio leaked a bare
'NameError: name StdioServerParameters is not defined' because the
top-level 'from mcp import ...' fails inside try/except ImportError,
leaving the names unbound at module scope.
Mirror the _MCP_HTTP_AVAILABLE gate that _run_http already had: raise
a clear ImportError with install instructions instead.
Fixes #30904
* fix(dashboard): require auth for plugin rescan (#27340)
* fix(gateway): validate Svix webhook signatures (#30200)
* fix: fail closed for webhook routes without secrets
Reject unsigned webhook requests when a route has no effective HMAC secret, even if the request handler is reached without the normal connect-time validation. Add regression coverage for the direct-handler path.
* fix(webhook): use 403 not 500 for missing-secret rejection
Operator misconfiguration is a client/setup error, not an internal server
exception. 403 "forbidden" more accurately reflects "this route refuses
to authenticate" than 500 "internal server error" — the latter triggers
incident alerting on operator monitoring and conflates real bugs with
config drift.
Follow-up tweak to PR #29629 by @m0n3r0.
* chore(release): map m0n3r0 for PR #29629 salvage
* fix(feishu): validate verification token before reflecting url_verification challenge
When FEISHU_VERIFICATION_TOKEN is configured, an unauthenticated remote
could previously prove endpoint control by sending a url_verification
payload with any attacker-controlled challenge string — the handler
reflected the challenge BEFORE running the token check.
Move the verification_token check ahead of the url_verification echo so
the challenge response is gated on a valid token. Add a regression test
covering the wrong-token case. Also fix the stale
test_connect_webhook_mode_starts_local_server fixture to set
FEISHU_VERIFICATION_TOKEN (post #30746 webhook mode requires a secret).
Salvaged from PR #29663 by @m0n3r0 — kept the url_verification reorder
and its regression test; dropped the host-conditional weakening of the
#30746 secret guard (we want webhook secrets required regardless of
bind host, not only on 0.0.0.0/::).
Docs updated to call out the gating.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
* fix(state): restrict sensitive store file permissions
response_store.db (api server) holds conversation history including tool
payloads, prompts, and results. webhook_subscriptions.json holds per-route
HMAC secrets. Under a permissive umask (e.g. 0o022, default on most
distros) both files were created mode 0o644 — readable by other local
users on shared boxes.
- gateway/platforms/api_server.py: ResponseStore tightens itself + WAL/SHM
sidecars to 0o600 after __init__, then trusts the inode. (Original
contributor patch chmod'd after every _commit() — wasteful on a hot
api_server path; chmod-on-create is sufficient since SQLite preserves
mode bits across writes.)
- hermes_cli/webhook.py: _save_subscriptions writes via tempfile.mkstemp
(which itself creates the file with 0o600), chmods the temp before the
atomic rename, and re-asserts 0o600 on the destination so an existing
permissive file from before this fix gets narrowed.
Tests cover (a) creation under permissive umask leaves 0o600 and (b) an
existing 0o644 webhook_subscriptions.json gets narrowed on next save.
Tests guarded with skipif os.name=='nt' since POSIX mode bits don't apply
on Windows.
Salvaged from PR #30917 by @Hinotoi-agent. Reworked the api_server.py
side from chmod-on-every-commit to chmod-on-create.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
* fix(tests): align CI tests with recent security hardening (#31470)
Four recent security PRs landed on main with stale/missing test updates,
breaking 4 test shards on every subsequent PR's CI run:
- test_discord_bot_auth_bypass.py (PR #30742 c3caca658):
DISCORD_ALLOWED_ROLES no longer bypasses _is_user_authorized.
Inverted 3 tests to assert the new (correct) behavior: role config
alone does NOT authorize at the gateway layer.
- test_msgraph_webhook.py (PR #30169 4ca77f105):
adapter.is_connected is a @property, not a method. Test was calling
it with () after the connect() change; TypeError: 'bool' is not
callable. Removed the parens.
- test_feishu_approval_buttons.py (PR #30744 bdb97b857):
Card-action callbacks now go through _allow_group_message
authorization. 3 tests in TestCardActionCallbackResponse didn't
populate adapter._allowed_group_users so the operator's open_id got
rejected. Added the allowlist setup to each test, matching the
existing pattern in test_returns_card_for_approve_action.
Also raise tolerance on test_wait_for_process_kills_subprocess_on_keyboardinterrupt:
the SIGTERM → 3s TimeoutStopSec → SIGKILL → reap chain can exceed 10s
under loaded xdist (40 workers). Bumped _wait_for_pgid_exit timeout
10→30s and worker join timeout 5→15s. Passes 100% in isolation
already; this just makes it tolerant of CI-host load.
Validation: 270/270 tests pass across the 5 affected files.
* fix: emit guardrail halt message to client before closing stream
When the tool loop guardrail fires (max_tool_failures, etc.), the
turn exits with guardrail_halt but no final assistant message was
emitted to the client. The SSE stream closed silently —
indistinguishable from a crash.
The stream_delta_callback(None) before tool execution is a display
flush, not a hard close. After generating the halt response, emit
it through both _safe_print (CLI) and stream_delta_callback (SSE)
so clients see the explanation.
Fixes #30770
* test(guardrail): assert halt message reaches stream_delta_callback
Regression guard for #30770 — verifies the guardrail-halt branch in
agent/conversation_loop.py pushes the synthesized halt message through
stream_delta_callback before breaking out of the loop. Without the
emit, chat-completions SSE writers drain an empty queue and clients
(Open WebUI, etc.) see a finish chunk with zero content delta —
indistinguishable from a crash.
Verified: the test fails when the production fix is reverted.
* fix(dashboard): validate WebSocket Host and Origin
* test(dashboard): send loopback headers for WebSocket sidecar test
* fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452)
* fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main for vision
Fixes #31179. Three coupled fixes so a configured aux vision backend
actually serves vision tasks instead of silently routing images to the
user's main provider:
1. agent/auxiliary_client.py: `auxiliary.<task>.provider: openai` resolves
to `custom` + `https://api.openai.com/v1`. "openai" was not in
PROVIDER_REGISTRY (we have `openai-codex` for OAuth and `custom` for
manual base_url), so the obvious config name silently failed to build a
client. User-supplied base_url is still preserved; only the provider
name normalises to `custom` so resolution doesn't hit the
PROVIDER_REGISTRY-only path.
2. agent/auxiliary_client.py: the vision auto-detect chain now skips the
user's main provider when models.dev reports `supports_vision=False`.
Without this guard, a misconfigured aux provider would fall back to
`auto`, which happily returned the main-provider client. The caller
would then send image content to e.g. api.deepseek.com with model
`gpt-4o-mini` and get a cryptic `unknown variant 'image_url',
expected 'text'` from the provider's parser.
3. tools/vision_tools.py + tools/browser_tool.py: `check_vision_requirements`
now mirrors the runtime fallback chain (explicit provider, then auto),
so `vision_analyze` shows up whenever vision is actually serviceable.
`browser_vision` gets a new `check_browser_vision_requirements` check_fn
that AND-gates browser + vision availability, so it doesn't get
advertised to the model when the call would fail at runtime.
Reproduction (config from the bug report):
model.provider: deepseek
model.default: deepseek-v4-pro
auxiliary.vision.provider: openai
auxiliary.vision.model: gpt-4o-mini
Before: resolve_vision_provider_client() returns None for the explicit
provider, fallback auto returns the deepseek client with model='gpt-4o-mini',
image hits api.deepseek.com → 'unknown variant image_url'. vision_analyze
hidden from tool list; browser_vision exposed but fails at call time.
After: resolves to custom + api.openai.com/v1 with model gpt-4o-mini.
vision_analyze and browser_vision both gate correctly on capability.
Tests: tests/agent/test_vision_routing_31179.py covers all three fixes
(12 cases including the user's exact scenario, base_url preservation,
text-only-main skip, capability-unknown permissive fallback, and tool
gating parity). Existing 382 tests across auxiliary/vision/image_routing
suites still pass.
* test(vision): use exact hostname check to silence CodeQL substring-sanitization alert
* fix(auxiliary): drop model name from vision-skip debug log to silence CodeQL
The new `logger.debug(...)` added in the previous commit interpolated
both `main_provider` and `vision_model` (a public model slug \u2014 not
sensitive). CodeQL's `py/clear-text-logging-sensitive-data` heuristic
re-flagged it twice because the rule mis-detects multi-value
interpolations near tainted-via-config provider strings.
Drop the model from the log args (provider alone is enough to diagnose
the skip; the same sibling branch a few lines up already logs provider
only). Behavior unchanged; CodeQL false positive cleared.
* fix(gateway): swallow transient Telegram TimedOut at loop level
Closes #31066. Closes #31110.
An unhandled `telegram.error.TimedOut` (or peer `NetworkError` /
`httpx` connection error) propagating to the asyncio event loop killed
the entire gateway process, taking down every profile attached to the
same runner. systemd restarted the service after ~5s but the active
conversation turn was lost.
Public adapter methods (`adapter.send`, `adapter.edit_message`,
`adapter.send_voice`, …) are individually try/except-wrapped on
current main, but at least one async path was reaching the loop with
TimedOut unhandled — the report's traceback ends at the deepest httpx
frame and doesn't pinpoint the caller.
Rather than audit 30+ call sites blind, install a loop-level safety net:
`_gateway_loop_exception_handler` is set as the loop's exception handler
in `start_gateway()` after `asyncio.get_running_loop()`. It classifies
the exception via `_is_transient_network_error()` (walks the
__cause__/__context__ chain, matches on class name so the test suite
doesn't need the real telegram/httpx packages installed). Transient
errors are logged at WARNING with full traceback so the originating
call site stays diagnosable; everything else forwards to
`loop.default_exception_handler` so real bugs still surface.
Tests cover the classifier (known transients accepted, real bugs
rejected, cause/context chain unwrap, cyclic-cause termination) and the
handler (swallow + log warning, forward unknowns, missing-exception
context). One end-to-end test schedules an orphan task raising TimedOut
and asserts `asyncio.run` returns cleanly.
* fix(agent): abort on HTTP 402 after pool rotation and fallback fail (#31443)
Closes #31273.
HTTP 402 (insufficient credits) was retried up to agent.api_max_retries
times (default 3), burning paid requests against an exhausted balance.
Real-world impact: ~$40 in 48h on a 24/7 Telegram+Discord gateway.
Root cause: FailoverReason.billing was in the is_client_error
exclusion set in agent/conversation_loop.py, which prevents the
non-retryable-abort branch from firing.
By the time control reaches that predicate:
* credential-pool rotation has already run for billing and either
continued the loop or returned False (pool exhausted/absent)
* the eager-fallback branch has also fired on billing and either
continued the loop or fell through (no fallback configured)
Falling through to the backoff retry from here has no recovery
mechanism left — it just burns more paid requests. Removing billing
from the exclusion set makes 402 abort cleanly once pool+fallback
recovery has failed, mirroring how 401/403 (also should_fallback=True)
already behave.
Added tests/run_agent/test_31273_402_not_retried.py which mirrors the
is_client_error predicate shape from the source and asserts the
invariant (plus a source-inspection guard against accidental
re-introduction).
* feat(security): on-demand supply-chain audit via OSV.dev (#31460)
Adds 'hermes security audit' — a one-shot vulnerability scan against
OSV.dev covering three surfaces a Hermes user actually controls:
1. The running Python's installed PyPI dists (importlib.metadata)
2. Plugin requirements.txt / pyproject.toml pins under ~/.hermes/plugins/
3. Pinned npx/uvx MCP servers in config.yaml
Zero new dependencies (stdlib urllib + importlib.metadata + tomllib +
concurrent.futures). No auth required for OSV's public batch API.
Flags: --json, --fail-on {low,moderate,high,critical} (default: critical),
--skip-venv, --skip-plugins, --skip-mcp
Output groups findings by source, sorts by severity descending, surfaces
fixed-versions inline. Exit 1 when any finding meets the --fail-on tier.
Deliberately out of scope: globally-installed pip/npm, editor/browser
extensions, daily background scans, auto-blocking of installs. The audit
is on-demand by design — daily scans become noise the user trains
themselves to ignore.
* fix(transport): strip Hermes-internal scaffolding keys before chat.completions
The empty-response recovery path in run_agent.py appends synthetic
messages tagged with _empty_recovery_synthetic (and the agent loop uses
_thinking_prefill / _empty_terminal_sentinel similarly). These are
internal bookkeeping markers — they must never reach the wire.
chat_completions' convert_messages only stripped Codex Responses leak
fields (codex_reasoning_items, call_id, etc.), not these _-prefixed
markers. Permissive providers (real OpenAI, Anthropic) silently ignore
unknown message keys so the bug stayed hidden, but strict
OpenAI-compatible gateways reject them outright. Observed against
codex.nekos.me:
502: [ObjectParam] [input[617]._empty_recovery_synthetic]
[unknown_parameter] Unknown parameter:
'_empty_recovery_synthetic'
Because the synthetic messages persist in the session, every
subsequent request in that session carries the poisoned key and
fails identically — a deterministic 502 the retry loop mistakes for
a transient server error.
Fix: convert_messages now drops any top-level message key starting
with '_'. OpenAI's message schema has no '_'-prefixed fields, so this
is safe and future-proofs against new internal markers.
Origin: local-author
Upstream-PR: none
Patch-State: local-only
* fix(error-classifier): treat 5xx request-validation errors as non-retryable
Standard OpenAI returns request-validation failures (unknown/
unsupported parameter, malformed request) as 4xx. Some
OpenAI-compatible gateways return them as 5xx instead — codex.nekos.me
returns 502 for an unknown parameter.
The generic '5xx -> retryable server_error' rule then misfires: the
error is deterministic (every retry gets the identical rejection), so
the retry loop burns all 3 attempts, the transport-recovery path
resets the counter and burns 3 more, and the result is a request
flood against a request that can never succeed.
Fix: when a 500/502 body carries an unambiguous request-validation
signal — 'unknown parameter' / 'unsupported parameter' /
'invalid_request_error' in the message text, or invalid_request_error
/ unknown_parameter / unsupported_parameter as the structured error
code — classify as a non-retryable format_error so the loop fails
fast and falls back. Genuine 502 Bad Gateway with no such signal
stays retryable as before.
Origin: local-author
Upstream-PR: none
Patch-State: local-only
* chore: map soju06@users.noreply.github.com for PR #26054 salvage
* fix(matrix,gateway): Matrix E2EE installs full dep set; plugins respect is_connected
Fixes #31116 — two distinct bugs in fresh-install Matrix gateway:
1. Matrix E2EE setup installed only mautrix[encryption], leaving asyncpg
/ aiosqlite / Markdown / aiohttp-socks uninstalled. The first encrypted
connect failed with 'No…
xbrrr
added a commit
to xbrrr/hermes-agent
that referenced
this pull request
May 26, 2026
* fix(cli): reuse canonical root model key normalization in load_cli_config
* feat(cli): kanban promote verb for manual todo->ready recovery
Adds `hermes kanban promote <task_id>` for manual lifecycle recovery
when an auto-promote daemon misses the parent-done transition (issue
#28822). Refuses promotion unless every parent dep is done/archived
(override with --force). Emits a `promoted_manual` audit event distinct
from the automatic `promoted` kind, so audit consumers can filter
human-driven from system-driven promotions. Supports --dry-run and
--json for orchestration. Does not mutate assignee/claim state — the
dispatcher picks the card up via its normal ready polling path.
Closes #28822.
* feat(kanban): --ids bulk promote + AUTHOR_MAP entry for #29464
Adds an --ids flag to 'hermes kanban promote' mirroring the existing
block/schedule convention, so the marquee use case from issue #28822
(promote all children of a closed organizational parent in one shot)
doesn't require a shell loop. Single-id JSON output stays a flat
object for back-compat; bulk emits a list. Dedupes positional + --ids
so the same id can't be promoted twice in one call. 5 new CLI-level
tests cover bulk happy path, partial-failure exit code, JSON shapes,
and dedup.
Also adds the thedavidmurray noreply-email -> github-login mapping in
scripts/release.py so the salvage cherry-pick passes the AUTHOR_MAP
contributor-credit check.
* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290)
* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint
Adds a soft guard so an agent running under one Hermes profile cannot
silently edit a different profile's skills/plugins/cron/memories.
Three layers:
A. agent/file_safety.classify_cross_profile_target
Classifies a write target against the active HERMES_HOME. Returns
a {active_profile, target_profile, area, target_path} dict when the
path lands in another profile's scoped area. PROFILE_SCOPED_AREAS =
(skills, plugins, cron, memories). get_cross_profile_warning()
wraps it into a model-facing error string that names both profiles,
names the area, and points at the cross_profile=True bypass.
Defense-in-depth, NOT a security boundary — the terminal tool runs
as the same OS user and can write any of these paths directly. The
guard exists to prevent confused-agent corruption, not to stop a
determined attacker. SECURITY.md §3.2 (terminal-bypass posture)
still applies.
Wired into tools/file_tools.write_file_tool and patch_tool with a
cross_profile=False kwarg. WRITE_FILE_SCHEMA and PATCH_SCHEMA both
advertise cross_profile so the model can pass it after explicit
user direction. patch_tool extracts target paths from V4A patch
bodies before checking (same shape as the existing sensitive-path
check).
skill_manage is already scoped to the active profile's SKILLS_DIR
by construction, so no extra guard wiring is needed there. The
D-side error message (below) still names other profiles when the
skill exists elsewhere.
B. agent/system_prompt
One deterministic line near the environment-hints block names the
active profile and tells the model not to modify another profile's
skills/plugins/cron/memories without explicit direction. Profile
name is stable for the lifetime of the AIAgent, so the line is
prompt-cache-safe.
D. tools/skill_manager_tool._skill_not_found_error
Replaces the bare "Skill 'X' not found." with a message that:
- names the active profile,
- searches OTHER profiles' skills dirs for the same name,
- names the profile(s) where the skill exists and the path,
- suggests `hermes -p <name>` to switch profiles, or
cross_profile=True for an explicit edit.
All 5 "not found" sites in skill_manager_tool (edit, patch, delete,
write_file, remove_file) now go through the helper.
Reference incident (May 2026): a hermes-security profile session
edited skills under both ~/.hermes/profiles/hermes-security/skills/
AND ~/.hermes/skills/ (the default profile's skills) without
realizing the second path belonged to a different profile. Three of
the four skill files needed manual restoration afterward.
What this PR does NOT do:
* No hard block. The terminal tool can still touch any of these
paths with no guard — same posture as the dangerous-command
approval flow. SECURITY.md §3.2 applies.
* No regex sweep on terminal commands for cross-profile paths.
That direction is a Skills-Guard-style arms race (cd + relative
paths, base64, etc.) and would false-positive on legitimate
cross-profile reads. Filed as a follow-up.
* No on-disk path migration. ~/.hermes/skills/ remains the
default profile's skills dir; this PR is about telling the
agent about that boundary, not changing the layout.
Tests:
tests/agent/test_file_safety_cross_profile.py (16 tests)
- _resolve_active_profile_name covers default/named/failure paths
- classify_cross_profile_target covers all four scoped areas,
both directions (default → named, named → default, named → named),
non-Hermes paths, and root-level config files
- get_cross_profile_warning covers in-profile no-op, cross-profile
message shape, and the defense-in-depth self-documentation
tests/tools/test_cross_profile_guard.py (12 tests)
- write_file: in-profile allow, cross-profile block, cross_profile=True
bypass, non-Hermes pass-through
- patch: replace-mode block, cross_profile=True bypass, V4A patch
path extraction
- skill_manage: error names the other profile (single + multiple),
missing-everywhere falls back to skills_list hint
- system prompt: contract-level checks (both branches present,
cross_profile=True mentioned, ~/.hermes/profiles/ referenced)
All 207 existing tests in file_safety/file_operations/skill_manager
still pass. 10 system-prompt tests still pass.
E2E verified: the exact incident scenario (security profile editing
default's hermes-agent-dev skill) is now blocked with the warning
message; cross_profile=True unblocks.
* fix(code_execution): add cross_profile to write_file/patch stubs
The cross_profile kwarg added to write_file_tool/patch_tool needs to
flow through the execute_code sandbox stubs in _TOOL_STUBS so the
test_stubs_cover_all_schema_params drift test passes. Without this,
scripts running inside execute_code couldn't pass cross_profile=True
through hermes_tools.write_file().
Caught by CI on PR #31290.
* fix(wecom-callback): retry send with fresh token on errcode 40001/42001
When WeCom returns errcode=40001 (invalid credential) or 42001 (token
expired), send() was returning a failure without evicting the bad token
from _access_tokens. All subsequent sends then kept using the same
invalid cached token until its TTL naturally expired (~7200s).
Fix: on the first token-rejection errcode, evict the cache entry and
retry once with a freshly fetched token. Non-token errcodes fail
immediately as before. If the refreshed token also fails, the error
is returned without looping further.
Adds four regression tests covering: successful retry on 40001,
successful retry on 42001, no retry on unrelated errcode, and clean
failure when the refresh does not help.
* gateway: debounce queued text follow-ups
* chore: map pnascimento9596@gmail.com for PR #31235 salvage
* fix(gateway): drop text snippet from debounce debug log (CodeQL)
CodeQL py/clear-text-logging-sensitive-data flagged the candidate-accept
debug log including event.text[:60]. Log text_len instead — sufficient for
debugging burst behavior without surfacing message contents.
Co-authored-by: Paulo Nascimento <pnascimento9596@gmail.com>
* fix(wecom): guard flush task against cancel-delivery race to prevent message loss
When asyncio.sleep() fires just before Task.cancel() is called, CPython
sets _must_cancel=True but cannot cancel the already-completed sleep
future, so CancelledError is delivered at the next await (handle_message)
rather than at the sleep. By that point the superseded task has already
popped the merged event from _pending_text_batches, so the superseding
task sees an empty batch and silently drops the message.
Fix: add a synchronous task-registry check between the sleep and the pop.
No await between the check and the pop means no other coroutine can
interleave, so the guard is race-free.
* fix(cli): decouple tool_progress=verbose from global DEBUG logging (#31379)
PR #6a1aa420e coupled `display.tool_progress: verbose` (a per-tool display
toggle for full args / results / think blocks) to `self.verbose` — which
controls root-logger DEBUG level. Result: setting tool_progress: verbose
in config silently flipped every module in the process to DEBUG and
flooded the terminal with internal logging, far beyond just full tool
calls.
The two concepts are separate:
- `tool_progress_mode == 'verbose'` → display behavior (tool rendering)
- `self.verbose` → logging behavior (root logger → DEBUG, line 9795)
This change keeps PR #6a1aa420e's argparse.SUPPRESS / config-fallback
plumbing but severs the verbose-display → debug-logging link.
Changes:
- cli.py:2868 — `self.verbose` only follows explicit `verbose=` arg; no
longer auto-True when tool_progress_mode == 'verbose'.
- cli.py:_toggle_verbose — slash-cycle through tool progress modes no
longer flips `self.verbose` / `agent.verbose_logging` / `agent.quiet_mode`.
- cli.py:9355 — fix misleading label (drop 'and debug logs').
- tui_gateway/server.py:_make_agent — same decoupling on the TUI side
(verbose_logging no longer derived from tool_progress_mode).
- tests/cli/test_tool_progress_scrollback.py — invert the test that
asserted the broken coupling; add coverage for explicit `--verbose`
still enabling DEBUG independent of tool_progress.
Live verified:
- tool_progress: verbose, no --verbose flag → 0 DEBUG/INFO log lines
- --verbose flag explicit → 32 DEBUG/INFO log lines (as expected)
* feat(secrets/bitwarden): EU Cloud + self-hosted server URL support (#31378)
Closes #31370.
bws defaults to the US identity endpoint, so EU Cloud and self-hosted
machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"}
during 'hermes secrets bitwarden setup'. The token is valid — it's just
being checked against the wrong region.
Add a Bitwarden region step to the wizard between the access-token and
project-list steps:
Step 1 Install bws
Step 2 Provide access token
Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL)
Step 4 Pick project (now talks to the right endpoint)
Step 5 Test fetch
Region is stored in config.yaml as secrets.bitwarden.server_url and
plumbed into every bws subprocess as BWS_SERVER_URL (project list,
secret list, test fetch, and the env_loader startup pull).
Also:
- Non-interactive: 'hermes secrets bitwarden setup --server-url ...'
- Pre-existing BWS_SERVER_URL in the shell is detected and reused
- Cache key includes server_url so EU/US fetches don't collide
- 'hermes secrets bitwarden status' shows the configured region
- 'invalid_client' / '400 Bad Request' from bws now triggers a hint
pointing at the region setting instead of looking like a bad token
* fix(security): restrict dashboard websockets to loopback clients (#30741)
* fix(gateway): stop enabling dingtalk allow-all during setup (#30743)
* fix(gateway): remove discord role allowlist auth bypass (#30742)
* security: restrict default webhook toolset capabilities (#30745)
* fix(qqbot): authorize approval button interactions by session owner (#30737)
* Harden msgraph webhook auth requirements (#30169)
* fix(docker): keep dashboard side-process loopback by default (#30740)
* security: harden API server key placeholder handling (#30738)
* fix(feishu): authorize interactive exec approval callbacks (#30739)
* fix(feishu): enforce auth and chat binding for approval buttons (#30744)
* fix(feishu): require webhook auth secret and honor config extras (#30746)
* fix(acp): deliver final_response after streaming — transform_llm_output hook now visible
When streaming is active, streamed_message=True skipped the final_response
update, causing plugin hooks like transform_llm_output to be silently
invisible. Remove the `not streamed_message` guard so the final response
(possibly transformed by plugins) is always delivered to the ACP client.
* fix: propagate response_transformed flag — plugin hook output survives streaming suppression
When a transform_llm_output hook modifies final_response after streaming,
the gateway was silently discarding the transformed content because
streamed=True / content_delivered=True triggered the final-send
suppression. Three changes:
1. conversation_loop: set `_response_transformed=True` when a
transform_llm_output hook returns a non-empty string, and expose it
as `response_transformed` in the result dict.
2. gateway/run: skip the final-send suppression when
`response_transformed` is True — the transformed response must
reach the client even if streaming already sent the original text.
3. acp_adapter/server: remove `not streamed_message` guard so
final_response is always delivered (ACP path fixed separately).
* fix(gateway): propagate response_transformed flag through run_sync return dict
run_sync() cherry-picks fields from the run_conversation result dict into
a new response dict for the gateway. response_transformed was missing from
the cherry-pick list, so the gateway always saw it as False and suppressed
the final send even though a transform_llm_output hook had modified the content.
* fix(gateway): edit streamed message instead of sending duplicate when response_transformed
When a transform_llm_output hook appends content after streaming, the previous
fix skipped the final-send suppression which caused the full response to be
sent as a NEW message (duplicate). Instead, edit the existing streamed message
in-place to append the transformed content, then set already_sent=True.
Added stream_consumer.message_id and .accumulated_text public properties.
* test(gateway): regression for plugin-transformed response after streaming
Adds a test that fails without the gateway fix, exercising the
response_transformed=True branch in _finalize_response: a streamed
response whose final text was modified by a transform_llm_output
plugin hook must be edit_message'd in place (not duplicate-sent),
with already_sent=True so the normal final-send is skipped.
Also drops two minor leftovers from the salvaged PR #29119:
* accumulated_text property on GatewayStreamConsumer (unused)
* duplicate _response_transformed=False inside the hook try block
* chore: map kenyon1977@gmail.com for PR #29119 salvage
* fix(acp): only deliver final_response after streaming when transformed
PR #29119 dropped the 'not streamed_message' guard unconditionally so
that plugin-transformed responses (transform_llm_output hook) would
reach ACP clients. That regressed test_prompt_does_not_duplicate_streamed_final_message:
when no transform happened, the streamed text was re-sent as a duplicate
final delivery.
Tighten the condition to mirror the gateway side: deliver after streaming
only when response_transformed=True. Otherwise keep the old guard.
Adds test_prompt_delivers_transformed_response_after_streaming so the
transformed path stays covered.
* fix(streaming): emit finish_reason=length on text-only partial-stream stub
When the API connection drops mid-stream after text deltas have already
been delivered, chat_completion_helpers returned a stub response with
finish_reason=stop. The conversation loop then classified the stub as a
clean text completion (text_response(finish_reason=stop)) and exited
with iteration budget remaining — even when the goal-judge verdict
came back as "continue" milliseconds later (issue #30963).
Switch the text-only partial-stream stub to finish_reason=length. The
existing length-continuation path (length_continue_retries up to 3,
"continue exactly where you left off" prompt, partial parts merged
into final_response) then fires automatically: the partial assistant
content is persisted, the model is asked to continue from the cut
point, and the loop keeps making progress against the goal.
The mid-tool-call branch keeps finish_reason=stop on purpose — its
user-facing warning ("Ask me to retry if you want to continue") asks
the user to drive the retry rather than auto-replaying a tool call
with possible side effects.
#5544's "no duplicate message" contract is preserved verbatim: the
partial content is reused, never re-emitted as a fresh API call, so
the user never sees two copies of the same delta.
Refs: NousResearch/hermes-agent#30963
* fix(conversation-loop): tailor length-continuation prompt for partial stream
The length-continue path's user-facing vprint and continuation prompt
both told the model "your response was truncated by the output length
limit." That's a lie when the stub came from a partial-stream network
error (issue #30963) — and a lie the model can detect, leading to "I
wasn't truncated, I'm done" no-op responses that defeat the
continuation entirely.
Detect the partial-stream-stub via response.id and swap in:
- vprint: "Stream interrupted by network error
(finish_reason='length' on partial-stream-stub)"
- prompt: "[System: The previous response was cut off by a network
error mid-stream. Continue exactly where you left off.
Do not restart or repeat prior text. Finish the answer
directly.]"
Real length truncations still see the original "truncated by output
length limit" prompt — the model needs to know which class of failure
it's recovering from. Same length_continue_retries=3 budget,
truncated_response_parts merging, and final-response stitching
infrastructure on both branches.
Refs: NousResearch/hermes-agent#30963
* test(streaming): pin partial-stream-stub finish_reason + continuation contract
Three test classes lock in the #30963 fix:
1. TestPartialStreamStubFinishReason — drives _interruptible_streaming_api_call
through the two recovery branches and asserts:
- text-only partial → finish_reason="length" (the new behaviour),
- mid-tool-call partial → finish_reason="stop" (unchanged on purpose).
2. TestLengthContinuationPromptBranching — pure-Python check on the branch
that picks the continuation prompt by response.id. Locks the network
error wording for partial-stream-stub vs. the output-length wording
for everything else.
3. TestConversationLoopPartialStreamContinuation — feeds a stub +
continuation pair into run_conversation, verifies the loop makes a
second API call (instead of exiting with text_response(stop)),
confirms the network-error continuation prompt actually reaches the
model on call #2, and that final_response stitches both halves.
Refs: NousResearch/hermes-agent#30963
* fix(mcp): raise ImportError instead of NameError when stdio SDK missing (#31450)
When the 'mcp' Python SDK isn't installed, _run_stdio leaked a bare
'NameError: name StdioServerParameters is not defined' because the
top-level 'from mcp import ...' fails inside try/except ImportError,
leaving the names unbound at module scope.
Mirror the _MCP_HTTP_AVAILABLE gate that _run_http already had: raise
a clear ImportError with install instructions instead.
Fixes #30904
* fix(dashboard): require auth for plugin rescan (#27340)
* fix(gateway): validate Svix webhook signatures (#30200)
* fix: fail closed for webhook routes without secrets
Reject unsigned webhook requests when a route has no effective HMAC secret, even if the request handler is reached without the normal connect-time validation. Add regression coverage for the direct-handler path.
* fix(webhook): use 403 not 500 for missing-secret rejection
Operator misconfiguration is a client/setup error, not an internal server
exception. 403 "forbidden" more accurately reflects "this route refuses
to authenticate" than 500 "internal server error" — the latter triggers
incident alerting on operator monitoring and conflates real bugs with
config drift.
Follow-up tweak to PR #29629 by @m0n3r0.
* chore(release): map m0n3r0 for PR #29629 salvage
* fix(feishu): validate verification token before reflecting url_verification challenge
When FEISHU_VERIFICATION_TOKEN is configured, an unauthenticated remote
could previously prove endpoint control by sending a url_verification
payload with any attacker-controlled challenge string — the handler
reflected the challenge BEFORE running the token check.
Move the verification_token check ahead of the url_verification echo so
the challenge response is gated on a valid token. Add a regression test
covering the wrong-token case. Also fix the stale
test_connect_webhook_mode_starts_local_server fixture to set
FEISHU_VERIFICATION_TOKEN (post #30746 webhook mode requires a secret).
Salvaged from PR #29663 by @m0n3r0 — kept the url_verification reorder
and its regression test; dropped the host-conditional weakening of the
#30746 secret guard (we want webhook secrets required regardless of
bind host, not only on 0.0.0.0/::).
Docs updated to call out the gating.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
* fix(state): restrict sensitive store file permissions
response_store.db (api server) holds conversation history including tool
payloads, prompts, and results. webhook_subscriptions.json holds per-route
HMAC secrets. Under a permissive umask (e.g. 0o022, default on most
distros) both files were created mode 0o644 — readable by other local
users on shared boxes.
- gateway/platforms/api_server.py: ResponseStore tightens itself + WAL/SHM
sidecars to 0o600 after __init__, then trusts the inode. (Original
contributor patch chmod'd after every _commit() — wasteful on a hot
api_server path; chmod-on-create is sufficient since SQLite preserves
mode bits across writes.)
- hermes_cli/webhook.py: _save_subscriptions writes via tempfile.mkstemp
(which itself creates the file with 0o600), chmods the temp before the
atomic rename, and re-asserts 0o600 on the destination so an existing
permissive file from before this fix gets narrowed.
Tests cover (a) creation under permissive umask leaves 0o600 and (b) an
existing 0o644 webhook_subscriptions.json gets narrowed on next save.
Tests guarded with skipif os.name=='nt' since POSIX mode bits don't apply
on Windows.
Salvaged from PR #30917 by @Hinotoi-agent. Reworked the api_server.py
side from chmod-on-every-commit to chmod-on-create.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
* fix(tests): align CI tests with recent security hardening (#31470)
Four recent security PRs landed on main with stale/missing test updates,
breaking 4 test shards on every subsequent PR's CI run:
- test_discord_bot_auth_bypass.py (PR #30742 c3caca658):
DISCORD_ALLOWED_ROLES no longer bypasses _is_user_authorized.
Inverted 3 tests to assert the new (correct) behavior: role config
alone does NOT authorize at the gateway layer.
- test_msgraph_webhook.py (PR #30169 4ca77f105):
adapter.is_connected is a @property, not a method. Test was calling
it with () after the connect() change; TypeError: 'bool' is not
callable. Removed the parens.
- test_feishu_approval_buttons.py (PR #30744 bdb97b857):
Card-action callbacks now go through _allow_group_message
authorization. 3 tests in TestCardActionCallbackResponse didn't
populate adapter._allowed_group_users so the operator's open_id got
rejected. Added the allowlist setup to each test, matching the
existing pattern in test_returns_card_for_approve_action.
Also raise tolerance on test_wait_for_process_kills_subprocess_on_keyboardinterrupt:
the SIGTERM → 3s TimeoutStopSec → SIGKILL → reap chain can exceed 10s
under loaded xdist (40 workers). Bumped _wait_for_pgid_exit timeout
10→30s and worker join timeout 5→15s. Passes 100% in isolation
already; this just makes it tolerant of CI-host load.
Validation: 270/270 tests pass across the 5 affected files.
* fix: emit guardrail halt message to client before closing stream
When the tool loop guardrail fires (max_tool_failures, etc.), the
turn exits with guardrail_halt but no final assistant message was
emitted to the client. The SSE stream closed silently —
indistinguishable from a crash.
The stream_delta_callback(None) before tool execution is a display
flush, not a hard close. After generating the halt response, emit
it through both _safe_print (CLI) and stream_delta_callback (SSE)
so clients see the explanation.
Fixes #30770
* test(guardrail): assert halt message reaches stream_delta_callback
Regression guard for #30770 — verifies the guardrail-halt branch in
agent/conversation_loop.py pushes the synthesized halt message through
stream_delta_callback before breaking out of the loop. Without the
emit, chat-completions SSE writers drain an empty queue and clients
(Open WebUI, etc.) see a finish chunk with zero content delta —
indistinguishable from a crash.
Verified: the test fails when the production fix is reverted.
* fix(dashboard): validate WebSocket Host and Origin
* test(dashboard): send loopback headers for WebSocket sidecar test
* fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452)
* fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main for vision
Fixes #31179. Three coupled fixes so a configured aux vision backend
actually serves vision tasks instead of silently routing images to the
user's main provider:
1. agent/auxiliary_client.py: `auxiliary.<task>.provider: openai` resolves
to `custom` + `https://api.openai.com/v1`. "openai" was not in
PROVIDER_REGISTRY (we have `openai-codex` for OAuth and `custom` for
manual base_url), so the obvious config name silently failed to build a
client. User-supplied base_url is still preserved; only the provider
name normalises to `custom` so resolution doesn't hit the
PROVIDER_REGISTRY-only path.
2. agent/auxiliary_client.py: the vision auto-detect chain now skips the
user's main provider when models.dev reports `supports_vision=False`.
Without this guard, a misconfigured aux provider would fall back to
`auto`, which happily returned the main-provider client. The caller
would then send image content to e.g. api.deepseek.com with model
`gpt-4o-mini` and get a cryptic `unknown variant 'image_url',
expected 'text'` from the provider's parser.
3. tools/vision_tools.py + tools/browser_tool.py: `check_vision_requirements`
now mirrors the runtime fallback chain (explicit provider, then auto),
so `vision_analyze` shows up whenever vision is actually serviceable.
`browser_vision` gets a new `check_browser_vision_requirements` check_fn
that AND-gates browser + vision availability, so it doesn't get
advertised to the model when the call would fail at runtime.
Reproduction (config from the bug report):
model.provider: deepseek
model.default: deepseek-v4-pro
auxiliary.vision.provider: openai
auxiliary.vision.model: gpt-4o-mini
Before: resolve_vision_provider_client() returns None for the explicit
provider, fallback auto returns the deepseek client with model='gpt-4o-mini',
image hits api.deepseek.com → 'unknown variant image_url'. vision_analyze
hidden from tool list; browser_vision exposed but fails at call time.
After: resolves to custom + api.openai.com/v1 with model gpt-4o-mini.
vision_analyze and browser_vision both gate correctly on capability.
Tests: tests/agent/test_vision_routing_31179.py covers all three fixes
(12 cases including the user's exact scenario, base_url preservation,
text-only-main skip, capability-unknown permissive fallback, and tool
gating parity). Existing 382 tests across auxiliary/vision/image_routing
suites still pass.
* test(vision): use exact hostname check to silence CodeQL substring-sanitization alert
* fix(auxiliary): drop model name from vision-skip debug log to silence CodeQL
The new `logger.debug(...)` added in the previous commit interpolated
both `main_provider` and `vision_model` (a public model slug \u2014 not
sensitive). CodeQL's `py/clear-text-logging-sensitive-data` heuristic
re-flagged it twice because the rule mis-detects multi-value
interpolations near tainted-via-config provider strings.
Drop the model from the log args (provider alone is enough to diagnose
the skip; the same sibling branch a few lines up already logs provider
only). Behavior unchanged; CodeQL false positive cleared.
* fix(gateway): swallow transient Telegram TimedOut at loop level
Closes #31066. Closes #31110.
An unhandled `telegram.error.TimedOut` (or peer `NetworkError` /
`httpx` connection error) propagating to the asyncio event loop killed
the entire gateway process, taking down every profile attached to the
same runner. systemd restarted the service after ~5s but the active
conversation turn was lost.
Public adapter methods (`adapter.send`, `adapter.edit_message`,
`adapter.send_voice`, …) are individually try/except-wrapped on
current main, but at least one async path was reaching the loop with
TimedOut unhandled — the report's traceback ends at the deepest httpx
frame and doesn't pinpoint the caller.
Rather than audit 30+ call sites blind, install a loop-level safety net:
`_gateway_loop_exception_handler` is set as the loop's exception handler
in `start_gateway()` after `asyncio.get_running_loop()`. It classifies
the exception via `_is_transient_network_error()` (walks the
__cause__/__context__ chain, matches on class name so the test suite
doesn't need the real telegram/httpx packages installed). Transient
errors are logged at WARNING with full traceback so the originating
call site stays diagnosable; everything else forwards to
`loop.default_exception_handler` so real bugs still surface.
Tests cover the classifier (known transients accepted, real bugs
rejected, cause/context chain unwrap, cyclic-cause termination) and the
handler (swallow + log warning, forward unknowns, missing-exception
context). One end-to-end test schedules an orphan task raising TimedOut
and asserts `asyncio.run` returns cleanly.
* fix(agent): abort on HTTP 402 after pool rotation and fallback fail (#31443)
Closes #31273.
HTTP 402 (insufficient credits) was retried up to agent.api_max_retries
times (default 3), burning paid requests against an exhausted balance.
Real-world impact: ~$40 in 48h on a 24/7 Telegram+Discord gateway.
Root cause: FailoverReason.billing was in the is_client_error
exclusion set in agent/conversation_loop.py, which prevents the
non-retryable-abort branch from firing.
By the time control reaches that predicate:
* credential-pool rotation has already run for billing and either
continued the loop or returned False (pool exhausted/absent)
* the eager-fallback branch has also fired on billing and either
continued the loop or fell through (no fallback configured)
Falling through to the backoff retry from here has no recovery
mechanism left — it just burns more paid requests. Removing billing
from the exclusion set makes 402 abort cleanly once pool+fallback
recovery has failed, mirroring how 401/403 (also should_fallback=True)
already behave.
Added tests/run_agent/test_31273_402_not_retried.py which mirrors the
is_client_error predicate shape from the source and asserts the
invariant (plus a source-inspection guard against accidental
re-introduction).
* feat(security): on-demand supply-chain audit via OSV.dev (#31460)
Adds 'hermes security audit' — a one-shot vulnerability scan against
OSV.dev covering three surfaces a Hermes user actually controls:
1. The running Python's installed PyPI dists (importlib.metadata)
2. Plugin requirements.txt / pyproject.toml pins under ~/.hermes/plugins/
3. Pinned npx/uvx MCP servers in config.yaml
Zero new dependencies (stdlib urllib + importlib.metadata + tomllib +
concurrent.futures). No auth required for OSV's public batch API.
Flags: --json, --fail-on {low,moderate,high,critical} (default: critical),
--skip-venv, --skip-plugins, --skip-mcp
Output groups findings by source, sorts by severity descending, surfaces
fixed-versions inline. Exit 1 when any finding meets the --fail-on tier.
Deliberately out of scope: globally-installed pip/npm, editor/browser
extensions, daily background scans, auto-blocking of installs. The audit
is on-demand by design — daily scans become noise the user trains
themselves to ignore.
* fix(transport): strip Hermes-internal scaffolding keys before chat.completions
The empty-response recovery path in run_agent.py appends synthetic
messages tagged with _empty_recovery_synthetic (and the agent loop uses
_thinking_prefill / _empty_terminal_sentinel similarly). These are
internal bookkeeping markers — they must never reach the wire.
chat_completions' convert_messages only stripped Codex Responses leak
fields (codex_reasoning_items, call_id, etc.), not these _-prefixed
markers. Permissive providers (real OpenAI, Anthropic) silently ignore
unknown message keys so the bug stayed hidden, but strict
OpenAI-compatible gateways reject them outright. Observed against
codex.nekos.me:
502: [ObjectParam] [input[617]._empty_recovery_synthetic]
[unknown_parameter] Unknown parameter:
'_empty_recovery_synthetic'
Because the synthetic messages persist in the session, every
subsequent request in that session carries the poisoned key and
fails identically — a deterministic 502 the retry loop mistakes for
a transient server error.
Fix: convert_messages now drops any top-level message key starting
with '_'. OpenAI's message schema has no '_'-prefixed fields, so this
is safe and future-proofs against new internal markers.
Origin: local-author
Upstream-PR: none
Patch-State: local-only
* fix(error-classifier): treat 5xx request-validation errors as non-retryable
Standard OpenAI returns request-validation failures (unknown/
unsupported parameter, malformed request) as 4xx. Some
OpenAI-compatible gateways return them as 5xx instead — codex.nekos.me
returns 502 for an unknown parameter.
The generic '5xx -> retryable server_error' rule then misfires: the
error is deterministic (every retry gets the identical rejection), so
the retry loop burns all 3 attempts, the transport-recovery path
resets the counter and burns 3 more, and the result is a request
flood against a request that can never succeed.
Fix: when a 500/502 body carries an unambiguous request-validation
signal — 'unknown parameter' / 'unsupported parameter' /
'invalid_request_error' in the message text, or invalid_request_error
/ unknown_parameter / unsupported_parameter as the structured error
code — classify as a non-retryable format_error so the loop fails
fast and falls back. Genuine 502 Bad Gateway with no such signal
stays retryable as before.
Origin: local-author
Upstream-PR: none
Patch-State: local-only
* chore: map soju06@users.noreply.github.com for PR #26054 salvage
* fix(matrix,gateway): Matrix E2EE installs full dep set; plugins respect is_connected
Fixes #31116 — two distinct bugs in fresh-install Matrix gateway:
1. Matrix E2EE setup installed only mautrix[encryption], leaving asyncpg
/ aiosqlite / Markdown / aiohttp-socks uninstalled. The first encrypted
connect failed with 'No module named asyncpg' deep inside
MatrixAdapter.connect(). Root cause: the setup wizard hand-rolled a
pip install of one package instead of using lazy_deps.ensure(
'platform.matrix'), and check_matrix_requirements() short-circuited the
runtime installer on 'import mautrix' alone — so the other 4 packages
were never pulled in.
2. Discord auto-enabled itself on every gateway start, even when the user
never selected Discord and had no DISCORD_BOT_TOKEN. Root cause:
gateway/config.py plugin-enablement loop gated enablement on
entry.check_fn() (just 'is the SDK importable?') and ignored
entry.is_connected (the 'did the user configure credentials?' probe).
Same bug class as commit 7849a3d73 fixed for _platform_status in the
setup wizard; this is the runtime counterpart. Affects Discord, Teams,
and Google Chat.
Changes:
- hermes_cli/setup.py::_setup_matrix — install via
lazy_deps.ensure('platform.matrix') to pull the full feature group.
- gateway/platforms/matrix.py::_check_e2ee_deps — verify asyncpg +
aiosqlite + PgCryptoStore in addition to OlmMachine, so E2EE failures
surface at startup instead of at first encrypted-room connect.
- gateway/platforms/matrix.py::check_matrix_requirements — use
feature_missing('platform.matrix') as the install gate instead of a
single 'import mautrix' check, so partial installs trigger the lazy
installer correctly.
- gateway/config.py plugin-enablement loop — consult entry.is_connected
before flipping enabled=True. Explicit YAML enabled=true still wins.
Tests: 3 new in tests/gateway/test_matrix.py (asyncpg-required,
aiosqlite-required, partial-install lazy-runs), 5 new in
tests/gateway/test_platform_registry.py (is_connected=False blocks,
is_connected=True enables, is_connected=None falls back to check_fn,
raising probe doesn't enable, explicit YAML wins).
Validation: 310 tests across affected test modules pass.
* fix(telegram): gate send() on send-path health after reconnect storms (#31165)
After sustained Bad Gateway / TimedOut reconnect cycles, the PTB httpx
client can enter a state where bot.send_message() returns a valid
Message (real message_id) but the message never reaches the recipient.
TelegramAdapter.send returns SendResult(success=True) and cron's
live-adapter branch marks the run delivered while the message is
silently dropped.
Add a _send_path_degraded flag. _handle_polling_network_error sets it
on reconnect storms; the existing _verify_polling_after_reconnect
heartbeat probe clears it once getMe() confirms the Bot client is
healthy. While the flag is set, send() short-circuits with
SendResult(success=False, retryable=True) so cron falls through to
the standalone delivery path (fresh HTTP session).
Closes #31165.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
* fix(agent): only strip mcp_ prefix for OAuth-injected tools (GH-25255)
When strip_tool_prefix=True (Anthropic OAuth path), normalize_response
unconditionally stripped the mcp_ prefix from ALL tool names starting
with mcp_. This broke Hermes-native MCP server tools (registered under
their full mcp_<server>_<tool> name in the registry) because the stripped
name doesn't match any registry entry.
Fix: check the tool registry before stripping. Only strip when:
- The stripped name EXISTS in the registry (OAuth-injected tool)
- The full name does NOT exist in the registry
This preserves backward compatibility for OAuth-injected tools while
protecting native MCP server tools from incorrect prefix removal.
7 new tests covering: OAuth strip, native preserve, no-flag, non-mcp,
unknown tools, mixed responses, and dual-registration edge case.
Signed-off-by: HKPA <hayka-pacha@users.noreply.github.com>
* fix(anthropic): skip mcp_ prefix on outgoing tool schemas when already prefixed
Companion to the GH-25255 incoming-strip fix from @hayka-pacha. Without
this, build_anthropic_kwargs unconditionally added 'mcp_' to every tool
name in step 3, so a native MCP server tool registered as
'mcp_composio_X' was sent as 'mcp_mcp_composio_X' on the wire. The
incoming strip only removes ONE prefix, which still worked on first
call, but on subsequent calls the model pattern-matched the
single-prefixed form from message history and produced names that
stripped to 'composio_X' — registry miss, dispatch fail.
The history-rewrite block (#4) already has this guard. Apply the same
guard to the schema-rewrite block (#3) so round-trip is symmetric.
Added 4 outgoing-side tests. Existing 7 incoming-side tests still pass.
Author map: hayka-pacha added for PR #25270 salvage attribution.
Refs GH-25255.
* fix(telegram): preserve new DM topic lanes
* test(telegram): add brand-new-topic regression for #31086
The cherry-picked fix from #28605 inverts an existing test (an unknown
non-lobby thread_id no longer rewrites to the most-recent binding), but
that test only seeds two bindings and queries a third thread_id. Add a
second regression test that more closely mirrors the live failure mode:
seed exactly one prior binding, then query a brand-new thread_id and
assert recovery returns None — so the new topic is allowed to get its
own session row instead of being silently merged into the previous
topic's session.
Co-authored-by: Fábio Siqueira <fabioxxx@gmail.com>
Co-authored-by: dillweed <dillweed@users.noreply.github.com>
* fix: show recap after in-session resume
* fix(cli): skip tool-call-only entries in resume recap, expose limits as config options
* chore(release): map zhangsamuel12@gmail.com to SamuelZ12 (PR #7480)
* test(cli): reconcile resume-recap tests with skip-tool-only default and compression-chain helper
- test_tool_calls_shown_as_summary: explicitly disable resume_skip_tool_only
(#4434 made True the default; the legacy assertion relied on tool-only
entries being rendered as a summary).
- test_tool_only_message_skipped_by_default: add coverage for the new
default skip behavior.
- test_resume_command_*: mock_db.resolve_resume_session_id now returns the
same id (no compression chain) so the post-#15000 redirect block doesn't
shove a MagicMock into HERMES_SESSION_ID.
* feat(config): document resume-recap tuning keys in DEFAULT_CONFIG
The hardcoded constants in _display_resumed_history were exposed as
config in PR #4434; declare them in DEFAULT_CONFIG and the CLI fallback
dict so they show up in 'hermes config' diagnostics and the schema
validator.
* fix(debug): redact BlueBubbles webhook secrets
* fix(gateway): seed plugin extras before is_connected gate (#31703)
Follow-up to 54e61f933. The plugin enablement gate calls
``entry.is_connected(probe_cfg)`` BEFORE ``env_enablement_fn`` runs,
and the probe is built as ``existing_cfg or PlatformConfig()`` — empty
extras, ``enabled=False``.
For plugins whose ``is_connected`` reads ``config.extra`` instead
of env vars directly, that probe is a misrepresentation of what the
platform will look like after enablement. Google Chat's
``_is_connected`` short-circuits on ``config.enabled`` and inspects
``config.extra["project_id"]`` / ``config.extra["subscription_name"]``
— both False on the default probe even when the user has set
``GOOGLE_CHAT_PROJECT_ID`` and ``GOOGLE_CHAT_SUBSCRIPTION_NAME``. Result:
Google Chat silently fails the gate on every env-var-only setup.
Build a candidate probe that mirrors what the platform will look like
post-enablement:
- pre-call ``env_enablement_fn`` and layer its result into the probe's
``extra`` (without mutating any existing platform config)
- pass ``enabled=True`` on the probe — we're asking "would this BE
configured if we let it in?" not "is it currently enabled?"
- reuse the same seeded extras when we commit the platform to
``config.platforms`` (avoids calling ``env_enablement_fn`` twice)
Discord/IRC/Teams/LINE/ntfy/Simplex ``_is_connected`` hooks read env
vars directly, so they are unaffected. This change only restores
Google Chat on env-var-only setups while keeping the original #31116
Discord-no-token block intact.
All 6 shipped ``env_enablement_fn`` implementations were audited and
are pure reads (no ``os.environ`` writes), so running them earlier in
the loop has no observable side effects.
Tests: 2 new in tests/gateway/test_platform_registry.py covering
extras-seeded-before-is_connected and don't-leak-extras-on-gate-fail.
693 tests across 11 adjacent suites pass (platform_registry, config,
google_chat, matrix, discord_connect, ntfy_plugin, simplex_plugin,
line_plugin, irc_adapter, teams, gateway_platform_gating).
Refs #31116.
* fix(kanban): refuse to rmtree workspace_path outside managed scratch root (#28818)
A board's ``default_workdir`` (e.g. ``hermes kanban boards
set-default-workdir my-board /path/to/real/source``) is copied into
``tasks.workspace_path`` for tasks created without an explicit
``workspace_kind``. Those tasks default to ``workspace_kind='scratch'``,
so completion calls ``_cleanup_workspace`` and unconditionally runs
``shutil.rmtree(wp, ignore_errors=True)`` — deleting the user's real
source tree as if it were disposable scratch storage.
Add ``_is_managed_scratch_path()`` and gate ``_cleanup_workspace`` on
it: only delete paths under ``HERMES_KANBAN_WORKSPACES_ROOT`` (the
worker-side override the dispatcher injects) or under the active kanban
home's ``kanban/`` subtree (covering both the legacy default-board root
and per-board ``kanban/boards/<slug>/workspaces`` roots). Anything else
gets a warning log and is left alone, so a misconfigured
``default_workdir`` can no longer destroy user data on task completion.
* fix(kanban): restrict managed-scratch roots to workspaces/ dirs only
Copilot review on PR #28819 flagged that `_is_managed_scratch_path` accepted
the entire `<kanban_home>/kanban` subtree as managed scratch storage. With
that, a task whose `workspace_kind='scratch'` and `workspace_path` was
mis-set to `<kanban_home>/kanban`, `.../kanban/logs`, or a board's
metadata directory (e.g. `.../kanban/boards/<slug>` without the
`workspaces/` child) would pass the containment guard and let task
completion `shutil.rmtree` Hermes' own DB, metadata, and log subtrees.
Tighten the guard:
* Allowed roots are now exclusively `workspaces/` directories — the
`HERMES_KANBAN_WORKSPACES_ROOT` override, `<kanban_home>/kanban/workspaces`,
and each `<kanban_home>/kanban/boards/<slug>/workspaces` discovered on
disk.
* Require strict descendancy: a path equal to a root itself is rejected
too, because deleting a workspaces root would wipe every task's scratch
dir at once.
Add a regression test covering the three Copilot-named attack paths
(kanban root, kanban/logs, board root without `workspaces/`) plus the
workspaces-root-itself case, and confirm the inner task-id dir still
matches.
* fix(kanban): scratch tasks must not inherit board.default_workdir (#28818)
Board defaults represent persistent project checkouts. Scratch workspaces
are auto-deleted on completion and must stay under the per-board scratch
root that resolve_workspace() creates. Inheriting default_workdir for a
scratch task pointed the cleanup path at the user's source tree — the
data-loss vector documented in #28818.
The containment guard in _cleanup_workspace (just added) is the safety
rail. This commit prevents the bad state from being created in the first
place: only persistent kinds (dir/worktree) inherit board defaults.
Tests updated to cover the new semantics: scratch with default_workdir
set keeps workspace_path=None; dir/worktree still inherits the board
default.
Salvaged from PR #31315 by @leeseoki0 — prevention layer on top of the
#28819 containment fix by @briandevans.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
* chore(release): map leeseoki0 for PR #31315 salvage
* fix(cli): add inline --yes/now skip for destructive slash commands (#30768)
Issue #30768 reports that on native Windows PowerShell the destructive-slash
confirmation modal renders but never registers keypresses, leaving the user
unable to confirm or cancel /reset, /new, /clear, or /undo. The modal works
on macOS, Linux, and WSL; PR #23907 (merged May 11) replaced the
daemon-thread input() pattern with a prompt_toolkit-native keybinding modal
but the win32 input pipeline apparently doesn't dispatch keys to the
filter-conditioned handlers. The modal investigation is ongoing.
This change ships the immediate escape hatch: append `now`, `--yes`, or `-y`
to any destructive slash command to bypass the modal and run the action
immediately. Works on every platform without touching the broken Windows
code path.
/reset now -> reset, no modal
/new --yes my-session -> new session titled "my-session", no modal
/clear -y -> clear, no modal
/undo -y -> undo, no modal
The default behavior (modal prompts when approvals.destructive_slash_confirm
is True) is unchanged for users who don't pass a skip token.
Implementation:
- New classmethod HermesCLI._split_destructive_skip(text) -> (remainder, skip)
parses a destructive-slash command string, strips the leading "/cmd" word
and any recognized skip tokens (case-insensitive exact match, not substring),
and reports whether a skip was requested.
- HermesCLI._confirm_destructive_slash gains an optional cmd_original= arg.
When the arg contains a skip token, it returns "once" immediately —
before the gate check and before any modal rendering.
- The /clear, /new, /undo handlers in process_command pass cmd_original
through. /new additionally uses _split_destructive_skip to strip skip
tokens from the remaining text before deriving the session title, so
"/new now My Session" yields title="My Session" (not "now My Session").
Tests:
- 7 new unit tests in tests/cli/test_destructive_slash_confirm.py covering
the helper (recognized tokens, command-word stripping, case-insensitive
exact match, None/empty input) and the modal bypass (now and --yes both
skip; no-skip-token still consults the modal).
- 3 new integration tests in tests/cli/test_destructive_slash_inline_skip_e2e.py
driving HermesCLI.process_command end-to-end and asserting (a) new_session
is invoked, (b) the modal is never reached, (c) the skip token does not
leak into the session title, and (d) the no-skip-token path still reaches
the modal as a sanity check that we haven't accidentally short-circuited
the normal flow.
All 31 tests across the destructive-slash test surface pass.
Docs:
- website/docs/reference/slash-commands.md documents the new flags both in
the destructive-commands table and the dedicated approval section, with a
link back to issue #30768 explaining why the escape hatch exists.
* fix(cli): show full session titles in /resume list
* fix(file-safety): write-deny pairing/ directory to prevent approved-list injection
The gateway pairing directory (~/.hermes/pairing/) stores per-platform
access-control files (telegram-approved.json, discord-approved.json, etc.).
A prompt-injected agent using write_file could add arbitrary user IDs to an
approved file, granting persistent gateway access without going through the
pairing code flow — the same threat class that motivated protecting
webhook_subscriptions.json (#14157).
The pairing directory was not included in the original control-plane protection
because it postdates PR #14157. PR #30383 introduced the hashed-pending schema
and made the approved files the sole source of truth for gateway access, raising
the security sensitivity of the directory.
Apply the same mcp-tokens pattern: block writes to pairing/ and any path within
it, under both the active hermes_home and the root path (for profile-mode parity
with the fix in #30382).
Regression tests verify denial for pairing/telegram-approved.json,
pairing/discord-pending.json, and the directory itself, in both normal and
profile-mode layouts.
* feat: support numbered resume selection in cli and gateway
* chore(release): map 490408354@qq.com to daizhonggeng (PR #9020)
* i18n+tests: add list_item_numbered, list_footer_numbered, out_of_range for 15 locales
The numbered /resume feature added new i18n keys to en.yaml; the catalog parity
tests require every locale to carry matching keys and placeholders, so add
translations to all 15 supported locales.
Also unblock tests/cli/test_cli_resume_command.py:
- _make_cli stub now sets self.resume_display = 'minimal' since
_handle_resume_command (post-#31695) calls _display_resumed_history.
- mock_db.resolve_resume_session_id returns the input id (no compression
chain) so HERMES_SESSION_ID is set to a real string, not a MagicMock.
* test(cli): update resume usage-hint assertion for numbered selection
PR #9020's salvage changed the /resume list footer from
'Use /resume <session id or title> to continue.' to
'Use /resume <number>, /resume <session id>, or /resume <session title> to continue.\n Example: /resume 2'.
test_resume_without_target_lists_recent_sessions still pinned the old
string verbatim and failed in CI. Relax to substring assertions that
allow both the new numbered footer and any future tweaks while still
verifying the hint is shown.
* fix(security): add missing credential paths to write denylist (#27217)
The write denylist already protects SSH keys, AWS, GPG, npm, PyPI,
Docker, Azure, and GitHub CLI credentials. Two common credential
stores were missing:
~/.git-credentials stores plaintext git tokens in the format
https://username:token@github.com when using git credential-store.
It is directly analogous to ~/.netrc which was already protected.
~/.config/gcloud/ contains Google Cloud OAuth tokens and service
account credentials. It is directly analogous to ~/.aws/ which
was already protected.
Under prompt injection, an agent could be instructed to overwrite
these files, destroying credentials or planting malicious ones.
Verified before and after with is_write_denied() on both paths.
* fix(file-safety): deny reads of Google OAuth tokens (#30972)
* fix(security): close TOCTOU window when saving Claude Code OAuth credentials (#21152)
_write_claude_code_credentials wrote ~/.claude/.credentials.json via
Path.write_text + replace + post-write chmod(0o600). Both the temp file
and the destination briefly inherited the process umask (commonly 0o644
= world-readable) between create/replace and chmod, exposing the OAuth
access/refresh tokens to other local users on multi-user hosts.
Use os.open with O_WRONLY|O_CREAT|O_EXCL and an explicit S_IRUSR|S_IWUSR
mode so the temp file is created atomically at 0o600. After os.replace,
the destination inherits the temp's mode, so the post-write chmod is no
longer needed. The temp name also gains a per-process random suffix to
avoid collisions between concurrent writers and stale leftovers from a
crashed prior write.
Parent dir (~/.claude/) is owned by Claude Code itself and shared with
its native auth, so we deliberately don't tighten its mode here (unlike
the mcp_oauth fix which owns its own subtree under HERMES_HOME).
Mirrors the fix shipped for agent/google_oauth.py in #19673 and the
parallel fix for tools/mcp_oauth.py in #21148.
Adds a regression test in TestWriteClaudeCodeCredentials asserting the
resulting file mode is 0o600 (skipped on Windows where POSIX mode bits
aren't enforced).
* ci(supply-chain): anchor install-hook regex at repo root (#31744)
The SETUP_HITS check matched any file ending in setup.py/setup.cfg/
sitecustomize.py/usercustomize.py at any path depth. This produced
false positives on every PR touching hermes_cli/setup.py (the CLI
setup wizard), which is unrelated to pip/site install hooks.
Only the top-level setup.py/setup.cfg execute during 'pip install',
and only top-level sitecustomize.py/usercustomize.py are auto-loaded
by site.py at interpreter startup. Anchor the regex with '^' so only
repo-root matches fire.
Symptom: PR #30916 (Mattermost plugin migration) flagged purely
because it deletes _setup_mattermost() from hermes_cli/setup.py.
Discord migration (#30591) hit the same false positive yesterday.
* fix(security): restrict write access to Anthropic OAuth credential store
* chore(release): map kronexoi for PR #30553 salvage
* Protect dashboard OAuth credentials with the same file-safety guarantees as other auth paths
The web dashboard's Anthropic OAuth helper wrote the credential file
straight to its final destination and relied on the process umask for
permissions. That left the dashboard-specific path weaker than the
existing auth writers, which already use owner-only permissions and
safer write semantics.
This change keeps the scope narrow: make the dashboard helper write via
a temp file + replace, chmod the final file to owner-only, and add a
focused regression test for both permission handling and atomic-write
behavior.
Constraint: Must preserve the existing dashboard OAuth flow and credential-pool side effects
Rejected: Broader auth-storage refactor | unnecessary scope for a single verified inconsistency
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep dashboard credential writes aligned with existing auth storage semantics; do not reintroduce direct write_text() here without matching chmod/atomic behavior
Tested: pytest -o addopts='' tests/hermes_cli/test_web_server_oauth_write.py tests/hermes_cli/test_web_server.py -q (78 passed)
Not-tested: Cross-platform permission semantics on Windows-managed filesystems
* fix(security): redact credentials before persistence in session capture
Two-layer redaction at the persistence boundary so credentials never reach
state.db, session_*.json, or compression:
1. agent/chat_completion_helpers.py :: build_assistant_message
- Redact assistant content before the message dict is constructed
(catches PATs / API keys the model inlines into natural language)
- Redact tool_call.function.arguments at the same site (catches secrets
inlined into tool args, e.g. terminal command=curl -H 'Authorization: ...')
Tool execution uses the raw API response object, not this dict, so
redacting the persisted shape is safe.
2. run_agent.py :: _save_session_log
- Add _redact_message_content() static helper that handles both string
content and OpenAI/Anthropic multimodal list-of-parts (image parts
pass through untouched, only text/content fields are redacted)
- Apply to every message + the cached system prompt before writing
session_*.json
Both layers respect HERMES_REDACT_SECRETS via redact_sensitive_text —
no-op when disabled.
Tests (TestSaveSessionLogRedactsSecrets, 4 cases):
- api key in tool content
- api key in user message
- api key in system prompt
- multimodal list-of-parts (image part preserved, text redacted)
Tests use an autouse fixture to force _REDACT_ENABLED=True because the
hermetic conftest defaults the env var to false.
Salvaged from PR #24758 by @vgocoder (build_assistant_message + session_log)
+ PR #19855 by @liuhao1024 (multimodal list helper, system_prompt redaction).
Kept only the redaction concern from #19855; its unrelated whatsapp npm
timeout + PATCH_SCHEMA changes are out of scope and dropped.
Refs #19798 (PAT leak via assistant inline mention), #19845 (session capture
credential leak).
Co-authored-by: liuhao1024 <liuhao03@bilibili.com>
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
* chore(release): map vgocoder for PR #24758 salvage
* fix(google_chat): harden oauth credential persistence with atomic private writes (#24788)
* feat(tts): add register_tts_provider() plugin hook (closes #30398)
Adds a `TTSProvider(ABC)` + `register_tts_provider()` extension point
to the plugin context API, **alongside** the existing config-driven
`tts.providers.<name>: type: command` registry from PR #17843. This is
additive — the command-provider surface stays as the primary way to
add a TTS backend.
The hook covers cases the shell-template grammar can't reasonably
express:
- Native Python SDKs without a CLI (Cartesia, Fish Audio, etc.)
- Streaming synthesis (chunked Opus → voice-bubble delivery)
- Voice metadata API for the `hermes tools` picker
- OAuth-refreshing auth flows
None of the 10 inline built-in providers (`edge`, `openai`,
`elevenlabs`, `minimax`, `gemini`, `mistral`, `xai`, `piper`,
`kittentts`, `neutts`) are migrated to plugins. They stay inline. The
hook is for *new* engines that aren't built-in.
## Resolution order
The dispatcher's resolution order is the load-bearing invariant:
1. `tts.provider` is a built-in name → built-in dispatch. **Always wins.**
2. `tts.provider` matches `tts.providers.<name>` with `command:` set
→ command-provider dispatch (PR #17843).
3. `tts.provider` matches a plugin-registered `TTSProvider`
→ plugin dispatch (new).
4. No match → falls through to Edge TTS default (legacy behavior).
Built-ins-always-win is enforced at THREE layers:
- Registry: `register_provider()` rejects shadowing names with a warning.
- Dispatcher: `_dispatch_to_plugin_provider()` short-circuits built-in
names defensively before consulting the registry.
- Picker: `_plugin_tts_providers()` filters built-in shadows out of
the `hermes tools` row list defensively.
Command-providers-win-over-plugins is enforced at TWO layers:
- The caller in `text_to_speech_tool` checks
`_resolve_command_provider_config` first.
- `_dispatch_to_plugin_provider` re-checks for a same-name command
config defensively so a refactor of the caller can't silently break
the invariant.
## New files
- `agent/tts_provider.py` — `TTSProvider(ABC)` with `synthesize()` (required),
`list_voices()`, `list_models()`, `get_setup_schema()`, `stream()`,
`voice_compatible` (all optional with sane defaults). Mirrors
`agent/image_gen_provider.py` shape.
- `agent/tts_registry.py` — `register_provider`/`get_provider`/`list_providers`
with `_BUILTIN_NAMES` reject-shadowing invariant. Mirrors
`agent/image_gen_registry.py` shape.
- `plugins/tts/...` directory ready for community plugins (none shipped).
## Modified files
- `hermes_cli/plugins.py` — `register_tts_provider()` method on
`PluginContext`. Matches the gating shape of
`register_image_gen_provider()` / `register_browser_provider()`.
- `tools/tts_tool.py` — `_dispatch_to_plugin_provider()` +
`_plugin_provider_is_voice_compatible()` + walrus-elif wiring into
the main dispatcher. Built-in elif chain untouched.
- `hermes_cli/tools_config.py` — `_plugin_tts_providers()` injects
plugin rows into the Text-to-Speech picker category alongside the
10 hardcoded built-in rows.
## Tests
- `tests/agent/test_tts_registry.py` — 47 tests covering registration,
lookup, ABC contract, helpers, AND a `TestBuiltinSync` regression
test that fails if `agent.tts_registry._BUILTIN_NAMES` drifts from
`tools.tts_tool.BUILTIN_TTS_PROVIDERS` (kept duplicated due to
circular import constraints).
- `tests/tools/test_tts_plugin_dispatch.py` — 35 tests covering
built-in-always-wins, command-wins-over-plugin, plugin dispatch,
exception passthrough, voice_compatible helper.
- `tests/hermes_cli/test_tts_picker.py` — 10 tests covering the
picker surface, builtin shadowing defense, integration with
`_visible_providers`.
- `tests/hermes_cli/test_plugins_tts_registration.py` — 3 end-to-end
tests via `PluginManager.discover_and_load()`.
- `tests/plugins/tts/check_parity_vs_main.py` — 9-scenario subprocess
parity harness vs `origin/main`. The only intentional diff is
`fallback_edge → plugin` for the `plugin-installed` scenario.
## Verification
- 95/95 new tests pass.
- 170/170 pre-existing TTS tests (test_tts_command_providers,
test_tts_max_text_length, test_tts_speed, etc.) pass unchanged.
- Parity harness against `origin/main`: 8 OK + 1 expected DIFF.
- E2E smoke: a registered plugin's `synthesize()` is called via
`text_to_speech_tool` with the standard JSON envelope returned.
- Ruff clean on all touched files.
## Docs
- `website/docs/user-guide/features/tts.md` — new "Python plugin
providers" section with a decision table (command-provider vs
plugin), minimal plugin example, and the optional-hook reference.
- `website/docs/user-guide/features/plugins.md` — TTS row updated to
mention both surfaces (command-provider primary, plugin for
SDK/streaming).
Closes #30398
* docs(plans): add s6-overlay supervision plan (v3)
Replace tini with s6-overlay as PID 1 in the Hermes Docker image so that
main hermes, the dashboard, and dynamically-created per-profile gateways
all run as supervised services. Includes container-boot reconciliation
(Task 4.0) so per-profile gateways survive docker restart.
Plan history:
- v1: 2026-05-07 — original design (subagent gateways scope)
- v2: 2026-05-18 — re-validated, scope narrowed to per-profile gateways,
WindowsServiceManager added to protocol
- v3: 2026-05-21 — re-validated in docker_s6 worktree, install-method
stamp preservation noted in Task 2.3, Task 4.0 added for container
restart survival
12.5 engineering days estimated across 7 phases.
* test(docker): add conftest fixtures for docker harness
Task 0.1 of the s6-overlay supervision plan. Establishes the test
infrastructure for tests/docker/: skip-on-missing-Docker collection
hook, session-scoped image-build fixture (overridable via the
HERMES_TEST_IMAGE env var for faster local iteration), and a
container_name fixture that ensures cleanup on test exit.
Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
* test(docker): lock baseline behavior for Phase 0 harness
Tasks 0.2-0.6 of the s6-overlay supervision plan. Locks the
user-visible behavior we must preserve through the Phase 2 init-
system swap:
- test_main_invocation.py (Task 0.2): docker run <image> with no
args, chat subcommand passthrough, bare executable passthrough,
bash pattern, exit-code propagation
- test_tui_passthrough.py (Task 0.3): T…
1 task
mathias3
pushed a commit
to mathias3/hermes-agent
that referenced
this pull request
May 28, 2026
…ousResearch#31378) Closes NousResearch#31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token
shuv1337
pushed a commit
to shuv1337/hermes-agent
that referenced
this pull request
May 28, 2026
…ousResearch#31378) Closes NousResearch#31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token
Bryce-huang
pushed a commit
to wbkunlun/hermes-agent
that referenced
this pull request
May 29, 2026
…ousResearch#31378) Closes NousResearch#31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token #AI commit#
mosaiq-systems
pushed a commit
to mosaiq-systems/hermes-agent
that referenced
this pull request
May 29, 2026
…ousResearch#31378) Closes NousResearch#31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…ousResearch#31378) Closes NousResearch#31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token
MAX007AI
added a commit
to MAX007AI/hermes-agent
that referenced
this pull request
Jun 10, 2026
The branch was cut from an older checkout, so several files silently reverted upstream work that landed before the merge-base. Restore: - hermes_state.py: SCHEMA_VERSION 13 and the messages.observed column (4a91e36). gateway/session.py passes observed= to append_message(); without the parameter every session append raised TypeError inside a debug-logged except block, silently breaking message persistence. - tests/conftest.py: TIRITH_ENABLED=false test-hermeticity guard (71291d8). - tools/code_execution_tool.py: cross_profile parameter on the write_file/patch sandbox stubs (d3c167b, NousResearch#31290). - hermes_cli/env_loader.py: null-byte stripping before _sanitize_env_lines (75643a6) and the server_url/home_path arguments to apply_bitwarden_secrets (bc3f1f4, NousResearch#31378).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #31370.
hermes secrets bitwarden setupnow asks which Bitwarden region the machine account belongs to (US / EU / self-hosted custom URL) so EU Cloud users stop getting[400 Bad Request] {"error":"invalid_client"}against the US identity endpoint.Changes
agent/secret_sources/bitwarden.py:fetch_bitwarden_secrets()/_run_bws_list()/apply_bitwarden_secrets()acceptserver_urland plumb it into the bws subprocess asBWS_SERVER_URL. Cache key includes server_url so EU/US fetches don't collide.hermes_cli/secrets_cli.py: new Step 3 region prompt (US default / EU / custom). New--server-urlflag for non-interactive setup.BWS_SERVER_URLin the shell env is detected and reused.statusshows the region.syncreads server_url from config.invalid_client/400 Bad Requesterrors now show a region hint instead of looking like a bad token.hermes_cli/env_loader.py: passes server_url through toapply_bitwarden_secretsat startup.hermes_cli/config.py: newsecrets.bitwarden.server_urldefault (empty = US Cloud).website/docs/user-guide/secrets/bitwarden.md: documents region step, non-interactive flags, and the invalid_client failure mode.Validation
invalid_clientvault.bitwarden.euBWS_SERVER_URLshell envsecrets.bitwarden.server_urlE2E verified with mocked subprocess: server_url flows from config.yaml → env_loader → apply_bitwarden_secrets →
_run_bws_listenv, and from--server-urlflag → wizard →_list_projectsenv.Backwards compatibility
server_url: ""(the default for existing installs) keeps bws on its built-in default endpoint (US Cloud), so nothing changes for current users. The new field is only consulted when set.Infographic