Skip to content

chore: sync with upstream main (2026-05-07)#22

Merged
bot-ted merged 605 commits into
mainfrom
sync/upstream-20260507
May 7, 2026
Merged

chore: sync with upstream main (2026-05-07)#22
bot-ted merged 605 commits into
mainfrom
sync/upstream-20260507

Conversation

@bot-ted

@bot-ted bot-ted commented May 7, 2026

Copy link
Copy Markdown
Owner

Daily sync with upstream. Auto-created by cron job.\n\nCommits: 604 new upstream commits merged.\n\nKey changes include:\n- Security fixes (TOCTOU closures, credential handling)\n- Discord role-auth scope fix (CVSS 8.1)\n- Kanban worker lifecycle fixes\n- New features: SSE transport for MCP, transform_llm_output plugin hook, Lightpanda browser engine\n- Multiple platform fixes (Telegram, WhatsApp, Feishu, Weixin, Discord)\n- TUI improvements and dashboard updates\n- New locales: Turkish, Ukrainian, French, Chinese\n- 33 providers made pluggable\n- Docker and deployment improvements

zeejaytan and others added 30 commits May 5, 2026 10:15
When a provider returns a 429 rate-limit error (not billing-related),
the auxiliary client's call_llm/async_call_llm previously did NOT trigger
the fallback chain. This caused auxiliary tasks like session_search to
exhaust all 3 retries against the same rate-limited endpoint, losing
session metadata that depended on the summarization completing.

Root cause: `_is_payment_error()` only matched 429s containing billing
keywords ("credits", "insufficient funds", etc.). Provider-specific
rate-limit messages like Nous's "Hold up for a bit, you've exceeded the
rate limit on your API key" didn't match, so `_is_payment_error` returned
False, `_is_connection_error` returned False, and `should_fallback` was
False — all retries hit the same rate-limited provider.

Fix:
- New `_is_rate_limit_error()` function that detects 429 + rate-limit
  keywords, generic 429 without billing keywords, and OpenAI SDK
  `RateLimitError` class instances (which may omit .status_code).
- Updated `should_fallback` in both `call_llm` and `async_call_llm` to
  include `_is_rate_limit_error`.
- Updated the max_tokens retry path to also check for rate-limit errors.
- Updated the reason string to include "rate limit".

This complements the Nous rate guard (PR NousResearch#10568) which prevents new calls
to Nous when already rate-limited — this fix handles the case where a
request is already in flight when the 429 arrives.

Related: NousResearch#8023, NousResearch#12554, NousResearch#11034
Co-authored-by: Zeejay <zjtan1@gmail.com>
The Telegram/Discord /model pickers currently call
list_authenticated_providers(), which returns every provider whose
credentials resolve locally and every model in its curated snapshot.
Two failure modes fall out:

- OpenRouter rows can include IDs the live catalog no longer carries.
- Provider rows can surface with zero callable models (e.g. a slug
  whose credential pool entry exists but has nothing behind it).

list_picker_providers() wraps the base function and post-processes the
result so the interactive picker only shows models the user can
actually select:

- OpenRouter's models come from fetch_openrouter_models() (live-catalog
  filtered against the curated OPENROUTER_MODELS snapshot).
- Rows with an empty models list are dropped, except custom endpoints
  (is_user_defined=True with an api_url) where the user may enter
  model ids manually.
- All other fields pass through unchanged.

The gateway /model handler switches to the new helper for the
interactive picker payload only. Typed /model <name> and the text
fallback list stay on list_authenticated_providers() so nothing is
hidden from power users or platforms without a picker.

Covered by nine focused unit tests in
tests/hermes_cli/test_list_picker_providers.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/model is the canonical command; /provider was a redundant alias that
dispatched to the same ModelPicker overlay. Drop the alias, the regex
branch in useCompletion, and the alias-coverage test.
…llout) (NousResearch#20363)

Fix three regressions introduced by PR NousResearch#18370 (lazy session creation):

1. _finalize_session() uses stale session_key after compression (NousResearch#20001)
2. session_key not synced after auto-compression in run_conversation (NousResearch#20001)
3. pending_title ValueError leaves title wedged forever (NousResearch#19029)
4. Gateway silently swallows null responses when agent did work (NousResearch#18765)
5. One-time cleanup for accumulated ghost compression continuations (NousResearch#20001)

Changes:
- tui_gateway/server.py: _finalize_session() now uses agent.session_id
  (falls back to session_key when agent is None). Refactor
  _sync_session_key_after_compress() with clear_pending_title and
  restart_slash_worker policy flags. Call it post-run_conversation()
  to sync session_key after auto-compression. Add ValueError handler
  to pending_title flush.
- gateway/run.py: Extract _normalize_empty_agent_response() helper that
  consolidates failed/partial/null response handling. Surfaces user-facing
  error when agent did work (api_calls > 0) but returned no text.
- hermes_state.py: Add finalize_orphaned_compression_sessions() — marks
  ghost continuation sessions as ended (non-destructive, preserves data).
- cli.py: One-time startup migration for orphaned compression sessions.

Test changes:
- tests/test_tui_gateway_server.py: Update pending_title ValueError test
  for post-NousResearch#18370 architecture (title applied post-message, not at create).
- tests/test_lazy_session_regressions.py: 14 new regression tests covering
  all fixed paths.
The comment at tools/web_tools.py:700-702 stated the runtime default for
auxiliary.web_extract.timeout is 360s. The actual runtime default is 30s
(_DEFAULT_AUX_TIMEOUT in agent/auxiliary_client.py:3140), used by
_get_task_timeout when no auxiliary.web_extract.timeout key is present in
config.yaml.

The 360s figure is the config template default written by
hermes_cli/config.py:697 into freshly-generated config.yaml files. It only
takes effect when that key exists in the user's config — not as a fallback.
Users on configs that predate commit 20b4060 (Apr 5, 2026), or who removed
the key, fall through to the 30s _DEFAULT_AUX_TIMEOUT runtime default.

The comment was introduced in 20b4060 alongside the template-default bump
from 30 to 360. The runtime default in auxiliary_client.py was not changed
in that commit and has remained 30s since 839d9d7 (Mar 28, 2026).
Mirror _message_thread_id_for_typing() with _message_thread_id_for_send():
both now map the General forum topic (thread id "1") to None upfront.

That removes the need for the retry-without-thread fallback in send_typing()
entirely — if _message_thread_id_for_typing() returns a non-None value, it's
a real user-created topic and falling back to the root chat is never correct.
If Telegram rejects the typing action (e.g. topic deleted mid-session), we
swallow it at debug level instead of bleeding the indicator into All Messages.

Updates the General-topic typing regression test to assert the new single-call
contract.
PR NousResearch#13743 replaced the global MAX_TEXT_LENGTH=4000 with a per-provider
table and a user-override 'max_text_length:' key, but the user-guide
TTS page documented no length behaviour at all. Users hitting truncation
had no way to discover the new caps or the override.

Add an 'Input length limits' subsection after the existing Configuration
YAML block: provider default caps (Edge 5000 / OpenAI 4096 / xAI 15000 /
MiniMax 10000 / Mistral 4000 / Gemini 5000 / ElevenLabs model-aware /
NeuTTS,KittenTTS 2000), ElevenLabs model_id -> cap table (5k-40k), an
override example, and the validation rules (non-positive / non-integer /
boolean values fall through to the provider default).
…ection

Mirrors the AGENTS.md NousResearch#20226 additions (Toolsets / Delegation / Curator /
Cron / Kanban) into the user-facing hermes-agent skill, and closes the
drift in the in-session slash command list.

User report (wxrrior in Discord): the skill did not mention /goal, so a
brand-new session answering "/hermes-agent do you have any info on /goal"
confidently said it did not exist. Cross-check against the CommandDef
registry found 16 commands missing from the static list: /goal, /agents,
/busy, /copy, /curator, /debug, /footer, /gquota, /indicator, /kanban,
/redraw, /reload, /reload-skills, /snapshot, /steer, /topic.

Changes:
- Slash Commands header now tells the reader to run /help or check the
  live docs reference as the source of truth, and names the registry
  of record (hermes_cli/commands.py) so future drift gets flagged
  honestly instead of answered confidently wrong.
- Added all 16 missing commands, slotted into existing subsections
  (/goal and /steer in Session; /busy + /indicator + /footer in
  Configuration; /curator + /kanban + /reload-skills + /reload in
  Tools & Skills; /topic in Gateway; /copy in Utility; /gquota +
  /debug in Info).
- Toolsets table updated to the authoritative 30-key list from
  toolsets.py (added kanban, yuanbao, spotify, safe, debugging, video,
  feishu_doc, feishu_drive, discord, discord_admin, clarify; previously
  stopped at 20 keys).
- New "Durable & Background Systems" section before Troubleshooting
  covers Delegation, Cron, Curator, Kanban - each with a short rundown
  of CLI verbs, key invariants, and a pointer to the user-facing docs.
  Mirrors AGENTS.md NousResearch#20226 but in the skill's user-facing register.
- Bumped version 2.0.0 -> 2.1.0.
PR NousResearch#12473 (merged 2026-04-19) added a new --deliver-only flag to
`hermes webhook subscribe` for zero-LLM direct delivery, but
website/docs/reference/cli-commands.md options table did not
reference it. Add the row so CLI users can discover the flag from
the reference page instead of having to read the source.
Subscribe overlay components to computed theme/session selectors instead of the full UI store so unrelated UI state updates trigger fewer overlay renders.
PR NousResearch#11468 added `hermes skills reset` but cli-commands.md was not
updated. Adds the subcommand to the table and usage examples.

Closes NousResearch#11543
…ousResearch#20332)

* feat(kanban): generic diagnostics engine for task distress signals

Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR NousResearch#20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.

Closes the follow-up from NousResearch#20232 discussion.

New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.

v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
  ``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
  ``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
  fires when ``tasks.spawn_failures >= 3``; suggests
  ``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
  ``crashed`` run outcomes with no successful completion between;
  suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
  state with no comments / unblock attempts; suggests commenting.

Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.

Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.

API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
  ``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
  section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
  severity, sorted critical-first.

CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
  [--json]`` — fleet view or single-task view, matches dashboard rule
  output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
  the top with severity markers + suggested actions.

Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
  !!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
  diagnostics (not just hallucinations), severity-coloured, lists
  affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
  ``DiagnosticsSection`` rendering a card per active diagnostic:
  title + detail + structured data (task-id chips when payload keys
  look like id lists) + action buttons. Reassign profile picker is
  inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
  environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
  red for critical. Uses CSS variables so theming is straightforward.

Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
  covering each rule's positive/negative/threshold paths, severity
  sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
  populated, severity-filtered), ``/board`` exposes both diagnostic
  list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
  warnings_field_for_hallucinated_completions``) updated to reflect
  the new contract: warning summary keys by diagnostic kind
  (``hallucinated_cards``) not event kind.

379 kanban-suite tests pass (+16 net from this PR).

Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:

* Attention strip: shows ``!! 5 tasks need attention`` in the
  error-severity orange; Show expands to a list of 5 rows ordered
  critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
  render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
  diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
  ``broken-ml-worker → alice`` and the drawer refreshed with the
  new assignee + the same diagnostic still firing (correct:
  spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
  ``--severity error`` narrows to 3; ``kanban show <id>`` includes
  the Diagnostics block at the top with suggested action hint.

Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.

* feat(kanban/diagnostics): lead titles with the actual error text

The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.

New titles:

  Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
  Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
  Agent crashed 3x: provider auth error: 401 Unauthorized
  Agent spawn failed 4x: insufficient_quota: You exceeded your current

Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').

Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).

Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
…y module docstrings

The BuiltinMemoryProvider class was removed from the codebase but its
name lingered in the module-level docstrings of memory_manager.py and
memory_provider.py, creating false expectations:

- memory_manager.py docstring showed example code doing
  add_provider(BuiltinMemoryProvider(...)) which ImportError at runtime
- memory_provider.py docstring listed BuiltinMemoryProvider as
  'always present, not removable' — misleading for new contributors

The regression test (test_memory_user_id.py) already passes without
any reference to BuiltinMemoryProvider; it uses RecordingProvider
instances directly. The stale references were docs-only drift.

Update both docstrings to reflect the actual current architecture:
MemoryManager accepts external plugin providers only (one at a time).

Closes NousResearch#14402
Resolved merge against current main (new No-agent mode section added in parallel).

Co-authored-by: Tony Simons <tony@tonysimons.dev>
…els, transport single-path

Introduces providers/ package — single source of truth for every
inference provider. Adding a simple api-key provider now requires one
providers/<name>.py file with zero edits anywhere else.

What this PR ships:
- providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes)
- ProviderProfile declarative fields: name, api_mode, aliases, display_name,
  env_vars, base_url, models_url, auth_type, fallback_models, hostname,
  default_headers, fixed_temperature, default_max_tokens, default_aux_model
- 4 overridable hooks: prepare_messages, build_extra_body,
  build_api_kwargs_extras, fetch_models
- chat_completions.build_kwargs: profile path via _build_kwargs_from_profile,
  legacy flag path retained for lmstudio/tencent-tokenhub (which have
  session-aware reasoning probing that doesn't map cleanly to hooks yet)
- run_agent.py: profile path for all registered providers; legacy path
  variable scoping fixed (all flags defined before branching)
- Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS,
  doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER
- GeminiProfile: thinking_config translation (native + openai-compat nested)
- New tests/providers/ (79 tests covering profile declarations, transport
  parity, hook overrides, e2e kwargs assembly)

Deltas vs original PR (salvaged onto current main):
- Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth
  (were added to main since original PR)
- Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their
  reasoning_effort probing has no clean hook equivalent yet)
- Removed lmstudio alias from custom profile (it's a separate provider now)
- Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension
  (resolve_provider special-cases them; adding breaks runtime resolution)
- runtime_provider: profile.api_mode only as fallback when URL detection
  finds nothing (was breaking minimax /v1 override)
- Preserved main's legacy-path improvements: deepseek reasoning_content
  preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M
  beta recovery, etc.
- Kept agent/copilot_acp_client.py in place (rejected PR's relocation —
  main has 7 fixes landed since; relocation would revert them)
- _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing
  test imports

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Closes NousResearch#14418
…providers/

Every provider profile is now a self-contained plugin under
plugins/model-providers/<name>/, mirroring the plugins/platforms/
pattern established for IRC and Teams. The ProviderProfile ABC
stays in providers/; the per-provider profile data moves out.

- plugins/model-providers/<name>/__init__.py calls register_provider()
- plugins/model-providers/<name>/plugin.yaml declares kind: model-provider
- providers/__init__.py._discover_providers() lazily scans bundled plugins
  then $HERMES_HOME/plugins/model-providers/<name>/ (user override path)
- User plugins with the same name override bundled ones (last-writer-wins
  in register_provider)
- Legacy providers/<name>.py layout still supported for back-compat with
  out-of-tree editable installs
- Hermes PluginManager: new kind=model-provider; skipped like memory
  plugins (providers/ discovery owns them); standalone plugins with
  register_provider+ProviderProfile in their __init__.py auto-coerce to
  this kind (same heuristic as memory providers)
- skip_names extended to include 'model-providers' so the general
  PluginManager doesn't double-scan the category
- 4 new tests in tests/providers/test_plugin_discovery.py covering
  bundled discovery, user override, and general-loader isolation
- Docs updated: website/docs/developer-guide/adding-providers.md,
  provider-runtime.md, providers/README.md, plugins/model-providers/README.md

No API break: auth.py / config.py / doctor.py / models.py / runtime_provider.py /
model_metadata.py / auxiliary_client.py / chat_completions.py / run_agent.py
all still consume providers via get_provider_profile() / list_providers() —
they just now see plugin-discovered entries instead of pkgutil-iterated ones.

Third parties can now drop a single directory into
~/.hermes/plugins/model-providers/<name>/ to add or override an inference
provider without touching the repo.
teknium1 and others added 26 commits May 7, 2026 05:38
…usResearch#16234) (NousResearch#21228)

Cloud metadata endpoints (169.254.169.254 etc.) are now always blocked
by browser_navigate regardless of hybrid routing, allow_private_urls,
or backend.

Bug: commit 42c076d (NousResearch#16136) added hybrid routing that flips
auto_local_this_nav=True for private URLs and short-circuits
_is_safe_url(). IMDS endpoints are technically private (169.254/16
link-local), so the sidecar happily routed them to a local Chromium,
and the agent could read IAM credentials via browser_snapshot. On
EC2/GCP/Azure this is a full SSRF-to-credential-theft.

Fix: new is_always_blocked_url() in url_safety.py — a narrow floor
that checks _BLOCKED_HOSTNAMES, _ALWAYS_BLOCKED_IPS,
_ALWAYS_BLOCKED_NETWORKS only. Applied as an independent gate in
browser_navigate's pre-nav and post-redirect checks, BEFORE
auto_local_this_nav gets a chance to short-circuit. Ordinary private
URLs (localhost, 192.168.x, 10.x, .local, CGNAT) still route to the
local sidecar as the NousResearch#16136 feature intends.

Secondary fix (reporter's finding): _url_is_private() now explicitly
checks 172.16.0.0/12. ipaddress.is_private only covers that range on
Python ≥3.11 (bpo-40791), so on 3.10 runtimes those URLs were routed
to cloud instead of the local sidecar. No security impact — just a
correctness fix for the hybrid-routing feature.

Closes NousResearch#16234.
- Add PID file mechanism to track bridge processes and kill stale ones on startup
- Improve _kill_port_process() with lsof fallback when fuser is not available
- Support explicit WhatsApp disable via config.yaml (whatsapp.enabled: false)
- Respect WHATSAPP_ENABLED=false env var to disable WhatsApp

Fixes NousResearch#19124
…nt file (NousResearch#21230)

When the user created a new board via the dashboard with "switch" checked,
the server-side `current` file was flipped to the new board. Clicking the
original board's tab then showed no cards even though the count badge read
correctly — the REST fetch dropped `?board=` when the selection was
"default" and the backend fell through to `current` (= the new board),
returning a different board's data than the tab the user clicked.

Fix:
- `withBoard()` always appends `?board=<slug>` when a board is selected,
  including "default". The dashboard's tab selection becomes authoritative
  instead of silently deferring to the server's `current` file.
- `writeSelectedBoard()` persists every selection (including "default")
  to localStorage. Previously "default" was stripped, which meant the
  next page load had nothing to pin to and fell through to `current`.
- Same change applied to the WebSocket query builder in `openWs()`.

Contract verified live:
  current_board = "proj2"
  GET /board                  → proj2's tasks   (bug shape: falls through to current)
  GET /board?board=default    → default's tasks (fix: explicit pin wins)
  GET /board?board=proj2      → proj2's tasks

Closes NousResearch#20879.
…headers (NousResearch#21232)

Authenticated remote OpenViking servers derive tenancy from the Bearer
key, but the client was always sending X-OpenViking-Account and
X-OpenViking-User — defaulted to the literal string "default" — which
overrode the key-derived tenant and broke auth.

- _headers(): skip X-OpenViking-Account/-User when blank or "default"
  (treats the legacy default value as unset, so existing installs don't
  need to touch their .env)
- _headers(): send Authorization: Bearer <key> alongside X-API-Key for
  standard HTTP auth compatibility
- health(): include auth headers so /health works against servers that
  require authentication

Tests cover bearer emission, legacy "default" suppression, empty
suppression, real tenant passthrough, and authenticated health checks.

Fixes the same user report as NousResearch#20695 (from @ZaynJarvis); that PR could
not be merged because its branch was stale against main and would have
reverted recent OpenViking work (NousResearch#15696, local resource uploads, summary
URI normalization, fs-stat pre-check).
Enables plugins to transform LLM output text after generation,
useful for vocabulary/personality transformation without burning
inference tokens.

Follows same pattern as transform_tool_result and transform_terminal_output:
- First non-empty string result wins
- Fail-open: exceptions logged as warnings, agent continues
- Signature: (response_text, session_id, model, platform)
- tests/test_transform_llm_output_hook.py: dispatch semantics
  (kwargs contract, first-non-empty-string-wins, empty-string
  pass-through, raising-plugin fail-open, no-plugins = no-op)
- tests/hermes_cli/test_plugins.py: assert the new hook name is in
  VALID_HOOKS alongside the other transform_* hooks
- website/docs/user-guide/features/hooks.md: summary-table entry +
  full section mirroring transform_tool_result / transform_terminal_output
- scripts/release.py: map barnacleboy.jezzahehn@agentmail.to -> JezzaHehn
  (existing entry only covers the gmail address)
…ch#21236)

Lists the skills sitting in ~/.hermes/skills/.archive/ so users have
something to pass to `hermes curator restore`. `curator status` already
shows counts; this fills the name-discovery gap.

Archive layout is flat (`archive_skill` writes to `.archive/<skill>/`),
so the directory name IS the skill name — no frontmatter parsing
needed. Timestamped collision directories (`<skill>-<ts>`) are listed
literally; user can still pass them to `restore`.

Reshape of @EvilDrag0n's NousResearch#20651, simplified: drop the frontmatter
rglob + preamble/trailer output + duplicate subcommand registration.

Co-authored-by: EvilDrag0n <lxl694522264@gmail.com>
…8.1)

The initial DISCORD_ALLOWED_ROLES implementation (NousResearch#11608, merged from NousResearch#9873)
scans every mutual guild when resolving a user's roles. This allows a
cross-guild DM bypass:

1. Bot is in both public server A and private server B.
2. User holds the allowed role in server A only.
3. User DMs the bot. The role check finds the role in A and authorizes the
   DM, granting access as if the user were trusted in server B.

Fix:
- DMs (no guild context) disable role-based auth by default. Opt-in via
  DISCORD_DM_ROLE_AUTH_GUILD=<guild_id> restricts role lookup to one
  explicitly-trusted guild.
- Guild messages check roles only in the originating guild
  (message.guild), never in other mutual guilds.
- Reject cached author.roles when the Member came from a different guild
  than the current message.

Backwards compatibility:
- DISCORD_ALLOWED_USERS behavior is unchanged (still works in both DMs
  and guild messages).
- Deployments that rely on roles in guild channels continue to work;
  role checks are now strictly scoped to that guild.
- Deployments that intentionally want role-based DM auth can opt into a
  single trusted guild via DISCORD_DM_ROLE_AUTH_GUILD.

Tests: 9 new regression guards in
tests/gateway/test_discord_roles_dm_scope.py covering the bypass path,
the opt-in path, cross-guild guild-message bypass, and backwards-compat
user-ID paths. 47/47 discord-auth tests pass.

Refs: NousResearch#11608 (initial implementation), NousResearch#7871 (feature request),
  NousResearch#9873 (PR author credit @0xyg3n)
Sibling-site fix: _evaluate_slash_authorization was the fourth
_is_allowed_user caller and didn't pass guild/is_dm through, so slash
interactions would take the DM branch regardless of whether they came
from a guild channel. Now reads interaction.guild + in_dm and forwards.

Also updates test_discord_slash_auth fixture (_make_interaction) so
the SimpleNamespace guild mock has a get_member(uid)->None method —
required by the new guild-scoped fallback path in _is_allowed_user.
Tests exercising positive role paths still work via user.roles.

Three new regression tests in test_discord_roles_dm_scope:
- Slash DM + role in mutual public guild → rejected
- Slash in guild B + role only in guild A → rejected
- Slash in guild B + role in guild B → allowed (positive control)

368 Discord tests pass. test_discord_free_channel_skips_auto_thread
also fails on clean main (pre-existing, unrelated to this fix).
…var)

Per repo policy, ~/.hermes/.env is for secrets only. Guild IDs are
behavioral configuration, not secrets. Replacing the
DISCORD_DM_ROLE_AUTH_GUILD env var from the original fix with
discord.dm_role_auth_guild in config.yaml.

- New module-level _read_dm_role_auth_guild() helper reads
  hermes_cli.config.read_raw_config()['discord']['dm_role_auth_guild'].
  Fails closed on any parse error (safe default = DM role-auth off).
- DEFAULT_CONFIG['discord'] gains dm_role_auth_guild: '' with a comment
  documenting the opt-in.
- Tests patch hermes_cli.config.read_raw_config directly (via the
  _set_dm_role_auth_guild helper) instead of setenv/delenv. 12 tests
  in test_discord_roles_dm_scope pass; no env var involvement.
- Docstring + module docstring + comments updated to reference
  discord.dm_role_auth_guild.
- E2E verified with real imports across 6 scenarios: unset, int,
  string, garbage, zero, and (crucially) env-var-only-no-config all
  return None except the valid int/string cases. Env var has zero
  effect — policy compliance confirmed.
…s empty (NousResearch#21245)

When the provider rejects a request (e.g. invalid model slug like
'--provider nous --model kimi-k2.6' where the valid slug is
'moonshotai/kimi-k2.6'), run_conversation() returns
{failed: True, error: <detail>, final_response: None}. The TUI gateway
and one-shot CLI mode both dropped the error on the floor and emitted
an empty turn, so the user saw a blank response with no indication
that anything went wrong.

Mirror the interactive CLI's existing pattern (cli.py:9832): when
final_response is empty AND (failed|partial) is set AND error is
populated, surface 'Error: <detail>' as the visible text. Leaves
the None-with-no-error path and the '(empty)' sentinel path
untouched — an empty successful turn still renders empty, and
existing sentinel handlers keep owning their lane.

Reported by @counterposition in PR NousResearch#20873; taking a minimal fix
rather than the broader structured-failure refactor proposed there.
…ng suffix

Discord (and similar platforms) can serve a PNG image cached as
discord_xxx.webp because the CDN reports content_type=image/webp for
proxied stickers, custom emoji, and certain bot-uploaded images even
when the actual bytes are PNG. Hermes' agent.image_routing._guess_mime
trusted the file suffix and declared media_type=image/webp to
Anthropic, which strict-validates and returns:

  HTTP 400 messages.N.content.M.image.source.base64:
  The image was specified using the image/webp media type,
  but the image appears to be a image/png image

The Discord image attachment never reaches the model; the whole turn
fails with no salvage path.

Fix: sniff magic bytes in _file_to_data_url before declaring MIME.
Suffix-based detection is kept as a fallback when bytes aren't
available. New helper _sniff_mime_from_bytes covers PNG, JPEG, GIF,
WEBP, BMP, and HEIC/HEIF.

Tests:
- Two existing tests asserted the old broken behaviour (PNG bytes in
  a .jpg/.webp file should report jpeg/webp); rewritten with real
  jpeg/webp magic bytes so they still cover suffix-aligned cases.
- New regression test test_mime_sniff_overrides_misleading_extension
  reproduces the exact Discord scenario (PNG bytes, .webp suffix) and
  asserts the data URL comes back as image/png.

All 28 tests in tests/agent/test_image_routing.py pass.
Merged 604 upstream commits into fork.
Resolved conflicts by accepting upstream version.
@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown

🔎 Lint report: sync/upstream-20260507 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 7525 on HEAD, 0 on base (🆕 +7525)

🆕 New issues (3949):

Rule Count
unresolved-import 1216
invalid-argument-type 905
unresolved-attribute 865
invalid-assignment 442
unsupported-operator 121
invalid-parameter-default 118
not-subscriptable 82
invalid-method-override 56
invalid-return-type 36
no-matching-overload 32
call-non-callable 27
unresolved-reference 20
invalid-type-form 13
unused-type-ignore-comment 4
not-iterable 3
+6 more rules
First entries
plugins/google_meet/cli.py:166: [unresolved-import] unresolved-import: Cannot resolve imported module `playwright`
tests/gateway/test_teams.py:136: [unresolved-attribute] unresolved-attribute: Unresolved attribute `HttpRequest` on type `ModuleType`
environments/tool_call_parsers/glm45_parser.py:21: [unresolved-import] unresolved-import: Cannot resolve imported module `openai.types.chat.chat_completion_message_tool_call`
tests/gateway/test_feishu_onboard.py:5: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/hermes_cli/test_atomic_json_write.py:8: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/cron/test_cron_workdir.py:283: [unresolved-attribute] unresolved-attribute: Unresolved attribute `AIAgent` on type `ModuleType`
gateway/platforms/discord.py:1642: [unresolved-import] unresolved-import: Cannot resolve imported module `discord`
tests/tools/test_tool_output_limits.py:20: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
gateway/platforms/api_server.py:567: [invalid-assignment] invalid-assignment: Object of type `None` is not assignable to `def pause_job(job_id: str, reason: str | None = None) -> dict[str, Any] | None`
tests/gateway/test_weixin.py:852: [invalid-assignment] invalid-assignment: Object of type `Mock` is not assignable to attribute `is_duplicate` of type `def is_duplicate(self, msg_id: str) -> bool`
gateway/platforms/dingtalk.py:42: [unresolved-import] unresolved-import: Cannot resolve imported module `dingtalk_stream.frames`
tests/gateway/test_approve_deny_commands.py:18: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
optional-skills/security/oss-forensics/scripts/evidence-store.py:165: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `list[Unknown]` in union `dict[str, str | Unknown] | list[Unknown] | Any`
tests/cli/test_reasoning_command.py:553: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(i: SupportsIndex, /) -> str, (s: slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> list[str]]` cannot be called with key of type `Literal["show_reasoning"]` on object of type `list[str]`
gateway/platforms/slack.py:779: [invalid-method-override] invalid-method-override: Invalid override of method `send_private_notice`: Definition is incompatible with `BasePlatformAdapter.send_private_notice`
tests/tools/test_image_generation.py:413: [unresolved-attribute] unresolved-attribute: Unresolved attribute `response` on type `Exception`
gateway/platforms/telegram.py:27: [unresolved-import] unresolved-import: Cannot resolve imported module `telegram.ext`
gateway/platforms/matrix.py:1055: [unresolved-import] unresolved-import: Cannot resolve imported module `httpx`
tests/gateway/test_teams.py:177: [unresolved-attribute] unresolved-attribute: Unresolved attribute `TypingActivityInput` on type `ModuleType`
tools/memory_tool.py:43: [invalid-assignment] invalid-assignment: Object of type `None` is not assignable to `<module 'fcntl'>`
gateway/platforms/dingtalk.py:934: [unresolved-attribute] unresolved-attribute: Attribute `CreateCardRequestImRobotOpenSpaceModel` is not defined on `None` in union `Unknown | None`
gateway/platforms/telegram.py:1342: [unresolved-attribute] unresolved-attribute: Attribute `message_id` is not defined on `None` in union `None | Unknown`
optional-skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py:2724: [invalid-argument-type] invalid-argument-type: Argument to bound method `Migrator.record` is incorrect: Expected `Path | None`, found `Literal["archive/tools-config.json"]`
tests/test_timezone.py:65: [unresolved-attribute] unresolved-attribute: Attribute `total_seconds` is not defined on `None` in union `timedelta | None`
tests/gateway/test_telegram_thread_fallback.py:65: [unresolved-attribute] unresolved-attribute: Unresolved attribute `constants` on type `ModuleType`
... and 3924 more

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@bot-ted bot-ted merged commit 5b89079 into main May 7, 2026
5 of 10 checks passed
@bot-ted bot-ted deleted the sync/upstream-20260507 branch May 7, 2026 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.