Skip to content

chore: sync with upstream main (2026-05-11)#28

Merged
bot-ted merged 1108 commits into
mainfrom
sync/upstream-20260511
May 11, 2026
Merged

chore: sync with upstream main (2026-05-11)#28
bot-ted merged 1108 commits into
mainfrom
sync/upstream-20260511

Conversation

@bot-ted

@bot-ted bot-ted commented May 11, 2026

Copy link
Copy Markdown
Owner

Daily sync with upstream. Auto-created by cron job.

8052 new commits from upstream main.

Recent commits:
64145a1 fix(nix): replace chown -R with targeted find in container entrypoint (NousResearch#23633)
5606258 feat(nix): add extraDependencyGroups for sealed venv extras (NousResearch#21817)
d992fd9 feat(deps): add hindsight-client as optional dependency (NousResearch#21818)
ebf2ea5 feat(terminal,cli): docker_extra_args + display.timestamps
228b7d2 fix(auxiliary): cache 402'd providers as unhealthy with TTL to stop per-call retry storms (NousResearch#23597)
ace1c4e fix(discord): typing indicator task not cleaned up after API error
0458d99 chore(release): AUTHOR_MAP entry for Mibayy clawhub email
9526040 chore(skills/stocks): tighten SKILL.md to modern format
2ea957f chore(skills/stocks): relocate to optional-skills/finance/stocks/
896a7ce feat: add stocks & finance skill (Yahoo Finance, no API key)
bf2cc8b Merge pull request NousResearch#20317 from NousResearch/meta/security-policy
228a4d1 fix(config): warn loudly on YAML parse failure instead of silent default fallback (NousResearch#23585)
3af3c4e fix(misc): three small defensive fixes from PR NousResearch#1974
482d49c chore: AUTHOR_MAP entry for wilsen0
edb4a2b test(telegram): cover env-clamped helper + adaptive text-batch tiers
ac95b8c perf(gateway): tune Telegram cadence + adaptive fast-path for short replies
e3b88a8 rename(skills): api-testing -> rest-graphql-debug (NousResearch#23589)
5f76787 chore(release): AUTHOR_MAP entry for Hugo-SEQUIER
1f89939 chore(skills/hyperliquid): tighten SKILL.md to modern format
f2e8ed2 Add unit tests for hyperliquid skill functionality

...
Full log: 1107 commits total

kshitijk4poor and others added 30 commits May 9, 2026 02:33
Maps zhekinmaksim@gmail.com to GitHub login Zhekinmaksim so
contributor_audit.py recognizes their authored commit in the
upcoming NousResearch#21930 salvage PR.
… contexts

Follow-up to PR NousResearch#21293 (cli.py), which fixed the same anti-pattern.
`asyncio.get_event_loop()` is documented as effectively "always returns
the running loop when called from a coroutine" and emits
DeprecationWarning/RuntimeWarning in some interpreter configurations.
The Python docs explicitly recommend get_running_loop() inside coroutines.

Replaces the remaining 9 call sites that are unconditionally inside
async def bodies:

- tools/browser_cdp_tool.py — _cdp_call() (4 sites): deadline + remaining
  computations inside the async websockets.connect context manager.
- hermes_cli/web_server.py — get_status, _start_device_code_flow,
  submit_oauth_code (3 sites): all FastAPI async endpoints offloading
  blocking httpx / PKCE work to run_in_executor.
- environments/agent_loop.py — HermesAgentLoop (1 site): tool dispatch
  inside the async rollout loop.
- environments/benchmarks/terminalbench_2/terminalbench2_env.py —
  rollout_and_score_eval (1 site): test verification thread offload.

All 9 sites are unconditionally inside async def bodies, so a running
loop is guaranteed and no try/except RuntimeError fallback is needed
(unlike the cli.py case in NousResearch#21293, which ran from a background thread).

Behavior is identical on supported Python versions; aligns the codebase
with the post-NousResearch#21293 idiom and avoids future warnings as the deprecation
hardens.

Salvaged from PR NousResearch#21930 by @Zhekinmaksim onto current main (the
original branch was 109 commits behind and carried unintended
stale-branch reverts of unrelated landed changes — _tail_lines
encoding=utf-8 and the Windows PTY bridge guard). Only the 9 swaps
from the PR's intended scope are applied here.
Maps obafemiferanmi1999@gmail.com (the commit-author email used on
PR NousResearch#21473's branch) to GitHub login KvnGz (the PR/branch owner) so
contributor_audit.py recognizes the authored commit in the upcoming
salvage PR.
Three tests in tests/agent/test_auxiliary_config_bridge.py read
in-tree source files (gateway/run.py and cli.py) via
Path.read_text() with no encoding argument.  The default falls
back to the system locale, which on Western Windows installs is
cp1252, and the read fails as soon as the source contains any
byte that isn't valid cp1252 (e.g. an em-dash in a comment):

    UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f
    in position 41190: character maps to <undefined>

Linux CI doesn't catch this because the default Linux locale is
UTF-8.  Windows contributors hit it on every run of the test suite.

Pin encoding="utf-8" on the three call sites that read repo
source files.  This matches the existing precedent in
hermes_cli/doctor.py:363, where the same pattern (with an
explanatory comment) was applied to fix the .env read on
non-UTF-8 Windows locales.

Affected tests now pass on Windows + Python 3.12:
  - TestGatewayBridgeCodeParity.test_gateway_has_auxiliary_bridge
  - TestGatewayBridgeCodeParity.test_gateway_no_compression_env_bridge
  - TestCLIDefaultsHaveAuxiliaryKeys.test_cli_defaults_can_merge_auxiliary
Plugin platforms (IRC, Teams, Google Chat) currently fail with
`No live adapter for platform '<name>'` when a `deliver=<plugin>` cron
job runs in a separate process from the gateway, even though the
platforms are eligible cron targets via `cron_deliver_env_var` (added
in NousResearch#21306). Built-in platforms (Telegram, Discord, Slack, etc.) use
direct REST helpers in `tools/send_message_tool.py` so cron can deliver
without holding the gateway in the same process; plugin platforms
historically depended on `_gateway_runner_ref()` which returns `None`
out of process.

This change adds an optional `standalone_sender_fn` field to
`PlatformEntry` so plugins can register an ephemeral send path that
opens its own connection, sends, and closes without needing the live
adapter. The dispatch site in `_send_via_adapter` falls through to the
hook when the gateway runner is unavailable, with a descriptive error
when neither path applies. The hook is optional, so existing plugins
are unaffected.

Reference migrations land in the same change for IRC, Teams, and
Google Chat, exercising the hook across stdlib (asyncio + IRC protocol),
Bot Framework OAuth client_credentials, and Google service-account
flows respectively.

Security hardening on the new code paths:
* IRC: control-character stripping on chat_id and message body to
  block CRLF command injection; bounded nick-collision retries; JOIN
  before PRIVMSG so channels with the default `+n` mode accept the
  delivery.
* Teams: TEAMS_SERVICE_URL validated against an allowlist of known
  Bot Framework hosts (`smba.trafficmanager.net`,
  `smba.infra.gov.teams.microsoft.us`) to block SSRF; chat_id and
  tenant_id constrained to the documented Bot Framework character set;
  per-request timeouts so a slow STS endpoint cannot starve the
  activity POST.
* Google Chat: chat_id and thread_id validated against strict
  resource-name regexes; service-account refresh wrapped in
  `asyncio.wait_for` so a hung token endpoint cannot stall the
  scheduler.

Test coverage: 20 new tests covering happy path, missing-config errors,
network failure modes, and each defensive validation. Existing tests
unchanged. `bash scripts/run_tests.sh tests/tools/test_send_message_tool.py
tests/gateway/test_irc_adapter.py tests/gateway/test_teams.py
tests/gateway/test_google_chat.py` reports 341 passed, 0 regressions.

Documentation: new "Out-of-process cron delivery" section in
website/docs/developer-guide/adding-platform-adapters.md and an entry
in gateway/platforms/ADDING_A_PLATFORM.md naming the hook.
…lone-all

When the source profile is the default (~/.hermes), shutil.copytree()
was copying multi-GB infrastructure alongside the ~40 MB of actual
profile data: hermes-agent/ (repo checkout + 3 GB venv), .worktrees/,
profiles/ (sibling profiles — recursive!), bin/ (installed binaries),
node_modules/ (hundreds of MB).

Add _CLONE_ALL_DEFAULT_EXCLUDE_ROOT frozenset with these five entries
and pass an ignore callback to copytree().  Exclusions are gated on
the source actually being the default profile (is_default_source) so
named-profile sources are never affected.

Also exclude at any depth: __pycache__/, *.pyc, *.pyo, *.sock, *.tmp.
Profile data (config.yaml, .env, auth.json, state.db, sessions/,
skills/, logs/) is preserved intact — clone-all means 'complete
snapshot minus infrastructure'.

Mirrors the approach already used by _default_export_ignore() and
_DEFAULT_EXPORT_EXCLUDE_ROOT (the export-side exclusion set which is
broader because it produces a portable archive, not a live clone).

Co-authored-by: MustafaKara7 <karamusti912@gmail.com>
Co-authored-by: fahdad <30740087+fahdad@users.noreply.github.com>
Fixes NousResearch#5022
Based on PRs NousResearch#5025, NousResearch#5026, and NousResearch#21728
…-scoped path

check_for_updates() and _resolve_repo_dir() were preferring
$HERMES_HOME/hermes-agent/ over Path(__file__).parent.parent.resolve()
when looking for a .git checkout.  For profiles created with
--clone-all, $HERMES_HOME/hermes-agent/ points to a stale copy
with a frozen HEAD, causing persistent "N commits behind" banners
that never resolved.

Flip the resolution order: prefer the running code's location first,
fall back to $HERMES_HOME/hermes-agent/ only when the live checkout
doesn't have a .git (system-wide pip installs, distro packages).

The embedded-rev branch (HERMES_REVISION env var, set by nix builds)
is unaffected — it uses git ls-remote against upstream, never reads
the local checkout's HEAD.

Based on PR NousResearch#21728 by @fahdad
Nikita Nosov (nik1t7n, PR NousResearch#22264) — first-time contributor email
and noreply alias.
…ndering

When a GFM table has a row-label column (first column with no header),
_render_table_block_for_telegram incorrectly included the row-label cell
in the bullet zip alongside the data cells, producing a spurious bullet
like '• 維度: 核心賣點' before the real data rows.

Detect the row-label column by comparing the first data row cell count
against the header count (has_row_label_col = len(first_data_row) ==
len(headers) + 1). When present, use cells[0] as the heading and
zip headers against cells[1:] only, correctly excluding the row-label
from the bullet list.

Fixes NousResearch#22604
…ble-row-label-duplicate-bullet

fix(telegram): exclude row-label column from bullet items in table re…
…-confirm-boundary-cleanup

fix gateway: clear slash confirm state during session boundary cleanup
… (NousResearch#22687)

/clear, /new, /reset, and /undo now ask the user to confirm before
discarding conversation state — three-option prompt routed through the
existing tools.slash_confirm primitive.

Native yes/no buttons render on Telegram, Discord, and Slack (their
adapters already implement send_slash_confirm); other platforms get a
text-fallback prompt and reply with /approve, /always, or /cancel.

The classic prompt_toolkit CLI uses the same three-option flow via the
established _prompt_text_input pattern (see _confirm_and_reload_mcp).
TUI keeps its existing modal overlay (NousResearch#12312).

Gated by new config key approvals.destructive_slash_confirm (default
true). Picking 'Always Approve' flips the gate to false so subsequent
destructive commands run silently — matches the established
mcp_reload_confirm UX.

Out of scope: /cron remove (separate domain — scheduled jobs, not
session history). Existing TUI overlay env-var (HERMES_TUI_NO_CONFIRM)
left unchanged; cosmetic unification can come later.

Closes NousResearch#4069.
…ency

Problem:
unlink_tasks() removes a parent→child dependency edge but does not trigger
recompute_ready().  A child whose last blocking parent is unlinked stays
stuck in 'todo' indefinitely — it only promotes to 'ready' on the next
dispatcher tick or a manual 'hermes kanban recompute'.  For CLI-only users
without a dispatcher, the child is permanently stuck.

Root cause:
complete_task() and unblock_task() both call recompute_ready() after their
write transaction so downstream children are evaluated immediately.
unlink_tasks() was missing this call — removing a dependency is
semantically equivalent to completing one, so the same recompute is needed.

Fix:
Capture the rowcount result before the write_txn exits, then call
recompute_ready(conn) outside the transaction when a row was actually
deleted (so the child sees the updated task_links state).

Tests:
Added test_unlink_tasks_triggers_recompute_ready in
tests/hermes_cli/test_kanban_db.py: creates parent A (done) + parent C
(running), child B with both parents (todo), unlinks C→B, asserts B is
ready immediately.  Stash-verified: FAILS without fix (child stays todo),
PASSES with fix.
62/62 tests green in tests/hermes_cli/test_kanban_db.py.

Closes NousResearch#22459.
…use (NousResearch#22681)

Plugin discovery imports every bundled platform plugin at model_tools
import time. The google_chat adapter unconditionally pulled in
google.cloud.pubsub_v1, googleapiclient, grpc, httplib2, and friends at
module top — about 33 MB RSS and 110 ms wall on every CLI invocation,
even ones that never construct a gateway adapter.

Wrap the heavy imports in _load_google_modules(): an idempotent loader
that rebinds the module-level globals (pubsub_v1, service_account,
HttpError, MediaFileUpload, …) on first call and is invoked from
GoogleChatAdapter.__init__, connect(), and check_google_chat_requirements().

The HttpError = Exception placeholder is preserved for the brief window
before the loader runs, so 'except HttpError as exc:' clauses stay
correct (Python looks up the name at try/except evaluation time, not
at function definition time).

Measured impact on a 9950X3D, 7-run medians:
  import cli:              895 → 787 ms  (-108 ms / -12%)
                           133 → 110 MB  ( -23 MB / -17%)
  import model_tools:      491 → 400 ms  ( -91 ms / -19%)
                            95 →  66 MB  ( -29 MB / -31%)
  google_chat alone:       244 → 132 ms  (-112 ms / -46%)
                            83 →  50 MB  ( -33 MB / -40%)
  hermes chat -q (cold):   177 → 145 MB  ( -32 MB / -18%)

Real-world win lands on every path that imports cli.py: hermes chat,
hermes gateway, cron jobs, batch runs, subagents. Long-lived gateway
processes save ~30 MB resident.

All 157 google_chat tests pass; full gateway suite (5050 tests) green.
…ousResearch#22684)

Plugin authors had no easy way to figure out why their plugin wasn't
loading — failures were buried in agent.log at WARNING and skip reasons
(disabled, not enabled, depth cap, exclusive) were DEBUG-only and
invisible by default.

Set HERMES_PLUGINS_DEBUG=1 to attach a stderr handler at DEBUG to the
hermes_cli.plugins logger only. Surfaces:

  - which directories were scanned + manifest counts per source
  - per manifest: resolved key, name, kind, source, on-disk path
  - skip reasons (disabled, not enabled, exclusive, depth cap, no register)
  - per load: tools/hooks/slash/CLI commands the plugin registered
  - full traceback on YAML parse failure (exc_info on the existing warning)
  - full traceback on register() exceptions, pointing at the plugin author's line

Env var off (default) → zero new stderr output, same as before.

Touches only hermes_cli/plugins.py + a doc section in the plugin-build
guide + an entry in the env-vars reference. 3 new tests lock the
attach/idempotent/no-attach behavior.
Enforce the parent-completion invariant at claim_task (the single
ready->running chokepoint) and re-gate unblock_task so blocked->ready
only fires when parents are done. Prevents child tasks from running
ahead of in-progress parents under the create-then-link race.

Also adds a stress test that races concurrent create+link against
hammered claim_task and asserts no child runs while any parent is undone.

Ref: kanban/boards/cookai/workspaces/t_a6acd07d/root-cause.md
Refs: t_8d6af9d6
…n tool description (NousResearch#22694)

The delegate_task tool description hardcoded 'default 3' / 'default 2' for
max_concurrent_children / max_spawn_depth, which misled the model on any
install that raised these limits — the schema text said 'default 3' even
when the user had set max_concurrent_children=15 / max_spawn_depth=3, so
the model would self-cap at 3 and never use the headroom.

Make the description dynamic. ToolEntry gains an optional
dynamic_schema_overrides callable; registry.get_definitions() merges its
output on top of the static schema before returning it. delegate_tool
registers a builder that reads the current delegation.* config and emits:

- 'up to N items concurrently for this user' (N = max_concurrent_children)
- 'Nested delegation IS enabled / OFF for this user (max_spawn_depth=N)'
- 'orchestrator children can themselves delegate up to M more level(s)'
- 'orchestrator_enabled=false' when the kill switch is set

The model_tools cache key already includes config.yaml mtime+size, so
edits to delegation.* in config invalidate the cached tool definitions
without an explicit hook. CLI_CONFIG staleness within a process is a
pre-existing limitation of _load_config and out of scope here.

Static description / tasks.description / role.description in
DELEGATE_TASK_SCHEMA are placeholders so module import doesn't trigger
cli.CLI_CONFIG load before the test conftest can redirect HERMES_HOME.
…latform_toolsets

When platform_toolsets[<platform>] contains both a composite (e.g.
hermes-cli) and at least one configurable opt-in (e.g. spotify), the
has_explicit_config branch in _get_platform_tools silently dropped the
composite, leaving sessions with only the configurable + plugin tools
and no native tools (terminal, file, web, browser, memory, etc.).

Mirror the else-branch's subset inference for composites that sit
alongside the configurables, but apply _DEFAULT_OFF_TOOLSETS only to the
implicit expansion so user-listed default-off toolsets (spotify,
discord) survive.
Resolve git via shutil.which with POSIX and Git-for-Windows fallbacks before clone and pull so Dashboard/API installs do not misreport Git as missing.

Add regression tests for the resolver and pull subprocess invocation.
When a Telegram user replies using the native quote feature to select
only part of a prior message, _build_message_event was injecting the
ENTIRE replied-to message into reply_to_text via
message.reply_to_message.text/caption. python-telegram-bot exposes
the user-selected substring as message.quote (TextQuote.text); we now
prefer that and fall back to the full replied-to text only when no
native quote is present.

The agent-visible "[Replying to: \"...\"]" prefix can otherwise expand
the user's narrow quote into the full prior message, causing the agent
to act on unrelated actionable-looking text the user did not select
(e.g. multi-item briefings where the user quotes one bullet but the
prefix injects every bullet). Falls back cleanly when message.quote
is absent (PTB <21 or replies that don't quote a substring).

Fixes NousResearch#22619

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s root

Problem:
After `hermes profile use NAME`, the gateway (started via systemd with
HERMES_HOME=/root/.hermes hardcoded) ignores the active profile and
always runs as the Default profile.  WebUI, Telegram, and all non-CLI
platforms are affected.

Root cause:
_apply_profile_override() contained an early-return guard:

    if profile_name is None and os.environ.get("HERMES_HOME"):
        return   # trust the inherited value

The intent was to let child processes inherit their parent's profile via
HERMES_HOME without redundantly re-reading active_profile.  But
systemd also sets HERMES_HOME — to the hermes root (/root/.hermes),
not a profile directory — so the guard fired and silently skipped the
active_profile check.  The user's `hermes profile use NAME` write to
~/.hermes/active_profile was never seen by the gateway process.

Fix:
Only skip the active_profile check when HERMES_HOME is already a
profile directory, identified by its immediate parent directory being
named "profiles" (e.g. ~/.hermes/profiles/coder or
/opt/data/profiles/coder).  When HERMES_HOME points to a root
directory (parent name != "profiles"), continue to read active_profile.

Tests:
- test_hermes_home_at_root_with_active_profile_is_redirected: the
  bug scenario — HERMES_HOME=/root/.hermes + active_profile=coder →
  HERMES_HOME must be redirected to .../profiles/coder.
  Stash-verified: FAILS without fix, PASSES with fix.
- test_hermes_home_already_profile_dir_is_trusted: child-process
  inheritance contract unchanged — .../profiles/coder is trusted as-is.
- test_hermes_home_unset_reads_active_profile: classic path unchanged.
- test_hermes_home_unset_default_profile_no_redirect: "default" still
  produces no redirect.
4/4 tests green.

Closes NousResearch#22502.
Adds 'codex' to the _MCP_PRESETS registry so users can add it via

  Connecting to 'codex'...

  ✓ Connected! Found 2 tool(s) from 'codex':

    codex                                    Run a Codex session. Accepts configuration parameters matchi...
    codex-reply                              Continue a Codex conversation by providing the thread id and...

  Enable all 2 tools? [Y/n/select]:
  Cancelled. without manually specifying
the command and args.

Enables: codex mcp-server → Hermes native MCP client → Codex tools
available as first-class Hermes tools.
teknium1 and others added 26 commits May 10, 2026 22:02
Two new tests:

- tests/gateway/test_telegram_format.py
  test_message_too_long_splits_into_continuations_not_silent_truncation:
  asserts edit_message returns success=True with continuation_message_ids
  populated and message_id pointing at the last continuation when
  content exceeds MAX_MESSAGE_LENGTH (NousResearch#19537). Replaces the original
  fail-on-overflow assertion with the split-and-deliver contract.

- tests/gateway/test_stream_consumer.py
  TestEditOverflowSplitAndDeliver.test_consumer_advances_message_id_on_split_and_deliver:
  asserts the consumer side updates _message_id to the latest
  continuation, clears _last_sent_text, and fires on_new_message when
  the adapter reports a split-and-deliver result.
Adds optional-skills/software-development/api-testing/SKILL.md — a single-file
runbook for systematic REST/GraphQL API debugging via Hermes tools (terminal,
execute_code, web_extract, delegate_task).

- 60-char description; gated to platforms: [linux, macos]
- Layered debug flow (connectivity → TLS → auth → format → parse → semantics)
- HTTP status playbook (401/403/404/409/422/429/5xx)
- Pagination, idempotency, contract validation, correlation IDs
- pytest smoke template, token-redaction patterns, leak checklist
- Hermes tool patterns replace generic curl/python examples

Lands in optional-skills/ (not always-active skills/) so it's installed via
hermes skills install official/software-development/api-testing.

scripts/release.py: AUTHOR_MAP entry for erenkar950@gmail.com → eren-karakus0.

Closes NousResearch#1800.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
…ove write-only _pending_messages

1. Quick command exec ran in the gateway process's full environment
   without env sanitization or output redaction. A quick command like
   "env" or "printenv" would leak all API keys, OAuth tokens, and
   bot credentials to the messaging user.

   Fix: apply _sanitize_subprocess_env() before exec and
   redact_sensitive_text() on output before returning.

2. GatewayRunner._pending_messages was written on every interrupt
   (lines 1331-1334) but never read or consumed anywhere. The actual
   interrupt delivery uses adapter._pending_messages (a separate dict).
   Removed the write-only accumulation to prevent unbounded growth.
agent.redact._REDACT_ENABLED is snapshotted at import time from
HERMES_REDACT_SECRETS env. Under xdist a prior test in the same worker
can flip it, so test_exec_command_output_is_redacted was order-dependent.
Pin it via monkeypatch like test_terminal_output_transform_still_runs_strip_and_redact does.
- Implement tests for normalizing perpetual markets and DEXs.
- Validate JSON output for main commands including markets, candles, and review.
- Ensure environment variable resolution and dotenv file reading are covered.
- Test export functionality for market data with expected output structure.
- description shortened to <=60 chars
- platforms gated to [linux, macos, windows] (stdlib-only, all OK)
- author credits Hugo Sequier
- collapse redundant prerequisites/setup blocks
- terminal-tool-oriented procedure section
More specific name. The skill is REST + GraphQL debugging end-to-end,
not generic 'api testing' (a smoke-test pytest scaffold is one short
section out of ~500 lines). Renames directory + frontmatter name +
self-reference in the delegate_task example body.
…eplies

Re-authored against current main from PR NousResearch#10388 by @wilsen0.  The
original branch is 3800+ commits stale and could not be cherry-picked
without reverting unrelated work; this change carries only the perf
intent forward.

Tuning summary
==============

Text-batch ingress (gateway/platforms/telegram.py):
  - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3
  - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0
  - Adaptive fast-path tiers in _flush_text_batch:
      total <= 320 cp -> min(cap, 0.18)
      total <= 1024 cp -> min(cap, 0.24)
      else            -> cap
    A single short reply now reaches the agent in ~180ms instead of
    600ms.  Tier constants compose with the configured cap via min()
    so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS
    below 0.18 still wins on every tier.
  - _env_float_clamped helper replaces bare float(os.getenv()).
    Rejects NaN / Inf, applies optional min/max bounds.  Used for
    text-batch + media-batch knobs.  Prevents asyncio.sleep(NaN)
    crashes when an operator typos an env var.

Stream cadence (gateway/config.py + stream_consumer.py):
  - StreamingConfig.edit_interval default 1.0s -> 0.8s
  - StreamingConfig.buffer_threshold default 40 -> 24 chars
  - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now
    a single source of truth.  StreamConsumerConfig imports them
    instead of duplicating the literals; the prior dual-source drift
    is fixed.

Tool progress (gateway/display_config.py):
  - Telegram default tool_progress 'all' -> 'new'.  Inside
    Telegram's ~1 edit/s flood envelope the 'all' default would
    accumulate edit pressure on busy chats; 'new' shows only the
    leading bubble per tool batch and feels less spammy.
  - Slack tier_low override (tool_progress='off') is preserved.

Composition with native draft streaming (NousResearch#23512)
================================================

The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH
the draft path (send_draft) and the edit path (edit_message), so the
tighter cadence helps native draft as much as edit-based.  The
text-batch fast-path applies before the consumer starts, so it speeds
up the first-token latency on every transport.  No conflict.

Stale-base avoidance
====================

Re-authored from scratch rather than cherry-picked.  Dropped from the
original branch:
  - Unrelated d2f043f 'fix(anthropic): preserve third-party thinking
    continuity' commit
  - boot_md.py builtin gateway hook (unrelated)
  - Reverted Slack tool_progress='off' (NousResearch#14663) restoration
  - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO
    members deletion
  - 2300+ lines of run.py base-skew noise

Tests
=====

New tests/gateway/test_telegram_text_batch_perf.py:
  - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds).
  - 4 tests for the adaptive-tier composition rules.

Updated tests/gateway/test_display_config.py:
  - test_platform_default_when_no_user_config: 'all' -> 'new' for
    Telegram, with comment.
  - test_high_tier_platforms: split into Telegram-overrides-to-new
    and Discord-stays-all assertions.

Closes NousResearch#10388.

Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
- New tests/gateway/test_telegram_text_batch_perf.py:
  TestEnvFloatClamped — 7 tests covering default-when-unset, valid
  parse, garbage fallback, NaN rejection, Inf rejection, min-clamp,
  max-clamp.  Asserts asyncio.sleep() always gets a finite number.

  TestAdaptiveTextBatchTiers — 4 tests covering the tier-constant
  invariants and the min(cap, tier_delay) composition rule.

- tests/gateway/test_display_config.py: update assertions for
  Telegram's new tool_progress='new' default.
Salvages the three substantive low-severity fixes from Gutslabs' NousResearch#1974
"misc bug fixes" bundle.  The other 8 claims in that PR were either
already fixed on main with superior implementations (state lock,
firecrawl lazy import, fcntl/msvcrt guard, path normalization, schema
migrations) or did not survive review.

- run_agent: `_materialize_data_url_for_vision` uses
  `NamedTemporaryFile(delete=False)`; if `base64.b64decode` raises on a
  corrupt data URL the temp file would persist forever.  Wrap the
  write in try/except and `os.unlink` the temp on failure.

- gateway/session: `append_to_transcript` JSONL write had no error
  handling, so disk-full / read-only-fs / permission errors crashed the
  message handler.  The SQLite write above is the primary store, so
  swallow OSError on the JSONL fallback with a debug log.

- gateway/status: `_read_pid_record` reads `pid_path.read_text()` after
  an `exists()` check; if the PID file is deleted between the two
  calls (concurrent gateway restart) we hit an unhandled OSError.
  Catch it and return None.

Adds a regression test for the tempfile cleanup; the other two paths
are defensive try/excepts on infrequent OSError that don't warrant
dedicated tests.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
…ult fallback (NousResearch#23585)

A YAML parse error in ~/.hermes/config.yaml caused load_config() to print
one line to stdout (Warning: Failed to load config: ...) and silently fall
back to DEFAULT_CONFIG, dropping every user override (auxiliary providers,
fallback chain, model settings). Users only noticed when downstream
behavior misbehaved — see issue NousResearch#23570 where a tab-indent error in the
auxiliary section caused aux fallback to use OpenRouter (depleted) instead
of the configured Codex/MiniMax chain.

Now: log at WARNING (so 'hermes logs' surfaces it), write a prominent line
to stderr, dedup on (path, mtime_ns, size) so concurrent loads don't spam,
and re-warn after the user edits the file. Both call sites (raw read +
merged load) route through the same helper.

Refs NousResearch#23570
…-policy

docs(security): rewrite policy around OS-level isolation as the boundary
5 commands: quote, search, history, compare, crypto
Zero dependencies, Python stdlib only.
Supports multi-symbol queries and crypto prices.
When the Discord typing API call fails (rate limit, network error, 403),
_typing_loop returns early but the stale task remains in _typing_tasks.
Subsequent send_typing calls see the stale entry and skip, leaving no
typing indicator for the rest of the agent invocation.

Add finally block to _typing_loop to always remove the task from
_typing_tasks on exit, whether from cancellation, error, or normal
completion. This allows send_typing to create a fresh task.

3 new tests in test_discord_send.py:
- Task removed after API error
- Typing restartable after failure
- stop_typing cleans up
…er-call retry storms (NousResearch#23597)

When an auxiliary provider returns HTTP 402 (credit / payment), every
subsequent compression / title-gen / session-search / vision call still
re-tried it as the FIRST entry in the chain — burning ~1 RTT to hit 402
again, then falling back. On a long Discord/LCM session that meant dozens
of doomed 402s per minute (issue NousResearch#23570).

Add a per-process unhealthy-provider cache with a 10 min TTL. When any
caller observes a payment error against a provider, the label is marked
unhealthy and skipped by:
  * _resolve_auto Step-1 (main provider use-as-aux path)
  * _resolve_auto Step-2 (aggregator/fallback chain)
  * _try_payment_fallback (used by call_llm/acall_llm on first 402)

Skip-logs are throttled to once per minute per label so a bursty session
doesn't spam agent.log. Entries auto-expire so a topped-up account
recovers without manual intervention. The cache is in-process only by
design — multi-profile users with different keys per profile must each
hit the 402 once.

Refs NousResearch#23570
Two independent opt-in QoL toggles, both off by default.

terminal.docker_extra_args:
- List of extra flags appended verbatim to docker run after security
  defaults. Useful for adding capabilities (e.g. --cap-add SETUID) or
  other docker run options not exposed by existing config keys.
- Non-string entries are logged and skipped.
- Also available via TERMINAL_DOCKER_EXTRA_ARGS='[...]' env var.

display.timestamps:
- Appends [HH:MM] to user input bullet and the assistant response box
  header. Single hub in _format_submitted_user_message_preview()
  covers both single-line and multi-line user previews; assistant
  response label gets the timestamp at box-open time.

Closes NousResearch#1569 (timestamps).

Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
…#21818)

Declares hindsight-client as an optional dependency group [hindsight]
in pyproject.toml. This allows build-time inclusion for environments
where runtime pip install is not possible (NixOS sealed venvs, Docker,
Kubernetes).

Not included in [all] — memory providers are plugins and should be
opted into explicitly.

Install via:
  uv sync --extra hindsight
  pip install hermes-agent[hindsight]

NixOS (with extraDependencyGroups):
  services.hermes-agent.extraDependencyGroups = [ "hindsight" ];

Closes NousResearch#8873
…arch#21817)

Expose the dependency-groups parameter from python.nix through
hermes-agent.nix and the NixOS module, allowing users to opt into
pyproject.toml optional extras (e.g. hindsight, voice, matrix) that
are resolved by uv inside the sealed venv.

Unlike extraPythonPackages (which appends to PYTHONPATH and requires
collision checking), extraDependencyGroups resolves the full dependency
graph in a single uv pass — no PYTHONPATH patching, no version
conflicts, no collision risk.

When to use which:
- extraDependencyGroups: enable a pyproject.toml optional extra
- extraPythonPackages: add an external Python plugin not in pyproject.toml

Usage:
  services.hermes-agent.extraDependencyGroups = [ "hindsight" ];

Or via overlay:
  pkgs.hermes-agent.override { extraDependencyGroups = [ "hindsight" ]; }

Refs: NousResearch#8873, NousResearch#9194
…NousResearch#23633)

The container entrypoint ran `chown -R` on $HERMES_HOME every start.
`chown` strips the setgid bit (kernel security behavior), destroying
the 2770 permissions the NixOS activation script sets for group access
by hostUsers. This caused PermissionError for interactive CLI users
even though they were in the hermes group.

Replace with `find ... ! -user $UID -exec chown` which only touches
files with wrong ownership, leaving correctly-owned directories and
their permission bits intact.

Affects: container.enable + container.hostUsers + addToSystemPackages

Related: NousResearch#19795, NousResearch#19788, NousResearch#9383
@github-actions

Copy link
Copy Markdown

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@github-actions

Copy link
Copy Markdown

🔎 Lint report: sync/upstream-20260511 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8147 on HEAD, 7976 on base (🆕 +171)

🆕 New issues (120):

Rule Count
invalid-argument-type 68
unresolved-attribute 17
unresolved-import 17
invalid-assignment 8
invalid-method-override 3
unsupported-operator 2
no-matching-overload 1
deprecated 1
unresolved-reference 1
invalid-return-type 1
not-subscriptable 1
First entries
run_agent.py:9525: [invalid-argument-type] invalid-argument-type: Argument to function `github_model_reasoning_efforts` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown, Unknown] | int | dict[Unknown | str, Unknown | str | dict[str, str]]`
tests/hermes_cli/test_goals.py:905: [unresolved-attribute] unresolved-attribute: Attribute `decomposed` is not defined on `None` in union `GoalState | None`
tests/hermes_cli/test_kanban_db.py:1435: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `int`, found `str | None | int`
plugins/platforms/line/adapter.py:1195: [unresolved-attribute] unresolved-attribute: Attribute `reply` is not defined on `None` in union `_LineClient | None`
run_agent.py:12798: [invalid-argument-type] invalid-argument-type: Argument to function `save_context_length` is incorrect: Expected `str`, found `str | Unknown | dict[Unknown, Unknown] | int | dict[Unknown | str, Unknown | str | dict[str, str]]`
run_agent.py:12827: [invalid-argument-type] invalid-argument-type: Argument to function `estimate_usage_cost` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown, Unknown] | int | dict[Unknown | str, Unknown | str | dict[str, str]]`
optional-skills/blockchain/hyperliquid/scripts/hyperliquid_client.py:288: [no-matching-overload] no-matching-overload: No overload of bound method `dict.get` matches arguments
run_agent.py:13546: [invalid-argument-type] invalid-argument-type: Argument to function `_pool_may_recover_from_rate_limit` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown, Unknown] | int | dict[Unknown | str, Unknown | str | dict[str, str]]`
cli.py:8483: [invalid-argument-type] invalid-argument-type: Argument to function `estimate_usage_cost` is incorrect: Expected `str`, found `str | Unknown | dict[Unknown, Unknown] | int | dict[Unknown | str, Unknown | str | dict[str, str]]`
run_agent.py:9498: [invalid-argument-type] invalid-argument-type: Argument to function `lmstudio_model_reasoning_options` is incorrect: Expected `str`, found `str | Unknown | dict[Unknown, Unknown] | int | dict[Unknown | str, Unknown | str | dict[str, str]]`
optional-skills/blockchain/hyperliquid/scripts/hyperliquid_client.py:290: [invalid-argument-type] invalid-argument-type: Argument to bound method `dict.get` is incorrect: Expected `Never`, found `Literal["fullName"]`
agent/plugin_llm.py:999: [invalid-argument-type] invalid-argument-type: Argument to function `async_call_llm` is incorrect: Expected `int`, found `int | None`
run_agent.py:5777: [unresolved-attribute] unresolved-attribute: Attribute `split` is not defined on `dict[Unknown, Unknown]`, `int`, `dict[Unknown | str, Unknown | str | dict[str, str]]` in union `str | Unknown | dict[Unknown, Unknown] | int | dict[Unknown | str, Unknown | str | dict[str, str]]`
run_agent.py:8396: [unresolved-attribute] unresolved-attribute: Attribute `strip` is not defined on `dict[Unknown, Unknown] & ~AlwaysFalsy`, `int & ~AlwaysFalsy`, `dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy` in union `Divergent | (Unknown & ~AlwaysFalsy) | (str & ~AlwaysFalsy) | ... omitted 4 union elements`
tests/hermes_cli/test_kanban_db.py:1435: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `int | None`, found `str | None | int`
agent/plugin_llm.py:996: [invalid-argument-type] invalid-argument-type: Argument to function `async_call_llm` is incorrect: Expected `str`, found `str | None`
tests/gateway/test_goal_verdict_send.py:143: [invalid-argument-type] invalid-argument-type: Argument to function `save_goal` is incorrect: Expected `GoalState`, found `GoalState | None`
optional-skills/blockchain/hyperliquid/scripts/hyperliquid_client.py:289: [invalid-argument-type] invalid-argument-type: Argument to bound method `dict.get` is incorrect: Expected `Never`, found `Literal["name"]`
optional-skills/blockchain/hyperliquid/scripts/hyperliquid_client.py:174: [deprecated] deprecated: The function `utcfromtimestamp` is deprecated: Use timezone-aware objects to represent datetimes in UTC; e.g. by calling .fromtimestamp(datetime.timezone.utc)
tests/tools/test_browser_eval_supervisor_path.py:13: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/test_live_system_guard_self_test.py:24: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/hermes_cli/test_goals.py:1114: [unresolved-attribute] unresolved-attribute: Attribute `checklist` is not defined on `None` in union `GoalState | None`
hermes_cli/kanban_db.py:4330: [invalid-argument-type] invalid-argument-type: Argument to function `_safe_int` is incorrect: Expected `str | None`, found `int`
plugins/platforms/line/adapter.py:972: [unresolved-attribute] unresolved-attribute: Class `MessageType` has no attribute `IMAGE`
tests/conftest.py:869: [unresolved-attribute] unresolved-attribute: Unresolved attribute `__class_getitem__` on type `def _guarded(cmd, *args, **kwargs) -> Unknown`
... and 95 more

✅ Fixed issues (50):

Rule Count
invalid-argument-type 33
unresolved-attribute 6
unresolved-import 6
unsupported-operator 4
not-subscriptable 1
First entries
tests/agent/test_codex_cloudflare_headers.py:163: [unresolved-attribute] unresolved-attribute: Attribute `startswith` is not defined on `dict[str, str]` in union `Unknown | str | dict[str, str]`
run_agent.py:2593: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `(Unknown & ~AlwaysFalsy) | (str & ~AlwaysFalsy) | (dict[str, str] & ~AlwaysFalsy) | ... omitted 4 union elements`
run_agent.py:8746: [invalid-argument-type] invalid-argument-type: Argument to function `get_provider_request_timeout` is incorrect: Expected `str`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
skills/mlops/training/trl-fine-tuning/templates/basic_grpo_training.py:16: [unresolved-import] unresolved-import: Cannot resolve imported module `transformers`
run_agent.py:4765: [invalid-argument-type] invalid-argument-type: Argument to function `save_trajectory` is incorrect: Expected `str`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
run_agent.py:4176: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `str`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
run_agent.py:2590: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `(Unknown & ~AlwaysFalsy) | (str & ~AlwaysFalsy) | (dict[str, str] & ~AlwaysFalsy) | ... omitted 4 union elements`
run_agent.py:7160: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:13287: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:9324: [invalid-argument-type] invalid-argument-type: Argument to function `_get_anthropic_max_output` is incorrect: Expected `str`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
run_agent.py:5777: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["/"]` and `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
tests/agent/test_codex_cloudflare_headers.py:163: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `str & ~AlwaysFalsy`, `int & ~AlwaysFalsy` in union `(Unknown & ~AlwaysFalsy) | (str & ~AlwaysFalsy) | (dict[str, str] & ~AlwaysFalsy) | ... omitted 3 union elements`
tests/agent/test_codex_cloudflare_headers.py:181: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["originator"]` and `(Unknown & ~AlwaysFalsy) | (str & ~AlwaysFalsy) | (dict[str, str] & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:2339: [invalid-argument-type] invalid-argument-type: Argument to function `query_ollama_num_ctx` is incorrect: Expected `str`, found `(str & ~AlwaysFalsy) | (dict[str, str] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 4 union elements`
skills/mlops/training/trl-fine-tuning/templates/basic_grpo_training.py:15: [unresolved-import] unresolved-import: Cannot resolve imported module `datasets`
tests/run_agent/test_provider_attribution_headers.py:156: [unsupported-operator] unsupported-operator: Operator `not in` is not supported between objects of type `Literal["X-OpenRouter-Cache-TTL"]` and `Unknown | str | dict[str, str] | ... omitted 3 union elements`
run_agent.py:5313: [invalid-argument-type] invalid-argument-type: Argument to function `parse_rate_limit_headers` is incorrect: Expected `str`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
tests/run_agent/test_provider_attribution_headers.py:154: [not-subscriptable] not-subscriptable: Cannot subscript object of type `int` with no `__getitem__` method
run_agent.py:8746: [invalid-argument-type] invalid-argument-type: Argument to function `get_provider_request_timeout` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
skills/mlops/training/trl-fine-tuning/templates/basic_grpo_training.py:18: [unresolved-import] unresolved-import: Cannot resolve imported module `trl`
run_agent.py:12799: [invalid-argument-type] invalid-argument-type: Argument to function `estimate_usage_cost` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
tests/run_agent/test_provider_attribution_headers.py:155: [unsupported-operator] unsupported-operator: Operator `not in` is not supported between objects of type `Literal["X-OpenRouter-Cache"]` and `Unknown | str | dict[str, str] | ... omitted 3 union elements`
run_agent.py:3347: [invalid-argument-type] invalid-argument-type: Argument to function `get_provider_stale_timeout` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
run_agent.py:11248: [invalid-argument-type] invalid-argument-type: Argument to function `_fixed_temperature_for_model` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
run_agent.py:8662: [invalid-argument-type] invalid-argument-type: Argument to bound method `ContextCompressor.update_model` is incorrect: Expected `int`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
... and 25 more

Unchanged: 4165 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@bot-ted bot-ted merged commit 0651813 into main May 11, 2026
12 of 17 checks passed
@bot-ted bot-ted deleted the sync/upstream-20260511 branch May 11, 2026 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.