Skip to content

fix: remove litellm dependency (PyPI supply chain compromise)#2796

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-764f7842
Mar 24, 2026
Merged

fix: remove litellm dependency (PyPI supply chain compromise)#2796
teknium1 merged 1 commit into
mainfrom
hermes/hermes-764f7842

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec payload that exfiltrates API keys, SSH keys, cloud credentials, etc. on any Python startup). PyPI has quarantined the entire litellm package, which blocks all fresh hermes-agent installs.

Ref: BerriAI/litellm#24512

Changes

  • Remove litellm>=1.75.5, typer, platformdirs from hermes-agent's pyproject.toml — these are mini-swe-agent deps that are redundantly duplicated here. mini-swe-agent has its own pyproject.toml and manages its own dependencies.
  • Fix install.shlog_success "mini-swe-agent installed" was printing unconditionally even on install failure. Now uses proper if/else.
  • Update warning messages in both install.sh and setup-hermes.sh — clarifies that only Docker/Modal terminal backends are affected when mini-swe-agent fails to install; local terminal is unaffected.

Impact

  • Unblocks fresh installs immediately
  • No hermes-agent functionality is lost (litellm is not imported anywhere in the main codebase)
  • mini-swe-agent sub-install will soft-fail with a warning until litellm quarantine lifts
  • Local terminal backend (used by ~all users) works without mini-swe-agent

Test plan

  • 6086 tests pass, 0 failures

… chain compromise)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512
@teknium1 teknium1 merged commit 18cbd18 into main Mar 24, 2026
1 check passed
lws803 added a commit to lattice-pns/lattice-agent that referenced this pull request Mar 24, 2026
…ts) (#16)

* fix(security): add SSRF protection to vision_tools and web_tools (hardened)

* fix(security): add SSRF protection to vision_tools and web_tools

Both vision_analyze and web_extract/web_crawl accept arbitrary URLs
without checking if they target private/internal network addresses.
A prompt-injected or malicious skill could use this to access cloud
metadata endpoints (169.254.169.254), localhost services, or private
network hosts.

Adds a shared url_safety.is_safe_url() that resolves hostnames and
blocks private, loopback, link-local, and reserved IP ranges. Also
blocks known internal hostnames (metadata.google.internal).

Integrated at the URL validation layer in vision_tools and before
each website_policy check in web_tools (extract, crawl).

* test(vision): update localhost test to reflect SSRF protection

The existing test_valid_url_with_port asserted localhost URLs pass
validation. With SSRF protection, localhost is now correctly blocked.
Update the test to verify the block, and add a separate test for
valid URLs with ports using a public hostname.

* fix(security): harden SSRF protection — fail-closed, CGNAT, multicast, redirect guard

Follow-up hardening on top of dieutx's SSRF protection (PR NousResearch#2630):

- Change fail-open to fail-closed: DNS errors and unexpected exceptions
  now block the request instead of allowing it (OWASP best practice)
- Block CGNAT range (100.64.0.0/10): Python's ipaddress.is_private
  does NOT cover this range (returns False for both is_private and
  is_global). Used by Tailscale/WireGuard and carrier infrastructure.
- Add is_multicast and is_unspecified checks: multicast (224.0.0.0/4)
  and unspecified (0.0.0.0) addresses were not caught by the original
  four-check chain
- Add redirect guard for vision_tools: httpx event hook re-validates
  each redirect target against SSRF checks, preventing the classic
  redirect-based SSRF bypass (302 to internal IP)
- Move SSRF filtering before backend dispatch in web_extract: now
  covers Parallel and Tavily backends, not just Firecrawl
- Extract _is_blocked_ip() helper for cleaner IP range checking
- Add 24 new tests (CGNAT, multicast, IPv4-mapped IPv6, fail-closed
  behavior, parametrized blocked/allowed IP lists)
- Fix existing tests to mock DNS resolution for test hostnames

---------

Co-authored-by: dieutx <dangtc94@gmail.com>

* fix(vision): make SSRF redirect guard async for httpx.AsyncClient

httpx.AsyncClient awaits event hooks. The sync _ssrf_redirect_guard
returned None, causing 'object NoneType can't be used in await
expression' on any vision_analyze call that followed redirects.

Caught during live PTY testing of the merged SSRF protection.

* fix(config): log warning instead of silently swallowing config.yaml errors (NousResearch#2683)

A bare `except Exception: pass` meant any YAML syntax error, bad value,
or unexpected structure in config.yaml was silently ignored and the
gateway fell back to .env / gateway.json without any indication.
Users had no way to know why their config changes had no effect.

Co-authored-by: sprmn24 <oncuevtv@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): prevent shell injection in _expand_path via ~user path suffix (NousResearch#2047)

echo was called with the full unquoted path (~username/suffix), allowing
command substitution in the suffix (e.g. ~user/$(malicious)) to execute
arbitrary shell commands. The fix expands only the validated ~username
portion via the shell and concatenates the suffix as a plain string.

Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>

* feat(config): support ${ENV_VAR} substitution in config.yaml (NousResearch#2684)

* feat(config): support ${ENV_VAR} substitution in config.yaml

* fix: extend env var expansion to CLI and gateway config loaders

The original PR (NousResearch#2680) only wired _expand_env_vars into load_config(),
which is used by 'hermes tools' and 'hermes setup'. The two primary
config paths were missed:

- load_cli_config() in cli.py (interactive CLI)
- Module-level _cfg in gateway/run.py (gateway — bridges api_keys to env vars)

Also:
- Remove redundant 'import re' (already imported at module level)
- Add missing blank lines between top-level functions (PEP 8)
- Add tests for load_cli_config() expansion

---------

Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>

* fix(gateway): prevent stale memory overwrites by flush agent (NousResearch#2670)

The gateway memory flush agent reviews old conversation history on session
reset/expiry and writes to memory. It had no awareness of memory changes
made after that conversation ended (by the live agent, cron jobs, or other
sessions), causing silent overwrites of newer entries.

Two fixes:

1. Skip memory flush entirely for cron sessions (session IDs starting with
   'cron_'). Cron sessions are headless with no meaningful user conversation
   to extract memories from.

2. Inject the current live memory state (MEMORY.md + USER.md) directly into
   the flush prompt. The flush agent can now see what's already saved and
   make informed decisions — only adding genuinely new information rather
   than blindly overwriting entries that may have been updated since the
   conversation ended.

Addresses the root cause identified in NousResearch#2670: the flush agent was making
memory decisions blind to the current state of memory, causing stale
context to overwrite newer entries on gateway restarts and session resets.

Co-authored-by: devorun <devorun@users.noreply.github.com>
Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>

* chore: release v0.4.0 (v2026.3.23)

* docs: revise v0.4.0 changelog — fix feature attribution, reorder sections

* fix: add macOS Homebrew paths to browser and terminal PATH resolution

On macOS with Homebrew (Apple Silicon), Node.js and agent-browser
binaries live under /opt/homebrew/bin/ which is not included in the
_SANE_PATH fallback used by browser_tool.py and environments/local.py.
When Hermes runs with a filtered PATH (e.g. as a systemd service),
these binaries are invisible, causing 'env: node: No such file or
directory' errors when using browser tools.

Changes:
- Add /opt/homebrew/bin and /opt/homebrew/sbin to _SANE_PATH in both
  browser_tool.py and environments/local.py
- Add _discover_homebrew_node_dirs() to find versioned Node installs
  (e.g. brew install node@24) that aren't linked into /opt/homebrew/bin
- Extend _find_agent_browser() to search Homebrew and Hermes-managed
  dirs when agent-browser isn't on the current PATH
- Include discovered Homebrew node dirs in subprocess PATH when
  launching agent-browser
- Add 11 new tests covering all Homebrew path discovery logic

* feat(cli, agent): add tool generation callback for streaming updates

- Introduced `_on_tool_gen_start` in `HermesCLI` to indicate when tool-call arguments are being generated, enhancing user feedback during streaming.
- Updated `AIAgent` to support a new `tool_gen_callback`, notifying the display layer when tool generation starts, allowing for better user experience during large payloads.
- Ensured that the callback is triggered appropriately during streaming events to prevent user interface freezing.

* fix(cli): ensure single closure of streaming boxes during tool generation

- Updated `_on_tool_gen_start` method in `HermesCLI` to close open streaming boxes exactly once, preventing potential multiple closures.
- Added a check for `_stream_box_opened` to manage the state of the streaming box more effectively, enhancing user experience during large payload streaming.

* fix(auth): preserve 'custom' provider instead of silently remapping to 'openrouter'

resolve_provider('custom') was silently returning 'openrouter', causing
users who set provider: custom in config.yaml to unknowingly route
through OpenRouter instead of their local/custom endpoint. The display
showed 'via openrouter' even when the user explicitly chose custom.

Changes:
- auth.py: Split the conditional so 'custom' returns 'custom' as-is
- runtime_provider.py: _resolve_named_custom_runtime now returns
  provider='custom' instead of 'openrouter'
- runtime_provider.py: _resolve_openrouter_runtime returns
  provider='custom' when that was explicitly requested
- Add 'no-key-required' placeholder for keyless local servers
- Update existing test + add 5 new tests covering the fix

Fixes NousResearch#2562

* feat(model): /model command overhaul — Phases 2, 3, 5

* feat(model): persist base_url on /model switch, auto-detect for bare /model custom

Phase 2+3 of the /model command overhaul:

Phase 2 — Persist base_url on model switch:
- CLI: save model.base_url when switching to a non-OpenRouter endpoint;
  clear it when switching away from custom to prevent stale URLs
  leaking into the new provider's resolution
- Gateway: same logic using direct YAML write

Phase 3 — Better feedback and edge cases:
- Bare '/model custom' now auto-detects the model from the endpoint
  using _auto_detect_local_model() and saves all three config values
  (model, provider, base_url) atomically
- Shows endpoint URL in success messages when switching to/from
  custom providers (both CLI and gateway)
- Clear error messages when no custom endpoint is configured
- Updated test assertions for the additional save_config_value call

Fixes NousResearch#2562 (Phase 2+3)

* feat(model): support custom:name:model triple syntax for named custom providers

Phase 5 of the /model command overhaul.

Extends parse_model_input() to handle the triple syntax:
  /model custom:local-server:qwen → provider='custom:local-server', model='qwen'
  /model custom:my-model          → provider='custom', model='my-model' (unchanged)

The 'custom:local-server' provider string is already supported by
_get_named_custom_provider() in runtime_provider.py, which matches
it against the custom_providers list in config.yaml. This just wires
the parsing so users can do it from the /model slash command.

Added 4 tests covering single, triple, whitespace, and empty model cases.

* fix: remove litellm/typer/platformdirs from hermes-agent deps (supply chain compromise) (NousResearch#2796)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512

* fix(gateway): detect virtualenv path instead of hardcoding venv/ (NousResearch#2797)

Fixes NousResearch#2492.

`generate_systemd_unit()` and `get_python_path()` hardcoded `venv`
as the virtualenv directory name. When the virtualenv is `.venv`
(which `setup-hermes.sh` and `.gitignore` both reference), the
generated systemd unit had incorrect VIRTUAL_ENV and PATH variables.

Introduce `_detect_venv_dir()` which:
1. Checks `sys.prefix` vs `sys.base_prefix` to detect the active venv
2. Falls back to probing `.venv` then `venv` under PROJECT_ROOT

Both `get_python_path()` and `generate_systemd_unit()` now use
this detection instead of hardcoded paths.

Co-authored-by: Hermes <hermes@nousresearch.ai>

* refactor(model): extract shared switch_model() from CLI and gateway handlers

Phase 4 of the /model command overhaul.

Both the CLI (cli.py) and gateway (gateway/run.py) /model handlers
had ~50 lines of duplicated core logic: parsing, provider detection,
credential resolution, and model validation. This extracts that
pipeline into hermes_cli/model_switch.py.

New module exports:
- ModelSwitchResult: dataclass with all fields both handlers need
- CustomAutoResult: dataclass for bare '/model custom' results
- switch_model(): core pipeline — parse → detect → resolve → validate
- switch_to_custom_provider(): resolve endpoint + auto-detect model

The shared functions are pure (no I/O side effects). Each caller
handles its own platform-specific concerns:
- CLI: sets self.model/provider/etc, calls save_config_value(), prints
- Gateway: writes config.yaml directly, sets env vars, returns markdown

Net result: -244 lines from handlers, +234 lines in shared module.
The handlers are now ~80 lines each (down from ~150+) and can't drift
apart on core logic.

* fix(agent): ensure first delta is fired during reasoning updates

- Added calls to `_fire_first_delta()` in the `AIAgent` class to ensure that the first delta is triggered for both reasoning and thinking updates. This change improves the handling of delta events during streaming, enhancing the responsiveness of the agent's reasoning capabilities.

* docs: update all docs for /model command overhaul and custom provider support

Documents the full /model command overhaul across 6 files:

AGENTS.md:
- Add model_switch.py to project structure tree

configuration.md:
- Rewrite General Setup with 3 config methods (interactive, config.yaml, env vars)
- Add new 'Switching Models with /model' section documenting all syntax variants
- Add 'Named Custom Providers' section with config.yaml examples and
  custom:name:model triple syntax

slash-commands.md:
- Update /model descriptions in both CLI and messaging tables with
  full syntax examples (provider:model, custom:model, custom:name:model,
  bare custom auto-detect)

cli-commands.md:
- Add /model slash command subsection under hermes model with syntax table
- Add custom endpoint config to hermes model use cases

faq.md:
- Add config.yaml example for offline/local model setup
- Note that provider: custom is a first-class provider
- Document /model custom auto-detect

provider-runtime.md:
- Add model_switch.py to implementation file list
- Update provider families to show Custom as first-class with named variants

* fix: make browser command timeout configurable via config.yaml (NousResearch#2801)

browser_vision and other browser commands had a hardcoded 30-second
subprocess timeout that couldn't be overridden. Users with slower
machines (local Chromium without GPU) would hit timeouts on screenshot
capture even when setting browser.command_timeout in config.yaml,
because nothing read that value.

Changes:
- Add browser.command_timeout to DEFAULT_CONFIG (default: 30s)
- Add _get_command_timeout() helper that reads config, falls back to 30s
- _run_browser_command() now defaults to config value instead of constant
- browser_vision screenshot no longer hardcodes timeout=30
- browser_navigate uses max(config_timeout, 60) as floor for navigation

Reported by Gamer1988.

* fix(tools): handle 402 insufficient credits error in vision tool (NousResearch#2802)

Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>

* refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (NousResearch#2804)

Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.

Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
  minisweagent's DockerEnvironment). Our wrapper already handled
  execute(), cleanup(), security hardening, volumes, and resource limits.

Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
  minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
  into ModalEnvironment for Atropos compatibility without monkey-patching.

Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md

No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.

* docs: fix stale and incorrect documentation across 18 files

Cross-referenced all 84 docs pages against the actual codebase and
corrected every discrepancy found.

Reference docs:
- faq.md: Fix non-existent commands (/stats→/usage, /context→/usage,
  hermes models→hermes model, hermes config get→hermes config show,
  hermes gateway logs→cat gateway.log, async→sync chat() call)
- cli-commands.md: Fix --provider choices list (remove providers not
  in argparse), add undocumented -s/--skills flag
- slash-commands.md: Add missing /queue and /resume commands, fix
  /approve args_hint to show [session|always]
- tools-reference.md: Remove duplicate vision and web toolset sections
- environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add
  copilot-acp, remove alibaba to match actual argparse choices)

Configuration & user guide:
- configuration.md: Fix approval_mode→approvals.mode (manual not ask),
  checkpoints.enabled default true not false, human_delay defaults
  (500/2000→800/2500), remove non-existent delegation.max_iterations
  and delegation.default_toolsets, fix website_blocklist nesting
  under security:, add .hermes.md and CLAUDE.md to context files
  table with priority system explanation
- security.md: Fix website_blocklist nesting under security:
- context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support,
  document priority-based first-match-wins loading behavior
- cli.md: Fix personalities config nesting (top-level, not under agent:)
- delegation.md: Fix model override docs (config-level, not per-call
  tool parameter)
- rl-training.md: Fix log directory (tinker-atropos/logs/→
  ~/.hermes/logs/rl_training/)
- tts.md: Fix Discord delivery format (voice bubble with fallback,
  not just file attachment)
- git-worktrees.md: Remove outdated v0.2.0 version reference

Developer guide:
- prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority
  system for context files
- agent-loop.md: Fix callback list (remove non-existent
  message_callback, add stream_delta_callback, tool_gen_callback,
  status_callback)

Messaging & guides:
- webhooks.md: Fix command (hermes setup gateway→hermes gateway setup)
- tips.md: Fix session idle timeout (120min→24h), config file
  (gateway.json→config.yaml)
- build-a-hermes-plugin.md: Fix plugin.yaml provides: format
  (provides_tools/provides_hooks as lists), note register_command()
  as not yet implemented

* fix: reject relative cwd paths for container terminal backends

When TERMINAL_CWD is set to '.' or any relative path (common when the
CLI config defaults to cwd='.'), container backends (docker, modal,
singularity, daytona) would pass it directly to the container where it's
meaningless. This caused 'docker run -d -w .' to fail.

Now relative paths are caught alongside host paths and replaced with
the default '/root' for container backends.

* docs: add missing skills, CLI commands, and messaging env vars

Complete the documentation gaps identified in the previous audit:

Skills catalogs:
- skills-catalog.md: Add 7 missing bundled skills — data-science/
  jupyter-live-kernel, dogfood/hermes-agent-setup, inference-sh/
  inference-sh-cli, mlops/huggingface-hub, productivity/linear,
  research/parallel-cli, social-media/xitter
- optional-skills-catalog.md: Add 8 missing optional skills —
  blockchain/base, creative/blender-mcp, creative/meme-generation,
  mcp/fastmcp, productivity/telephony, research/bioinformatics,
  security/oss-forensics, security/sherlock

CLI commands reference:
- cli-commands.md: Add full documentation for hermes mcp (add/remove/
  list/test/configure) and hermes plugins (install/update/remove/list)

Messaging platform docs:
- discord.md: Add DISCORD_REQUIRE_MENTION and
  DISCORD_FREE_RESPONSE_CHANNELS to manual config env vars section
- signal.md: Add SIGNAL_ALLOW_ALL_USERS to env var reference table
- slack.md: Add SLACK_HOME_CHANNEL_NAME to config section

* chore: remove all remaining mini-swe-agent references

Complete cleanup after dropping the mini-swe-agent submodule (PR NousResearch#2804):

- Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var
  settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py
- Remove mini-swe-agent health check from hermes doctor
- Remove 'minisweagent' from logger suppression lists
- Remove litellm/typer/platformdirs from requirements.txt
- Remove mini-swe-agent install steps from install.ps1 (Windows)
- Remove mini-swe-agent install steps from website docs
- Update all stale comments/docstrings referencing mini-swe-agent
  in terminal_tool.py, tools/__init__.py, code_execution_tool.py,
  environments/README.md, environments/agent_loop.py
- Remove mini_swe_runner from pyproject.toml py-modules
  (still exists as standalone script for RL training use)
- Shrink test_minisweagent_path.py to empty stub

The orphaned mini-swe-agent/ directory on disk needs manual removal:
  rm -rf mini-swe-agent/

* feat: env var passthrough for skills and user config (NousResearch#2807)

* feat: env var passthrough for skills and user config

Skills that declare required_environment_variables now have those vars
passed through to sandboxed execution environments (execute_code and
terminal).  Previously, execute_code stripped all vars containing KEY,
TOKEN, SECRET, etc. and the terminal blocklist removed Hermes
infrastructure vars — both blocked skill-declared env vars.

Two passthrough sources:

1. Skill-scoped (automatic): when a skill is loaded via skill_view and
   declares required_environment_variables, vars that are present in
   the environment are registered in a session-scoped passthrough set.

2. Config-based (manual): terminal.env_passthrough in config.yaml lets
   users explicitly allowlist vars for non-skill use cases.

Changes:
- New module: tools/env_passthrough.py — shared passthrough registry
- hermes_cli/config.py: add terminal.env_passthrough to DEFAULT_CONFIG
- tools/skills_tool.py: register available skill env vars on load
- tools/code_execution_tool.py: check passthrough before filtering
- tools/environments/local.py: check passthrough in _sanitize_subprocess_env
  and _make_run_env
- 19 new tests covering all layers

* docs: add environment variable passthrough documentation

Document the env var passthrough feature across four docs pages:

- security.md: new 'Environment Variable Passthrough' section with
  full explanation, comparison table, and security considerations
- code-execution.md: update security section, add passthrough subsection,
  fix comparison table
- creating-skills.md: add tip about automatic sandbox passthrough
- skills.md: add note about passthrough after secure setup docs

Live-tested: launched interactive CLI, loaded a skill with
required_environment_variables, verified TEST_SKILL_SECRET_KEY was
accessible inside execute_code sandbox (value: passthrough-test-value-42).

* chore: pin all dependency version ranges (supply chain hardening) (NousResearch#2810)

Adds upper-bound version pins (<next_major) to all dependencies in
pyproject.toml — both core and optional. Previously most deps were
unpinned or had only floor bounds, meaning fresh installs would pull
whatever version was latest on PyPI.

This limits blast radius from supply chain attacks like the litellm
1.82.7/1.82.8 credential stealer (BerriAI/litellm#24512). With bounded
ranges, a compromised major version bump won't be pulled automatically.

Floors are set to current known-good installed versions.

* refactor: update mini_swe_runner to use Hermes built-in backends

Replace all minisweagent imports with Hermes-Agent's own environment
classes (LocalEnvironment, DockerEnvironment, ModalEnvironment).

mini_swe_runner.py no longer has any dependency on mini-swe-agent.
The runner now uses the same backends as the terminal tool, so Docker
and Modal environments work out of the box without extra submodules.

Tested: local and Docker backends verified working through the runner.

* chore: regenerate uv.lock with hashes, use lockfile in setup (NousResearch#2812)

- Regenerate uv.lock with sha256 hashes for all 2965 package artifacts
- Add python_version marker to yc-bench (requires >=3.12)
- Update setup-hermes.sh to prefer 'uv sync --locked' for hash-verified
  installs, with fallback to 'uv pip install' when lockfile is stale

This completes the supply chain hardening: pyproject.toml bounds the
version ranges, and uv.lock pins exact versions with cryptographic
hashes so tampered packages are rejected at install time.

* ci: add supply chain audit workflow for PR scanning (NousResearch#2816)

Scans every PR diff for patterns associated with supply chain attacks:

CRITICAL (blocks merge):
- .pth files (auto-execute on Python startup — litellm attack vector)
- base64 decode + exec/eval combo (obfuscated payload execution)
- subprocess with encoded/obfuscated commands

WARNING (comment only, no block):
- base64 encode/decode alone (legitimate uses: images, JWT, etc.)
- exec/eval alone
- Outbound POST/PUT requests
- setup.py/sitecustomize.py/usercustomize.py changes
- marshal.loads/pickle.loads/compile()

Posts a detailed comment on the PR with matched lines and context.
Excludes lockfiles (uv.lock, package-lock.json) from scanning.

Motivated by the litellm 1.82.7/1.82.8 credential stealer attack
(BerriAI/litellm#24512).

* docs: document 9 previously undocumented features

New documentation for features that existed in code but had no docs:

New page:
- context-references.md: Full docs for @-syntax inline context
  injection (@file:, @folder:, @diff, @StaGeD, @git:, @url:) with
  line ranges, CLI autocomplete, size limits, sensitive path blocking,
  and error handling

configuration.md additions:
- Environment variable substitution: ${VAR_NAME} syntax in config.yaml
  with expansion, fallback, and multi-reference support
- Gateway streaming: Progressive token delivery on messaging platforms
  via message editing (StreamingConfig: enabled, transport, edit_interval,
  buffer_threshold, cursor) with platform support matrix
- Web search backends: Three providers (Firecrawl, Parallel, Tavily)
  with web.backend config key, capability matrix, auto-detection from
  API keys, self-hosted Firecrawl, and Parallel search modes

security.md additions:
- SSRF protection: Always-on URL validation blocking private networks,
  loopback, link-local, CGNAT, cloud metadata hostnames, with
  fail-closed DNS and redirect chain re-validation
- Tirith pre-exec security scanning: Content-level command scanning
  for homograph URLs, pipe-to-interpreter, terminal injection with
  auto-install, SHA-256/cosign verification, config options, and
  fail-open/fail-closed modes

sessions.md addition:
- Auto-generated session titles: Background LLM-powered title
  generation after first exchange

creating-skills.md additions:
- Conditional skill activation: requires_toolsets, requires_tools,
  fallback_for_toolsets, fallback_for_tools frontmatter fields with
  matching logic and use cases
- Environment variable requirements: required_environment_variables
  frontmatter for automatic env passthrough to sandboxed execution,
  plus terminal.env_passthrough user config

* docs: fix api-server response storage — SQLite, not in-memory (NousResearch#2819)

* docs: update all docs for /model command overhaul and custom provider support

Documents the full /model command overhaul across 6 files:

AGENTS.md:
- Add model_switch.py to project structure tree

configuration.md:
- Rewrite General Setup with 3 config methods (interactive, config.yaml, env vars)
- Add new 'Switching Models with /model' section documenting all syntax variants
- Add 'Named Custom Providers' section with config.yaml examples and
  custom:name:model triple syntax

slash-commands.md:
- Update /model descriptions in both CLI and messaging tables with
  full syntax examples (provider:model, custom:model, custom:name:model,
  bare custom auto-detect)

cli-commands.md:
- Add /model slash command subsection under hermes model with syntax table
- Add custom endpoint config to hermes model use cases

faq.md:
- Add config.yaml example for offline/local model setup
- Note that provider: custom is a first-class provider
- Document /model custom auto-detect

provider-runtime.md:
- Add model_switch.py to implementation file list
- Update provider families to show Custom as first-class with named variants

* docs: fix api-server response storage description — SQLite, not in-memory

The ResponseStore class uses SQLite persistence (with in-memory
fallback), not pure in-memory storage. Responses survive gateway
restarts.

---------

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Co-authored-by: dieutx <dangtc94@gmail.com>
Co-authored-by: Teknium <teknium1@gmail.com>
Co-authored-by: sprmn24 <oncuevtv@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>
Co-authored-by: devorun <devorun@users.noreply.github.com>
Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>
Co-authored-by: Hermes <hermes@nousresearch.ai>
Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
InB4DevOps pushed a commit to InB4DevOps/hermes-agent that referenced this pull request Mar 25, 2026
… chain compromise) (NousResearch#2796)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512
outsourc-e pushed a commit to outsourc-e/hermes-agent that referenced this pull request Mar 26, 2026
… chain compromise) (NousResearch#2796)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
… chain compromise) (NousResearch#2796)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512
alt-glitch added a commit that referenced this pull request May 12, 2026
…ain policy

After the Mini Shai-Hulud supply chain campaign (May 2026) and the litellm
compromise (March 2026), codify the dependency pinning policy that was
established in PRs #2810 and #9801 but never written down for contributors.

Changes:
- pyproject.toml: Add tight upper bounds to the 5 deps that slipped
  through as review escapes from external contributor PRs:
  - hindsight-client>=0.4.22,<0.5 (was >=0.4.22)
  - aiosqlite>=0.20,<0.23 (was >=0.20)
  - asyncpg>=0.29,<0.32 (was >=0.29)
  - alibabacloud-dingtalk>=2.0.0,<3 (was >=2.0.0)
  - youtube-transcript-api>=1.2.0,<2 (was >=1.2.0)

  Pre-1.0 packages get <0.(current_minor+2) — tight enough to block
  hostile minor releases but loose enough to not require bumps every week.

- CONTRIBUTING.md: Add 'Dependency pinning policy' section under Security
  with the full rationale, table of source types + treatments, and examples.

- AGENTS.md: Add concise 'Dependency Pinning Policy' section for AI coding
  agents with the decision table and step-by-step checklist.

- supply-chain-audit.yml: Add dep-bounds job that fails PRs introducing
  PyPI deps without <ceiling upper bounds. Fires on pyproject.toml changes.
  Posts a PR comment with the specific unbounded specs found.

Refs: #2796 #2810 #9801 #24205
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
… chain compromise) (NousResearch#2796)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512
kshitijk4poor pushed a commit that referenced this pull request May 15, 2026
…ain policy (#24226)

After the Mini Shai-Hulud supply chain campaign (May 2026) and the litellm
compromise (March 2026), codify the dependency pinning policy that was
established in PRs #2810 and #9801 but never written down for contributors.

Changes:
- pyproject.toml: Add tight upper bounds to the 5 deps that slipped
  through as review escapes from external contributor PRs:
  - hindsight-client>=0.4.22,<0.5 (was >=0.4.22)
  - aiosqlite>=0.20,<0.23 (was >=0.20)
  - asyncpg>=0.29,<0.32 (was >=0.29)
  - alibabacloud-dingtalk>=2.0.0,<3 (was >=2.0.0)
  - youtube-transcript-api>=1.2.0,<2 (was >=1.2.0)

  Pre-1.0 packages get <0.(current_minor+2) — tight enough to block
  hostile minor releases but loose enough to not require bumps every week.

- CONTRIBUTING.md: Add 'Dependency pinning policy' section under Security
  with the full rationale, table of source types + treatments, and examples.

- AGENTS.md: Add concise 'Dependency Pinning Policy' section for AI coding
  agents with the decision table and step-by-step checklist.

- supply-chain-audit.yml: Add dep-bounds job that fails PRs introducing
  PyPI deps without <ceiling upper bounds. Fires on pyproject.toml changes.
  Posts a PR comment with the specific unbounded specs found.

Refs: #2796 #2810 #9801 #24205
hawknewton added a commit to AmbulnzLLC/hermes-agent that referenced this pull request May 15, 2026
* test(ci): stabilize shared optional dependency baselines

* fix(install): preserve pip entry point when re-running on symlinked install

setup_path() writes the user-facing hermes shim with `cat >`, which
follows existing symlinks. Older installs created
`$command_link_dir/hermes` as a symlink to `$HERMES_BIN`
(`venv/bin/hermes`), so re-running install.sh stomped the pip entry
point with a bash shim that exec'd itself in an infinite loop.

`rm -f` the link target before writing so the shim lands at
`$command_link_dir/hermes` and the venv entry point is left intact.

Adds a regression test that reproduces the symlink-stomp end-to-end
(creates the symlink, drives the real shim-write block from setup_path,
asserts the venv pip script body survives and the shim is now a regular
file). Both new assertions fail on origin/main and pass with the fix.

Closes #21454.

* feat(discord): render clarify choices as buttons

Brings Discord to parity with Telegram on the clarify tool's interactive
UX. Overrides BasePlatformAdapter.send_clarify on DiscordAdapter to attach
a button view when choices are present.

  - ClarifyChoiceView: one discord.ui.Button per choice (max 24, Discord's
    25-component view cap leaves one slot for Other) plus a final
    'Other (type answer)' button.
  - Numeric click -> tools.clarify_gateway.resolve_gateway_clarify(
    clarify_id, choice_text) using the canonical choice text from the
    gateway entry (falls back to the button label if the entry vanished).
  - Other click -> tools.clarify_gateway.mark_awaiting_text(clarify_id) so
    the gateway's text-intercept captures the next user message in this
    session as the response.
  - Auth via the shared _component_check_auth helper (same OR-semantics as
    ExecApprovalView / SlashConfirmView / UpdatePromptView / ModelPickerView).
  - Open-ended (no choices) path renders the prompt as a plain embed and
    relies on the existing text-intercept resolution.
  - Single-use: first valid click disables every button and updates the
    embed footer with who answered and what they chose.

No changes to BasePlatformAdapter.send_clarify or the gateway's
clarify_callback wiring -- the existing scaffolding already drives all
adapters; Discord just inherits the default text fallback today and gains
buttons by virtue of this override.

Test conftest extended: _FakeEmbed gains add_field() / set_footer() stubs
so tests can construct embedded views without monkey-patching per-test.

Original PR: #19249 by @LeonSGP43. This is a reshape of the contributor's
work onto current main's clarify infrastructure (clarify_id + entry-based
resolution shared with Telegram, instead of a parallel on_answer-closure
mechanism). The button view structure and UX shape are preserved.

Tests: 14 new tests in tests/gateway/test_discord_clarify_buttons.py.
391/391 existing Discord gateway tests still pass.

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>

* fix(cli): allow rotating broken OpenRouter / AI Gateway key in `hermes model` flow (#25750)

Before: when `OPENROUTER_API_KEY` (or `AI_GATEWAY_API_KEY`) was already
set in ~/.hermes/.env, `hermes model openrouter` / `hermes model
ai-gateway` skipped the API-key prompt entirely and jumped straight to
the model picker. Users with a broken / expired / wrong key had no way
to replace it without editing ~/.hermes/.env by hand or re-running
`hermes setup` from scratch.

Both flows now route through the existing `_prompt_api_key()` helper,
which surfaces [K]eep / [R]eplace / [C]lear when a key is already
configured — the same UX the generic API-key providers (z.ai, MiniMax,
Gemini, etc.) and the Daytona setup already use.

* fix(install.ps1): pin uv sync to venv\, verify baseline imports on Windows (#25755)

* fix(cli): allow rotating broken OpenRouter / AI Gateway key in `hermes model` flow

Before: when `OPENROUTER_API_KEY` (or `AI_GATEWAY_API_KEY`) was already
set in ~/.hermes/.env, `hermes model openrouter` / `hermes model
ai-gateway` skipped the API-key prompt entirely and jumped straight to
the model picker. Users with a broken / expired / wrong key had no way
to replace it without editing ~/.hermes/.env by hand or re-running
`hermes setup` from scratch.

Both flows now route through the existing `_prompt_api_key()` helper,
which surfaces [K]eep / [R]eplace / [C]lear when a key is already
configured — the same UX the generic API-key providers (z.ai, MiniMax,
Gemini, etc.) and the Daytona setup already use.

* fix(install.ps1): pin uv sync target to venv\, verify baseline imports

Two related Windows-installer bugs that produce a broken venv with
`ModuleNotFoundError: No module named 'dotenv'` on first `hermes` run.

## Bug 1: uv sync ignores VIRTUAL_ENV, syncs into .venv\ instead of venv\

`Install-Dependencies` creates the venv at `venv\` via `uv venv venv`,
sets `$env:VIRTUAL_ENV = "$InstallDir\venv"`, then runs
`uv sync --extra all --locked`. Modern uv (>=0.5) ignores `VIRTUAL_ENV`
for the `sync` subcommand and uses the project default `.venv\`
instead. Result: deps land in `$InstallDir\.venv\`, `venv\` stays
empty except for the python.exe stub from the earlier `uv venv` call,
`hermes.exe` ends up wired to the wrong site-packages.

The bash installer (`scripts/install.sh`) already worked around this in
`install_deps()` line 1127 by passing `UV_PROJECT_ENVIRONMENT` — that
flag tells uv exactly where to put the project env regardless of
`VIRTUAL_ENV`. Port the same fix to PowerShell.

## Bug 2: no post-install verification

If the sync still misdirects for any other reason (uv version drift,
filesystem quirk, user re-run scenarios), the installer reports success
and the user only finds out by running `hermes` and getting an
unhelpful traceback. Add a baseline-import probe that runs the venv's
own python against the four packages every `hermes` invocation needs
(`dotenv`, `openai`, `rich`, `prompt_toolkit`). On failure, throw
with a recovery command tailored to whether a sibling `.venv\` exists.

User report (Windows 11, Python 3.13.5, Hermes v0.13.0): manual repro
steps were exactly this — `uv sync` landed in `.venv\`, recovered by
junctioning `venv\` → `.venv\` to bridge the path mismatch.

* fix(telegram): escape dynamic markdown in callback flows

Use MarkdownV2 formatting for Telegram callback follow-ups and interactive prompts where dynamic names or user text can break legacy Markdown parsing. Add regression coverage for reload-mcp, model picker, approval callbacks, and update prompts.

* fix(telegram): restore model-switch success path + author map

The cherry-picked PR over-indented the edit_message_text block for
the mm: (model selected → switch) success path so the confirmation
edit lived inside the preceding 'except Exception as exc' branch and
only fired when the callback raised. Dedent the try/except back to
12-space indent so it runs after the callback succeeds, restoring
the original flow that removes the inline buttons and shows the
'Switched to ...' confirmation.

Add a regression test (test_model_selected_edits_message_on_success)
that asserts edit_message_text is awaited and the result text is
routed through format_message (MARKDOWN_V2 + backtick survival).

Add phuongvm to scripts/release.py AUTHOR_MAP.

* fix(memory): skip OpenViking upload symlinks

* fix(codex-runtime): retire wedged sessions + post-tool watchdog + OAuth refresh classify (#25769)

Mirrors openclaw beta.8's app-server resilience fixes so a stuck codex
subprocess can't burn the full turn deadline and so users get a
`codex login` pointer instead of raw RPC errors when their token expires.

- TurnResult.should_retire signals the caller to drop+respawn codex.
- Deadline-hit path and dead-subprocess detection set should_retire so
  the next turn doesn't ride a CPU-spinning or auth-broken process.
- Post-tool watchdog (post_tool_quiet_timeout=90s): if a tool item
  completes and codex goes silent past the threshold without further
  output or turn/completed, fast-fail instead of waiting the full 600s.
  Resets on any non-tool activity so normal think-after-tool flows are
  not affected.
- <turn_aborted> and <turn_aborted/> in agent text are treated as
  terminal — some codex builds tear down a turn that way without
  emitting turn/completed.
- _classify_oauth_failure() inspects RPC error message + stderr tail
  for invalid_grant / token refresh / 401 / etc. and rewrites
  user-facing errors to 'run codex login'. Conservative: generic
  failures still surface verbatim. Fires at turn/start failure,
  turn/completed failure, and dead-subprocess paths.
- thread/start cross-fill: tolerate thread.id, thread.sessionId,
  top-level sessionId/threadId so future codex schema drift doesn't
  KeyError us at handshake.
- run_agent.py: when run_turn returns should_retire=True OR raises,
  close + null self._codex_session so the next turn respawns.

Tests: +30 cases across session + integration suites.
  tests/agent/transports/test_codex_app_server_session.py 50/50 pass
  tests/run_agent/test_codex_app_server_integration.py 27/27 pass
  Broader codex scope (transports + cli runtime/migration) 376/376 pass

* chore(release): add AUTHOR_MAP entries for second new-contributor batch

Pre-stages AUTHOR_MAP for 7 new contributors in the upcoming batch:

- HxT9          (#25760)
- evgyur        (#25651)
- AsoTora       (#25624)
- oxngon        (#25603)
- yifengingit   (#25589)
- vanthinh6886  (#25562)
- Arkmusn       (#25559)

EthanGuo-coder, wesleysimplicio, and zccyman are already in the map.

* fix: read approvals.timeout from config in CLI approval callback

The _approval_callback method in HermesCLI hardcoded timeout=60
instead of reading the approvals.timeout config value. This meant
the config setting was silently ignored for CLI interactive prompts.

Other approval paths (callbacks.py, tools/approval.py) already read
the config correctly — only cli.py was missed.

* fix: use AUTOINCREMENT id for message ordering instead of timestamp

On WSL2 (and similar environments), time.time() is not strictly monotonic
due to NTP sync or host clock adjustments. When clock regression occurs
during a multi-tool flush, later-inserted rows get earlier timestamps,
causing ORDER BY timestamp, id to sort them before rows that were written
first. This breaks the tool_calls/tool_response adjacency invariant and
triggers HTTP 400 from the API.

Use ORDER BY id instead, since id (INTEGER PRIMARY KEY AUTOINCREMENT)
always reflects true insertion order regardless of system clock behavior.

* docs: clarify media impact on session context

* fix: stop retrying initial MCP auth failures

* fix(gateway): enable text-intercept for multi-choice clarify fallback (#25567)

* fix: restrict .env file permissions to 0600

Set file mode 0600 on ~/.hermes/.env after creation in the installer and
after every write via memory_setup._write_env_vars(). This ensures only
the file owner can read/write API keys and tokens, matching standard
practice for credential files (.netrc, .aws/credentials, .ssh/config).

Fixes #25477

* fix(gateway): forward image attachments to background agent tasks

When the gateway spawned a background agent (e.g. for delegation), media
URLs and types from the originating message weren't forwarded — the bg
agent saw the prompt but no attached images. Vision-enabled tasks
effectively lost their inputs.

Forwards media_urls/media_types through the bg-task spawn path and
runs the same vision-enrichment step the main flow uses, so the bg
agent gets image descriptions inlined into its prompt.

Closes #25614.

Salvage of #25603 by @oxngon (manually re-applied — original branch
was severely stale against current main).

* fix(terminal): prevent safety filter false positives on keywords inside quoted strings

The _foreground_background_guidance() function matched background-wrapper
keywords (nohup/disown/setsid) anywhere in the command text, including
inside quoted strings, Python -c code, commit messages, and PR body text.

Two-layer fix:
1. Strip single-quoted, double-quoted, and backtick-quoted content before
   pattern matching via _strip_quotes() helper.
2. Tighten the regex to only match keywords at command-start positions
   (after ^, ;, &, &&, ||, or $() — not mid-argument.

Both layers are needed: quote stripping handles the common case of keywords
in string literals, and the position-aware regex handles unquoted cases
like 'export FOO=setsid' (word boundary match, wrong position).

Fixes #20064

* chore(release): map oswaldb22 noreply email for AUTHOR_MAP

Co-Authored-By: Oswald <oswaldb22@users.noreply.github.com>

* test(toolsets): lock web search into default platform coverage

Adds regression tests pinning web search into the WhatsApp and api-server
default platform-coverage toolsets. Pure test additions, no runtime change.

Salvage of the test-addition commit from #25692 by @wesleysimplicio.
(The AUTHOR_MAP fixup commit from the same PR landed separately as
529ec85c7.)

* fix(update): refresh lazy-installed backends on hermes update (#25766)

Pyproject's [all] extra was slimmed down in May 2026 — ~20 optional
backends moved to tools/lazy_deps.py and only install on first use.
hermes update runs uv pip install -e .[all] which doesn't touch any of
them, so pin bumps in LAZY_DEPS (CVE response, transitive fixes) were
silently ignored on already-activated backends.

Two changes:

1. _is_satisfied() now parses the spec and checks the installed version
   against the constraint via packaging.specifiers. Previously it
   returned True the moment the package name was importable, which made
   ensure() a name-presence gate rather than a version-pin gate.

2. New active_features() / refresh_active_features() pair: lists every
   feature with at least one of its packages currently installed, then
   re-runs ensure() on each. Refresh is invoked at the end of
   _cmd_update_impl, right after the [all] install completes. Cold
   backends (never activated) stay quiet — no churn for them.

Output during update is one summary block:
  → Refreshing 4 active lazy backend(s)...
    ↑ 1 refreshed: provider.anthropic
    ✓ 3 already current
or
    ⚠ memory.honcho failed to refresh: <pip stderr>

Failures never raise out of update — backends keep their previously-
installed version and we tell the user to rerun once upstream is fixed.
security.allow_lazy_installs=false is honored: features get marked
"skipped" with the reason shown.

Tests: 18 new unit tests covering version-aware satisfaction (exact pin,
range, extras blocks, missing package, malformed spec), active feature
discovery, and refresh status reporting. All 61 lazy_deps tests pass.

* fix(agent/gemini-cloudcode): seed delta defaults for reasoning-only stream chunks

_make_stream_chunk built delta_kwargs with only `role`, so a reasoning-only
chunk produced a SimpleNamespace without a `.content` attribute. Downstream
consumers that read `delta.content` then raised AttributeError on Gemini 2.5
Flash, where the thinking delta arrives before any content delta.

Seed `content`, `tool_calls`, `reasoning`, and `reasoning_content` as None
up front, matching the pattern already used in gemini_native_adapter.py.
Key-present arguments still override the defaults.

Fixes #24974
References: Related open PR #24984 (luyao618) applies the same 1-line fix; this PR adds a regression test that #24984 omits
Co-Authored-By: Claude <noreply@anthropic.com>

* fix(install): support non-sudo service-user installs on apt distros (#25814)

The Debian/Ubuntu branch of install_node_deps() ran 'npx playwright install
--with-deps chromium' unconditionally. Playwright invokes sudo interactively
to apt-install Chromium's system libraries, which blocks the installer for
non-sudo users (systemd service accounts, unprivileged operator users) on
an unsatisfiable password prompt.

Changes:
- install.sh: gate --with-deps behind a sudo capability check on the apt
  branch (matches the existing Arch/pacman branch pattern). Non-sudo users
  fall back to 'npx playwright install chromium' alone and the installer
  prints the exact 'sudo npx playwright install-deps chromium' command an
  administrator can run separately.
- install.sh: add --skip-browser (alias --no-playwright) to skip the
  Playwright step entirely for headless installs that don't need browser
  automation. Mirrors the existing --no-venv / --skip-setup shape.
- installation.md: add a 'Non-Sudo / System Service User Installs' section
  covering the admin/service-user split, the --skip-browser flag, and the
  ~/.local/bin PATH gotcha (the root cause of the 'No module named dotenv'
  error users hit when running the repo source 'hermes' script with system
  Python instead of the venv launcher).
- test_install_sh_browser_install.py: regression coverage for the
  --skip-browser flag and the sudo-gate on the apt branch.

Reported by @ssilver in Discord.

* skill(comfyui): add template-integrity reference from @purzbeats (#25828)

Adds references/template-integrity.md covering safe conversion of the
official comfyui-workflow-templates package from editor format to API
format — Reroute bypass via link tracing, dotted dynamic-input keys
(values.a, resize_type.width) that must NOT be flattened, server-error
"patch don't rebuild" loop, Cloud quirks (302 redirect to signed GCS
URL, free-tier 1 concurrent job, 1920x1080 OOM on RTX 5090), and a
Discord-compatible ffmpeg stitch recipe (yuv420p + xfade/acrossfade).

SKILL.md lists the new reference so the agent loads it when starting
from an official template. purzbeats added to author list and to
scripts/release.py AUTHOR_MAP.

Co-authored-by: purzbeats <97489706+purzbeats@users.noreply.github.com>

* fix(whatsapp): drop status broadcasts and channel newsletters before agent dispatch (#25845)

WhatsApp pseudo-chats (Status updates / Stories, Channels / Newsletters,
broadcast lists) were being routed through the full agent pipeline. A
user's gateway.log showed the agent replying to a contact's Story
('status@broadcast') with 345 chars plus title-generation cost, which
also shows up in the contact's status feed.

Drop these JIDs at _should_process_message() before the policy gate so
they're filtered regardless of dm_policy or allowlist state. Covers:
- status@broadcast (Stories)
- *@newsletter (Channels)
- *@broadcast (broadcast lists, future-proofing)

The bridge.js already filters these on the fromMe outbound path, but
inbound events on self-chat mode skipped that check.

Tests:
- status@broadcast dropped on open policy
- broadcast filter wins over allowlisted senders
- real DMs still pass through
- helper unit cases (case-insensitive, whitespace-tolerant)

26/26 tests/gateway/test_whatsapp_group_gating.py pass; 59/59 adjacent
WhatsApp test suites pass.

* fix(ci): stabilize shared test state after 21012

* fix(telegram): set REQUIRES_EDIT_FINALIZE so final MarkdownV2 edit is not skipped

When the final streamed text is identical to the last plain-text edit,
stream_consumer._send_or_edit short-circuits and never calls
adapter.edit_message(finalize=True).  For Telegram, this skips the
plain-text → MarkdownV2 conversion, leaving raw Markdown syntax visible
to the user.

Set REQUIRES_EDIT_FINALIZE = True on TelegramAdapter so the finalize
edit is always delivered, matching the existing DingTalk pattern.

Fixes #25710

* fix(gateway): load streaming config from nested gateway.streaming key

`hermes config set gateway.streaming.*` writes the streaming block
nested under a `gateway:` key in config.yaml, but the config loader
only checked for a top-level `streaming:` key — silently ignoring
the nested variant.

Fall back to `yaml_cfg['gateway']['streaming']` when the top-level
key is absent, matching the pattern already used for other nested
config sections.

Closes #25676

* fix(gateway): prevent duplicate final send when only cosmetic edit failed

When the stream consumer's got_done handler successfully delivers the
final response content via _send_or_edit but the subsequent edit
(e.g. cursor removal) fails, final_response_sent remains False even
though the user has already received the final answer. The gateway's
fallback send path then re-delivers the same content, causing the
user to see the response twice on Telegram.

Introduce a new _final_content_delivered flag on the stream consumer,
set by the got_done handler when the final content has reached the
user. The _run_agent suppression logic now treats this flag as an
additional signal (alongside final_response_sent and
response_previewed) that final delivery is already complete.

This preserves the existing behavior for intermediate-text-only
streams (where already_sent=True but no final content has been
delivered) — those still receive the gateway's fallback send, matching
the test expectation in test_partial_stream_output_does_not_set_already_sent.

Adds TestFinalContentDeliveredSuppression with two cases covering
both the suppression (content delivered + edit failed) and the
non-suppression (intermediate text only) branches.

* fix(agent): keep image tool results from poisoning text-only sessions

* fix(codex-app-server): attach redacted stderr tail to generic failures (#25929)

When codex app-server fails outside the OAuth-classified path
(non-auth turn/start errors, plain TimeoutErrors, generic turn-ended
status, subprocess silently exits, hard deadline timeout), the user
got a bare 'Internal error' / 'turn/start failed: ...' with no
context. Diagnosing config/provider/auth-bridge issues forced a
re-run with verbose codex flags.

Add a _format_error_with_stderr helper that appends the last few
stderr lines via agent.redact.redact_sensitive_text(force=True),
and use it at every catch-all error site:

- ensure_started() failures (codex init / thread/start) now return
  a TurnResult.error with should_retire=True instead of bubbling
- non-OAuth turn/start CodexAppServerError / TimeoutError
- subprocess-died branch (previously dumped raw stderr_blob[-300:]
  with no redaction — a leak risk)
- turn ended with non-completed status
- hard turn-timeout deadline

OAuth-classified failures and the post-tool quiet watchdog already
produce clean hints and stay unchanged. The redactor catches sk-*,
gh*_*, Authorization: Bearer, query-string tokens, JWTs, private
keys, etc., so provider error payloads can't leak into chat output
or trajectories.

Inspired by openclaw#80718, adapted for our app-server transport.

* fix(cli): batch resize history replay

* chore(release): map @1000Delta in AUTHOR_MAP

* fix(voice): remove per-tool-call beep in CLI voice mode (#25967)

The spinner already shows tool activity visually; the 1.2 kHz tone on
every tool.started event was unwanted noise (especially on WSL2, where
each beep also triggers Windows Terminal's bell notification).

Removed the play_beep call in _on_tool_progress entirely. Record
start/stop beeps (gated by voice.beep_enabled) are unaffected.

* fix: preserve ansi output history on resize replay

* chore(release): map @LeonSGP43 commit email in AUTHOR_MAP

* fix(cli): clamp scrollback box widths + suppress status bar after resize (#25975)

When the terminal shrinks, already-printed box-drawing rules (response,
reasoning, streaming TTS, background-task Panels) reflow into multiple
narrower rows — visible as duplicated horizontal separators / ghost
lines in scrollback. Similarly, prompt_toolkit redraws a fresh status
bar on SIGWINCH on top of one the terminal just reflowed, producing
double-bar artifacts on column shrink.

Two surgical changes:

1. Decorative scrollback boxes now use a new
   `HermesCLI._scrollback_box_width()` helper that clamps to
   `max(32, min(width, 56))`. The live TUI footer is unaffected and still
   uses the full width. Covers: streaming response box (open + close),
   reasoning box (open + close, both streaming and post-stream paths),
   streaming-TTS box close, final-response Rich Panel, and the
   background-task Rich Panel.

2. `_recover_after_resize()` now also sets a new
   `_status_bar_suppressed_after_resize` flag so the dynamic status bar
   and both input separator rules stay hidden until the next user input.
   The flag is cleared in the process loop the moment the user submits
   their next prompt, restoring chrome cleanly.

Tests:
- New `test_input_rules_hide_after_resize_until_next_input` covers the
  flag's effect on rule heights.
- New `test_scrollback_box_width_caps_to_resize_safe_value` covers the
  helper at floor / cap / mid-range / overflow.
- Existing resize-recovery test extended to assert the flag flips.

Refs: #18449 #19280 #22976
Salvage of #24403.

Co-authored-by: Szymonclawd <szymonclawd@mac.home>

* fix(ui-tui): heal same-dimension alt-screen resize drift

- Treat same-dimension resize events in alt-screen mode as a repaint
  signal, because terminal hosts can reflow or restore the physical
  buffer without changing columns/rows.
- Ensure pending resize erases are emitted even when the virtual diff
  is empty, so stale physical glyphs are still cleared.
- Extract alt-screen resize repaint into prepareAltScreenResizeRepaint()
  for readability.
- Add defensive clearTimeout in prepareAltScreenResizeRepaint so rapid
  resize bursts don't stack redundant delayed repaints.
- Add a focused regression test for same-dimension alt-screen resize
  healing.

Addresses #18449
Related to #17961

* chore(release): map @luoyuctl in AUTHOR_MAP

* feat(proxy): local OpenAI-compatible proxy for OAuth providers (#25969)

Adds 'hermes proxy start' — a local HTTP server that lets external apps
(OpenViking, Karakeep, Open WebUI, ...) use a Hermes-managed provider
subscription as their LLM endpoint. The proxy attaches the user's real
OAuth-resolved credentials to each forwarded request, refreshing them
automatically; the client can send any bearer (it gets stripped).

Ships with one adapter — Nous Portal. The UpstreamAdapter ABC and
registry in hermes_cli/proxy/adapters/ are designed for additional
OAuth providers to plug in by name without server changes.

Commands:
  hermes proxy start [--provider nous] [--host 127.0.0.1] [--port 8645]
  hermes proxy status
  hermes proxy providers

Allowed Portal paths: /v1/chat/completions, /v1/completions,
/v1/embeddings, /v1/models. Anything else returns 404 with a clear
error pointing at the allowed list.

aiohttp is gated like gateway/platforms/api_server.py (try-import,
clean runtime error if missing). No new core dependency.

Tests: 24 unit tests + 1 separate E2E that spawns the real subprocess
and verifies the upstream receives the right bearer with the client's
header stripped.

* feat(discord): channel history backfill for multi-user sessions

Adds optional channel-context backfill for Discord shared-channel sessions
so the agent can see recent messages it missed between its own turns
(typically when require_mention=true filters out most traffic).

Previously the agent only saw the @mention message that triggered it, which
led to disorienting replies in active multi-user channels where the
conversation context was invisible. With backfill enabled, a configurable
number of recent messages are fetched per-turn and prepended to the trigger
message as a context block, kept separate from sender-prefix logic so
attribution remains clean.

This re-opens the work from #13063 (approved by @OutThisLife on 2026-04-20,
closed when I closed the branch to address the simpolism:main head-branch
issue plus an ordering bug I caught later in live use). Filing against the
freshly-rewritten problem statement in #13054 so the design is grounded in
the failure mode rather than the implementation shape.

The implementation follows the **push-mode last-self-anchored** design from
the two options laid out in #13054. See the issue for the trade-off
discussion vs pull-mode (#13120 was an earlier closed PR using that shape).
Treating this as a reference implementation — happy to rewrite as
last-trigger anchoring or as a hybrid with #13120 if maintainers prefer.

Changes:

- gateway/platforms/discord.py:
  - new `_discord_history_backfill()` / `_discord_history_backfill_limit()`
    helpers (config.extra > env > default), mirroring the existing
    `_discord_require_mention()` shape
  - new `_fetch_channel_context()` that scans `channel.history()` backwards
    from the trigger to the bot's last message (or limit), formats as
    `[Recent channel messages] / [name] msg / ...`, respects DISCORD_ALLOW_BOTS,
    skips system messages
  - per-channel `_last_self_message_id` cache to narrow the fetch window
    on hot paths (avoids full history scan when the bot has spoken recently)
  - **IMPORTANT**: passes `oldest_first=False` explicitly to `channel.history()`.
    discord.py 2.x silently flips the default to True when `after=` is supplied,
    which would select the EARLIEST N messages after our last response instead
    of the LATEST N before the trigger. In high-traffic windows this would
    return stale tool traces and drop the actual final answer the user is
    asking about. See regression test below. Caught in live use during a
    Codex tool-trace burst on May 13 2026.
- gateway/config.py: discord_history_backfill + discord_history_backfill_limit
  settings + yaml→env bridge
- gateway/platforms/base.py: channel_context field on MessageEvent
- gateway/run.py: prepend channel_context after sender-prefix so the
  [sender name] tag applies to the trigger message alone, not to the backfill
- hermes_cli/config.py: defaults for new discord.history_backfill and
  discord.history_backfill_limit keys
- cli-config.yaml.example: documented defaults
- tests/gateway/test_discord_free_response.py: 7 new tests covering
  cold-start backfill, self-message stop boundary, other-bot filtering,
  cache hot-path narrowing, stale-cache fallback, shared-channel +
  per-user backfill paths, and the ordering regression test
  (`test_fetch_channel_context_cache_uses_latest_window_when_after_set`)
- tests/gateway/test_config.py: yaml→env bridge tests
- tests/gateway/test_session.py: prefix-order edge cases
- website/docs/user-guide/messaging/discord.md: env vars + config keys +
  usage docs

Tested on Ubuntu 24.04 — empirically validated in my own multi-bot Discord
research server for the past three weeks.

Fixes #13054
Supersedes #13063 (closed)

* feat(discord): default history backfill on, expand to per-user + threads

Follow-up to snav's PR #25463 contribution: flip default to on, broaden
scope so backfill fires whenever require_mention gates the bot (not just
shared-session channels).

Why:
- The mention-gate creates a session-transcript gap regardless of whether
  the channel is shared or per-user. In per-user sessions, Alice's session
  is still missing other participants' messages and her own pre-mention
  messages — backfill fills both gaps.
- Threads naturally scope to thread-only history because discord.py's
  channel.history() on a thread returns only that thread's messages.
- DMs still skip — every DM triggers the bot, so the session transcript
  is already complete.

Changes:
- hermes_cli/config.py: discord.history_backfill default → true
- gateway/platforms/discord.py: drop the _is_shared gate, keep _is_dm
  skip and _needed_mention gate; env var DISCORD_HISTORY_BACKFILL
  default → 'true'
- cli-config.yaml.example + website docs: update defaults and prose;
  add the DISCORD_HISTORY_BACKFILL / _LIMIT env var rows that were
  documented in the PR description but missing from the env-var table
- tests/gateway/test_discord_free_response.py:
  - flip test_discord_per_user_channel_does_not_backfill →
    test_discord_per_user_channel_backfills_too (new behavior)
  - add test_discord_dm_does_not_backfill (DM skip is invariant)
  - give FakeThread a no-op history() so existing thread tests don't hit
    a fake discord.Forbidden when backfill now fires on threads too

Tests: 160/160 in target files; 400/400 across all tests/gateway/ -k discord.

* fix(web): make sync-assets script cross-platform

The prebuild step used `rm -rf` and `cp -r`, which fail on Windows
(`'rm' is not recognized`). Replace with an inline Node one-liner
using fs.rmSync / fs.cpSync so the build works on Windows, macOS,
and Linux without adding a dependency.

* fix(lsp): shift baseline diagnostics into post-edit coordinates (#25978)

Pre-existing diagnostics below an edit point used to surface as 'LSP
diagnostics introduced by this edit' whenever the edit deleted or
inserted lines.  The delta-filter key included the diagnostic's
range, so the same logical error reported at a different line in
the post-edit snapshot looked like a brand new diagnostic.

Concrete case: deleting 14 lines in cli.py caused Pyright errors at
lines 9873, 10590, 12413, 13004 (unrelated to the edit) to be
reported as introduced by it.

Fix: build a piecewise-linear line-shift map (via difflib's
SequenceMatcher) from pre and post content, and remap baseline
diagnostics into post-edit coordinates before the set-difference.
Diagnostics in deleted regions drop out cleanly; diagnostics below
the edit shift by the right amount; diagnostics above are untouched.
The strict (range-aware) equality key stays — so a genuinely new
instance of an identical error class at a different line still
surfaces as new.

Pieces:
- agent/lsp/range_shift.py — build_line_shift, shift_diagnostic_range,
  shift_baseline.  Pure functions, no LSP state.
- agent/lsp/manager.py — LSPService.get_diagnostics_sync gains an
  optional line_shift kwarg; baseline is shift_baseline'd before
  computing the seen-set.  _diag_key keeps the strict range key.
- tools/file_operations.py — write_file captures pre_content for any
  LSP-handled extension (not just LINTERS_INPROC) and passes pre/post
  to _maybe_lsp_diagnostics, which builds the shift map.
- New _lsp_handles_extension helper guards the pre_content read.

Trade-offs preserved:
- Genuinely new same-class errors at different lines still surface
  (content-only key would have swallowed them).
- Pre-existing errors at unshifted positions still get filtered
  (covered by the strict-key path with no shift).
- Best-effort: when pre_content can't be captured (file didn't
  exist, permissions), the unshifted comparison still catches
  most pre-existing errors; the edge case it misses is a new file
  with a non-empty baseline, which is structurally impossible.

* fix(web): cross-platform sync-assets + surface build errors on failure

Three Windows-only bugs in the web-dashboard build path. Each is small,
scoped, and verified end-to-end on Windows 11 — including under a stock
cmd.exe / PowerShell console with its default cp1252 encoding.

1. `sync-assets` shells out to Unix-only commands

   web/package.json hard-codes `rm -rf … && cp -r …`. Neither exists on
   Windows cmd.exe. `hermes_cli/main.py::_build_web_ui` runs npm via
   subprocess (which on Windows defaults to cmd.exe), so the prebuild
   hook crashed before Vite ever ran and the dashboard never built.

   Fix: web/scripts/sync-assets.mjs — ~20 lines of Node using fs.rmSync
   + fs.cpSync (stdlib, Node >= 16.7). No new deps, identical behavior
   on POSIX and Windows.

2. Build failures were silent

   _build_web_ui ran both subprocess calls with capture_output=True and
   never relayed the captured buffers on failure. Users saw 'Web UI
   build failed' and nothing else — no stdout, no stderr, no hint that
   the real problem was 'rm is not recognized'.

   Fix: inner _relay() helper that decodes and prints stdout + stderr
   (utf-8, errors='replace') whenever a step returns non-zero. Replaces
   the existing stderr_tail-only relay on the build path; success path
   is unchanged. (stderr_tail is preserved for the stale-dist fallback
   branch added by #23817.)

Salvaged from #13368 by @johnisag onto current main. Conflict
resolution preserves main's improvements:
- _run_npm_install_deterministic() (replaces bare subprocess.run for
  npm install)
- npm-build retry-after-sleep for Windows boot-time races (#23817)
- stale-dist fallback for non-interactive callers (#23817)

Closes #25073, #13368.

* fix(web): handle non-UTF8 Windows console encodings in _build_web_ui

Codex review pointed out that even with the sync-assets fix applied,
_build_web_ui still crashes on a stock Windows console before reaching
npm: Python stdout defaults to cp1252 (or similar) and raises
UnicodeEncodeError when print() hits the arrow/check glyphs used for
status messages (→, ✗, ⚠, ✓). Reproduced locally in PowerShell:

    $ PYTHONIOENCODING=cp1252 python -c "from hermes_cli.main import _build_web_ui; _build_web_ui(Path('web'), fatal=True)"
    UnicodeEncodeError: 'charmap' codec can't encode character '\u2192' ...

The previous PR body claimed "end-to-end verified on Windows 11", but
that was under the venv's default (utf-8) stdout. A plain `py` or
PowerShell invocation would still fail before sync-assets ever ran.

Fix: inner _say() helper that falls back to
  text.encode(sys.stdout.encoding, errors="replace")
when print() raises UnicodeEncodeError. Glyphs degrade to '?' on
ASCII / cp1252 consoles; utf-8 consoles are unaffected. Verified the
full build pipeline runs to completion with PYTHONIOENCODING=cp1252.

Scoped tightly to _build_web_ui (the function this PR already touches);
other call sites in the codebase with the same risk are out of scope.

* chore(release): map agorgianitisj@hotmail.com -> johnisag

* fix(proxy): suppress false-positive windows-footgun on guarded add_signal_handler

The call site at line 246 is already wrapped in try/except NotImplementedError
(added in #25969). The checker just doesn't peek at surrounding context.
Mark with the suppression comment so the blocking check passes.

* fix(cli): wire /sessions slash command in the classic CLI

The 'sessions' command has been registered in the central command
registry since #20805 (May 2025) and surfaces in /help and tab-completion,
but the classic CLI's process_command() never had an elif branch for it.
The canonical name fell through and printed 'Unknown command: sessions'.
The TUI side was wired up correctly via the SessionPicker overlay; only
the legacy CLI was missing the dispatch.

Adds _handle_sessions_command() which mirrors /resume's no-arg behavior
inline (the CLI has no overlay primitive equivalent to the TUI picker):

- /sessions and /sessions list  → print the recent-sessions table
- /sessions <id_or_title>       → delegates to _handle_resume_command

Includes regression tests covering the dispatcher wiring (the original
bug) plus the three handler branches.

* chore(release): map phil.thomas@gametime.co -> explainanalyze

* chore(release): map phil.thomas@gametime.co -> explainanalyze

* fix(browser): use correct env var for --no-sandbox bypass

AGENT_BROWSER_CHROME_FLAGS is not read by agent-browser CLI.
The correct env var is AGENT_BROWSER_ARGS, with comma-separated values.

This fixes Chrome 'No usable sandbox' crash on Ubuntu 23.10+ systems
where AppArmor restricts unprivileged user namespaces. The detection
logic was correct but the fix used the wrong environment variable name
and space-separated instead of comma-separated args.

* fix(browser): honor pre-set AGENT_BROWSER_ARGS and document the bypass

Follow-up to the sandbox-bypass env-var fix:

- Update the opt-out gate so a user-provided AGENT_BROWSER_ARGS is also
  respected, not just the legacy AGENT_BROWSER_CHROME_FLAGS. Previously
  the gate only checked the broken legacy var, so a user who pre-set
  AGENT_BROWSER_ARGS would still get clobbered by Hermes's auto-injection.
- Document AGENT_BROWSER_ARGS in .env.example, the browser feature page,
  and the env var reference, with notes about the auto-injection on
  AppArmor-restricted systems (Ubuntu 23.10+, DGX Spark, containers).
- Add Anadi Jaggia to AUTHOR_MAP.

* fix(cli): fall back to SelectSelector when kqueue can't watch stdin

On macOS with uv-managed cPython 3.11, the default kqueue selector cannot
register fd 0, so prompt_toolkit's loop.add_reader raises
OSError(EINVAL) ("[Errno 22] Invalid argument") from kqueue.control()
and the agent crashes immediately on startup (#5884, also reported in
#6393).

Probe KqueueSelector.register(0, EVENT_READ) before launching
prompt_toolkit. If it fails, install an event-loop policy that returns a
SelectorEventLoop backed by SelectSelector — select() works fine on
stdin in this Python build, so add_reader succeeds and the agent
launches normally.

Also extend the existing #6393 fallback handler to recognize EINVAL /
EBADF / "Invalid argument" so that any future selector failure on stdin
shows the friendly "reinstall Python via pyenv or Homebrew" guidance
instead of an opaque traceback.

Verified on macOS (Darwin 24.6.0) with uv-managed cPython 3.11.15: the
kqueue probe fails, the policy switch fires, and `hermes` launches
cleanly. No effect on platforms where kqueue can register fd 0.

* chore(release): add AUTHOR_MAP entry for outdoorsea

* fix(aux): surface Nous auth-unavailable warning in auxiliary client

When the auxiliary client falls through Nous (e.g. no stored auth, or
runtime credential mint failed), users currently see only `debug`-level
lines, so the next provider in the fallback chain takes over silently.
Promote the no-auth path to a warning that tells operators to run
`hermes auth`, and add a debug breadcrumb on the rarer
mint-failed-but-stored-auth-still-present fallback path so the existing
behavior (use the raw stored token) is preserved while staying
investigable.

Salvaged from #23881 by @0xharryriddle. The contributor's original
patch also short-circuited the second branch with a return, which broke
the pool-entry fallback path covered by
`test_try_nous_uses_pool_entry` — kept the warning intent, dropped the
return so the fallback still works. Dropped the contributor's changes
to `hermes_cli/goals.py` because the goal-pause path is unreachable
when the auxiliary client is None (`judge_goal` returns
`parse_failed=False`, which resets `consecutive_parse_failures`),
so the reason string they added never surfaces in the pause message.

Refs #23876

* feat: add ACP registry metadata for Zed

* chore(release): bump ACP Registry assets in lockstep with pyproject

The ACP Registry manifest (acp_registry/agent.json), the npm launcher
package.json, and the launcher's HERMES_AGENT_VERSION constant must all
match pyproject.toml exactly — tests/acp/test_registry_manifest.py
enforces this lockstep.

Without a release-script hook, the next weekly version bump fails that
test until someone hand-edits four files. Extend update_version_files()
to drive the ACP bump alongside __init__.py and pyproject.toml, and
add tests covering the lockstep and the missing-files no-op path.

Also map adam.manning@gmail.com -> am423 for the salvage commit.

* chore: remove Atropos RL environments and tinker-atropos integration (#26106)

* chore: remove Atropos RL environments, tools, tests, skill, and tinker-atropos submodule

Delete:
- environments/ (43 files — base env, agent loop, tool call parsers, benchmarks)
- rl_cli.py (standalone RL training CLI)
- tools/rl_training_tool.py (all 10 rl_* tools)
- tests: test_rl_training_tool, test_tool_call_parsers, test_managed_server_tool_support,
  test_agent_loop, test_agent_loop_vllm, test_agent_loop_tool_calling,
  test_terminalbench2_env_security
- optional-skills/mlops/hermes-atropos-environments/
- tinker-atropos git submodule + .gitmodules

* chore: remove RL/Atropos references from Python source

- toolsets.py: remove rl toolset block + update comment
- model_tools.py: remove rl_tools group + update async bridging comment
- hermes_cli/tools_config.py: remove RL display entry, _DEFAULT_OFF_TOOLSETS,
  setup block, and rl_training post-setup handler
- tools/budget_config.py: remove RL environment reference in docstring
- tests/test_model_tools.py: remove rl_tools from expected groups
- tests/run_agent/test_streaming_tool_call_repair.py: fix stale cross-reference

* chore: remove rl/yc-bench extras and tinker-atropos refs from pyproject.toml

- Remove rl extra (atroposlib, tinker, fastapi, uvicorn, wandb)
- Remove yc-bench extra
- Remove rl_cli from py-modules
- Remove [tool.ty.src] exclude for tinker-atropos
- Remove [tool.ruff] exclude for tinker-atropos
- Regenerate uv.lock

* chore: remove tinker-atropos from install/setup scripts

- setup-hermes.sh: remove entire tinker-atropos submodule install block
- scripts/install.sh: remove both tinker-atropos blocks (Termux + standard)
- scripts/install.ps1: remove tinker-atropos block
- nix/hermes-agent.nix: remove tinker-atropos pip install line

* chore: remove RL references from cli-config.yaml.example

* docs: remove Atropos/RL references from README, CONTRIBUTING, AGENTS.md

* docs: remove RL/Atropos references from website

- Delete: environments.md, rl-training.md, mlops-hermes-atropos-environments.md
- sidebars.ts: remove rl-training and environments sidebar entries
- optional-skills-catalog.md: remove hermes-atropos-environments row
- tools-reference.md: remove entire rl toolset section
- toolsets-reference.md: remove rl row + update example
- integrations/index.md: remove RL Training bullet
- architecture.md: remove environments/ from tree + RL section
- contributing.md: remove tinker-atropos setup
- updating.md: remove tinker-atropos install + stale submodule update

* chore: remove remaining RL/Atropos stragglers

- hermes_cli/config.py: remove TINKER_API_KEY + WANDB_API_KEY env var defs
- hermes_cli/doctor.py: remove Submodules check section (tinker-atropos)
- hermes_cli/setup.py: remove RL Training status check
- hermes_cli/status.py: remove Tinker + WandB from API key status display
- agent/display.py: remove both rl_* tool preview/activity blocks
- website/docs: remove RL references from providers.md + env-variables.md
- tests: remove TINKER_API_KEY from conftest, set_config_value, setup_script

* chore: remove RL training section from .env.example

* feat(acp-registry): switch to uvx distribution, drop npm launcher

The ACP Registry schema supports uvx as a first-class distribution method
alongside npx and binary. Pointing the registry directly at the existing
hermes-agent PyPI release removes:

- the @nousresearch npm scope (we don't own it)
- a separate npm publish step on every weekly release
- 90 lines of Node launcher + tests in packages/hermes-agent-acp/

The Zed registry now installs Hermes via:

  uvx --from 'hermes-agent[acp]==<version>' hermes-acp

This is the same command the npm launcher was shelling out to anyway, so
end-user behavior is unchanged. Registry CI validates the PyPI URL +
version-pin exact match automatically.

Changes:
- acp_registry/agent.json: distribution.npx -> distribution.uvx
- delete packages/hermes-agent-acp/ entirely
- scripts/release.py: drop npm-launcher bump paths, keep manifest lockstep
- tests/acp/test_registry_manifest.py: assert uvx shape + version pin
- tests/scripts/test_release_acp_registry.py: rewrite for uvx-only shape
- docs (user-guide + dev-guide): drop all npm-launcher references
- delete docs/plans/acp-registry-zed-integration.md (stale, npm-shaped)

Validated against agentclientprotocol/registry agent.schema.json via
jsonschema. hermes-agent==0.13.0 is already live on PyPI.

* fix(deps): pin brotlicffi so aiohttp can decode Discord's Brotli attachments

Discord's CDN serves attachments with Content-Encoding: br. aiohttp's
compression_utils tries 'import brotlicffi as brotli' first and falls back
to google's Brotli, but Brotli<1.2.0's Decompressor.process() is 1-arg
while aiohttp calls it with 2 args (data, max_length). Result: every
.txt/.md/.doc uploaded to a Discord-gateway session fails to decode at
att.read() with 'Can not decode content-encoding: br' / 'TypeError:
process() takes exactly 1 argument (2 given)', the agent never sees the
bytes, and falls back to filesystem guessing.

Pin brotlicffi==1.2.0.1 in both surfaces:

  - tools/lazy_deps.py 'platform.discord' tuple: Discord users on the
    lazy-install path get it on first discord.py import.
  - pyproject.toml [messaging] extra: users who explicitly install
    hermes-agent[messaging] (skipping the lazy path) get it eagerly.

brotlicffi wins aiohttp's import race regardless of what else is
installed (try brotlicffi / except: import brotli), so existing setups
that already pulled google's Brotli transitively don't change behavior
beyond the bug fix. ~1.5 MB wheel, manylinux/macOS/Windows coverage.

E2E verified: round-trip decode of Brotli-compressed payload via
aiohttp.compression_utils.brotli succeeds with brotlicffi pinned; same
test against Brotli==1.1.0 alone reproduces the reported TypeError.

Credit to @Korkyzer for the original diagnosis and fix shape in #15744;
the lazy-deps gating layer was added on top to keep brotlicffi out of
the install path for users who don't run a Discord gateway.

Fixes #12511.
Closes #15744.

Co-authored-by: Korky <korkyzer@gmail.com>

* fix(cli): kill resize scrollback duplication + light-mode visibility

Two long-standing prompt_toolkit bugs in the base hermes CLI:

1. Resize duplication. Column-shrink resize used to push 40+ rows of
   duplicate chrome (status bar, input rules) into terminal scrollback
   every resize. Same wall as pt issues #29 (open since 2014), #1675,
   #1933 — aider/xonsh/ipython all use alt-screen to dodge it.

   Root cause (verified by reading prompt_toolkit/renderer.py):
   _output_screen_diff (renderer.py L232-242) deliberately moves the
   cursor to the bottom of the canvas after every paint 'to make sure
   the terminal scrolls up'. In non-fullscreen mode this scrolls chrome
   content into terminal scrollback on every render — not just on
   resize.

   Fix: monkey-patch prompt_toolkit.renderer._output_screen_diff to
   bypass the reserve-vertical-space cursor move. When pt's logic checks
   'if current_height > previous_screen.height', we inflate the previous
   screen height so the branch falls through. ~30-line wrapper, no fork
   of pt, no alt-screen, no DECSTBM scroll region.

   Verified empirically in real Terminal.app: 10 resizes (mixed
   shrinks/widens 1300→500→1400) during streaming produced ZERO
   scrollback delta, full agent response preserved, status bar pinned
   at bottom, no visible duplicates. pt is pinned to ==3.0.52 so the
   private-function patch is safe; future pt bumps will need to
   re-verify the signature matches.

2. Light-mode terminal visibility. Hardcoded skin colors (#FFF8DC
   cornsilk, #FFD700 gold, #B8860B dark goldenrod) are tuned for dark
   Terminal.app — invisible on light/cream backgrounds.

   Port ui-tui/src/theme.ts detectLightMode() to Python so the base CLI
   adapts. Detection priority: HERMES_LIGHT/HERMES_TUI_LIGHT env →
   HERMES_TUI_THEME=light|dark → HERMES_TUI_BACKGROUND=#RRGGBB →
   COLORFGBG env (xterm/Konsole/urxvt) → OSC 11 query
   (\x1b]11;?\x1b\\) with 100ms timeout → default dark. OSC 11 is
   tty-gated so gateway/cron/batch/subagent code paths don't pay the
   timeout cost.

   When light mode is detected, dark-mode colors auto-remap to readable
   equivalents (#FFF8DC → #1A1A1A, #FFD700 → #9A6B00, etc). Hooked at
   three points:
   - _hex_to_ansi() — auto-remaps any color emitted via the ANSI helper
   - _build_tui_style_dict() — rewrites pt style strings (chrome bg/fg)
   - SkinConfig.get_color() — wrapped at module load so Rich Panel
     borders/body text get the remap too

   Status-bar foreground colors (#C0C0C0, #888888, etc.) are explicitly
   skipped because they're paired with a dark navy bg — remapping them
   would make them invisible in dark mode.

3. Other visibility fixes: [thinking] reasoning preview now uses ANSI
   dim+italic (\x1b[2;3m) instead of #B8860B so it inherits terminal
   default fg color. Input/prompt area defaults to terminal default fg
   (was #FFF8DC cornsilk → invisible on cream).

Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>

* test(cli): cover light-mode detection + SkinConfig.get_color remap

Adds 16 unit tests covering the light/dark terminal detection path
introduced in the previous commit:

- Env override priority (HERMES_LIGHT, HERMES_TUI_LIGHT,
  HERMES_TUI_THEME, HERMES_TUI_BACKGROUND, COLORFGBG)
- Detection cache stickiness
- _maybe_remap_for_light_mode() no-op in dark mode
- Known dark-mode color remap (#FFF8DC -> #1A1A1A etc)
- Case-insensitive lookup
- Unknown color passthrough
- Status-bar paired colors (#C0C0C0, #888888, #555555, #8B8682) are
  intentionally NOT remapped — regression guard for the patch-11 fix,
  since remapping them would produce dark-on-dark on the status bar's
  navy bg
- SkinConfig.get_color() wrapper is installed and idempotent
- SkinConfig.get_color() does remap in light mode and passes through
  in dark mode

We don't try to fake an OSC 11 reply — that path is exercised
end-to-end in real Terminal.app; the env-override path covers the
algorithmic logic.

* revert(cli): drop scrollback box width clamp (#25975), restore full-width borders (#26163)

#25975 (salvaging #24403) clamped decorative scrollback Panels and
streaming box rules to `max(32, min(width, 56))` as a defense against
terminal-emulator reflow when columns shrink. On any modern wide
terminal this made the response/reasoning borders look stubby — 56
cols inside a 200-col viewport.

#26137 (salvaging #25981, by @OutThisLife) landed a more fundamental
fix: prompt_toolkit's `_output_screen_diff` is monkey-patched so its
reserve-vertical-space cursor move no longer pushes chrome into
scrollback at all. With that in place, the clamp is no longer
load-bearing for the chrome-into-scrollback class of bugs — the
remaining risk is purely cosmetic reflow of *already stamped*
Panel borders during an aggressive column shrink, which we now
accept as a tradeoff for restoring proper full-width rendering.

Changes:
- `_scrollback_box_width()` returns `max(32, width)` (just the floor,
  no upper cap). All 10 call sites stay valid.
- Updated `test_scrollback_box_width_caps_to_resize_safe_value` to
  the new `test_scrollback_box_width_returns_viewport_width` asserting
  full-width passthrough above the 32-col floor.

Floor of 32 is kept so `'─' * (w - 2)` math stays positive on tiny
terminals.

Refs #18449 #19280 #22976 (the original reflow class) and #25975
(the clamp this reverts).

* fix(goals): raise judge max_tokens 200 → 4096, make configurable

The freeform /goal judge was capped at max_tokens=200, which reliably
truncated the JSON verdict on reasoning-heavy models (deepseek-v4-pro,
qwq, etc.) — the model burns tokens on hidden reasoning before emitting
visible content, and the first /goal turn's prompt is larger than later
turns, blowing past 200. Symptom: agent.log shows
`judge reply was not JSON: '{"done": true, "reason": "The agent successfully'`
followed by repeated `judge returned empty response` lines, then the
goal pauses with a misleading 'judge model isn't returning the required
JSON verdict' message.

Diagnosed live by @helix4u — empirically verified that raising the
budget on an unmodified worktree makes the failures go away on the
exact configs users were hitting on Nous Plus subscription paths.

Changes:
- DEFAULT_JUDGE_MAX_TOKENS = 4096 (up from 200)
- New auxiliary.goal_judge.max_tokens config knob for tuning in
  specifically constrained setups
- _goal_judge_max_tokens() resolves the value with fail-open semantics
  (non-int / non-positive / load failure → default). load_config() is
  mtime-cached so per-turn lookup is cheap.

Scoped narrowly to the verified root cause — does not introduce a
submit_verdict tool-call schema (see #26162 / #23671 for that direction;
they can land separately if we want them).

Tests: tests/hermes_cli/test_goals.py + tests/cli/test_cli_goal_interrupt.py
+ tests/gateway/test_goal_verdict_send.py — 62/62 passing.

E2E verified: config override honored (8192), missing/garbage/zero
values fall back to 4096, no-auxiliary-section falls back to 4096.

Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>

Credits:
- @helix4u (Gille) — diagnosed the max_tokens=200 truncation via live
  testing on an unmodified worktree, drafted the original fix shape
  in #26162.
- @AhmetArif0 — flagged the freeform judge fragility in #23671 from
  the tool-call angle.
- @0xharryriddle (HarryRiddle.eth) — reported the issue from a Nous
  Plus subscription setup in #23876 with full debug reports.

Closes #23876
Supersedes #26162, #23671, #23881

* ci: add PyPI publish workflow (salvaged from #25901) (#26148)

* ci(pypi): add publish workflow for automated PyPI releases

Triggered by CalVer tag pushes from scripts/release.py (v20* pattern).
Three jobs: build (uv build) → publish (OIDC trusted publishing) → sign
(Sigstore + attach to existing GitHub Release).

- workflow_dispatch as manual escape hatch
- skip-existing for safe re-runs
- Graceful skip when GitHub Release not found (sign job)
- Top-level permissions: contents: read (CodeQL compliant)

Requires one-time setup: PyPI trusted publisher + GitHub pypi environment.

Co-authored-by: dmahan93 <44207705+dmahan93@users.noreply.github.com>

* fix(release): address review findings

- Stage acp_registry/agent.json in version bump commit (was silently left unstaged)
- Add missing return when no previous tags found without --first-release
- Fix get_pr_number return type annotation (str -> str | None)
- Prefer uv build over python -m build (matches CI workflow), with fallback
- Use unit separator (%x1f) in git log format to handle | in author names
- Add explicit encoding='utf-8' to .release_notes.md write

Workflow hardening:
- Gracefully skip signing when GitHub Release not found (env var gate
  instead of exit 1, so PyPI publish still shows green)

* fix(ci): harden PyPI workflow — SHA-pin actions, guard workflow_dispatch, explicit build flags

- Pin all actions to commit SHAs (supply-chain hardening for id-token:write)
- workflow_dispatch now requires confirm_tag input + checks out that tag
- Both uv build paths explicitly pass --sdist --wheel

---------

Co-authored-by: dmahan93 <44207705+dmahan93@users.noreply.github.com>

* feat(yuanbao): add _parse_resource_id and update _extract_text for ybres anchors

* feat(yuanbao): add quote_media_refs extraction to QuoteContextMiddleware

* feat(yuanbao): prioritize quote media refs over history backfill in DispatchMiddleware

* fix(yuanbao): resolve quoted file/image via transcript lookup when quote desc lacks ybres

When a user quotes a file message (type=3) and @bot, the quote's desc field
only contains the filename without a ybres:// resource reference. The existing
QuoteContextMiddleware only extracted media refs from desc using the ybres regex,
which always returned empty for file quotes.

Fix: add a transcript lookup fallback in QuoteContextMiddleware.handle() —
when quote_media_refs is empty but reply_to_message_id is set, search the
session transcript for the quoted message_id and extract ybres anchors from
its content.

Also fix message_type classification: when quote media resolves non-image files,
override message_type to DOCUMENT so gateway/run.py's document injection logic
properly prepends the file path and content for the agent.

* refactor(yuanbao): improve quote media fallback — move to DispatchMiddleware, tighten conditions

* feat(skills-hub): add huggingface/skills as trusted default tap (#2549)

Adds Hugging Face's official skill catalog to the default GitHub taps and
classifies it as a trusted source alongside openai/skills and anthropics/skills.

- tools/skills_guard.py: huggingface/skills -> TRUSTED_REPOS
- tools/skills_hub.py: GitHubSource.DEFAULT_TAPS += huggingface/skills (skills/)
- website/docs: list it under default taps + trusted-source examples

Closes #2549.

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>

* fix(session): persist auto-reset state across gateway restarts

was_auto_reset, auto_reset_reason, and reset_had_activity were not
included in SessionEntry.to_dict() / from_dict(), so a gateway restart
between session expiry and the user's next message would silently drop
the auto-reset notification and context note.

Add the three fields to the serialization roundtrip with safe defaults
(False / None / False) so existing sessions.json files load cleanly.

Add three roundtrip tests to test_session_reset_notify.py.

* fix(gateway): isinstance-guard string-form 429 error body

When a non-Anthropic provider (e.g. Morpheus proxy) returns a 429 with
`{"error": "Too Many Requests"}` instead of the expected
`{"error": {"type": ...}}` dict, _err_body.json().get("error", {})
returns the raw string and the next .get("type") line crashes with
AttributeError, taking down the message handler.

Guard with isinstance(_err_json, dict) so non-dict error bodies fall
through to the generic rate-limit hint.

Salvaged from PR #2587 by @KiraKatana. The PR's fallback-config
`base_url`/`api_key_env` fix was already implemented independently
on main (run_agent.py:8759-8780) with additional aliases and Ollama
Cloud host handling, so only the gateway guard is cherry-picked.

Co-authored-by: KiraKatana <kira.ops@proton.me>

* fix: clean stale conversation mappings on response eviction/deletion

ResponseStore.put() and .delete() now remove conversations rows that
reference evicted or deleted response IDs, preventing 404 errors when
a conversation name is reused after its backing response was purged.

Adds regression tests for delete, eviction, and handler-level reuse.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore(release): add AUTHOR_MAP entry for CoinTheHat

* fix(whatsapp): fail fast when Baileys sendMessage hangs

Baileys' sock.sendMessage() can hang indefinitely while uploading
media to WhatsApp servers (and, less often, on text sends), pinning
the bridge's Express handler until the gateway's aiohttp timeout
fires — surfacing to the user as a 120s wait followed by an empty
error from the TTS/voice path.

Wrap every sock.sendMessage() call inside the bridge in a
sendWithTimeout() helper that rejects after WHATSAPP_SEND_TIMEOUT_MS
(default 60s) via Promise.race. The four call sites are /send,
/edit, and /send-media's primary send. Express handlers catch the
rejection in their existing try/catch and return a real 500 to the
gateway, which can then surface a retryable error.

Salvaged from #2608 — wysie diagnosed the hang and the
Promise.race shape; the other two parts of that PR (gateway HTTP
session pooling, base.py metadata kwarg removal) already landed on
main via separate routes and are no longer needed.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>

* security(deps): add upper bounds to 5 loose deps + document supply chain policy (#24226)

After the Mini Shai-Hulud supply chain campaign (May 2026) and the litellm
compromise (March 2026), codify the dependency pinning policy that was
established in PRs #2810 and #9801 but never written down for contributors.

Changes:
- pyproject.toml: Add tight upper bounds to the 5 deps that slipped
  through as review escapes from external contributor PRs:
  - hindsight-client>=0.4.22,<0.5 (was >=0.4.22)
  - aiosqlite>=0.20,<0.23 (was >=0.20)
  - asyncpg>=0.29,<0.32 (was >=0.29)
  - alibabacloud-dingtalk>=2.0.0,<3 (was >=2.0.0)
  - youtube-transcript-api>=1.2.0,<2 (was >=1.2.0)

  Pre-1.0 packages get <0.(current_minor+2) — tight enough to block
  hostile minor releases but loose enough to not require bumps every week.

- CONTRIBUTING.md: Add 'Dependency pinning policy' section under Security
  with the full rationale, table of source types + treatments, and examples.

- AGENTS.md: Add concise 'Dependency Pinning Policy' section for AI coding
  agents with the decision table and step-by-step checklist.

- supply-chain-audit.yml: Add dep-bounds job that fails PRs introducing
  PyPI deps without <ceiling upper bounds. Fires on pyproject.toml changes.
  Posts a PR comment with the specific unbounded specs found.

Refs: #2796 #2810 #9801 #24205

* feat(image-gen): actionable setup message when no FAL backend is reachable (#26222)

When the in-tree FAL path has no API key (and no managed gateway), the
handler used to return a bare 'FAL_KEY environment variable not set'
error. Users had no idea where to get a key, that a managed Nous
gateway exists, or that plugin-registered providers are an option.

Now `image_generate_tool` returns a structured multi-line message:
  - signup link (https://fal.ai)
  - managed-gateway status (if Nous tools are enabled)
  - pointer to `hermes tools` / `hermes plugins list` for alternate
    backends, so users on a stale `image_gen.provider` know where to look

The schema is untouched — `check_fn` still gates the tool out of the
schema when no backend is reachable at startup, consistent with every
other conditional tool. This patch fixes the call-time failure modes:
managed-gateway 5xx, plugin provider disappearing mid-session, etc.

Inspired by #2546 / @Mibayy. The PR was ~5700 commits stale against
the new plugin-aware image_gen architecture, so this is a forward port
of the actionable-error idea rather than a cherry-pick.


Closes #2543

Co-authored-by: Mibayy <mibayy@users.noreply.github.com>

* docs(cron): worked recipes for the wakeAgent pre-run gat…
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
… chain compromise) (NousResearch#2796)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512
DIZ-admin pushed a commit to DIZ-admin/hermes-agent that referenced this pull request May 16, 2026
…ain policy (NousResearch#24226)

After the Mini Shai-Hulud supply chain campaign (May 2026) and the litellm
compromise (March 2026), codify the dependency pinning policy that was
established in PRs NousResearch#2810 and NousResearch#9801 but never written down for contributors.

Changes:
- pyproject.toml: Add tight upper bounds to the 5 deps that slipped
  through as review escapes from external contributor PRs:
  - hindsight-client>=0.4.22,<0.5 (was >=0.4.22)
  - aiosqlite>=0.20,<0.23 (was >=0.20)
  - asyncpg>=0.29,<0.32 (was >=0.29)
  - alibabacloud-dingtalk>=2.0.0,<3 (was >=2.0.0)
  - youtube-transcript-api>=1.2.0,<2 (was >=1.2.0)

  Pre-1.0 packages get <0.(current_minor+2) — tight enough to block
  hostile minor releases but loose enough to not require bumps every week.

- CONTRIBUTING.md: Add 'Dependency pinning policy' section under Security
  with the full rationale, table of source types + treatments, and examples.

- AGENTS.md: Add concise 'Dependency Pinning Policy' section for AI coding
  agents with the decision table and step-by-step checklist.

- supply-chain-audit.yml: Add dep-bounds job that fails PRs introducing
  PyPI deps without <ceiling upper bounds. Fires on pyproject.toml changes.
  Posts a PR comment with the specific unbounded specs found.

Refs: NousResearch#2796 NousResearch#2810 NousResearch#9801 NousResearch#24205
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
… chain compromise) (NousResearch#2796)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…ain policy (NousResearch#24226)

After the Mini Shai-Hulud supply chain campaign (May 2026) and the litellm
compromise (March 2026), codify the dependency pinning policy that was
established in PRs NousResearch#2810 and NousResearch#9801 but never written down for contributors.

Changes:
- pyproject.toml: Add tight upper bounds to the 5 deps that slipped
  through as review escapes from external contributor PRs:
  - hindsight-client>=0.4.22,<0.5 (was >=0.4.22)
  - aiosqlite>=0.20,<0.23 (was >=0.20)
  - asyncpg>=0.29,<0.32 (was >=0.29)
  - alibabacloud-dingtalk>=2.0.0,<3 (was >=2.0.0)
  - youtube-transcript-api>=1.2.0,<2 (was >=1.2.0)

  Pre-1.0 packages get <0.(current_minor+2) — tight enough to block
  hostile minor releases but loose enough to not require bumps every week.

- CONTRIBUTING.md: Add 'Dependency pinning policy' section under Security
  with the full rationale, table of source types + treatments, and examples.

- AGENTS.md: Add concise 'Dependency Pinning Policy' section for AI coding
  agents with the decision table and step-by-step checklist.

- supply-chain-audit.yml: Add dep-bounds job that fails PRs introducing
  PyPI deps without <ceiling upper bounds. Fires on pyproject.toml changes.
  Posts a PR comment with the specific unbounded specs found.

Refs: NousResearch#2796 NousResearch#2810 NousResearch#9801 NousResearch#24205
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
… chain compromise) (NousResearch#2796)

litellm 1.82.7/1.82.8 contained a credential stealer (.pth auto-exec
payload). PyPI quarantined the entire package, blocking all fresh
hermes-agent installs since litellm was listed as a hard dependency.

These three deps (litellm, typer, platformdirs) are only used by the
mini-swe-agent submodule, which has its own pyproject.toml and manages
its own dependencies. They were redundantly duplicated in hermes-agent's
pyproject.toml.

Also fixes install.sh to not print 'mini-swe-agent installed' on
failure, and updates warning messages in both install scripts to clarify
that only Docker/Modal backends are affected — local terminal is
unaffected.

Ref: BerriAI/litellm#24512
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant