fix(security): cherry-pick upstream security hardening by jecruz · Pull Request #2 · jecruz/hermes-agent

jecruz · 2026-04-01T20:17:05Z

What does this PR do?

Cherry-picks upstream security hardening from NousResearch/hermes-agent. Adds credential pooling with multi-provider failover, comprehensive secret redaction for logs, memory provider plugins (7 backends), subscription feature management, and multi-platform gateway support (Discord, Matrix, Slack, Telegram, WhatsApp, custom API).

Major features:

Credential pooling: Multiple credentials per provider with round-robin/least-used/random strategies
Secret redaction: Snapshots env at import time; redacts API keys, tokens, passwords, private keys, DB connection strings, and phone numbers
Memory plugins: Honcho, Mem0, Holographic, Byterover, RetainDB, OpenViking, Hindsight (pluggable, one active at a time)
Gateway delivery: Event-driven message handling across 5+ platforms with schema injection
Nous subscription: Feature state tracking for web search, image generation, TTS, browser, Modal

Related Issue

Upstream security sync for v0.7.0 release

Type of Change

🔒 Security fix
✨ New feature (non-breaking change that adds functionality)
♻️ Refactor (no behavior change)

Changes Made

Core Security Hardening

agent/redact.py: Regex-based secret redaction for logs; snapshots HERMES_REDACT_SECRETS at import time to prevent runtime bypass; patterns for 30+ API key types + base64-encoded tokens + DB connection strings
agent/credential_pool.py: Multi-credential failover with pluggable strategies (fill_first, round_robin, random, least_used); per-provider priority chaining; 429/402 cooldowns
hermes_cli/auth.py: Provider registry (Nous Portal, OpenRouter, GitHub Copilot, custom endpoints); OAuth device-code flow; JWT token refresh with skew; cross-process file locking on auth.json

Memory & Persistence

agent/memory_manager.py: Lifecycle orchestration for memory providers; prefetch before each turn; sync after each turn
agent/memory_provider.py: ABC for pluggable backends (honorable mention: already defined, plugins comply)
plugins/memory/: 7 memory backends (honcho, mem0, holographic, byterover, retaindb, openviking, hindsight) with standardized tool schemas

Gateway & Delivery

gateway/run.py: Event-driven message handling; platform abstraction; schema injection
gateway/platforms/: Discord, Matrix, Slack, Telegram, WhatsApp, generic API server

Nous Features & Setup

hermes_cli/nous_subscription.py: Subscription state tracking; feature availability flags
hermes_cli/memory_setup.py: Interactive memory provider selection (NEW)
hermes_cli/setup.py: Enhanced onboarding with memory/tool configuration

Test Coverage

tests/test_nous_subscription.py: NousFeatureState immutability, feature aggregation, model config parsing (NEW)

Infrastructure

Dockerfile: Updated Python base; env handling
.github/workflows/: Enhanced test/deploy/doc CI
requirements.txt: New dependencies (e.g., for memory backends)

How to Test

Secret redaction: Log a message with an API key (sk-abc123...), Greptile token, phone number → verify redaction in logs

python -c "from agent.redact import redact_sensitive_text; print(redact_sensitive_text('key=e93WIf0Vc78igjtgSINvawwBwkdPu5ctglEvo8uvA/dOaiP+'))"

Credential pool: Add multiple credentials per provider; verify round-robin/least-used selection

hermes config credentials add-pool --provider openrouter --label primary
hermes config credentials add-pool --provider openrouter --label fallback

Memory setup: Complete onboarding with memory provider selection
```
python -m hermes_cli.main setup
```

Tests: Run full test suite

pytest tests/ -q
pytest tests/test_nous_subscription.py -v

Checklist

Code

My commit messages follow Conventional Commits
My PR contains only changes related to this feature (no unrelated commits)
I've added tests for my changes
Tested on macOS 15.2

Documentation & Housekeeping

I've updated relevant documentation (RELEASE_v0.7.0.md, skill docs)
I've updated cli-config.yaml.example for new config keys
I've considered cross-platform impact (Windows, macOS, Linux)
I've updated tool descriptions/schemas

Screenshots / Logs

No UI changes (backend hardening only).

- add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support

…l we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway.

…gnore - Combine apt-get update and install into single RUN with cache clearing - Remove APT lists after installation - Add --no-cache-dir to pip install - Add --prefer-offline --no-audit to npm install - Create .dockerignore to exclude unnecessary files from build context - Update docker-publish.yml workflow to tag images with release names - Ensure buildx caching is used (type=gha)

…edact flag - Add gho_, ghu_, ghs_, ghr_ prefix patterns (OAuth, user-to-server, server-to-server, and refresh tokens) — all four types used by GitHub Apps and Copilot auth flows were absent from _PREFIX_PATTERNS - Snapshot HERMES_REDACT_SECRETS at module import time instead of re-reading os.getenv() on every call, preventing runtime env mutations (e.g. LLM-generated export commands) from disabling redaction

The _REDACT_ENABLED constant is snapshotted at import time, so monkeypatch.delenv() alone doesn't re-enable redaction during tests when HERMES_REDACT_SECRETS=false is set in the host environment.

…hromium) (NousResearch#4292) The SSRF protection added in NousResearch#3041 blocks all private/internal addresses unconditionally in browser_navigate(). This prevents legitimate local use cases (localhost apps, LAN devices) when using Camofox or the built-in headless Chromium without a cloud provider. The check is only meaningful for cloud backends (Browserbase, BrowserUse) where the agent could reach internal resources on a remote machine. Local backends give the user full terminal and network access already — the SSRF check adds zero security value. Add _is_local_backend() helper that returns True when Camofox is active or no cloud provider is configured. Both the pre-navigation and post-redirect SSRF checks now skip when running locally. The browser.allow_private_urls config option remains available as an explicit opt-out for cloud mode.

…docs * docs: clarify WhatsApp allowlist behavior and document WHATSAPP_ALLOW_ALL_USERS - Add WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to env vars reference - Warn that * is not a wildcard and silently blocks all messages - Show WHATSAPP_ALLOWED_USERS as optional, not required - Update troubleshooting with the * trap and debug mode tip - Fix Security section to mention the allow-all alternative Prompted by a user report in Discord where WHATSAPP_ALLOWED_USERS=* caused all incoming messages to be silently dropped at the bridge level. * feat: support * wildcard in platform allowlists Follow the precedent set by SIGNAL_GROUP_ALLOWED_USERS which already supports * as an allow-all wildcard. Bridge (allowlist.js): matchesAllowedUser() now checks for * in the allowedUsers set before iterating sender aliases. Gateway (run.py): _is_authorized() checks for * in allowed_ids after parsing the allowlist. This is generic — works for all platforms, not just WhatsApp. Updated docs to document * as a supported value instead of warning against it. Added WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to the env vars reference. Tests: JS allowlist test + 2 Python gateway tests (WhatsApp + Telegram to verify cross-platform behavior).

The delivery target parser uses split(':', 1) which only splits on the first colon. For the documented format platform:chat_id:thread_id (e.g. 'telegram:-1001234567890:17585'), thread_id gets munged into chat_id and is never extracted. Fix: split(':', 2) to correctly extract all three parts. Also fix to_string() to include thread_id for proper round-tripping. The downstream plumbing in _deliver_to_platform() already handles thread_id correctly (line 292-293) — it just never received a value.

`hermes config set KEY ""` and `hermes config set KEY 0` were rejected because the guard used `not value` which is truthy for empty strings, zero, and False. Changed to `value is None` so only truly missing arguments are rejected. Closes NousResearch#4277 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Some models (e.g. Kimi K2.5 on Alibaba OpenAI-compatible endpoint) emit reasoning text followed by a closing </think> without a matching opening <think> tag. The existing paired-tag regexes in _strip_think_blocks() cannot match these orphaned tags, so </think> leaks into user-facing responses on all platforms. Add a catch-all regex that strips any remaining opening or closing think/thinking/reasoning/REASONING_SCRATCHPAD tags after the existing paired-block removal pass. Closes NousResearch#4285

… warning (NousResearch#4294) * docs: update llama.cpp section with --jinja flag and tool calling guide The llama.cpp docs were missing the --jinja flag which is required for tool calling to work. Without it, models output tool calls as raw JSON text instead of structured API responses, making Hermes unable to execute them. Changes: - Add --jinja and -fa flags to the server startup example - Replace deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with hermes model interactive setup - Add caution block explaining the --jinja requirement and symptoms - List models with native tool calling support - Add /props endpoint verification tip * docs+feat: comprehensive local LLM provider guides and context length warning Docs (providers.md): - Rewrote Ollama section with context length warning (defaults to 4k on <24GB VRAM), three methods to increase it, and verification steps - Rewrote vLLM section with --max-model-len, tool calling flags (--enable-auto-tool-choice, --tool-call-parser), and context guidance - Rewrote SGLang section with --context-length, --tool-call-parser, and warning about 128-token default max output - Added LM Studio section (port 1234, context length defaults to 2048, tool calling since 0.3.6) - Added llama.cpp context length flag (-c) and GPU offload (-ngl) - Added Troubleshooting Local Models section covering: - Tool calls appearing as text (with per-server fix table) - Silent context truncation and diagnosis commands - Low detected context at startup - Truncated responses - Replaced all deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with hermes model interactive setup and config.yaml examples - Added deprecation warning for legacy env vars in General Setup Code (cli.py): - Added context length warning in show_banner() when detected context is <= 8192 tokens, with server-specific fix hints: - Ollama (port 11434): suggests OLLAMA_CONTEXT_LENGTH env var - LM Studio (port 1234): suggests model settings adjustment - Other servers: suggests config.yaml override Tests: - 9 new tests covering warning thresholds, server-specific hints, and no-warning cases

…g.yaml (NousResearch#4298) The _has_any_provider_configured() guard only checked env vars, .env file, and auth.json — missing config.yaml model.provider/base_url/api_key entirely. Users who configured a provider through setup (saving to config.yaml) but had empty API key placeholders in .env from the install template were permanently blocked by the 'not configured' message. Changes: - _has_any_provider_configured() now checks config.yaml model section for explicit provider, base_url, or api_key — covers custom endpoints and providers that store credentials in config rather than env vars - .env.example: comment out all empty API key placeholders so they don't pollute the environment when copied to .env by the installer - .env.example: mark LLM_MODEL as deprecated (config.yaml is source of truth) - 4 new tests for the config.yaml detection path Reported by OkadoOP on Discord.

…iled refresh (NousResearch#4300) When an OAuth token refresh fails on a 401 error, the pool recovery would return 'not recovered' without trying the next credential in the pool. This meant users who added a second valid credential via 'hermes auth add' would never see it used when the primary credential was dead. Now: try refresh first (handles expired tokens quickly), and if that fails, rotate to the next available credential — same as 429/402 already did. Adds three tests covering 401 refresh success, refresh-fail-then-rotate, and refresh-fail-with-no-remaining-credentials.

Exposes the existing max_turns parameter (cli.py main()) as a CLI flag so programmatic callers (Paperclip adapter, scripts) can control the agent's tool-calling iteration limit without editing config.yaml. Priority chain unchanged: CLI flag > config agent.max_turns > env HERMES_MAX_ITERATIONS > default 90.

WSL detection was treated as a hard fail, blocking voice mode even when audio worked via PulseAudio bridge. Now PULSE_SERVER env var presence makes WSL a soft notice instead of a blocking warning. Device query failures in WSL with PULSE_SERVER are also treated as non-blocking.

Fixes a zip-slip path traversal vulnerability in hermes profile import. shutil.unpack_archive() on untrusted tar members allows entries like ../../escape.txt to write files outside ~/.hermes/profiles/. - Add _normalize_profile_archive_parts() to reject absolute paths (POSIX and Windows), traversal (..), empty paths, backslash tricks - Add _safe_extract_profile_archive() for manual per-member extraction that only allows regular files and directories (rejects symlinks) - Replace shutil.unpack_archive() with the safe extraction path - Add regression tests for traversal and absolute-path attacks Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>

…ile read guard Two pre-existing issues causing test_file_read_guards timeouts on CI: 1. agent/redact.py: _ENV_ASSIGN_RE used unbounded [A-Z_]* with IGNORECASE, matching any letter/underscore to end-of-string at each position → O(n²) backtracking on 100K+ char inputs. Bounded to {0,50} since env var names are never that long. 2. tools/file_tools.py: redact_sensitive_text() ran BEFORE the character-count guard, so oversized content (that would be rejected anyway) went through the expensive regex first. Reordered to check size limit before redaction.

Add MiniMax as a fifth TTS provider alongside Edge TTS, ElevenLabs, OpenAI, and NeuTTS. Supports speech-2.8-hd (recommended default) and speech-2.8-turbo models via the MiniMax T2A HTTP API. Changes: - Add _generate_minimax_tts() with hex-encoded audio decoding - Add MiniMax to provider dispatch, requirements check, and Telegram Opus compatibility handling - Add MiniMax to interactive setup wizard with API key prompt - Update TTS documentation and config example Configuration: tts: provider: "minimax" minimax: model: "speech-2.8-hd" voice_id: "English_Graceful_Lady" Requires MINIMAX_API_KEY environment variable. API reference: https://platform.minimax.io/docs/api-reference/speech-t2a-http

* feat: add /branch (/fork) command for session branching Inspired by Claude Code's /branch command. Creates a copy of the current session's conversation history in a new session, allowing the user to explore a different approach without losing the original. Works like 'git checkout -b' for conversations: - /branch — auto-generates a title from the parent session - /branch my-idea — uses a custom title - /fork — alias for /branch Implementation: - CLI: _handle_branch_command() in cli.py - Gateway: _handle_branch_command() in gateway/run.py - CommandDef with 'fork' alias in commands.py - Uses existing parent_session_id field in session DB - Uses get_next_title_in_lineage() for auto-numbered branches - 14 tests covering session creation, history copy, parent links, title generation, edge cases, and agent sync * fix: clear ghost status-bar lines on terminal resize When the terminal shrinks (e.g. un-maximize), the emulator reflows previously full-width rows (status bar, input rules) into multiple narrower rows. prompt_toolkit's _on_resize only cursor_up()s by the stored layout height, missing the extra rows from reflow — leaving ghost duplicates of the status bar visible. Fix: monkey-patch Application._on_resize to detect width shrinks, calculate the extra rows created by reflow, and inflate the renderer's cursor_pos.y so the erase moves up far enough to clear ghosts.

OAuth refresh tokens are single-use. When multiple consumers share the same Anthropic OAuth session (credential pool entries, Claude Code CLI, multiple Hermes profiles), whichever refreshes first invalidates the refresh token for all others. This causes a cascade: 1. Pool entry tries to refresh with a consumed refresh token → 400 2. Pool marks the credential as "exhausted" with a 24-hour cooldown 3. All subsequent heartbeats skip the credential entirely 4. The fallback to resolve_anthropic_token() only works while the access token in ~/.claude/.credentials.json hasn't expired 5. Once it expires, nothing can auto-recover without manual re-login Fix: - Add _sync_anthropic_entry_from_credentials_file() to detect when ~/.claude/.credentials.json has a newer refresh token and sync it into the pool entry, clearing exhaustion status - After a successful pool refresh, write the new tokens back to ~/.claude/.credentials.json so other consumers stay in sync - On refresh failure, check if the credentials file has a different (newer) refresh token and retry once before marking exhausted - In _available_entries(), sync exhausted claude_code entries from the credentials file before applying the 24-hour cooldown, so a manual re-login or external refresh immediately unblocks agents Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Address review feedback: replace bare `except: pass` with a debug log when the post-retry write-back to ~/.claude/.credentials.json fails. The write-back is best-effort (token is already resolved), but logging helps troubleshooting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…les (NousResearch#4738) Add docker_env option to terminal config — a dict of key-value pairs that get set inside Docker containers via -e flags at both container creation (docker run) and per-command execution (docker exec) time. This complements docker_forward_env (which reads values dynamically from the host process environment). docker_env is useful when Hermes runs as a systemd service without access to the user's shell environment — e.g. setting SSH_AUTH_SOCK or GNUPGHOME to known stable paths for SSH/GPG agent socket forwarding. Precedence: docker_env provides baseline values; docker_forward_env overrides for the same key. Config example: terminal: docker_env: SSH_AUTH_SOCK: /run/user/1000/ssh-agent.sock GNUPGHOME: /root/.gnupg docker_volumes: - /run/user/1000/ssh-agent.sock:/run/user/1000/ssh-agent.sock - /run/user/1000/gnupg/S.gpg-agent:/root/.gnupg/S.gpg-agent

Audit found 24+ discrepancies between docs and code. Fixed: HIGH severity: - Remove honcho toolset from tools-reference, toolsets-reference, and tools.md (converted to memory provider plugin, not a built-in toolset) - Add note that Honcho is available via plugin MEDIUM severity: - Add hermes memory command family to cli-commands.md (setup/status/off) - Add --clone-all, --clone-from to profile create in cli-commands.md - Add --max-turns option to hermes chat in cli-commands.md - Add /btw slash command to slash-commands.md - Fix profile show example output (remove nonexistent disk usage, add .env and SOUL.md status lines) - Add missing hermes-webhook toolset to toolsets-reference.md - Add 5 missing providers to fallback-providers.md table - Add 7 missing providers to providers.md fallback list - Fix outdated model examples: glm-4-plus→glm-5, moonshot-v1-auto→kimi-for-coding

…ty-v5

…, and test flakiness Bug fixes: - agent/redact.py: catastrophic regex backtracking in _ENV_ASSIGN_RE — removed re.IGNORECASE and changed [A-Z_]* to [A-Z0-9_]* to restrict matching to actual env var name chars. Without this, the pattern backtracks exponentially on large strings (e.g. 100K tool output), causing test_file_read_guards to time out. - tools/file_operations.py: over-escaped newline in find -printf format string produced literal backslash-n instead of a real newline, breaking file search result parsing (total_count always 1, paths concatenated). Test fixes: - Remove stale pytestmark.skip from 4 test modules that were blanket-skipped as 'Hangs in non-interactive environments' but actually run fine: - test_413_compression.py (12 tests, 25s) - test_file_tools_live.py (71 tests, 24s) - test_code_execution.py (61 tests, 99s) - test_agent_loop_tool_calling.py (has proper OPENROUTER_API_KEY skip already) - test_413_compression.py: fix threshold values in 2 preflight compression tests where context_length was too small for the compressed output to fit in one pass. - test_mcp_probe.py: add missing _MCP_AVAILABLE mock so tests work without MCP SDK. - test_mcp_tool_issue_948.py: inject MCP symbols (StdioServerParameters etc.) when SDK is not installed so patch() targets exist. - test_approve_deny_commands.py: replace time.sleep(0.3) with deterministic polling of _gateway_queues — fixes race condition where resolve fires before threads register their approval entries, causing the test to hang indefinitely. Net effect: +256 tests recovered from skip, 8 real failures fixed.

… diagnostics (NousResearch#5077) Provider coverage: - Add 6 missing providers to _PROVIDER_ENV_HINTS (Nous, DeepSeek, DashScope, HF, OpenCode Zen/Go) - Add 5 missing providers to API connectivity checks (DeepSeek, Hugging Face, Alibaba/DashScope, OpenCode Zen, OpenCode Go) New diagnostics: - Config version check — detects outdated config, --fix runs non-interactive migration automatically - Stale root-level config keys — detects provider/base_url at root level (known bug source, PR NousResearch#4329), --fix migrates them into the model section - WAL file size check — warns on >50MB WAL files (indicates missed checkpoints from the duplicate close() bug), --fix runs PASSIVE checkpoint - Mem0 memory plugin status — checks API key resolution including the env+json merge we just fixed

- Browse: POST /api/v1/browse → GET /api/v1/fs/{ls,tree,stat} - Read: POST /api/v1/read[/abstract] → GET /api/v1/content/{read,abstract,overview} - System prompt: result.get('children') → len(result) (API returns list) - Content: result.get('content') → result is a plain string - Browse: result['entries'] → result is the list; is_dir → isDir (camelCase) - Browse: add rel_path and abstract fields to entry output Based on PR NousResearch#4742 by catbusconductor. Auth header changes dropped (already on main via NousResearch#4825).

…h#5082) Add an optional 'script' parameter to cron jobs that references a Python script. The script runs before each agent turn, and its stdout is injected into the prompt as context. This enables stateful monitoring — the script handles data collection and change detection, the LLM analyzes and reports. - cron/jobs.py: add script field to create_job(), stored in job dict - cron/scheduler.py: add _run_job_script() executor with timeout handling, inject script output/errors into _build_job_prompt() - tools/cronjob_tools.py: add script to tool schema, create/update handlers, _format_job display - hermes_cli/cron.py: add --script to create/edit, display in list/edit output - hermes_cli/main.py: add --script argparse for cron create/edit subcommands - tests/cron/test_cron_script.py: 20 tests covering job CRUD, script execution, path resolution, error handling, prompt injection, tool API Script paths can be absolute or relative (resolved against ~/.hermes/scripts/). Scripts run with a 120s timeout. Failures are injected as error context so the LLM can report the problem. Empty string clears an attached script.

…oviders (NousResearch#5091) Users on direct API-key providers (Alibaba, DeepSeek, ZAI, etc.) without an OpenRouter or Nous key would get broken auxiliary tasks (compression, vision, etc.) because _resolve_auto() only tried aggregator providers first, then fell back to iterating PROVIDER_REGISTRY with wrong default model names. Now _resolve_auto() checks the user's main provider first. If it's not an aggregator (OpenRouter/Nous), it uses their main model directly for all auxiliary tasks. Aggregator users still get the cheap gemini-flash model as before. Adds _read_main_provider() to read model.provider from config.yaml, mirroring the existing _read_main_model(). Reported by SkyLinx — Alibaba Coding Plan user getting 400 errors from google/gemini-3-flash-preview being sent to DashScope.

- Dedicated asyncio event loop for Hindsight async calls (fixes aiohttp session leaks) - Client caching (reuse instead of creating per-call) - Local mode daemon management with config change detection and auto-restart - Memory mode support (hybrid/context/tools) and prefetch method (recall/reflect) - Proper shutdown with event loop and client cleanup - Disable HindsightEmbedded.__del__ to avoid GC loop errors - Update API URLs (app -> ui.hindsight.vectorize.io, api_url -> base_url) - Setup wizard: conditional fields (when clause), dynamic defaults (default_from) - Switch dependency install from pip to uv (correct for uv-based venvs) - Add hindsight-all to plugin.yaml and import mapping - 12 new tests for dispatch routing and setup field filtering Original PR NousResearch#5044 by cdbartholomew.

* feat: execute_code runs on remote terminal backends (Docker/SSH/Modal/Daytona/Singularity) When TERMINAL_ENV is not 'local', execute_code now ships the script to the remote environment and runs it there via the terminal backend -- the same container/sandbox/SSH session used by terminal() and file tools. Architecture: - Local backend: unchanged (UDS RPC, subprocess.Popen) - Remote backends: file-based RPC via execute_oneshot() polling - Script writes request files, parent polls and dispatches tool calls - Responses written atomically (tmp + rename) via base64/stdin - execute_oneshot() bypasses persistent shell lock for concurrency Changes: - tools/environments/base.py: add execute_oneshot() (delegates to execute()) - tools/environments/persistent_shell.py: override execute_oneshot() to bypass _shell_lock via _execute_oneshot(), enabling concurrent polling - tools/code_execution_tool.py: add file-based transport to generate_hermes_tools_module(), _execute_remote() with full env get-or-create, file shipping, RPC poll loop, output post-processing * fix: use _get_env_config() instead of raw TERMINAL_ENV env var Read terminal backend type through the canonical config resolution path (terminal_tool._get_env_config) instead of os.getenv directly. * fix: use echo piping instead of stdin_data for base64 writes Modal doesn't reliably deliver stdin_data to chained commands (base64 -d > file && mv), producing 0-byte files. Switch to echo 'base64' | base64 -d which works on all backends. Verified E2E on both Docker and Modal.

Bring Matrix feature parity with Discord by adding mention gating and auto-threading. Both default to true, matching Discord behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move mention stripping outside the `if not is_dm` guard so mentions are stripped in DMs too. Remove the bare-mention early return so a message containing only a mention passes through as empty string, matching Discord's behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ty-v5

…verage - Add base64 token pattern for Greptile/custom endpoint API keys in redaction - Simplify JSON field regex to rely on re.IGNORECASE - Fix PooledCredential.extra typing with field(default_factory=dict) - Add comprehensive test suite for NousFeatureState and model config - Add CLAUDE.md with project documentation Fixes: Generic base64 tokens (e.g. Greptile API keys) were not redacted in logs. Tested: Redaction patterns, dataclass immutability, config parsing edge cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix PooledCredential.extra field typing with field(default_factory=dict) - Add comprehensive test_nous_subscription.py with 14 test cases - Add CLAUDE.md with tech stack, structure, conventions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Resolved 15 merge conflicts. Kept upstream security hardening: - Gateway approval system with session tokens - Memory provider shutdown during session expiry - Discord message deduplication for RESUME replays - Comprehensive credential pooling and redaction Conflicts resolved by taking upstream version for gateway platforms, CLI commands, tool implementations, and approval tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-04-04T22:27:40Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: base64 encoding/decoding detected

Base64 has legitimate uses (images, JWT, etc.) but is also commonly used to obfuscate malicious payloads. Verify the usage is appropriate.

Matches (first 20):

28550:+    encoded = base64.b64encode(content.encode("utf-8")).decode("ascii")
28676:+                encoded_result = base64.b64encode(

⚠️ WARNING: exec() or eval() usage

Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.

Matches (first 20):

7033:+Persistent memory via the `brv` CLI — hierarchical knowledge tree with tiered retrieval (fuzzy text → LLM-driven search).
7081:+a hierarchical context tree with tiered retrieval (fuzzy text → LLM-driven
19607:+shell injection, SQL injection, path traversal, eval()/exec() with user input,
32259:+Persistent memory via the `brv` CLI — hierarchical knowledge tree with tiered retrieval (fuzzy text → LLM-driven search). Local-first with optional cloud sync.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

12212:+        resp = self._httpx.post(
30728:+    response = requests.post(base_url, json=payload, headers=headers, timeout=60)

⚠️ WARNING: Install hook files modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/memory_setup.py
hermes_cli/setup.py
skills/productivity/google-workspace/scripts/setup.py
tests/skills/test_google_oauth_setup.py

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

19608:+pickle.loads(), obfuscated commands.

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

Set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 to address deprecation of Node.js 20 on GitHub Actions runners (deprecated June 2, 2026). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-04-04T22:31:27Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: base64 encoding/decoding detected

Base64 has legitimate uses (images, JWT, etc.) but is also commonly used to obfuscate malicious payloads. Verify the usage is appropriate.

Matches (first 20):

28572:+    encoded = base64.b64encode(content.encode("utf-8")).decode("ascii")
28698:+                encoded_result = base64.b64encode(

⚠️ WARNING: exec() or eval() usage

Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.

Matches (first 20):

7055:+Persistent memory via the `brv` CLI — hierarchical knowledge tree with tiered retrieval (fuzzy text → LLM-driven search).
7103:+a hierarchical context tree with tiered retrieval (fuzzy text → LLM-driven
19629:+shell injection, SQL injection, path traversal, eval()/exec() with user input,
32281:+Persistent memory via the `brv` CLI — hierarchical knowledge tree with tiered retrieval (fuzzy text → LLM-driven search). Local-first with optional cloud sync.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

12234:+        resp = self._httpx.post(
30750:+    response = requests.post(base_url, json=payload, headers=headers, timeout=60)

⚠️ WARNING: Install hook files modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/memory_setup.py
hermes_cli/setup.py
skills/productivity/google-workspace/scripts/setup.py
tests/skills/test_google_oauth_setup.py

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

19630:+pickle.loads(), obfuscated commands.

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

github-actions · 2026-04-04T23:57:52Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: base64 encoding/decoding detected

Base64 has legitimate uses (images, JWT, etc.) but is also commonly used to obfuscate malicious payloads. Verify the usage is appropriate.

Matches (first 20):

28957:+    encoded = base64.b64encode(content.encode("utf-8")).decode("ascii")
29083:+                encoded_result = base64.b64encode(

⚠️ WARNING: exec() or eval() usage

Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.

Matches (first 20):

7143:+Persistent memory via the `brv` CLI — hierarchical knowledge tree with tiered retrieval (fuzzy text → LLM-driven search).
7191:+a hierarchical context tree with tiered retrieval (fuzzy text → LLM-driven
19717:+shell injection, SQL injection, path traversal, eval()/exec() with user input,
32666:+Persistent memory via the `brv` CLI — hierarchical knowledge tree with tiered retrieval (fuzzy text → LLM-driven search). Local-first with optional cloud sync.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

12322:+        resp = self._httpx.post(
31135:+    response = requests.post(base_url, json=payload, headers=headers, timeout=60)

⚠️ WARNING: Install hook files modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/memory_setup.py
hermes_cli/setup.py
skills/productivity/google-workspace/scripts/setup.py
tests/skills/test_google_oauth_setup.py

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

19718:+pickle.loads(), obfuscated commands.

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

…#13354) Classic-CLI /steer typed during an active agent run was queued through self._pending_input alongside ordinary user input. process_loop, which drains that queue, is blocked inside self.chat() for the entire run, so the queued command was not pulled until AFTER _agent_running had flipped back to False — at which point process_command() took the idle fallback ("No agent running; queued as next turn") and delivered the steer as an ordinary next-turn user message. From Utku's bug report on PR NousResearch#13205: mid-run /steer arrived minutes later at the end of the turn as a /queue-style message, completely defeating its purpose. Fix: add _should_handle_steer_command_inline() gating — when _agent_running is True and the user typed /steer, dispatch process_command(text) directly from the prompt_toolkit Enter handler on the UI thread instead of queueing. This mirrors the existing _should_handle_model_command_inline() pattern for /model and is safe because agent.steer() is thread-safe (uses _pending_steer_lock, no prompt_toolkit state mutation, instant return). No changes to the idle-path behavior: /steer typed with no active agent still takes the normal queue-and-drain route so the fallback "No agent running; queued as next turn" message is preserved. Validation: - 7 new unit tests in tests/cli/test_cli_steer_busy_path.py covering the detector, dispatch path, and idle-path control behavior. - All 21 existing tests in tests/run_agent/test_steer.py still pass. - Live PTY end-to-end test with real agent + real openrouter model: 22:36:22 API call #1 (model requested execute_code) 22:36:26 ENTER FIRED: agent_running=True, text='/steer ...' 22:36:26 INLINE STEER DISPATCH fired 22:36:43 agent.log: 'Delivered /steer to agent after tool batch' 22:36:44 API call #2 included the steer; response contained marker Same test on the tip of main without this fix shows the steer landing as a new user turn ~20s after the run ended.

The MCP circuit breaker previously had no path back to the closed state: once _server_error_counts[srv] reached _CIRCUIT_BREAKER_THRESHOLD the gate short-circuited every subsequent call, so the only reset path (on successful call) was unreachable. A single transient 3-failure blip (bad network, server restart, expired token) permanently disabled every tool on that MCP server for the rest of the agent session. Introduce a classic closed/open/half-open state machine: - Track a per-server breaker-open timestamp in _server_breaker_opened_at alongside the existing failure count. - Add _CIRCUIT_BREAKER_COOLDOWN_SEC (60s). Once the count reaches threshold, calls short-circuit for the cooldown window. - After the cooldown elapses, the *next* call falls through as a half-open probe that actually hits the session. Success resets the breaker via _reset_server_error; failure re-bumps the count via _bump_server_error, which re-stamps the open timestamp and re-arms the cooldown. The error message now includes the live failure count and an "Auto-retry available in ~Ns" hint so the model knows the breaker will self-heal rather than giving up on the tool for the whole session. Covers tests 1 (half-opens after cooldown) and 2 (reopens on probe failure); test 3 (cleared on reconnect) still fails pending fix #2.

rewbs and others added 30 commits March 26, 2026 16:17

feat: add managed tool gateway and Nous subscription support

95dc9aa

- add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support

Merge branch 'main' into rewbs/tool-use-charge-to-subscription

e95965d

Gate tool-gateway behind an env var, so it's not in users' faces unti…

1cbb1b9

…l we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway.

Merge branch 'NousResearch:main' into docker-optimization

fdef045

Further npm optimizations

48942c8

Simplify dockerignore

5de312c

Merge branch 'NousResearch:main' into docker-optimization

4f4d7c4

Add build-essential to Dockerfile dependencies

3a1e489

Optimize Playwright install

0287597

Merge branch 'main' into rewbs/tool-use-charge-to-subscription

6e4598c

Merge branch 'main' into rewbs/tool-use-charge-to-subscription

1126284

Fixes and refactors enabled by recent updates to main.

1b7473e

Merge branch 'NousResearch:main' into docker-optimization

289df5d

fix: patch _REDACT_ENABLED in test fixture for module-level snapshot

fad3f33

The _REDACT_ENABLED constant is snapshotted at import time, so monkeypatch.delenv() alone doesn't re-enable redaction during tests when HERMES_REDACT_SECRETS=false is set in the host environment.

fix: harden codex responses transport handling

e75964d

test: add codex transport drop regression

cac9d20

fix(security): reject path traversal in credential file registration

a97641b

acsezen and others added 21 commits April 3, 2026 22:40

Merge remote-tracking branch 'upstream/main' into fix/upstream-securi…

bc40007

…ty-v5

feat(gateway): add MATRIX_REQUIRE_MENTION and MATRIX_AUTO_THREAD support

d86be33

Bring Matrix feature parity with Discord by adding mention gating and auto-threading. Both default to true, matching Discord behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'upstream/main' into fix/upstream-securi…

bbb8fa1

…ty-v5

chore(ci): opt into Node.js 24 for GitHub Actions runner

570cb61

Set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 to address deprecation of Node.js 20 on GitHub Actions runners (deprecated June 2, 2026). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: patch get_all_skills_dirs and fix voice restart on no-speech

1bb5916

jecruz merged commit d78ca06 into main Apr 5, 2026
6 of 7 checks passed

jecruz deleted the fix/upstream-security-v5 branch April 5, 2026 00:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): cherry-pick upstream security hardening#2

fix(security): cherry-pick upstream security hardening#2
jecruz merged 227 commits into
mainfrom
fix/upstream-security-v5

jecruz commented Apr 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 4, 2026

Uh oh!

github-actions Bot commented Apr 4, 2026

Uh oh!

github-actions Bot commented Apr 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

jecruz commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Related Issue

Type of Change

Changes Made

Core Security Hardening

Memory & Persistence

Gateway & Delivery

Nous Features & Setup

Test Coverage

Infrastructure

How to Test

Checklist

Code

Documentation & Housekeeping

Screenshots / Logs

Uh oh!

github-actions Bot commented Apr 4, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: base64 encoding/decoding detected

⚠️ WARNING: exec() or eval() usage

⚠️ WARNING: Outbound network calls (POST/PUT)

⚠️ WARNING: Install hook files modified

⚠️ WARNING: marshal/pickle/compile usage

Uh oh!

github-actions Bot commented Apr 4, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: base64 encoding/decoding detected

⚠️ WARNING: exec() or eval() usage

⚠️ WARNING: Outbound network calls (POST/PUT)

⚠️ WARNING: Install hook files modified

⚠️ WARNING: marshal/pickle/compile usage

Uh oh!

github-actions Bot commented Apr 4, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: base64 encoding/decoding detected

⚠️ WARNING: exec() or eval() usage

⚠️ WARNING: Outbound network calls (POST/PUT)

⚠️ WARNING: Install hook files modified

⚠️ WARNING: marshal/pickle/compile usage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

jecruz commented Apr 1, 2026 •

edited

Loading