chore: sync with upstream main (2026-05-25)#50
Merged
Conversation
* fix(tui): log parent gateway lifecycle exits Add parent-side breadcrumbs for TUI gateway shutdown and transport exits so future backend EOF/SIGTERM reports identify the parent action that caused them. * chore(tui): retrigger lifecycle logging checks Retry transient GitHub checkout failures on the lifecycle logging PR.
…31055) * fix(tui): ignore late thinking deltas after completion Prevent stale reasoning events from repainting the TUI status after a turn has already completed and the UI is idle. * test(tui): restore timers after thinking delta assertion Keep fake timer cleanup in a finally block so assertion failures cannot leak timer mode into later tests.
Notify scroll subscribers when ScrollBox viewport bounds change and key virtual-history updates on viewport height so resize/keyboard changes remount the tail rows instead of leaving stale spacers visible.
Clamp and truncate the cwd/branch segment so narrow status bars cannot wrap into the composer input row.
Keep the resize delta below the virtual history scroll quantum so the regression test specifically depends on viewport height entering the snapshot key.
Make the cwd separator width conditional so the computed status layout matches the rendered row on ultra-narrow terminals.
Update the resize regression and comments so the test specifically guards viewport-height changes in the virtual-history snapshot key.
Budget the right-hand status label by terminal display width so wide Unicode paths cannot wrap skinny status bars.
Document that ScrollBox subscribers are notified for renderer-computed viewport and content bound changes, not only imperative scrolls.
Avoid preserving a frozen virtual transcript range when wrapped rows shrink enough that the old tail window no longer covers the viewport.
Wraps + heights are column-dependent, so a width change must remeasure every row and the renderer must repaint the full viewport. - Key virtualRows on cols so React remounts wrapped rows on resize. - Snap back to bottom after sticky-mode resize once React rerenders. - Reserve a scrollbar + gap column in transcriptBodyWidth (non-termux). - Full repaint on any viewport height change (was: shrink-only). - ScrollBox scrollHeight uses deepest child bottom so sticky-bottom math can reach the real final rendered row after reflow. - DECSTBM fast-path now requires full container rect match.
Terminals can't scale glyphs, so the banner now picks a layout per column width instead of always rendering the full 101-col logo: - Wide (>= logo width): full ASCII logo + tagline. - Mid (>= 58 cols): centered rule banner that expands with viewport. - Narrow (>= 34 cols): brand line + tagline, both width-aware. - < 34 cols: hidden. SessionPanel surfaces model/cwd/sid inline when the hero column is hidden, so narrow layouts don't lose that info. Logo width constants derive from the art itself.
Addresses Copilot review on PR NousResearch#31077: - onResize now re-checks isSticky() inside the 100ms timer so manual scrolls during the debounce window don't get snapped back to tail. - Comment on the virtualRows cols-keying calls out the deliberate trade-off: per-row local state (e.g. systemOpen) resets on resize so yoga can remeasure off live geometry. The hook's scale-by-ratio path is too approximate for mixed markdown widths.
… fix, and 81 tests
ntfy now ships as a self-contained plugin under plugins/platforms/ntfy/ instead of editing 8 core files (gateway/config.py Platform enum, gateway/run.py factory + auth maps, cron/scheduler.py, toolsets.py, hermes_cli/status.py, agent/prompt_builder.py, gateway/channel_directory.py, tools/send_message_tool.py). All routing goes through gateway/platform_registry via register_platform(): - adapter_factory, check_fn, validate_config, is_connected - env_enablement_fn seeds PlatformConfig.extra from NTFY_* env vars so gateway status reflects env-only setups without instantiating httpx - standalone_sender_fn handles deliver=ntfy cron jobs when cron runs out-of-process from the gateway - allowed_users_env / allow_all_env hook into _is_user_authorized - cron_deliver_env_var=NTFY_HOME_CHANNEL for cron home routing - platform_hint surfaces in the system prompt - pii_safe=True (topic names are the only identifier; no PII to redact) Tests moved to tests/gateway/test_ntfy_plugin.py using _plugin_adapter_loader so the module lives under plugin_adapter_ntfy in sys.modules and cannot collide with sibling plugin-adapter tests on the same xdist worker. The core-file grep tests (Platform.NTFY in source, hermes-ntfy in toolsets, etc.) are replaced with plugin-shape tests covering register() metadata, env_enablement_fn output, and standalone_sender_fn behavior. 68 tests pass under scripts/run_tests.sh.
Robustness: - Surface 401/404 stream failures via _set_fatal_error() so the gateway's runtime status reflects 'fatal: ntfy_unauthorized' / 'ntfy_topic_not_found' instead of staying 'connected' when the reconnect loop halts. Matches the pattern in whatsapp / telegram / sms adapters. - Strip whitespace from auth tokens so pasted tokens with trailing newlines don't produce malformed Authorization headers. Simplicity: - Extract _build_auth_header() and _truncate_body() to module-level helpers, used by both NtfyAdapter and _standalone_send. Removes the duplicated auth/truncation logic between the two paths. Docs: - website/docs/user-guide/messaging/ntfy.md — full setup guide, identity-model warning, self-hosting, cron usage, troubleshooting. - website/docs/reference/environment-variables.md — all 9 NTFY_* vars. - website/docs/user-guide/messaging/index.md — platform comparison row. - website/sidebars.ts — sidebar entry between simplex and open-webui. Tests: 78/78 (+ 10 new robustness tests covering token hygiene, fatal error propagation for 401/404, and the _truncate_body helper).
Move shutil.rmtree into a finally block so the temp directory is always cleaned up, even when an exception occurs during download, extraction, or file copying.
…atus bar Bug 1: /voice off in TUI mode did not clear HERMES_VOICE_TTS, leaving TTS stuck ON with no way to disable it (the voice.toggle tts handler requires voice mode to be ON). Bug 2: TUI status bar only showed 'voice on/off' without any indication of whether TTS speech output is active, because the frontend never tracked voiceTts state. - tui_gateway/server.py: clear HERMES_VOICE_TTS when voice is turned off - ui-tui/src/app/useMainApp.ts: add voiceTts state, thread setVoiceTts through voice contexts, display [tts] in status bar - ui-tui/src/app/slash/commands/session.ts: sync tts from voice.toggle response - ui-tui/src/app/interfaces.ts: add setVoiceTts to all voice context interfaces
…) (NousResearch#26975) * docs(simplex): remove broken Docker install command (NousResearch#26974) The "Or Docker" snippet pointed at `simplexchat/simplex-chat`, which is not a published Docker Hub image. Users following the docs hit: docker: Error response from daemon: pull access denied for simplexchat/simplex-chat, repository does not exist or may require 'docker login'. The SimpleX Chat project only publishes Docker images for its server components (smp-server, xftp-server) — the chat CLI is distributed as a binary release. Drop the broken `docker run` line and keep the verified binary-download path, with a note pointing users to the upstream Dockerfile if they want to build a container themselves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(simplex): drop misleading "Dockerfile" link text Copilot review flagged that the link text claimed "Dockerfile in the upstream repo" but the URL pointed at the repository root, not a specific Dockerfile path. Reword to "build from source from the simplex-chat repository" so the link text and target match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Null bytes in API key values (introduced by copy-paste) crash
os.environ[k] = v with ValueError: embedded null byte, preventing
hermes from starting at all.
- Add api_mode to 4 update_model() call sites: - conversation_loop.py: long_context failover and probe stepping - agent_runtime_helpers.py: rollback restore (also saves compressor_api_mode) - chat_completion_helpers.py: fallback activation - Fix 31 root-logger calls across 5 files (logging.warning/error/info -> logger.warning/error/info) to respect module-level log filtering
…#31077) * fix(tui): refresh virtual transcript on viewport resize Notify scroll subscribers when ScrollBox viewport bounds change and key virtual-history updates on viewport height so resize/keyboard changes remount the tail rows instead of leaving stale spacers visible. * test(tui): isolate viewport-height remount regression Keep the resize delta below the virtual history scroll quantum so the regression test specifically depends on viewport height entering the snapshot key. * test(tui): clarify virtual history resize snapshot Update the resize regression and comments so the test specifically guards viewport-height changes in the virtual-history snapshot key. * docs(tui): clarify scrollbox subscription signals Document that ScrollBox subscribers are notified for renderer-computed viewport and content bound changes, not only imperative scrolls. * fix(tui): recompute virtual tail after width resize Avoid preserving a frozen virtual transcript range when wrapped rows shrink enough that the old tail window no longer covers the viewport. * fix(tui): preserve transcript tail across resizes Wraps + heights are column-dependent, so a width change must remeasure every row and the renderer must repaint the full viewport. - Key virtualRows on cols so React remounts wrapped rows on resize. - Snap back to bottom after sticky-mode resize once React rerenders. - Reserve a scrollbar + gap column in transcriptBodyWidth (non-termux). - Full repaint on any viewport height change (was: shrink-only). - ScrollBox scrollHeight uses deepest child bottom so sticky-bottom math can reach the real final rendered row after reflow. - DECSTBM fast-path now requires full container rect match. * feat(tui): responsive banner tiers Terminals can't scale glyphs, so the banner now picks a layout per column width instead of always rendering the full 101-col logo: - Wide (>= logo width): full ASCII logo + tagline. - Mid (>= 58 cols): centered rule banner that expands with viewport. - Narrow (>= 34 cols): brand line + tagline, both width-aware. - < 34 cols: hidden. SessionPanel surfaces model/cwd/sid inline when the hero column is hidden, so narrow layouts don't lose that info. Logo width constants derive from the art itself. * fix(tui): re-check sticky inside resize debounce + document remount Addresses Copilot review on PR NousResearch#31077: - onResize now re-checks isSticky() inside the 100ms timer so manual scrolls during the debounce window don't get snapped back to tail. - Comment on the virtualRows cols-keying calls out the deliberate trade-off: per-row local state (e.g. systemOpen) resets on resize so yoga can remeasure off live geometry. The hook's scale-by-ratio path is too approximate for mixed markdown widths.
…tvar during session switches
Add opt-in AST diagnostics for skill review without making Skills Guard stricter by default. - Add hermes skills inspect --ast-deep to scan fetched skill bundles before installation - Add hermes skills audit --deep to scan already-installed hub skills - Keep AST analysis in tools/skills_ast_audit.py, separate from tools/skills_guard.py - Label output as diagnostic hints, not security verdicts - Cover dynamic import/access patterns: importlib, __import__(computed), getattr(computed), and __dict__[computed] This follows the maintainer guidance from closed PR NousResearch#7436: useful AST-level analysis belongs in an opt-in diagnostic path, not in Skills Guard's default heuristic scan.
Trim ~600 LOC off the original contribution while keeping the same operator-facing surface and detection coverage. - Collapse three entry points (file / dir / bundle) into one ast_scan_path(path) that handles both files and directories. - Drop AstFinding dataclass + severity field — replaced with plain (file, line, pattern_id, description) tuples. Severity ordering was display-only for a diagnostic that explicitly disclaims security verdicts, so the field added bookkeeping without earning its place. - Replace Rich-markup formatter with plain text grouped by file. - Drop the 'inspect --ast-deep' surface — same scanner, same output as 'audit --deep', single CLI entry is enough. Operators audit after install; pre-install inspection signal isn't worth the second surface. - Trim test file to the cases that earn their place: bypass payload, syntax error survival, RecursionError survival, false-positive guard (importer lookalike), literal-arg false-positive guard, non-.py ignored, directory recursion + cache-dir skipping, missing-path, getattr/__dict__ detection, formatter empty + populated. Net: tools/skills_ast_audit.py 353 -> 133 LOC, tests/tools/test_skills_ast_audit.py 299 -> 103 LOC, full diff +704/-12 -> +264/-6. No change to tools/skills_guard.py — Skills Guard verdicts remain untouched per SECURITY.md §2.4.
Auxiliary LLM tasks (vision, compression, web_extract, etc.) currently
require modifications to core files for any plugin that needs its own
task slot — specifically the _AUX_TASKS list in hermes_cli/main.py and
the hardcoded env-var bridging dict in gateway/run.py. This violates
the 'plugins must not modify core files' rule and forces every memory
or context plugin that wants its own auxiliary task to either fork
core or open a coupled core+plugin PR.
This change adds a generic plugin surface for auxiliary task
registration:
ctx.register_auxiliary_task(
key='memory_retain_filter',
display_name='Memory retain filter',
description='hindsight pre-retain dedup/extract',
defaults={'timeout': 30, 'extra_body': {'reasoning_effort': 'low'}},
)
After registration, the task automatically:
- Appears in 'hermes model → Configure auxiliary models' picker via
a new _all_aux_tasks() merge of built-in + plugin tasks
- Has its provider/model/base_url/api_key bridged from config.yaml
to AUXILIARY_<KEY_UPPER>_* env vars at gateway startup
(gateway/run.py now uses a dynamic bridged-keys set instead of
a hardcoded per-task dict)
- Gets plugin-declared defaults (timeout, extra_body, etc.) layered
underneath user config so unconfigured plugin tasks still work
(agent/auxiliary_client._get_auxiliary_task_config)
- Resets to auto via 'Reset all to auto' alongside built-ins
Validation:
- Rejects shadowing of built-in keys (vision, compression, etc.)
- Rejects invalid key shapes (must match [A-Za-z0-9_]+)
- Rejects cross-plugin collisions (clear error)
- Allows same-plugin re-registration (idempotent updates)
Plugin discovery failures (rare) fall back gracefully — the aux
config UI still shows built-in tasks if get_plugin_auxiliary_tasks()
raises, and gateway env-var bridging keeps working for built-ins.
Built-in tasks remain hardcoded in _AUX_TASKS for stability — they're
the baseline UX, and DEFAULT_CONFIG already ships their defaults.
Plugin tasks layer on top.
Tests: 15 new tests in test_plugin_auxiliary_tasks.py covering API
validation, manager state lifecycle, helper sort order, _all_aux_tasks
merge semantics, _reset_aux_to_auto inclusion of plugin tasks, and
default-layering in auxiliary_client.
Updates the gateway-bridge code-parity test (test_auxiliary_config_bridge)
to assert the new dynamic shape rather than the hardcoded literal env
var names which no longer appear post-refactor.
Motivation: this unblocks PR NousResearch#20262 (hindsight smart retain pipeline)
and similar plugins that need a dedicated aux task slot. The change
is non-breaking — built-in env vars (AUXILIARY_VISION_PROVIDER, etc.)
keep working since they're produced by the same f-string template
that built the hardcoded names.
…ssion exit A Ctrl+C during a slow slash command (e.g. /skills browse on a large skill tree, /sessions list against a multi-GB SQLite DB) used to unwind past self.process_command() to the outer prompt_toolkit event loop, which killed the entire session — losing all conversation state. Fix: wrap the slash-command dispatch in try/except KeyboardInterrupt so Ctrl+C aborts the command but the prompt loop continues. Other exceptions still propagate so real bugs aren't silently swallowed. Surgical reapply of PR NousResearch#5189. Original branch was many months stale (3764 files / 1M+ LOC of unrelated reverts); the substantive ~6 LOC change in cli.py was reapplied by hand onto current main with the contributor's authorship preserved via --author.
4 tests: KBI during slash command does not set _should_exit; truthy return keeps session alive; falsy return still sets exit (legit /exit path); non-KBI exceptions propagate normally.
…usResearch#16263) When the terminal drops the ESC[201~ end mark during a bracketed paste (terminal race, torn write, SSH glitch, macOS sleep/wake), prompt_toolkit's Vt100Parser keeps buffering all later input in _paste_buffer forever. From the user's perspective, the CLI appears frozen — the only recovery was closing the tab/session. This patch monkey-patches Vt100Parser.feed() so that bracketed-paste mode flushes buffered content as a normal BracketedPaste event after 2 seconds without an end marker, then restores normal parsing. Includes 8 regression tests covering normal paste, timeout recovery, torn end marks, and edge cases. Surgical reapply of PR NousResearch#27518. Original branch was many months stale (1193 files / 172k LOC of unrelated reverts); the substantive ~77 LOC patch in cli.py plus the new 157-line test file were reapplied onto current main with the contributor's authorship preserved via --author.
…o Tranquil-Flow (PR NousResearch#27518)
…ousResearch#4609) The read_file tool and terminal cat can access /proc/self/environ to recover all process env vars including secrets stripped by the subprocess blocklist. Output redaction partially mitigates (catches known-format tokens) but misses custom/proprietary key formats, especially when values are printed without their key names. Add /proc/*/environ, /proc/*/cmdline, and /proc/*/maps to the blocked device paths in _is_blocked_device(): - /proc/*/environ: leaks full process env (API keys, tokens) - /proc/*/cmdline: leaks command-line args (may contain passwords) - /proc/*/maps: leaks memory layout (ASLR bypass for exploitation) Legitimate /proc reads (cpuinfo, meminfo, uptime, version) remain accessible — the check only blocks per-pid pseudo-files with known sensitive suffixes. Complements PR NousResearch#4432 (PID namespace isolation for child processes) which prevents children from reading the parent's /proc, but does not prevent the parent process itself from being read via file tools. Partially addresses NousResearch#4427 Changes: tools/file_tools.py | +6 tests/tools/test_file_read_guards.py | +18 -1 Co-authored-by: dsr-restyn <dsr-restyn@users.noreply.github.com>
* fix: reject read_file symlinks to blocking devices The read_file guard already refused direct device paths such as /dev/zero, but a workspace symlink resolving to one of those devices could still reach the shell-backed read path and hang on wc/head/sed. Keep the literal alias check and add a resolved-path pass so local symlinks to blocked device/fd endpoints are rejected before I/O. Constraint: Preserve literal /dev/stdin handling before terminal-specific realpath resolution Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py tests/tools/test_file_tools.py -q; python -m compileall tools/file_tools.py tests/tools/test_file_read_guards.py; git diff --check Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> * Keep file guard tests off sensitive macOS temp paths The branch now inherits a sensitive-path write guard from upstream main. On macOS, tempfile.mkdtemp() resolves under /private/var/folders, so the new write-path guard fired before the file read dedup assertions could exercise their intended behavior. The tests now create their scratch files inside the worktree temp checkout, outside those system-sensitive prefixes, without changing production behavior. Constraint: Rebased branch must pass the expanded file read guard suite on macOS. Rejected: Loosen the production sensitive-path prefix list | broader behavior change unrelated to this PR. Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py -q --------- Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>
* fix(transcription): reject symlinked audio inputs Validation runs before provider selection, so rejecting symbolic-link paths there prevents supported-extension links from being treated as normal audio files. Use os.path.islink to avoid perturbing the existing Path.stat error path and to reject links before resolving targets. Constraint: Keep validation platform-safe and avoid requiring symlink support where unavailable. Rejected: Use Path.is_symlink | it consumes pathlib stat calls and broke the existing stat error regression. Confidence: high Scope-risk: narrow Directive: Keep path hardening in _validate_audio_file before provider dispatch. Tested: source venv/bin/activate && python -m pytest tests/tools/test_transcription_tools.py::TestValidateAudioFileEdgeCases -q (5 passed) Tested: source venv/bin/activate && python -m pytest tests/tools/test_transcription_tools.py::TestValidateAudioFileEdgeCases tests/tools/test_transcription_tools.py::TestTranscribeAudioDispatch::test_invalid_file_short_circuits -q (6 passed) Tested: source venv/bin/activate && python -m compileall tools/transcription_tools.py tests/tools/test_transcription_tools.py Tested: git diff --check Not-tested: Full tests/tools/test_transcription_tools.py under .[dev] only; existing faster_whisper optional dependency tests fail with ModuleNotFoundError. * Keep transcription tests independent of optional whisper install The transcription suite mocks faster-whisper directly, so a minimal test stub keeps the branch verifiable in environments where the optional package is not installed. This preserves the existing mock-based coverage without adding a dependency. Constraint: faster-whisper is an optional local STT dependency and is absent from the current validation environment Rejected: Install faster-whisper just for branch validation | would add heavyweight environment coupling outside the patch scope Confidence: high Scope-risk: narrow Directive: Keep this as a test-only stub unless production import semantics change Tested: pytest tests/tools/test_transcription_tools.py -q --------- Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>
…es_cron_deny The test set HERMES_YOLO_MODE=1 via monkeypatch.setenv, expecting check_dangerous_command() to honor yolo and bypass cron_mode=deny. But tools.approval._YOLO_MODE_FROZEN is intentionally frozen at module import time (security: prevents prompt-injection runtime escalation). When CI imports the module BEFORE the test sets the env, the frozen value stays False and the yolo bypass never activates. Local runs missed this because the conftest leaked a non-empty HERMES_YOLO_MODE into the import-time env. CI's clean-env path exposed the bug deterministically on test (3) / test (4) shards. Fix: patch the module attribute directly via mock.patch.object so the test simulates process-startup-with-yolo regardless of import order. The behavior under test (yolo bypasses cron_mode=deny for non-hardline commands) is unchanged; the security invariant (_YOLO_MODE_FROZEN can't be set at runtime by skills) is preserved. Reproduced locally with: env -i HOME=$HOME PATH=$PATH python3 -m pytest tests/tools/test_cron_approval_mode.py -o 'addopts=' -v Without the fix: 1 failed, 23 passed. With the fix: 24 passed.
…h continuation (NousResearch#31998) (NousResearch#32012) * fix(streaming): route mid-tool-call partial-stream-stub through length continuation (NousResearch#31998) When a stream stalls mid-tool-call (e.g. a large write_file), the partial-stream-stub recovery used finish_reason='stop' which caused the conversation loop to treat the turn as complete, returning only the warning text. When users said 'continue', the model retried the same large tool call, hit the same stale timeout, and looped indefinitely. Changes: - chat_completion_helpers.py: change _stub_finish_reason from 'stop' to 'length' for mid-tool-call partials. The stub still has tool_calls=None so no tool auto-executes — the model gets a fresh API call through the existing length-continuation machinery (bounded to 3 retries). Also attach _dropped_tool_names to the stub for downstream use. - conversation_loop.py: add a third continuation prompt branch for partial-stream-stubs with dropped tool calls. Instead of the generic 'continue where you left off' (which would retry the same large call), tell the model to break the output into smaller tool calls (~8K tokens each) to avoid stream timeouts. - test_partial_stream_finish_reason.py: update existing test from finish_reason='stop' to 'length', add _dropped_tool_names assertion, add new test_dropped_tool_call_uses_chunking_prompt for the 3-way prompt branching. Safety: tool_calls=None is preserved on the stub, so the conversation loop enters the text-continuation branch (line 1513), NOT the tool-call execution branch (line 3246). No tool auto-executes. The model simply gets another API call with targeted guidance. * refactor: extract constants and continuation prompt helper - Move magic strings to hermes_constants.py (PARTIAL_STREAM_STUB_ID, FINISH_REASON_LENGTH) - Extract _get_continuation_prompt() in conversation_loop.py — DRYs the 3-way prompt branching and lets tests import the real function - Trim verbose inline comments in chat_completion_helpers.py - Tests import constants + helper instead of duplicating logic --------- Co-authored-by: alt-glitch <balyan.sid@gmail.com>
text_to_speech_tool accepts an explicit output_path. Without a traversal guard, a path containing '..' components (whether prompt-injection- controlled, from a confused skill, or just a buggy caller) could escape its declared base and write the audio to a system location — e.g. `output_path='audio/../../etc/cron.d/x'` lands the file outside the intended audio cache. Reject '..' components in the user-supplied path. Explicit absolute paths are unchanged (the agent legitimately writes audio wherever the user/caller asks); only traversal-style escapes are blocked. The terminal tool can still write anywhere with approval — this just keeps the unattended TTS surface from materializing files via traversal. Regression tests cover: '..' in the middle (audio/../../etc/...), bare '..' prefix, and the negative cases (absolute paths + relative paths without '..' both pass through unchanged). Salvaged from PR NousResearch#6693 by @aaronlab. The original PR confined output to DEFAULT_OUTPUT_DIR-or-cwd, which broke 9 existing tests that legitimately write to tmp_path locations. The traversal-only check covers the actual threat (path-escape via '..' from prompt injection) without restricting where users can choose to write their audio. The remaining pieces of NousResearch#6693 (skill_commands rglob symlink rejection, delegate_tool batch prefix display) are dropped: - skill_commands rglob: breaks the documented design supporting ~/.hermes/skills/<name> as a symlink to a checked-out skill elsewhere (see comment at agent/skill_commands.py:73-75) - delegate_tool batch prefix: pure UX, doesn't belong in a security PR Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
_update_via_zip downloads a source ZIP from GitHub and calls zipfile.ZipFile.extractall. The existing zip-slip path guard validates each member's path stays under tmp_dir, but does not check member type — so a ZIP containing a symlink member would still be materialized by extractall, and a symlink target could point outside the extracted tree (or to a sensitive system path). This isn't a high-likelihood threat for hermes-agent's actual GitHub source ZIPs (we don't ship symlinks), but the extractall path runs as the user's account and a compromised mirror could plant arbitrary files via the symlink → target → write chain. Reject any member whose Unix mode bits (upper 16 bits of external_attr) are S_IFLNK before extractall. Hermes source ZIPs contain only regular files and directories; a symlink member is unambiguously suspicious. Regression tests cover: symlink member rejection (raises ValueError, caught by the outer try/except as a clean SystemExit, no extraction), and the happy-path verification that a normal ZIP doesn't trigger the symlink reject message. Salvaged from PR NousResearch#15881 by @codeblackhole1024. The remaining pieces of that PR were already on main or contradicted explicit design decisions: - config.yaml write-deny: already in agent/file_safety.py's control_file_names denylist (the modern guard); the proposed addition to build_write_denied_paths was the legacy path. - Quick commands danger detection: contradicts the explicit cli.py:8491-8492 comment 'shell=True is intentional: quick_commands are user-defined shell snippets from config.yaml — not agent/LLM controlled.' - Memory plugin shlex.split for dep checks: already on main (hermes_cli/memory_setup.py:133). Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
…ousResearch#32053) When the user runs OAuth on a remote/SSH machine without a port forward, the OAuth provider redirects to http://127.0.0.1:<port>/callback which only the listener on the remote machine can receive — the user's browser on another box just shows a connection error. _wait_for_callback() now races the HTTP listener against a stdin reader on interactive TTYs. The user can copy the URL from the browser's address bar after authorization (which contains code=...&state=...) and paste it back at the prompt. Whichever fills the result dict first wins; the HTTP listener remains the primary path for local sessions and SSH tunnels. Accepts any of: - Full local redirect URL: http://127.0.0.1:N/callback?code=...&state=... - Provider URL after redirect: https://mcp.linear.app/callback?code=...&state=... - Just the query string: ?code=...&state=... or code=...&state=... The paste thread only spawns when _is_interactive() is true, preserving the existing 'no input() in headless runs' invariant — verified by TestWaitForCallbackPasteIntegration.test_paste_prompt_NOT_shown_when_noninteractive. The SSH-session hint in _redirect_handler is updated to surface the paste option as the primary remedy, with ssh -L tunneling as the alternative.
…ousResearch#32060) The gateway's media delivery allowlist required files live inside `~/.hermes/cache/{documents,images,...}`, which is the wrong shape for real agent usage. Agents naturally produce artifacts via terminal tools (`pandoc -o /tmp/report.pdf`, `matplotlib savefig`, etc.) or write_file into project directories — these never land under the cache. Result: users got a raw file path in chat instead of an attachment. This is doubly bad in deployment shapes where the cache directories aren't writable by the agent at all: Hermes running in Docker with a read-only mount, or with a Docker/Modal/SSH terminal backend whose filesystem isn't the gateway host's filesystem. Layered trust model: 1. Cache-dir allowlist (unchanged) — Hermes-managed roots always trusted. 2. Operator allowlist — `HERMES_MEDIA_ALLOW_DIRS` env var, now also surfaced as `gateway.media_delivery_allow_dirs` in config.yaml. 3. Recency-based trust (new, default on) — files whose mtime is within `gateway.trust_recent_files_seconds` (default 600s) of "now" are trusted even outside the cache/operator allowlist. Old host files (`/etc/passwd`, `~/.bashrc`, `~/.ssh/id_rsa`) have mtimes measured in days/months, well outside the window — prompt-injection paths pointing at pre-existing files are still rejected. 4. Hard denylist — `/etc`, `/proc`, `/sys`, `/dev`, `/root`, `/boot`, `/var/{log,lib,run}`, plus `$HOME/.{ssh,aws,gnupg,kube,docker,config, azure,gcloud}` and `Library/Keychains`. Denylist blocks delivery even when recency would trust the file, in case an attacker somehow refreshes a sensitive file's mtime. Operators who want strict-allowlist behavior set `gateway.trust_recent_files: false` and the system reverts to pre-existing behavior. Tests: 6 new cases in test_platform_base.py cover the recency window, disabled mode, system-path denylist, and the motivating PDF-in-project scenario. 3 existing tests (test_platform_base, test_tts_media_routing, test_send_message_tool) that exercised the strict-allowlist path are updated to disable recency trust explicitly. E2E validation: real `validate_media_delivery_path()` accepts fresh PDFs in /tmp and project dirs, rejects /etc/passwd, ~/.ssh/id_rsa, and files older than the window; config.yaml `gateway.*` keys bridge correctly to the env vars the validator reads.
The chatgpt.com/backend-api/codex endpoint has an intermittent failure mode where it accepts the connection but never emits a single stream event — the socket just hangs. Direct sequential probing reproduces it (0 events, no HTTP status), and a fresh reconnect then succeeds in ~2s. Today the only guard is the wall-clock stale timeout in interruptible_api_call, so a dead-on-arrival connection is held for the full stale window (90-900s depending on context / config) before the retry loop can reconnect — minutes of wasted wall time per stall, at a rate of ~20% of calls during affected windows. Add a TTFB watchdog scoped to the codex_responses path: - codex_runtime.run_codex_stream stamps agent._codex_stream_last_event_ts on *every* stream event (not just output-text deltas), so reasoning-only and tool-call-only turns are not mistaken for a stall. - interruptible_api_call resets that marker before the worker starts and, while it is still None, kills the connection once elapsed exceeds the TTFB cutoff (default 45s, tunable via HERMES_CODEX_TTFB_TIMEOUT_SECONDS, 0 disables). The raised TimeoutError flows through the existing retry path unchanged. Once any event has arrived the stream is healthy and only the existing wall-clock stale timeout applies, so legitimate long generations are never interrupted. Gated to codex_responses; the chat_completions non-stream, anthropic and bedrock branches have no first-event signal and are untouched. Adds tests/agent/test_codex_ttfb_watchdog.py covering the stall kill, the events-flowing pass-through, and the env-disable path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ousResearch#32061) The CLI status bar tracked /background agent tasks (▶ N) but not shell processes spawned via terminal(background=true). Both kinds of work can run concurrently and a user has no in-bar signal for shell processes. Add an independent indicator (⚙ N) sourced from tools.process_registry.process_registry._running. The two indicators render side-by-side when both are active (▶ 1 │ ⚙ 2), hidden when their count is zero. Renders at all four status-bar tiers (text fallback + prompt_toolkit fragments, narrow + wide widths). The narrow <52 tier still drops both for space — unchanged. New ProcessRegistry.count_running() returns len(_running) without acquiring _lock; CPython dict len is atomic and we're polling on every status-bar tick, so lock-free is the right tradeoff.
…MCP OAuth (NousResearch#32067) Follow-up to NousResearch#32053. The OAuth-over-SSH guide and the MCP feature page previously only covered xAI and Spotify. Now that MCP servers can complete OAuth via stdin paste-back on remote/headless hosts, document it. oauth-over-ssh.md: - Add MCP servers to the 'Which Providers Need This' table. - New 'MCP Servers' section covering: paste-back (no setup, works anywhere), SSH port forward (same pattern as xAI/Spotify), and the 30s config-auto-reload race pitfall (use 'hermes mcp login <server>' from a fresh terminal instead of editing config from inside a running session). mcp.md: - New 'OAuth-authenticated HTTP servers' section under HTTP servers, covering auth: oauth config, token cache path, paste-back vs SSH tunnel for headless hosts, and the same reload-race pitfall. - Cross-links to the OAuth-over-SSH guide anchor.
… disabling server (NousResearch#32069) When an MCP server triggers OAuth at startup, the user can now type 'skip' (or 'cancel', 's', 'n', 'no', 'q', 'quit') at the paste prompt + Enter to exit the flow cleanly and continue agent startup without that server. Previously the only ways to bypass an unwanted OAuth prompt were: - Wait the full 5-minute paste timeout - Ctrl+C (also kills the whole reload, may leave half-state) - Edit config.yaml to set 'enabled: false' on the server Skip writes a sentinel to result['error'] which _wait_for_callback maps to OAuthNonInteractiveError('user_skipped'). mcp_tool already classifies that as an auth error in _is_auth_error() and the reconnect loop logs it as 'not retrying automatically' — server stays disconnected for the session, other MCP servers continue normally, no infinite retry burn. The skip message tells users how to re-auth later ('hermes mcp login') or disable persistently ('enabled: false'), so they don't have to remember. 14 new tests covering: case-insensitive skip parsing, all 7 skip tokens, skip not stomping an HTTP-listener win, skip routed to skip path rather than URL-parse path, sentinel mapped to OAuthNonInteractiveError, prompt mentions the skip option.
The TUI frontend's slash command registry shadowed /queue's 'q' alias with /quit's 'q' alias. Since /quit appeared later in the registry, the flat lookup kept the later entry, making /q always quit instead of queueing a prompt. This mirrors the backend fix in PR NousResearch#10538 (hermes_cli/commands.py) but applies the same correction to the TUI TypeScript registry. Fixes NousResearch#10467
…rch#31983) Adapted from @hclsys's test in PR NousResearch#31985. Asserts findSlashCommand('q') resolves to the queue command, not quit.
…ousResearch#32072) All four failures were broken by the security cluster (NousResearch#10082 / NousResearch#10133 / NousResearch#4609 / symlink-reject batch) merging on May 25. They were red on origin/main HEAD when NousResearch#32042 and NousResearch#32061 ran, gating PRs that touched unrelated code. 1) tests/hermes_cli/test_update_zip_symlink_reject.py test_update_via_zip_accepts_normal_member called the real _update_via_zip without sandboxing PROJECT_ROOT — so the function's shutil.copytree() actually copied the fake README from the test ZIP over the real repo's README.md, which then made test_readme_mentions_powershell_installer fail in any test run that happened to pick this test up earlier. Mock PROJECT_ROOT to an isolated tmp_path / install_dir, stub subprocess so pip/uv reinstall doesn't actually run, and assert the fake README lands in the sandbox (not the real tree). 2) tests/tools/test_windows_native_support.py test_readme_mentions_powershell_installer was the victim of (1) — nothing wrong with the test itself, the fix in (1) clears it. 3) tests/tools/test_file_read_guards.py test_proc_fd_other_not_blocked called _is_blocked_device('/proc/self/fd/3') expecting False. But _is_blocked_device runs realpath() and on pytest xdist workers fd 3 happens to be dup'd to /dev/urandom (because the worker subprocess inherits open fds from pytest's collection pipe machinery). Switch to the lower-level _is_blocked_device_path which is the path-pattern check the test actually means to exercise; realpath-resolution coverage already lives in test_symlink_to_blocked_device_is_blocked. 4) tests/tools/test_transcription_tools.py Module installed a faster_whisper stub via sys.modules without setting __spec__, then later @pytest.mark.skipif called importlib.util.find_spec('faster_whisper') which raises 'ValueError: __spec__ is None' for modules with a None spec attr. Set __spec__ on the stub to a real ModuleSpec. Validation: 195/195 green across the 4 affected files.
…arch#31845) Aux callers (title generation, vision, session search, etc.) can reach resolve_provider_client() without an explicit model when the user picked their main provider via 'hermes model' and didn't bother configuring a per-task auxiliary.<task>.model override. The expectation in that case is universal: 'use my main model for side tasks too.' Before, the OAuth providers (xai-oauth, openai-codex) silently returned (None, None) on an empty model — both lack a catalog default because their accepted-model lists drift on the backend. That caused _resolve_auto to drop to its Step-2 fallback chain (OpenRouter / Nous / etc.), so aux tasks billed against the wrong subscription without warning. The fix is at the top of resolve_provider_client() — a single 3-step universal fallback that runs before any provider branch, so no provider-specific empty-model guards are needed (now or for any future provider we add): 1. caller-passed model (caller knew what they wanted) 2. provider's catalog default (cheap aux model, if registered) 3. user's main model from config.yaml Behaviour by provider class: - OAuth providers (xai-oauth, openai-codex) — no catalog default, so step 3 applies. Title gen runs on grok-4.3 / gpt-5.4 against the user's actual subscription instead of leaking to OpenRouter. - API-key providers (anthropic, gemini, kimi-coding, etc.) — catalog default wins at step 2, preserving the original 'cheap aux model' behaviour. Anthropic users still get claude-haiku-4-5 for titles, not opus. - Explicit-model callers (auxiliary.<task>.model config, programmatic callers) — caller wins at step 1, no surprise switching. Salvaged from @wysie's PR NousResearch#31845 which fixed the xai-oauth branch specifically. The universal shape supersedes the per-branch fix and covers openai-codex (same bug class) plus any future OAuth providers. 4 new tests in TestResolveProviderClientUniversalModelFallback: - empty_model_for_oauth_provider_falls_back_to_main_model - empty_model_for_codex_also_uses_main_model - empty_model_for_catalog_provider_uses_catalog_default - explicit_model_takes_precedence_over_fallbacks 365/365 across tests/agent/test_auxiliary_*, tests/run_agent/test_codex_xai_oauth_recovery.py, tests/hermes_cli/test_auth_xai_oauth_provider.py, and tests/hermes_cli/test_plugin_auxiliary_tasks.py. Co-authored-by: wysie <wysie@users.noreply.github.com>
Merged 2584 upstream commits into fork. Auto-resolved merge conflicts by accepting upstream changes.
🔎 Lint report:
|
| Rule | Count |
|---|---|
invalid-argument-type |
40 |
unresolved-import |
36 |
unsupported-operator |
14 |
unresolved-attribute |
9 |
invalid-method-override |
4 |
invalid-assignment |
3 |
not-subscriptable |
2 |
no-matching-overload |
1 |
not-iterable |
1 |
First entries
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `dict[str, Any]`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `list[dict[str, Any]]`, found `str | bool`
tests/hermes_cli/test_tts_picker.py:105: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["bogus"]` and value of type `_NoName` on object of type `dict[str, TTSProvider]`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `list[str]`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `IterationBudget`, found `str | bool`
run_agent.py:963: [unresolved-attribute] unresolved-attribute: Object of type `Self@_codex_silent_hang_hint` has no attribute `model`
tests/tools/test_transcription_plugin_dispatch.py:176: [invalid-argument-type] invalid-argument-type: Argument to `_FakeProvider.__init__` is incorrect: Expected `dict[Unknown, Unknown] | None`, found `Literal["weird string"]`
gateway/platforms/api_server.py:2582: [invalid-argument-type] invalid-argument-type: Argument to function `create_job` is incorrect: Expected `list[str] | None`, found `Unknown | LiteralString | dict[str, str]`
tests/tools/test_memory_tool.py:117: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["deception_hide"]` and `str | None`
tests/cli/test_destructive_slash_inline_skip_e2e.py:126: [invalid-argument-type] invalid-argument-type: Argument to bound method `HermesCLI.process_command` is incorrect: Expected `HermesCLI`, found `SimpleNamespace`
tests/cli/test_bracketed_paste_timeout.py:45: [invalid-argument-type] invalid-argument-type: Argument to function `exec` is incorrect: Expected `str | Buffer | CodeType`, found `str | None`
tests/run_agent/test_codex_silent_hang_hint.py:13: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/gateway/test_telegram_topic_mode.py:1229: [invalid-argument-type] invalid-argument-type: Argument to bound method `SessionDB.bind_telegram_topic` is incorrect: Expected `str`, found `str | None`
tests/tools/test_memory_tool.py:172: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["hermes_config_mod"]` and `str | None`
tests/agent/test_tts_registry.py:282: [invalid-argument-type] invalid-argument-type: Argument to function `resolve_output_format` is incorrect: Expected `str | None`, found `Literal[123]`
tests/tools/test_memory_tool.py:112: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["hidden_div"]` and `str | None`
tests/agent/test_non_stream_stale_timeout.py:38: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `str`, found `str | bool`
tests/cli/test_resume_quiet_stderr.py:16: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/tools/test_memory_tool.py:164: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["agent_config_mod"]` and `str | None`
tests/tools/test_memory_tool.py:107: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["html_comment_injection"]` and `str | None`
tests/tools/test_memory_tool.py:102: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["translate_execute"]` and `str | None`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int | float | None`, found `str | bool`
plugins/platforms/mattermost/adapter.py:379: [invalid-method-override] invalid-method-override: Invalid override of method `send_voice`: Definition is incompatible with `BasePlatformAdapter.send_voice`
tests/agent/test_non_stream_stale_timeout.py:38: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int`, found `str | bool`
run_agent.py:952: [unresolved-attribute] unresolved-attribute: Object of type `Self@_codex_silent_hang_hint` has no attribute `api_mode`
... and 85 more
✅ Fixed issues (11):
| Rule | Count |
|---|---|
invalid-argument-type |
5 |
invalid-method-override |
4 |
unresolved-import |
2 |
First entries
gateway/platforms/api_server.py:2520: [invalid-argument-type] invalid-argument-type: Argument to function `create_job` is incorrect: Expected `dict[str, Any] | None`, found `Unknown | LiteralString`
gateway/platforms/mattermost.py:867: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `list[str]`, found `(list[str] & ~AlwaysFalsy) | None`
gateway/platforms/mattermost.py:352: [invalid-method-override] invalid-method-override: Invalid override of method `send_image_file`: Definition is incompatible with `BasePlatformAdapter.send_image_file`
gateway/platforms/mattermost.py:379: [invalid-method-override] invalid-method-override: Invalid override of method `send_voice`: Definition is incompatible with `BasePlatformAdapter.send_voice`
gateway/platforms/mattermost.py:365: [invalid-method-override] invalid-method-override: Invalid override of method `send_document`: Definition is incompatible with `BasePlatformAdapter.send_document`
gateway/platforms/matrix.py:243: [unresolved-import] unresolved-import: Cannot resolve imported module `mautrix`
gateway/platforms/api_server.py:2520: [invalid-argument-type] invalid-argument-type: Argument to function `create_job` is incorrect: Expected `bool`, found `Unknown | LiteralString`
gateway/platforms/api_server.py:2520: [invalid-argument-type] invalid-argument-type: Argument to function `create_job` is incorrect: Expected `int | None`, found `Unknown | LiteralString`
gateway/platforms/mattermost.py:392: [invalid-method-override] invalid-method-override: Invalid override of method `send_video`: Definition is incompatible with `BasePlatformAdapter.send_video`
gateway/platforms/api_server.py:2520: [invalid-argument-type] invalid-argument-type: Argument to function `create_job` is incorrect: Expected `list[str] | None`, found `Unknown | LiteralString`
gateway/platforms/mattermost.py:809: [unresolved-import] unresolved-import: Cannot resolve imported module `aiohttp`
Unchanged: 4824 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Daily sync with upstream. Auto-created by cron job.
Commits: 2584 new upstream commits merged.
Recent upstream commits
dbe5d84 fix(auxiliary): universal main-model fallback for aux tasks (NousResearch#31845)
46c1ae8 fix(tests): four pre-existing flakes from the security cluster merge (NousResearch#32072)
f5bb595 chore(release): map 8bit64k + hclsys in AUTHOR_MAP
85a0b34 test(tui): regression test for /q alias resolving to queue (NousResearch#31983)
064ac28 fix(tui): remove 'q' alias from /quit, add to /queue
8191f66 feat(mcp-oauth): accept 'skip' at paste prompt to bypass auth without disabling server (NousResearch#32069)
bdf3696 docs(mcp-oauth): document paste-back flow and SSH options for remote MCP OAuth (NousResearch#32067)
1c3c364 feat(cli): show live background terminal-process count in status bar (NousResearch#32061)
2b16de0 chore(release): map adam91holt for PR NousResearch#31984 salvage
8601c4d fix(codex): add time-to-first-byte watchdog for stalled Codex streams
a989a79 fix(gateway): allow native delivery of freshly-produced agent files (NousResearch#32060)
0ff7c09 feat(mcp-oauth): stdin paste-back fallback for headless OAuth flow (NousResearch#32053)
e9119e0 chore(release): map dsr-restyn + WuKongAI-CMU + codeblackhole1024 for S04 cluster
bd2756d fix(update): reject symlink members in update ZIP
5f20322 fix(tts): reject '..' traversal in output_path
ac5359a fix(streaming): route mid-tool-call partial-stream-stub through length continuation (NousResearch#31998) (NousResearch#32012)
46d8b5d fix(profile): reject symlinks in distributions (NousResearch#25292)
0d55315 fix(backup): skip symlinked files in zip archives (NousResearch#25289)
79799c8 test(approval): patch _YOLO_MODE_FROZEN directly in test_yolo_overrides_cron_deny
95848b1 fix(transcription): reject symlinked audio inputs (NousResearch#10082)