chore: sync upstream NousResearch/hermes-agent (727 commits, v0.14.0)#10
chore: sync upstream NousResearch/hermes-agent (727 commits, v0.14.0)#10Wizarck wants to merge 727 commits into
Conversation
Salvages the substantive part of NousResearch#22295 by @steezkelly. Adds the missing HERMES_KANBAN_HOME, HERMES_KANBAN_RUN_ID, HERMES_KANBAN_CLAIM_LOCK, HERMES_KANBAN_DISPATCH_IN_GATEWAY entries to _HERMES_BEHAVIORAL_VARS so ambient developer-shell pins on those vars don't bleed into pytest runs. The frozenset extraction + standalone regression test from the original PR were dropped to keep the change minimal — main already maintains the list inline.
Salvages NousResearch#22981 by @SimbaKingjoe. Adds 'kanban.max_in_progress' config that caps simultaneously running tasks. When the board already has N running, dispatcher skips spawning so slow workers (local LLMs, resource-constrained hosts) don't pile up and time out. Threads through dispatch_once(max_in_progress=) and gateway dispatcher config parsing with validation (warns on invalid/below-1 values).
Salvages NousResearch#23738 by @LeonSGP43. Wheel installs were missing skills/ and optional-skills/ because pyproject's [tool.setuptools.packages.find] only includes Python packages — the skills directories don't have __init__.py so they were silently dropped from the wheel. Adds setup.py with data_files spec emitting skills/* and optional-skills/* under hermes_agent-<v>.data/data/, and a get_bundled_skills_dir() helper in hermes_constants that discovers the wheel-installed location via sysconfig before falling back to a source-checkout path. tools/skills_sync uses the helper so 'hermes update' works for pip-installed users.
Salvages NousResearch#23302 by @Bartok9. Four independent one-area fixes: 1. kanban boards delete alias now hard-deletes (not archives) — the alias didn't carry --delete, so getattr(args, 'delete', False) returned False. Detect boards_action=='delete' explicitly. 2. Gateway auto-title failures no longer leak as user-visible warnings — debug-log only since they're not actionable. 3. Background process completion notification snaps truncation to the next newline boundary, prepends a marker when content is dropped. 4. _cprint() schedules the run_in_terminal coroutine via asyncio.ensure_future so output isn't silently dropped from background threads (fixes NousResearch#23185 Bug A). Skips the double-print fallback that would fire for mock paths.
Salvages NousResearch#24402 by @RyanRana. The KANBAN_GUIDANCE block (~835 tokens) is session-static — the dispatcher decides at spawn time whether the process is a kanban worker via the kanban_show tool's check_fn (gated on HERMES_KANBAN_TASK env var). Re-checking 'kanban_show' in valid_tool_names and re-loading the reference on every system-prompt rebuild (init + each context compression) is wasted work. Caches the resolved string on agent._kanban_worker_guidance once in agent_init and consumes it in system_prompt.build_system_prompt(), with a getattr fallback for code paths that bypass agent_init.
Salvages NousResearch#25745 by @LizerAIDev. Adds --sort {created,created-desc, priority,priority-desc,status,assignee,title,updated} to 'hermes kanban list'. Validated against VALID_SORT_ORDERS map; invalid values raise ValueError. Default behaviour (priority DESC, created ASC) is unchanged when --sort is omitted.
… inspect)
Adds three read-only endpoints to the kanban dashboard plugin so the
SwitchUI workspace (and any other dashboard consumer) can track
workers across tasks without N+1 round-trips through /tasks/{task_id}.
- GET /workers/active
Single SQL JOIN of task_runs + tasks where ended_at IS NULL,
worker_pid IS NOT NULL, status='running'. Returns
{workers: [...], count, checked_at}.
- GET /runs/{run_id}
Direct lookup of any task_run row by id. Reuses existing
kanban_db.get_run() helper and _run_dict() serialiser. 404 when
not found. Mirrors GET /tasks/{task_id} 404 shape.
- GET /runs/{run_id}/inspect
Live PID stats via psutil.Process.as_dict() — cpu_percent,
memory_rss_bytes, memory_vms_bytes, num_threads, num_fds, status,
create_time, cmdline. Short-circuits with alive:false when run
has ended, has no worker_pid, the pid is gone, or psutil is
unavailable. AccessDenied surfaces as alive:true with error
rather than a 500.
11 new tests in tests/plugins/test_kanban_worker_runs.py cover the
empty-board case, running-task case, ended-run filtering,
missing-pid filtering, 404 paths, already-ended inspect, no-pid
inspect, dead-pid inspect, and live-pid inspect (psutil mocked).
All pass.
Companion termination endpoint (POST /runs/{run_id}/terminate) is
intentionally out of scope here — opening a separate issue first
since the RBAC and dispatcher-mediated soft-cancel design needs
maintainer input before code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ng (NousResearch#26744) - Existing ``test_patch_drag_drop_move_todo_to_ready`` now asserts the enriched 409 detail names the blocking parent (id, quoted title, and current status), so the dashboard always has something actionable to render. - New bundle-assertion test ``test_dashboard_surfaces_ready_blocked_error_inline`` pins the frontend wiring: the ``parseApiErrorMessage`` helper exists, the drag/drop banner runs through it, and the drawer maintains a visible ``patchErr`` state that's cleared between PATCHes and tasks.
…usResearch#27941) Update the Codex app-server runtime guide's Kanban section to reflect the new behaviour: * The sandbox override now adds the board DB directory plus every Kanban path the dispatcher pinned (HERMES_KANBAN_WORKSPACES_ROOT, HERMES_KANBAN_WORKSPACE, legacy HERMES_KANBAN_ROOT) -- deduplicated, DB-dir first. * The motivation note now includes the cross-mount artifact-write scenario (e.g. ``/media/.../kanban-workspaces/...`` on a separate drive) and links to issue NousResearch#27941 so readers can find the original bug report.
Salvages substantive part of NousResearch#26490 by @aqilaziz. Detects corrupt board DBs ("file is not a database" / "database disk image is malformed") and disables them by fingerprint until they're repaired, instead of flooding the gateway log with repeated logger.exception tracebacks every tick. Cherry-picked the substantive commit (ea5b4ec); the tip commit was an unrelated _is_dir OSError fix for service-path lookup. Dropped a small test reformat that was bundled in the same commit.
Salvages NousResearch#28199 by @bensargotest-sys. Aligns Kanban docs with current tool registration: dispatcher-spawned task workers get task tools, profiles that explicitly enable the kanban toolset get orchestrator routing tools (kanban_list, kanban_unblock). Corrects failure-limit text to current default of 2. Hardens the e2e subprocess script to resolve repo root and use the spawnable default assignee. Updates the diagnostics severity fixture to assert error below the critical threshold.
Salvages NousResearch#26897 by @loicnico96. The per-task model_override DB column already exists on main, but it wasn't exposed in user-facing surfaces. This adds: - 'kanban show' prints 'model: <name>' when model_override is set - kanban_show / kanban_list tool responses include the model_override field Original branch was stale (PR was authored against an older field name 'model'); applied the substantive surface exposure manually using the current 'model_override' field name.
Salvages NousResearch#26791 by @Niraven. Adds 'hermes kanban swarm' to create a durable Kanban Swarm v1 graph: a completed root/blackboard card, parallel worker cards, a verifier gated on all workers, and a synthesizer gated on the verifier. Stores shared swarm blackboard updates as structured JSON comments on the root card. Self-contained: new hermes_cli/kanban_swarm.py module + CLI wiring + unit tests.
Salvages NousResearch#27598 by @nnnet. Adds optional 'board' parameter to all 9 kanban_* MCP tools via shared _connect helper. Backwards compatible — omitting board keeps current pinned-board behavior. Useful for orchestrator profiles that route across multiple boards. Two-file scope: tools/kanban_tools.py + tests.
Salvages NousResearch#23208 by @awizemann. Tracks which chat session created a kanban task so clients can render a per-session board without falling back to tenant + time-window heuristics. - Schema: tasks gains nullable session_id TEXT column with index (additive migration in _migrate_add_optional_columns). - ACP: server.py exposes the originating session id via HERMES_SESSION_ID with save/restore around the agent loop. - Tool: kanban_create reads HERMES_SESSION_ID (with explicit override). - CLI: 'hermes kanban list --session <id>' filter; JSON output exposes session_id.
…olumn Salvages NousResearch#23772 by @thewillhuang. Adds 'review' as a valid kanban task status and extends dispatch_once to monitor the review column as a second dispatch source (in addition to the existing ready column). - Adds 'review' to VALID_STATUSES - Adds claim_review_task() — atomically transitions review → running - Adds has_spawnable_review() — health telemetry mirror - Extends dispatch_once with a review column dispatch loop - Review agents get 'sdlc-review' skill auto-loaded Resolved 2 conflicts (VALID_STATUSES merge with main's 'scheduled' state, test file additions). Adapted claim_review_task to main's ttl_seconds: Optional[int] = None convention (matches claim_task).
Salvages NousResearch#23790 by @thewillhuang. Adds detect_stale_running() to the dispatcher cycle. Running tasks that have been started for longer than dispatch_stale_timeout_seconds (default 14400 = 4h) without a heartbeat in the last hour are auto-reclaimed to ready. - New config kanban.dispatch_stale_timeout_seconds (default 14400, 0 disables) - New 'stale' field on DispatchResult - detect_stale_running() in kanban_db.py with heartbeat freshness check - Records outcome='stale' on run close + 'stale' event; ticks failure counter - Wires config through gateway embedded dispatcher - Updates _cmd_dispatch verbose/JSON output and daemon logging Resolved test-file end-of-file conflict by appending both halves.
Salvages NousResearch#26745 by @nehaaprasaad. Exposes filtering for the existing workflow_template_id and current_step_key columns: - list_tasks() accepts workflow_template_id and current_step_key kwargs - 'hermes kanban list' adds matching CLI flags - dashboard plugin_api also exposes the filters Resolved a small conflict in list_tasks signature alongside main's session_id and order_by additions; combined all three into the single filter list.
Salvages NousResearch#27484 by @fardoche6. Adds a respawn guard that skips worker spawn for tasks where: - a recent run already succeeded (recent_success — within guard window) - the previous run hit a quota/auth error (blocker_auth, also auto-blocks) - a recent task comment includes a GitHub PR URL (active_pr) The guard prevents repeat worker storms on the same bug/task. Includes the contributor's review-findings fixup (regex hardening, observability, auth coverage). Resolved a small DispatchResult conflict alongside main's 'stale' field; kept both. Authorship preserved via rebase merge.
Salvages NousResearch#27568 by @SerenityTn. Dashboard cron page now lists cron jobs from all profiles, with profile-aware filter UI and storage routing. Includes test coverage for cross-profile listing, mutation, deletion, and validation. Also fixes orphan conflict markers in config.py left by an earlier salvage merge (kanban.dispatch_stale_timeout_seconds was double-nested in HEAD/PR markers from NousResearch#28452 salvage of NousResearch#23790).
…rch#28458) PR NousResearch#28452 (salvage of NousResearch#23790, stale detection) merged with leftover git conflict markers in hermes_cli/config.py around the `dispatch_stale_timeout_seconds` config block, breaking config import and any code path that loads it. Cleans up the markers and keeps both config blocks (worker log rotation/orchestrator + stale detection). Resolves a self-introduced regression.
…rch#28459) PR NousResearch#28454 (salvage of NousResearch#26745, workflow filter) merged with leftover git conflict markers in hermes_cli/kanban.py at three sites: - _task_to_dict() (session_id alongside workflow_template_id/current_step_key) - p_list parser (--sort alongside --workflow-template-id/--step-key) - _cmd_list (order_by alongside the new filter kwargs) Cleans up the markers and keeps both halves at each site. Resolves a self-introduced regression.
Salvages NousResearch#26496 by @aqilaziz. Adds branch_name column + CLI flag so tasks with workspace_kind='worktree' can pin a target branch on create. Schema migration added to _migrate_add_optional_columns. - Task.branch_name field + DB column + migration - create_task accepts branch_name kwarg - hermes kanban create --branch <name> flag - kanban show output includes 'Branch: <name>' when set Cherry-picked the substantive commit (a7558cf); the PR's tip was an unrelated service-path-dirs commit. Resolved 2 INSERT-column-list and show-output conflicts alongside main's session_id and max_runtime_seconds additions; kept all three.
…NousResearch#28373) Skill bundles are tiny YAML files in ~/.hermes/skill-bundles/ that group several skills under one slash command. Invoking /<bundle-name> from any surface (CLI, TUI, dashboard, any gateway platform) loads every referenced skill into a single combined user message. Use cases: - /backend-dev → loads github-code-review + test-driven-development + github-pr-workflow as one bundle. - /research → loads several research skills together. - Team task profiles shared via dotfiles. Behavior: - Bundles take precedence over individual skills when slugs collide. - Missing skills are skipped with a note, not fatal. - No system-prompt mutation — bundles generate a fresh user message at invocation time, the same way /<skill> does. Prompt cache stays intact. - Works in CLI dispatch, gateway dispatch, autocomplete (CLI + TUI), /help display. Schema (~/.hermes/skill-bundles/<slug>.yaml): name: backend-dev description: Backend feature work. skills: - github-code-review - test-driven-development instruction: | Optional extra guidance prepended to the loaded skills. New module: agent/skill_bundles.py — load, scan, resolve, build invocation message, save, delete. yaml.safe_load only; broken bundles log a warning and are skipped, never raise. New CLI subcommand: hermes bundles {list,show,create,delete,reload}. Implementation in hermes_cli/bundles.py; wired in hermes_cli/main.py. 'bundles' added to _BUILTIN_SUBCOMMANDS so plugin discovery skips it. New in-session slash command: /bundles lists installed bundles in both CLI and gateway. /<bundle-name> dispatch added to CLI (cli.py) and gateway (gateway/run.py) before the existing /<skill-name> path. Autocomplete: SlashCommandCompleter gained an optional skill_bundles_provider parameter that defaults to None — the prompt shows '▣ <description> (N skills)' for bundles vs '⚡' for skills. Tests: - tests/agent/test_skill_bundles.py — 33 tests covering slugify, scan/cache freshness, resolve (including underscore→hyphen Telegram alias), build_bundle_invocation_message (loading, missing skills, user/bundle instruction injection, dedup), save/delete, reload diff, list sort. - tests/hermes_cli/test_bundles.py — 8 tests for the CLI subcommand (create/list/show/delete/reload, --force, missing bundle errors). - tests/gateway/test_bundles_command.py — 4 tests for the gateway handler and bundle resolution priority. Live E2E: verified subprocess invocations of hermes bundles {list,create,show,reload,delete} round-trip correctly against an isolated HERMES_HOME. Docs: - website/docs/user-guide/features/skills.md — new 'Skill Bundles' section with quick example, YAML schema, management commands, behavior notes. - website/docs/reference/cli-commands.md — 'hermes bundles' added to the top-level command table and given its own subcommand section.
Salvages NousResearch#24533 by @roycepersonalassistant. Adds a first-class 'scheduled' Kanban status for time-delay follow-ups that aren't waiting on human input. - hermes kanban schedule <task_id> [reason] CLI command - Dashboard/API transitions to/from Scheduled - unblock_task() now releases both 'blocked' AND 'scheduled' tasks (re-checking parent dependencies before moving to ready/todo) - i18n + docs updates Resolved conflicts: kept HEAD's failure-counter reset on unblock alongside the PR's scheduled state, kept HEAD's 'running' direct-set rejection, combined both bulk-status branches. Dropped the dist/ bundle changes (months-stale; would need rebuild from source).
Salvages NousResearch#28125 by @Jpalmer95. Adds: - Drag-to-delete trash zone in the kanban dashboard - Bulk delete endpoint with cascading delete_task cleanup - Frontend updates (drag visual + drop handler) - Confirmation prompt before delete Resolved end-of-file test conflict by appending both halves.
Salvages NousResearch#21823 by @pochi-gio. Adds Korean (ko) Docusaurus locale and translates Kanban documentation (kanban.md, kanban-tutorial.md) and the two related skills (devops-kanban-orchestrator, devops-kanban-worker). Purely additive — adds ko to the locales list in docusaurus.config.ts and creates the website/i18n/ko/ tree.
…nges (NousResearch#28465) - aux_config: drop session_search from _AUX_TASKS and remove stale test (PR NousResearch#27590 removed auxiliary.session_search from DEFAULT_CONFIG) - compression_boundary_hook: set compressor._last_compress_aborted=False on MagicMock so the post-compress abort branch (PR NousResearch#28117) doesn't short-circuit before the session-id rotation under test - kanban_dashboard_plugin: use consecutive_failures=3 so severity stays 'error' (failure_threshold default dropped from 3 to 2 in d9fef0c, so failures=5 now crosses the critical floor of 2*2=4) - cli_manual_compress: accept force kwarg on DummyAgent._compress_context (cli._manual_compress now passes force=True)
Only caller was the removed _save_session_log. Also removes the unused convert_scratchpad_to_think and has_incomplete_scratchpad imports from run_agent.py (both still used elsewhere via their own imports).
…tespace Adds TestNoSessionJsonSnapshot to lock the contract that session_log_file attribute, _save_session_log method, and the per-session JSON snapshot writer are gone. logs_dir is retained for request_dump_*.json. Also cleans up stray trailing whitespace in test_run_agent_codex_responses introduced when the _save_session_log stub line was deleted.
The email "jonny@nousresearch.com" belongs to @yoniebans (GitHub id 5584832, display name "jonny"), not to Jeffrey Quesnelle (@jquesnelle, id 687076, who commits as emozilla@nousresearch.com). Verified across all 60 historical commits on the repo authored from this email — every one of them was a yoniebans commit being mis-credited to jquesnelle in the changelog. Surfaced while salvaging PR NousResearch#29182 (yoniebans's session-log refactor).
PR NousResearch#29182 deleted the per-session JSON snapshot writer outright because state.db is canonical and the snapshots had no in-tree consumer. Some users have external tooling that reads `~/.hermes/sessions/session_{sid}.json` directly, so reintroduce the writer behind a config flag that defaults to off. - Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG - Restore `AIAgent._save_session_log` + `_clean_session_content` as gated methods. When the flag is off the call is a fast no-op; when on, the writer behaves as before (atomic write, truncation guard preserved, REASONING_SCRATCHPAD → think tag normalization) - Re-derive the target path from `agent.session_id` on each call so `/branch` and `/compress` re-points happen automatically — no need to restore the explicit re-point bookkeeping at call sites - Wire the single call site in `_persist_session` (the cleanup-on-exit hook). Did NOT restore the 7 intra-turn calls the original PR deleted — those were redundant writes within the same turn that doubled disk I/O without adding any persistence guarantee `_persist_session` does not already provide - Read the flag once at agent init via `load_config()`, cache as `agent._session_json_enabled` - Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn` to pin behavior: default off (no file), opt-in true (file written), no-op method on default agents, logs_dir retained unconditionally - Update CONTRIBUTING.md and the bundled `hermes-agent` skill to document the flag and its default
…ousResearch#29426) `splitReasoning()` strips paired `<think>…</think>` blocks first, then runs an unclosed-trailing regex to catch reasoning that hasn't yet streamed its closer. That second regex was unanchored and greedy: new RegExp(`<${tag}>([\\s\\S]*)$`, 'i') So any literal `<think>` somewhere in prose — a model quoting the tag, a code example, or a stream-mid-tag before the closer arrives — consumed every paragraph after it to EOF. User-visible symptom: "TUI eats last paragraph of output," both during streaming and on settled turns. Real reasoning streams always lead the message (that's the only place an unclosed opener can legitimately appear during streaming). Anchor the regex to `^\s*` so mid-prose mentions of the tag are preserved. Empirical repro before the fix: splitReasoning('final answer paragraph one.\n\n<think>internal note\n\nfinal answer paragraph two.') → text: 'final answer paragraph one.' ← paragraph two GONE After: → text: 'final answer paragraph one.\n\n<think>internal note\n\nfinal answer paragraph two.' Updated the existing trailing-unclosed test to lead with `<think>` (the real-world shape) and added a regression test pinning the mid-text case. ui-tui type-check clean, 808/808 vitest pass.
NousResearch#28975) Bumps [ws](https://github.com/websockets/ws) from 8.20.0 to 8.20.1. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](websockets/ws@8.20.0...8.20.1) --- updated-dependencies: - dependency-name: ws dependency-version: 8.20.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…h#28889) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.5.6 to 7.6.0. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/protobufjs-v7.6.0/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.5.6...protobufjs-v7.6.0) --- updated-dependencies: - dependency-name: protobufjs dependency-version: 7.6.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [idna](https://github.com/kjd/idna) from 3.11 to 3.15. - [Release notes](https://github.com/kjd/idna/releases) - [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.md) - [Commits](kjd/idna@v3.11...v3.15) --- updated-dependencies: - dependency-name: idna dependency-version: '3.15' dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ault (NousResearch#29021) * fix(deps): bump pydantic to 2.13.4 to avoid pydantic-core thread segfault pydantic-core 2.41.5 (pulled by pydantic==2.12.5) segfaults when the OpenAI SDK's Responses API resource (client.responses.create / client.responses.stream) is exercised from a non-main threading.Thread. Hermes always dispatches codex_responses calls from a daemon thread in agent/chat_completion_helpers.py:_call, so the crash is 100% reproducible whenever the active provider is xai-oauth or openai-codex. Symptom: `hermes -z "ping"` (or any oneshot path) dies with SIGSEGV / exit 139 and zero output — hermes_cli/oneshot.py redirects stderr to /dev/null, hiding the crash. Bumping pydantic to 2.13.4 pulls in pydantic-core 2.46.4, which eliminates the crash. Verified end-to-end: `hermes -z "ping"` against xai-oauth/grok-4.3 now returns the expected response. Minimal repro (any OpenAI base_url; not xAI-specific): import threading from openai import OpenAI cli = OpenAI(api_key="sk-bogus", base_url="https://api.openai.com/v1") def go(): try: cli.responses.create(model="gpt-4o", input="ping") except BaseException as e: print(type(e).__name__) threading.Thread(target=go).start() # → SIGSEGV with pydantic-core 2.41.5; clean 401 with 2.46.4 * chore(deps): regenerate uv.lock for pydantic 2.13.4 bump
state.db is canonical. The 'use whichever source is longer' branch was defensive code for the pre-DB migration; on every real DB it has not fired (verified on a session corpus with 27 jsonl files / 950 sessions — zero jsonl-bigger cases). Test changes: - TestLoadTranscriptCorruptLines: deleted (tested dead JSONL code path) - TestLoadTranscriptPreferLongerSource: deleted (tested removed fallback) - Replaced with TestLoadTranscriptDBOnly (DB-only reads) - TestSessionStoreRewriteTranscript: fixture now creates DB session - test_gateway_retry_replaces_last_user_turn: fixture uses real DB
Yuanbao's recall feature was reading the gateway JSONL directly to look up messages by platform message_id, which state.db does not preserve. Migrated to use load_transcript() which returns DB messages. Recall branch A1 (message_id match) now falls through to A2 (content match) or B (system note) for all sessions — a documented degradation. Follow-up issue: add platform_message_id column to state.db messages to restore exact-id matching.
…te_transcript state.db is canonical. JSONL transcripts were a transition fallback; the fallback was removed in the previous commit. Existing *.jsonl files on disk are left untouched.
Mirror messages are persisted via _append_to_sqlite. JSONL writer was a redundant dual-write. Updated test assertions from JSONL file checks to SQLite mock verification.
…db writes Fixtures that instantiate SessionStore() trigger SessionDB() with no args, which resolves to ~/.hermes/state.db via the DEFAULT_DB_PATH module constant (snapshot of get_hermes_home() at hermes_state import time). The autouse _hermetic_environment fixture in tests/conftest.py monkeypatches HERMES_HOME env, but DEFAULT_DB_PATH is already cached by then. Per-test monkeypatch.setattr(hermes_state, 'DEFAULT_DB_PATH', tmp_path/'state.db') forces the DB into tmp_path so the tests can't leak into the real profile. Verified by counting u1-prefixed sessions in real state.db before/after: delta=0.
…fixture PR NousResearch#29211 review findings: 1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures. 2. yuanbao recall branch A1 (message_id exact match) was structurally dead once load_transcript() became DB-only — state.db never preserves the platform message_id. Removed the dead loop, consolidated to a single content-match branch (renamed 'A: content match'). Branch B (system note) unchanged. Updated the test name + docstring to reflect this. Note: self._lock is no longer taken in append_to_transcript (was guarding the JSONL file append). SQLite append_message handles its own concurrency via WAL mode, so this is safe; flagging for awareness.
… recall PR NousResearch#29211 dropped JSONL gateway transcripts and noted that the platform's own `message_id` field (used by Yuanbao's recall guard to redact a message by exact platform id) was no longer preserved — falling back to content-match. That fallback works for the common case but redacts the wrong row when two messages share text (or fails to match when content is post-processed). Restore exact-id matching by giving state.db a column for it: - New `platform_message_id TEXT` column on the messages table (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler on existing DBs, no version-gated migration block needed) - Partial index `idx_messages_platform_msg_id` on (session_id, platform_message_id) to keep recall's point-lookup cheap even on large sessions - `append_message()` and `replace_messages()` accept the new value: the gateway-facing `append_to_transcript` in `gateway/session.py` forwards either `message["platform_message_id"]` or the legacy `message["message_id"]` key (yuanbao's existing convention) - `get_messages_as_conversation()` surfaces the column back on the message dict as `message_id` so platform code reads the same shape it used to read from JSONL - Yuanbao `_patch_transcript`: restore branch A1 (exact id match) ahead of A2 (content match) ahead of B (system-note). Both branches log which one fired so operators can tell from gateway.log whether recall hit the canonical path or had to fall back. Tests: - New low-level round-trip tests in `test_hermes_state.py` for both `append_message` and `replace_messages` paths - The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert the new contract: branch A1 (id match) works against DB-only transcripts, and branch A2 (content match) still recovers rows that were observed without a platform id (e.g. agent-processed @bot messages where run.py doesn't carry msg_id through)
The xAI Responses API for x_search returns 200 OK with a
synthesized fluff answer in two failure modes that callers currently
cannot distinguish from a real, citation-backed result:
1. Any narrowing filter (allowed_x_handles, excluded_x_handles,
from_date, to_date) was active, but the X index returned no
matching posts. The model then answers from training data.
2. The date range is malformed, inverted, or pure-future (e.g.
from_date=2030-01-01). The API call burns quota and Grok
responds with a generic answer.
Mitigations, both client-side:
* Validate from_date / to_date before the HTTP call:
- Strict YYYY-MM-DD.
- from_date <= to_date when both set.
- from_date <= today UTC (no posts in a window that hasn't
started). to_date in the future remains allowed so callers
can request 'from yesterday to tomorrow'.
* Add 'degraded' + 'degraded_reason' to successful responses.
degraded=True iff any narrowing filter was active AND both the
top-level 'citations' array and inline 'url_citation'
annotations came back empty. A broad query with no filters that
returns no citations is *not* flagged degraded — that case is
just an unsourced answer, not a filter miss.
Tests cover all four validation paths plus six degraded-flag
scenarios (each filter type, inline vs top-level citation
recovery, broad query baseline). All existing tests continue to
pass; the additions are purely additive on the success-path
response shape.
Discovered while testing the x_search toolset end-to-end:
queries scoped to @teknium1 returned confident-sounding generic
text about Nous Research with zero citations, and from_date in
2030 produced sassy non-answers. Both are now detectable by the
caller.
…degraded-flag Merged after self-review + local verification of date validation and degraded flag. All tests pass, claims confirmed end-to-end.
Browse.sh exposes skills by task name (e.g. "search-listings"), which is shared across hundreds of sites. Deduplicating by name silently dropped every browse-sh skill after the first one with a given task name — e.g. only Airbnb's "search-listings" would survive, collapsing Booking.com, Zillow, and every other site's variant into nothing. Switch unified_search() and do_browse() to use r.identifier as the dedup key. identifier is always globally unique (e.g. "browse-sh/airbnb.com/search-listings-ddgioa"), so same-named skills from different browse-sh hostnames are preserved as distinct results. Update existing TestUnifiedSearchDedup tests to model the real scenario (same identifier appearing from two sources) and add a regression test that asserts browse-sh skills with the same name but different hostnames are never collapsed.
browse_skills() is the TUI gateway's API for the web UI skills browser (tui_gateway/server.py:6574). It had the same dedup-by-name bug as do_browse() and unified_search() fixed in the parent commit: r.name is not unique for browse-sh skills (Airbnb, Booking.com, Zillow all publish "search-listings"), so the dedup loop silently dropped all but the first skill with each task name. Switch to r.identifier, which is always globally unique. Add a regression test asserting that two browse-sh skills with the same name but different hostnames both appear in the browse_skills() result.
…tch path Sibling fix on top of @EloquentBrush0x's PR NousResearch#29441. - tools/skills_hub.py GitHubSource.search() had the same r.name dedup bug. Two configured GitHub taps publishing same-named skills would collapse to one. - tests/hermes_cli/test_skills_hub.py:test_browse_skills_dedup_uses_identifier_not_name patched hermes_cli.skills_hub.create_source_router, but browse_skills() imports it locally from tools.skills_hub. Fixed patch path.
…-2026-05-21 # Conflicts: # plugins/observability/langfuse/plugin.yaml
|
Important Review skippedToo many files! This PR contains 296 files, which is 146 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (4)
📒 Files selected for processing (296)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚨 CRITICAL Supply Chain Risk DetectedThis PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging. 🚨 CRITICAL: Install-hook file added or modifiedThese files can execute code during package installation or interpreter startup. Files: Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting. |
🔎 Lint report:
|
| Rule | Count |
|---|---|
unresolved-attribute |
335 |
unresolved-import |
98 |
invalid-argument-type |
76 |
invalid-assignment |
49 |
unsupported-operator |
22 |
invalid-parameter-default |
12 |
unresolved-reference |
8 |
unresolved-global |
5 |
invalid-type-form |
3 |
unused-type-ignore-comment |
3 |
invalid-return-type |
3 |
no-matching-overload |
3 |
not-subscriptable |
2 |
unknown-argument |
2 |
call-non-callable |
1 |
First entries
agent/agent_init.py:1223: [invalid-argument-type] invalid-argument-type: Argument to function `get_custom_provider_context_length` is incorrect: Expected `str`, found `str | dict[str, str] | Any`
optional-skills/research/darwinian-evolver/templates/custom_problem_template.py:32: [unresolved-import] unresolved-import: Cannot resolve imported module `darwinian_evolver.evolve_problem_loop`
run_agent.py:3077: [unresolved-attribute] unresolved-attribute: Object of type `Self@_has_stream_consumers` has no attribute `stream_delta_callback`
agent/agent_init.py:113: [invalid-type-form] invalid-type-form: Function `callable` is not valid in a parameter annotation: Did you mean `collections.abc.Callable`?
hermes_cli/xai_retirement.py:84: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `str`, found `str | None`
gateway/platforms/api_server.py:623: [invalid-assignment] invalid-assignment: Object of type `None` is not assignable to `def create_job(prompt: str | None, schedule: str, name: str | None = None, repeat: int | None = None, deliver: str | None = None, origin: dict[str, Any] | None = None, skill: str | None = None, skills: list[str] | None = None, model: str | None = None, provider: str | None = None, base_url: str | None = None, script: str | None = None, context_from: str | list[str] | None = None, enabled_toolsets: list[str] | None = None, workdir: str | None = None, profile: str | None = None, no_agent: bool = False) -> dict[str, Any]`
tests/gateway/test_agent_cache.py:416: [unresolved-attribute] unresolved-attribute: Object of type `AIAgent` has no attribute `reasoning_config`
hermes_cli/config.py:3893: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `str`, `list[Unknown]`, `list[str]`, `int` in union `str | dict[Unknown, Unknown] | list[Unknown] | ... omitted 27 union elements`
tests/cron/test_cron_profile.py:153: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> LiteralString, (key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> str]` cannot be called with key of type `Literal["properties"]` on object of type `str`
cli.py:5051: [unresolved-attribute] unresolved-attribute: Object of type `AIAgent & ~AlwaysFalsy` has no attribute `_checkpoint_mgr`
acp_adapter/session.py:627: [unresolved-attribute] unresolved-attribute: Unresolved attribute `_print_fn` on type `AIAgent`
gateway/run.py:16949: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["title_callback"]` and value of type `(title: str) -> None` on object of type `dict[str, ((task: str, exc: BaseException) -> None) | dict[str, Any | None] | None]`
run_agent.py:3267: [unresolved-attribute] unresolved-attribute: Object of type `Self@_get_transport` has no attribute `api_mode`
tests/run_agent/test_steer.py:192: [unresolved-attribute] unresolved-attribute: Unresolved attribute `_execution_thread_id` on type `AIAgent`
tests/gateway/test_base_topic_sessions.py:346: [unresolved-attribute] unresolved-attribute: Object of type `bound method DummyTelegramAdapter.play_tts(chat_id: str, audio_path: str, **kwargs) -> CoroutineType[Any, Any, SendResult]` has no attribute `await_args`
gateway/run.py:16880: [invalid-assignment] invalid-assignment: Object of type `object` is not assignable to attribute `session_id` on type `SessionEntry & ~AlwaysFalsy`
tests/gateway/test_telegram_channel_posts.py:28: [unresolved-attribute] unresolved-attribute: Unresolved attribute `InlineKeyboardMarkup` on type `ModuleType`
tests/gateway/test_telegram_channel_posts.py:25: [unresolved-attribute] unresolved-attribute: Unresolved attribute `Bot` on type `ModuleType`
tests/tools/test_process_registry.py:923: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["3 earlier matches were suppressed"]` and `str | None`
run_agent.py:617: [unresolved-attribute] unresolved-attribute: Object of type `Self@_safe_print` has no attribute `_print_fn`
run_agent.py:1662: [unresolved-attribute] unresolved-attribute: Object of type `Self@interrupt` has no attribute `quiet_mode`
plugins/web/xai/provider.py:189: [unresolved-attribute] unresolved-attribute: Attribute `strip` is not defined on `None` in union `Any | None | Literal["grok-4.3"]`
tests/hermes_cli/test_destructive_slash_confirm_gate.py:32: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `str`, `list[Unknown]`, `list[str]`, `int` in union `str | dict[Unknown, Unknown] | list[Unknown] | ... omitted 27 union elements`
run_agent.py:894: [unresolved-attribute] unresolved-attribute: Object of type `Self@_resolved_api_call_stale_timeout_base` has no attribute `model`
run_agent.py:3039: [unresolved-attribute] unresolved-attribute: Object of type `Self@_fire_stream_delta` has no attribute `_stream_callback`
... and 597 more
✅ Fixed issues (197):
| Rule | Count |
|---|---|
invalid-argument-type |
94 |
invalid-assignment |
38 |
unresolved-attribute |
30 |
unsupported-operator |
16 |
unresolved-import |
5 |
unresolved-reference |
5 |
not-subscriptable |
3 |
invalid-return-type |
2 |
no-matching-overload |
2 |
invalid-method-override |
2 |
First entries
run_agent.py:1659: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["timeout"]` and value of type `int | float` on object of type `dict[str, str | dict[str, str]]`
hermes_cli/config.py:3710: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `str`, `list[Unknown]`, `list[str]`, `int` in union `str | dict[Unknown, Unknown] | list[Unknown] | ... omitted 26 union elements`
tests/tools/test_session_search.py:467: [invalid-argument-type] invalid-argument-type: Argument to function `session_search` is incorrect: Expected `int`, found `Literal["2"]`
tests/hermes_cli/test_aux_config.py:46: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> LiteralString, (key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> str]` cannot be called with key of type `Literal["session_search"]` on object of type `str`
cli.py:11025: [unresolved-attribute] unresolved-attribute: Attribute `_active_children` is not defined on `None` in union `AIAgent | None`
run_agent.py:13259: [invalid-argument-type] invalid-argument-type: Argument to function `estimate_usage_cost` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
tools/session_search_tool.py:375: [invalid-argument-type] invalid-argument-type: Argument to bound method `SessionDB.search_messages` is incorrect: Expected `list[str]`, found `None | list[str]`
run_agent.py:6022: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["/"]` and `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
run_agent.py:2692: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `(Unknown & ~AlwaysFalsy) | (str & ~AlwaysFalsy) | (dict[str, str] & ~AlwaysFalsy) | ... omitted 4 union elements`
tests/tools/test_browser_console.py:265: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["record_sessions"]` and `str | dict[Unknown, Unknown] | list[Unknown] | ... omitted 26 union elements`
tests/run_agent/test_anthropic_error_handling.py:18: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
run_agent.py:1743: [invalid-argument-type] invalid-argument-type: Argument to function `resolve_provider_client` is incorrect: Expected `str`, found `Unknown | None`
tests/agent/test_curator.py:855: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["curator"]` and `str | dict[Unknown, Unknown] | list[Unknown] | ... omitted 26 union elements`
tests/run_agent/test_fallback_model.py:11: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
run_agent.py:11665: [invalid-argument-type] invalid-argument-type: Argument to function `_fixed_temperature_for_model` is incorrect: Expected `str | None`, found `str | Unknown | dict[Unknown | str, Unknown | str | dict[str, str]] | int | dict[Unknown, Unknown]`
hermes_cli/auth.py:4114: [invalid-argument-type] invalid-argument-type: Argument to function `get_api_key_provider_status` is incorrect: Expected `str`, found `(str & ~Literal["spotify"] & ~Literal["nous"] & ~Literal["openai-codex"] & ~Literal["qwen-oauth"] & ~Literal["google-gemini-cli"] & ~Literal["minimax-oauth"] & ~Literal["copilot-acp"]) | None`
run_agent.py:8135: [unresolved-attribute] unresolved-attribute: Attribute `append` is not defined on `None` in union `None | list[Unknown]`
run_agent.py:10688: [invalid-argument-type] invalid-argument-type: Argument to function `session_search` is incorrect: Expected `str`, found `Unknown | None`
tests/cron/test_codex_execution_paths.py:77: [invalid-assignment] invalid-assignment: Object of type `(messages) -> None` is not assignable to attribute `_save_session_log` of type `def _save_session_log(self, messages: list[dict[str, Any]] = None) -> Unknown`
run_agent.py:11737: [unresolved-attribute] unresolved-attribute: Attribute `strip` is not defined on `dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy`, `int & ~AlwaysFalsy`, `dict[Unknown, Unknown] & ~AlwaysFalsy` in union `(str & ~AlwaysFalsy) | (Unknown & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | ... omitted 3 union elements`
tools/environments/local.py:488: [unresolved-attribute] unresolved-attribute: Unresolved attribute `_hermes_pgid` on type `Popen[str]`
run_agent.py:7886: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["response"]` and value of type `SimpleNamespace` on object of type `dict[str, None | list[Unknown]]`
tests/tools/test_session_search.py:327: [unresolved-attribute] unresolved-attribute: Unresolved attribute `SessionDB` on type `ModuleType`
tests/agent/test_codex_cloudflare_headers.py:181: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["originator"]` and `(Unknown & ~AlwaysFalsy) | (str & ~AlwaysFalsy) | (dict[str, str] & ~AlwaysFalsy) | ... omitted 3 union elements`
tests/hermes_cli/test_destructive_slash_confirm_gate.py:32: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `str`, `list[Unknown]`, `list[str]`, `int` in union `str | dict[Unknown, Unknown] | list[Unknown] | ... omitted 26 union elements`
... and 172 more
Unchanged: 4129 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
Summary
Routine upstream sync. Brings in 727 commits from
NousResearch/hermes-agent(since our last merge atc44309230on 2026-05-13). Includes the v0.14.0 "Foundation Release" (808 commits / 633 PRs / 545 issues closed from the upstream tag).Conflicts (1 trivial)
plugins/observability/langfuse/plugin.yamlversion: \"1.1.0\"+ Wizarck-fork description (the cost-attribution metadata patch from PR #2). Upstream bumped to1.0.0with the plain description — discarded that.Auto-merged (no manual touch)
agent/prompt_builder.py,gateway/{config,run}.py,tools/cronjob_tools.py,toolsets.pytests/run_agent/test_provider_parity.py,tests/test_hermes_state.pyui-tui/src/components/thinking.tsxwebsite/docs/Untouched by upstream (our patches survive byte-identical)
plugins/observability/langfuse/__init__.py(v1.1.0 cost-by-tag patch)gateway/platforms/{whatsapp_via_mcp_meta_business_api, web_via_http_sse, base, telegram}.pytools/send_message_tool.pyhermes_cli/{status, gateway, platforms}.pycron/scheduler.pydeploy/eligia-vps/*plugins/model-providers/anthropic/plugin (unchanged upstream — still supportsCLAUDE_CODE_OAUTH_TOKEN)Notable upstream changes
agent/anthropic_adapter.py)code_verifier(fcd9011f8)hermes migrate xaistate.dbis now canonical; legacy JSONL/session_log_fileremovedVerification
api.anthropic.com— issue [Bug]: On a Claude Max 20x subscription with a valid OAuth access token from ~/.claude/.credentials.json, every Hermes request to native Anthropic (provider: anthropic, https://api.anthropic.com/v1/messages) is rejected with HTTP 400 NousResearch/hermes-agent#15080 does not reproduce in Haiku 4.5 on Max 20x as of 2026-05-21. Tested with both Claude Code mimicry headers and plain headers; both routed to the Max lane (confirmed viaanthropic-ratelimit-unified-5h-utilizationresponse header).test+nix-ubuntupre-existing failures).Test plan
CLAUDE_CODE_OAUTH_TOKEN(separate follow-up to migrate the env var on the VPS).🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com