Skip to content

chore: sync with upstream main (2026-05-18)#36

Merged
bot-ted merged 1777 commits into
mainfrom
sync/upstream-20260518
May 18, 2026
Merged

chore: sync with upstream main (2026-05-18)#36
bot-ted merged 1777 commits into
mainfrom
sync/upstream-20260518

Conversation

@bot-ted

@bot-ted bot-ted commented May 18, 2026

Copy link
Copy Markdown
Owner

Daily sync with upstream. Auto-created by cron job.

Commits synced: 8718 commits

Recent commits (first 20):

609c485 Merge pull request NousResearch#27971 from NousResearch/austin/fix/goal-statusbar
d9b6f75 refactor(bootstrap): consolidate ACP browser bootstrap into install.{sh,ps1} (NousResearch#27851)
e3a254d feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection (NousResearch#27845)
6f5ec92 feat(config): add install-method stamping + Docker detection (NousResearch#27843)
f2fdb9a feat(gateway): deliverable mode — ship artifacts as native uploads from any agent surface (NousResearch#27813)
dadc8aa fix(kanban): surface unusable triage auxiliary model (auto-decompose aware) (NousResearch#27871)
d9fef0c fix(kanban): align failure diagnostics with retry limit
6e60a8a feat(kanban): make worker log retention configurable
8831eb5 fix(kanban): align worker terminal timeout with task runtime
0292398 fix(acp): use modes for edit auto-approval
f70e0b8 feat(acp): add session-scoped edit auto-approval
49b28d1 fix(acp): avoid duplicate edit approval diffs
9592e59 feat(acp): require approval for editor file edits
060ec02 docs: add ACP Zed edit approval diffs plan
0fa46c6 fix(yuanbao): persist message_id on @bot user transcript writes
41f1edd refactor(doctor): extract section banner + fail-and-issue helpers (NousResearch#27830)
94c523f docs(session_search): update all docs for the single-shape rewrite (NousResearch#27840)
ff07873 fix(skills): load symlinked skill slash commands
abf1af5 feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (NousResearch#27590)
4a3f13b perf(prompt-cache): date-only timestamp + loud gateway-DB roundtrip logging

teknium1 and others added 30 commits May 16, 2026 01:24
…earch#26829)

Port three hardening patches from Claude Code 2.1.113's expanded deny
rules to hermes' detect_dangerous_command() pattern list.

1. macOS /private/{etc,var,tmp,home} system paths
   /etc, /var, /tmp, /home are symlinks to /private/<name> on macOS.
   A write to /private/etc/sudoers works identically to /etc/sudoers
   but bypassed the plain /etc/ pattern check. Extracted a shared
   _SYSTEM_CONFIG_PATH fragment so /etc/ and the /private/ mirror
   stay in sync across redirect / tee / cp / mv / install / sed -i
   patterns.

2. killall -9 / -KILL / -SIGKILL / -s KILL / -r <regex>
   Parallel to the existing pkill -9 pattern. killall -9 against
   non-hermes processes was previously unprotected, and killall -r
   can sweep unrelated processes matching a regex.

3. find -execdir rm
   Same destructive effect as find -exec rm but ran in each match's
   directory. The previous pattern required a literal '-exec ' so
   -execdir slipped through.

Guarded by 32 new test cases in 4 test classes:
  - TestMacOSPrivateSystemPaths  (11 cases)
  - TestKillallKillSignals       (9 cases)
  - TestFindExecdir              (4 cases)
  - TestEtcPatternsUnaffectedByRefactor  (6 regression guards on
    the existing /etc/ coverage after the _SYSTEM_CONFIG_PATH refactor)

Inspiration: https://github.com/anthropics/claude-code/releases
(Claude Code 2.1.113, April 17 2026 - "Enhanced deny rules" and
"Dangerous path protection")
…rsions (NousResearch#26830)

Closes NousResearch#10695. Picks up the still-vulnerable Python pins on current main:

- aiohttp 3.13.3 -> 3.13.4 (messaging, slack, homeassistant, sms extras +
  lazy_deps platform.slack) — CVE-2026-34513 (DNS cache exhaustion),
  CVE-2026-34518 (cookie/proxy-auth leak on cross-origin redirect, relevant
  for the gateway since it handles OAuth tokens), CVE-2026-34519 (response
  reason injection), CVE-2026-34520 (null bytes in headers), CVE-2026-34525
  (multiple Host headers).
- anthropic 0.86.0 -> 0.87.0 (anthropic extra + lazy_deps provider.anthropic)
  — CVE-2026-34450 (memory tool files created mode 0o666),
  CVE-2026-34452 (path-traversal in async local-filesystem memory tool).
  Not directly exploitable since hermes-agent doesn't use the SDK's
  filesystem memory tool, but the SDK is bumped for hygiene.
- cryptography pinned explicitly at 46.0.7 in core dependencies —
  CVE-2026-39892 (buffer overflow on non-contiguous buffers). Previously
  came in transitively via PyJWT[crypto]; the explicit floor keeps the
  WeCom/Weixin crypto paths from drifting below the fix.

curl-cffi from the original issue is no longer in pyproject.toml or uv.lock,
so no action needed there.

uv.lock regenerated cleanly; only aiohttp / anthropic / cryptography moved.

Credit: original issue + scoping by @shaun0927 (NousResearch#10695, NousResearch#10701).
Floor analysis and packaging-surface audit by @gnanirahulnutakki (NousResearch#10784),
adapted to current main's exact-pin style.

Co-authored-by: shaun0927 <shaun0927@users.noreply.github.com>
Co-authored-by: Gnani Rahul Nutakki <gnanirahulnutakki@users.noreply.github.com>
…arch#355) (NousResearch#26729)

* feat(skills): add osint-investigation optional skill (closes NousResearch#355)

Phase-1 public-records OSINT investigation framework adapted from
ShinMegamiBoson/OpenPlanter (MIT). Lives in optional-skills/research/.

Six data-source wiki entries (FEC, SEC EDGAR, USAspending, Senate LD,
OFAC SDN, ICIJ Offshore Leaks), each following the 9-section template:
summary, access, schema, coverage, cross-reference keys, data quality,
acquisition, legal, references.

Six stdlib-only acquisition scripts that emit normalized CSV, plus three
analysis scripts:

  - entity_resolution.py  — three-tier match (exact / fuzzy / token overlap)
                            with explicit confidence per row
  - timing_analysis.py    — permutation test for donation/contract timing
                            correlation, joins through cross-links
  - build_findings.py     — assembles structured findings.json with
                            evidence chains pointing back to source rows

Validation: full pipeline runs end-to-end on synthetic fixtures. Entity
resolution found 24 cross-matches with 0 false positives on a 5-row /
4-row test set. Timing analysis on 5 donations clustered near 3 awards
returned p=0.000, effect size 2.41 SD. Findings JSON correctly tags
HIGH-severity timing pattern. All 9 scripts pass --help and py_compile.

Docs site page auto-generated by website/scripts/generate-skill-docs.py;
sidebar + catalog entries updated by the same generator.

* fix(osint-investigation): live API fixes from end-to-end sweep

Live-tested the skill on a real public-citizen query and found three bugs
the synthetic E2E missed. All three are now fixed and re-verified.

1. FEC fetch hung on contributor name searches.
   The combination of two_year_transaction_period + sort=date +
   contributor_name puts the OpenFEC query plan on a slow path that the
   upstream gateway times out (25s+). Switched to min_date/max_date with no
   explicit sort. Renamed --candidate to --contributor (the original name
   was misleading: FEC searches by donor, not by candidate; --candidate is
   kept as a deprecated alias). Added --state filter for narrowing.

2. ICIJ Offshore Leaks reconcile endpoint returns 404.
   ICIJ removed the Open Refine reconciliation API. Rewrote
   fetch_icij_offshore.py to download the official bulk CSV ZIP (~70 MB,
   public, no auth) and search it locally. Cached under
   $HERMES_OSINT_CACHE/icij/ (default ~/.cache/hermes-osint/icij/) for
   30 days, --force-refresh to refetch. Verified live: 'PUTIN' query
   returns 5 Panama Papers officer matches in 0.5s after first download.

3. SEC EDGAR silently returned 0 when the company-name resolver matched
   an individual Form 3/4/5 filer (insider trading disclosures).
   Now surfaces 'Resolved company X → CIK Y (Z)' on stderr, prints a
   filing-type histogram when the type filter wipes results, and
   explicitly warns when the matched CIK appears to be an individual
   filer rather than a corporate registrant.

Bonus: _http.py was retrying 429 responses with exponential backoff plus
honoring (often-missing) Retry-After headers, which compounded into
multi-second hangs per page when the upstream key was over quota.
Changed to fail-fast on 429 with a clear, actionable error showing the
upstream's quota message. Verified: 0.3s fast-fail vs the previous 60s
hang on DEMO_KEY rate-limit exhaustion.

Updated SKILL.md, fec.md, and icij-offshore.md to match the new CLI
flags and ICIJ bulk-cache flow. Regenerated the docusaurus page via
website/scripts/generate-skill-docs.py.

Live sweep results across all 6 sources for 'Dillon Rolnick, New York':
- OFAC SDN: 0 matches ✓ (correctly not sanctioned)
- USAspending: 0 matches ✓ (correctly not a federal contractor)
- Senate LDA: 0 matches ✓ (correctly not a lobbying client)
- SEC EDGAR: warns it resolved to 'Rolnick Michael' (CIK 0001845264)
    who is an individual Form 3 filer, not a corporate registrant
- ICIJ: 0 matches ✓ (correctly not in any offshore leak)
- FEC: rate-limited (DEMO_KEY); fails fast with clear quota message

* feat(osint-investigation): expand to 12 sources covering identity, property, courts, archives, news

Phase-2 expansion per Teknium feedback that the original 6-source skill
(federal financial/regulatory only) wasn't a complete OSINT toolkit. Adds
6 more sources covering the major omissions a real investigation would
reach for first.

New sources (6 fetch scripts + 6 wiki entries):

1. NYC ACRIS — Real property records (deeds, mortgages, liens) via the
   city's Socrata API. Search by party name or property address. Joins
   Parties to Master to populate doc_type, dates, borough, and amount.
   Coverage: 5 NYC boroughs, ~70M party records, 1966-present.

2. OpenCorporates — Global corporate registry covering 130+ jurisdictions
   (~200M companies). Free API token at
   https://opencorporates.com/api_accounts/new raises the rate limit;
   HTML fallback works without one (limited fields).

3. CourtListener (Free Law Project) — federal + state court opinions
   (~10M back to colonial era) + PACER dockets via RECAP. Anonymous v4
   search works; COURTLISTENER_TOKEN raises rate limits.

4. Wayback Machine CDX — historical web captures (~900B+). Used both for
   surveillance-of-record (when did this site change?) and as a
   content-recovery layer when other sources point to dead URLs.

5. Wikipedia + Wikidata — narrative bio + structured facts. Wikipedia
   OpenSearch for article matching, REST summary for extracts, Wikidata
   Action API (wbgetentities) for claims. Avoids the SPARQL Query
   Service which is aggressively rate-limited.

6. GDELT 2.0 DOC API — global news monitoring in 100+ languages,
   ~2015-present. Auto-retries with 6s backoff on the standard
   1-req-per-5-sec throttle.

Other changes in this commit:

- SEC EDGAR no longer raises SystemExit when the company-name resolver
  finds no CIK; writes an empty CSV with header so the rest of a
  pipeline can keep moving and the warning is just on stderr.

- _http.py User-Agent updated per Wikimedia policy: includes app name,
  version, and a 'set HERMES_OSINT_UA to identify yourself' instruction.

- SKILL.md workflow now groups sources into two clusters (federal
  financial vs identity/property/courts/archives/news) with bash
  examples for each. 'When to use this skill' lists the broader set of
  investigation patterns the expanded sources unlock.

Live sweep results on 'Dillon Rolnick, New York' across all 12 sources:

  ofac           ✓ 0 (correctly clean)
  icij           ✓ 0 (correctly not in any leak)
  usaspending    ✓ 0 (correctly not a federal contractor)
  senate_lda     ✓ 0 (correctly not a lobbying client)
  sec_edgar      ✓ 0, warns: resolved to 'Rolnick Michael' (CIK 0001845264),
                   individual Form 3 filer, NOT a corporate registrant
  fec            — rate-limited (DEMO_KEY exhausted), fails fast with
                   clear quota message
  nyc_acris      ✓ 200 records named Rolnick across NYC; 48 records at
                   571 Hudson (the property the web identifies as his)
  opencorporates ✓ 0 (no API token configured; HTML fallback)
  courtlistener  ✓ 0 for 'Dillon Rolnick'; 20 for 'Rolnick' generally;
                   5 for 'Microsoft' sanity check
  wayback        ✓ 30 captures of nousresearch.com from 2011-present
  wikipedia      ✓ 0 (correctly not notable enough); Bill Gates sanity
                   returns full structured facts (occupation, employer,
                   DOB, place of birth, country)
  gdelt          ✓ 0 for 'Dillon Rolnick'; 5 for 'Nous Research'

All 17 scripts compile clean and pass --help. Synthetic analysis pipeline
regression still passes (entity_resolution 30 matches, timing p=0.000,
findings 2).

* feat(osint-investigation): remove FEC; DEMO_KEY rate-limits make it unreliable

The FEC fetcher consistently failed the live sweep because the OpenFEC
DEMO_KEY tier (40 calls/hour) exhausts on a single investigation, and
the upstream returns slow-path query plans for unindexed contributor-name
searches that the gateway times out. Without a real API key it's not
usable; with one the user has to sign up at api.data.gov first. That's
too much setup friction for a skill that should work out of the box.

Removed:
  - scripts/fetch_fec.py
  - references/sources/fec.md

Updated:
  - SKILL.md frontmatter description + tags
  - 'When NOT to use' now points users at https://www.fec.gov/data/ for
    federal donations
  - entity_resolution example switched from donor↔contractor to
    lobbying-client↔contractor (Senate LDA + USAspending pair)
  - timing_analysis example switched to lobbying-filings vs awards
  - 8 wiki entries had their 'FEC ↔ ...' cross-reference bullets removed

11 sources remain (5 federal financial + 6 identity/property/courts/
archives/news). All scripts compile, pass --help, and the synthetic
analysis pipeline still passes on the new lobbying-shaped regression
fixture (30 matches, p=0.000 on tight clustering, 2 findings).
Fixes NousResearch#26693

`hermes doctor` currently promotes invalid direct API keys into the final
summary even when the matching OAuth path is already healthy. That makes
the setup look more broken than it really is.

This change keeps the failed API Connectivity row visible but stops
treating it as a blocking summary issue when a healthy OAuth fallback
already exists for the same provider family.

Covered cases:
- Gemini OAuth + invalid direct Gemini key
- MiniMax OAuth + invalid direct MiniMax key

Based on NousResearch#26704 by @worlldz.
…rs (NousResearch#10648)

Address two blocking issues when using GitHub Copilot integrations:

1. ACP mode: detect the gh-copilot CLI deprecation error from stderr
   and surface an actionable message with alternatives instead of
   hanging or showing a cryptic error.

2. GitHub Models (Azure) 413: recognize models.inference.ai.azure.com
   as a known GitHub Models URL, and print a targeted hint explaining
   the hard 8K token limit that makes this endpoint incompatible with
   Hermes' system prompt size.
…apping

Cover the deprecation pattern matching against real gh-copilot stderr
output, verify the GitHub Models Azure URL is in _URL_TO_PROVIDER, and
confirm _is_github_models_base_url recognises the Azure endpoint.
…ls 413 hint

Follow-up improvements on top of @konsisumer's cherry-picked fix for NousResearch#10648:

1. Deprecation patterns required BOTH a product fingerprint ('gh-copilot') and
   a deprecation marker. The previous list included 'copilot-cli' and bare
   'deprecation', which would false-positive on stderr from the NEW
   @github/copilot CLI — whose repo is literally github.com/github/copilot-cli
   and which legitimately surfaces those substrings in its own messages.

2. Replace the deprecation hint. The user in NousResearch#10648 installed
   'gh extension install github/gh-copilot' (the deprecated extension)
   thinking that's what ACP mode uses, when ACP actually spawns the new
   'copilot' binary from '@github/copilot'. The hint now points users at the
   correct install command ('npm install -g @github/copilot') with the new
   CLI's repo URL, and demotes provider-switching to a fallback alternative.

3. Change _URL_TO_PROVIDER value for models.inference.ai.azure.com from the
   'github-models' alias to the canonical 'copilot' provider id, matching the
   convention used by every other entry in the table.

4. Sharpen the 413 hint message. The free tier's ~8K cap is below the
   system-prompt floor, so this endpoint is fundamentally incompatible with
   an agentic loop — not a 'use a different URL' problem.

Tests:
- New parametrized false-positive coverage for the new CLI's stderr shape.
- Updated assertion to require canonical 'copilot' provider mapping.
- All 14 deprecation/URL tests pass.
…sResearch#4469) (NousResearch#26822)

When the agent is running and the user sends multiple TEXT messages in
rapid succession, base.py's active-session branch stored the pending
event as a single-slot replacement:

    self._pending_messages[session_key] = event

Three rapid messages A, B, C landed as: A (interrupts), B (replaces A
before consumer reads), C (replaces B). Only C reached the next turn —
A and B were silently dropped. This is the symptom in NousResearch#4469.

Route the follow-up through merge_pending_message_event(..., merge_text=True)
so TEXT events accumulate into the existing pending event's text instead
of clobbering it. Photo and media bursts already merged through the same
helper; this just extends the merge_text path (already used by the
Telegram bursty-grace branch in gateway/run.py) to all platforms.

Test exercises BasePlatformAdapter.handle_message directly with the
session marked active and asserts three rapid TEXT events merge to
'part two\\npart three' rather than dropping the middle message.
Sanity-checked the test would fail without the fix.

Credits @devorun for the original investigation and analysis in NousResearch#4491
that surfaced the underlying queue handling, though their fix targeted
GatewayRunner._pending_messages which is now dead state on main.
The PKCE flow reused the code_verifier as the OAuth state parameter.
Per RFC 6749 §10.12 and RFC 7636, these serve different purposes:
state is an anti-CSRF token visible in the authorization URL; the
code_verifier must remain secret for the token exchange.

Generate an independent secrets.token_urlsafe(32) for state and
validate it on callback to provide actual CSRF protection.

Closes NousResearch#10693
Group the secrets import with time and webbrowser at the top of
run_hermes_oauth_login_pure(), matching the existing pattern.
Drop the _secrets alias — no name conflict in this scope.
…tion

Two unit tests for run_hermes_oauth_login_pure():

1. test_authorization_url_state_is_not_pkce_verifier — asserts state in the
   auth URL is independent from the PKCE code_verifier sent in the token
   exchange, and that the verifier never appears in the URL.

2. test_callback_state_mismatch_aborts — asserts the flow returns None
   (no token exchange) when the callback state does not match the value
   we generated.

Negative control verified: reintroducing the b17e5c1 vulnerable pattern
(state = verifier, no callback validation) makes both tests fail.

Also adds AUTHOR_MAP entry for shaun0927 (contributor of the fix).
The Foundation Release — Hermes installs and runs anywhere now.

Highlights:
- Native Windows support (early beta) — PowerShell installer, native subprocess/PTY paths, ~40 follow-up Windows-only fixes
- pip install hermes-agent — PyPI wheel
- Cold-start wave — ~19s off hermes launch, 180x faster browser_console (CDP WS)
- Supply-chain advisory checker + lazy-deps + tiered install fallback
- OpenAI-compatible local proxy for OAuth providers (Claude Pro, ChatGPT Pro, SuperGrok)
- Cross-session 1h Claude prompt cache (Anthropic / OpenRouter / Nous Portal)
- 2 new platforms: LINE + SimpleX Chat (22 total)
- Microsoft Graph foundation — Teams pipeline + webhook adapter
- /handoff actually transfers sessions live
- x_search first-class tool, vision_analyze pixel passthrough
- LSP semantic diagnostics on every write
- Unified video_generate with pluggable backends
- computer_use cua-driver backend
- 9 new optional skills, OpenRouter Pareto Code router, xAI Grok OAuth
- 12 P0 + 50 P1 closures

808 commits · 633 PRs · 1393 files · 165k insertions · 545 issues closed · 215 contributors
…n/load (NousResearch#12285) (NousResearch#26943)

Persisted assistant `reasoning_content` / `reasoning` fields are now emitted
as ACP `agent_thought_chunk` notifications during `_replay_session_history`,
so editor clients (Zed, etc.) rebuild collapsed Thinking panes when the user
re-opens a session that used a thinking model.

Ordering matches live streaming: thought precedes message text within the
same assistant turn, mirroring how `reasoning_callback` deltas arrive before
`stream_delta_callback` deltas in `events.py::make_thinking_cb` /
`make_message_cb`.

Behavior on non-reasoning histories is unchanged; the replay loop's existing
text / tool_call / tool_call_update / plan emission is preserved bit-for-bit.

Closes NousResearch#12285.

Credit:
- @Yukipukii1 (NousResearch#14691) — original thought-replay design via
  `acp.update_agent_thought_text`; the tool-call portion of that PR has
  since landed via NousResearch#19139, but the reasoning replay is theirs.
- @HenkDz (NousResearch#17652 / NousResearch#18578) — established the `_replay_session_history` and
  `_history_*` helper conventions this builds on.
- @D1zzyDwarf (NousResearch#16531) — also closed by this work.
…ousResearch#12285 follow-up) (NousResearch#26957)

Switches `_replay_session_history` from `loop.call_soon`-deferred (after the
`LoadSessionResponse` is written) to `await`-inline (before the response is
constructed) for both `session/load` and `session/resume`. Adds defensive
try/except around the awaited call so a replay helper crash still yields a
successful load response — partial transcripts are acceptable, total
load failure is not.

The deferral was added on May 2 in commit 19854c7 with the rationale "Zed
only attaches streamed transcript/tool updates once the load/resume response
has completed." That justification was incorrect:

- Zed's current ACP integration (zed-industries/zed
  crates/agent_servers/src/acp.rs) explicitly registers the session-update
  routing entry BEFORE awaiting the loadSession RPC, with the comment:
  "so that any session/update notifications that arrive during the call
  (e.g. history replay during session/load) can find the thread."
- Every other reference ACP server (Codex, Claude Code, OpenCode, Pi, agentao)
  replays history BEFORE responding to the load request.
- The ACP spec wording ("Stream the entire conversation history back to the
  client via notifications") and the natural JSON-RPC reading both mean
  "during the request's lifetime", not "after the response resolves".

Empirical reproduction (reported by Biraj on @agentclientprotocol/sdk
v0.21.1): the same custom ACP client works correctly against Codex /
Claude Code / OpenCode / Pi but receives 0 notifications from Hermes
because it measures the per-call notification count at the moment
`loadSession` resolves — which on Hermes was before the `call_soon`-
scheduled replay coroutine had a chance to run.

Changes:
- `acp_adapter/server.py`: remove `_schedule_history_replay`; both
  `load_session` and `resume_session` now `await self._replay_session_history`
  before returning, wrapped in try/except that logs and continues on
  helper exceptions.
- `tests/acp/test_server.py`: replace the single
  `test_load_session_schedules_history_replay_after_response`
  (which encoded the now-incorrect post-response ordering) with two tests
  asserting `events == ["replay", "returned"]` for load and resume.
  Add two regression tests confirming that a replay helper raising still
  yields a `LoadSessionResponse` / `ResumeSessionResponse` rather than
  propagating the exception out as a JSON-RPC error.

Result: 240 ACP tests pass (was 238), ruff clean. Verified end-to-end:
biraj's synchronous notification-counter pattern now sees 6 notifications
during `loadSession` for a 5-message session, matching all other reference
ACP servers.

The `_fenced_text` change in `acp_adapter/tools.py` from the same May 2
commit is orthogonal and intentionally left intact — it's a separate,
still-valid fix for Zed's pipe-as-table rendering.

Refs NousResearch#12285. Follows up NousResearch#26943 (which added thought-chunk replay but kept
the deferral).
…sResearch#27035)

* chore: release v0.14.0 (2026.5.16)

The Foundation Release — Hermes installs and runs anywhere now.

Highlights:
- Native Windows support (early beta) — PowerShell installer, native subprocess/PTY paths, ~40 follow-up Windows-only fixes
- pip install hermes-agent — PyPI wheel
- Cold-start wave — ~19s off hermes launch, 180x faster browser_console (CDP WS)
- Supply-chain advisory checker + lazy-deps + tiered install fallback
- OpenAI-compatible local proxy for OAuth providers (Claude Pro, ChatGPT Pro, SuperGrok)
- Cross-session 1h Claude prompt cache (Anthropic / OpenRouter / Nous Portal)
- 2 new platforms: LINE + SimpleX Chat (22 total)
- Microsoft Graph foundation — Teams pipeline + webhook adapter
- /handoff actually transfers sessions live
- x_search first-class tool, vision_analyze pixel passthrough
- LSP semantic diagnostics on every write
- Unified video_generate with pluggable backends
- computer_use cua-driver backend
- 9 new optional skills, OpenRouter Pareto Code router, xAI Grok OAuth
- 12 P0 + 50 P1 closures

808 commits · 633 PRs · 1393 files · 165k insertions · 545 issues closed · 215 contributors

* docs(release): rewrite v0.14.0 highlights for excitement framing

Demote Windows beta from headline; lead with SuperGrok / OAuth proxy /
x_search / Microsoft Teams. Frame lazy-deps as a debloating wave that
makes installs dramatically lighter. Add highlights for clickable URLs
in any terminal, dangerous-command detection bypasses, ChatGPT Pro
and SuperGrok via the local proxy. Tighten the summary paragraph.
…ol docs (NousResearch#27050)

The video_gen toolset and its video_generate tool shipped without
user-facing reference docs. toolsets-reference.md and the dev-guide
plugin page were already in, but reference/tools-reference.md had no
video_gen section at all and user-guide/features/tools.md's Media row
didn't list video_generate.

- reference/tools-reference.md: add a video_gen section after video,
  including backend list (xAI Grok-Imagine, FAL.ai Veo/Pixverse/Kling),
  unified text-to-video / image-to-video surface note, link to the
  dev-guide plugin page, and the video_generate tool row. Add
  video_generate to the standalone-tools quick-counts line.
- user-guide/features/tools.md: extend Media row with video_generate
  and video_analyze plus an opt-in caveat.
…fo (NousResearch#27051)

Port from nanocoai/nanoclaw#1962: modern Signal V2-only groups surface on
dataMessage.groupV2.id, not groupInfo.groupId. signal-cli versions differ
in which field they expose for V2 groups — some forward the underlying
libsignal envelope verbatim (groupV2), others normalize everything into
groupInfo. Without a groupV2 read, V2-only groups appear as DMs because
groupInfo is undefined and the adapter misroutes them to the sender's
DM session.

Reads groupV2.id first, falls back to groupInfo.groupId. Also hardens
chat_name extraction against non-dict groupInfo payloads (crashed with
AttributeError under malformed envelopes).

6 new tests cover V2 routing, V1 legacy compatibility, V2-preferred
precedence, no-group DM path, allowlist enforcement, and malformed
payloads.
…xt (NousResearch#27053)

Each highlight now gets 2-3 sentences explaining the user-facing value,
not just the technical change. Targeted at someone discovering Hermes
for the first time who isn't deep in the codebase.
…ch#27055)

The `@askjo/camofox-browser` npm package was a top-level entry in
the root `package.json` `dependencies` block, so `hermes update`
ran its postinstall on every user, every update. That postinstall
calls `npx camoufox-js fetch`, which silently downloads a ~300MB
Firefox-fork browser binary from GitHub Releases — multi-minute on
fast connections, and a hard block for users on slow / restricted
networks (notably users in China running through a VPN).

Camofox is an explicit opt-in browser backend. The runtime check
in `tools/browser_tool.py` only routes through Camofox when the
user has set `CAMOFOX_URL` (selected via `hermes tools` →
Browser Automation → Camofox). Users who never opted in never
touched the package at runtime, yet every `hermes update` paid
for the binary fetch anyway.

This change:

* Removes `@askjo/camofox-browser` from root `package.json`
  dependencies (and the regenerated `package-lock.json` drops
  Camofox's entire transitive tree, ~2.6k lines).
* Updates the Camofox `post_setup` handler in
  `hermes_cli/tools_config.py` to install
  `@askjo/camofox-browser@^1.5.2` explicitly when the user
  selects Camofox, and streams npm output (no `--silent`, no
  `capture_output`) so the ~300MB download is visible rather
  than appearing frozen.
* Adds `tests/test_package_json_lazy_deps.py` as a regression
  guard so future PRs can't silently re-add Camofox (or any
  binary-postinstall package) to eager root dependencies.

`agent-browser` stays eager — it is the default Chromium-driving
backend used by every session that does not have a cloud browser
provider configured, and its postinstall is small.

Validation:

| | Before | After |
|---|---|---|
| `hermes update` time on slow network | multi-minute hang at `→ Updating Node.js dependencies...` | seconds (no binary fetch) |
| Camofox opt-in install visibility | silent, looked frozen | streamed npm output |
| Regression guard against re-adding | none | `test_package_json_lazy_deps.py` |

Tests:
- `tests/test_package_json_lazy_deps.py`: 3/3 pass
- `tests/tools/test_browser_camofox*`: 92/92 pass
- `tests/hermes_cli/test_tools_config.py`: 66/66 pass
- `tests/hermes_cli/test_cmd_update.py` + adjacent: green

Reported by lulu (Discord, May 2026) — `hermes update` hangs at
`→ Updating Node.js dependencies...` in China.
Related: NousResearch#18840, NousResearch#18869.
…ible (NousResearch#18840)

`hermes update` ran the repo-root and ui-tui npm installs with both
`--silent` and `subprocess.run(..., capture_output=True)`, which hides
all output from optional postinstall scripts.  The largest of those —
`@askjo/camofox-browser`'s `npx camoufox-js fetch` — downloads a
Firefox-fork browser binary that can take many minutes on slow
connections.  Because nothing was printed during that wait, the updater
appeared to hang at "Updating Node.js dependencies..." and users
Ctrl-C'd, sometimes leaving `node_modules` partially installed.

Drop `--silent` and pass `capture_output=False` for the repo-root and
ui-tui paths so npm streams its `info run …` postinstall lines straight
to the terminal.  Output is still mirrored to `~/.hermes/logs/update.log`
by the existing `_UpdateOutputStream` wrapper, so SSH-disconnect safety
is preserved.

The `web/` install path is untouched — its build step is fast and does
not run binary-fetching postinstalls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…earch#27101)

Port from google-gemini/gemini-cli#19332.

Users can now exit with '/exit --delete' (or '/quit --delete', '/exit -d')
to permanently remove the current session's SQLite history plus on-disk
transcripts (*.json / *.jsonl / request_dump_*) in one shot. Useful for
privacy-sensitive workflows and one-off interactions where leaving a
session recording behind is undesirable.

Implementation:
- New HermesCLI._delete_session_on_exit one-shot flag (defaults False).
- process_command() parses --delete / -d after /exit or /quit and arms
  the flag. Unknown args print a hint and keep the CLI running (prevents
  typos like '/exit -delete' from accidentally exiting).
- Shutdown path calls SessionDB.delete_session(session_id, sessions_dir=...)
  right after end_session() when the flag is set. That API already
  existed for 'hermes sessions delete' and handles both SQLite removal
  (orphaning child sessions so FK constraints hold) and on-disk file
  cleanup.
- /quit CommandDef now advertises '[--delete]' in args_hint so /help
  and CLI autocomplete surface it.

Tests: tests/cli/test_exit_delete_session.py (12 cases covering both
aliases, case insensitivity, whitespace, short form, unknown-arg
rejection, and registry metadata).

E2E-verified with isolated HERMES_HOME: session row deleted, all three
transcript/request-dump files removed, second delete_session call
correctly returns False.
…earch#27102)

Emit a grep-friendly '[MEMORY] rss=...MB ...' line in agent.log /
gateway.log every N minutes (default 5) so slow leaks in the long-lived
gateway process show up as a time series. Based on
cline/cline#10343
(src/standalone/memory-monitor.ts).

- gateway/memory_monitor.py: new module. Daemon thread, baseline on
  start, final snapshot on stop. Uses resource.getrusage() (stdlib)
  first, falls back to psutil, disables itself with one WARNING if
  neither is available.
- gateway/run.py: start monitor right after setup_logging() in
  start_gateway(); stop it in the shutdown block next to MCP teardown.
- hermes_cli/config.py: logging.memory_monitor { enabled, interval_seconds }
  defaults under the existing logging section.
- tests/gateway/test_memory_monitor.py: 10 unit tests covering format,
  baseline/shutdown snapshots, double-start noop, periodic timer,
  daemon thread invariant, and unavailable-RSS warn-and-skip path.

Adapted from TypeScript/Node to Python (threading.Event-based daemon
thread instead of setInterval/unref), added Python-specific gc + thread
counts to the log line (handier than ext/arrayBuffers for diagnosing
Python gateway leaks), and gated behind a config.yaml toggle so users
can silence the periodic line if they want.

No heap-snapshot-on-OOM equivalent — CPython doesn't have V8's
--heapsnapshot-near-heap-limit; tracemalloc would be the Python
equivalent but adds non-trivial overhead, so leaving that out.
…chemas (NousResearch#27104)

Port from anomalyco/opencode#24730: Moonshot's JSON Schema validator rejects
two shapes that the rest of the JSON Schema ecosystem accepts:

1. $ref nodes with sibling keywords. Moonshot expands the reference before
   validation and then rejects the node if keys like `description`, `type`,
   or `default` appear alongside $ref. MCP-sourced tool schemas commonly
   put a `description` on $ref-typed properties so the model sees the
   field hint — which worked on every provider except Moonshot.

2. Tuple-style `items` arrays (positional element schemas). Moonshot's
   engine requires ONE schema applied to every array element. Common in
   tool schemas generated from Go/Protobuf that model fixed-length arrays
   as `[{type:number}, {type:number}]`.

Repairs applied in `agent/moonshot_schema.py`:

- Rule 3: when a node has `$ref`, return `{"$ref": <value>}` only
  (strip every sibling). The referenced definition still carries its own
  description on the target node, which Moonshot accepts.
- Rule 4: when `items` is a list, collapse to the first element schema
  (falling back to `{}` which is then filled by the generic missing-type
  rule). Preserves `minItems` / `maxItems` / other siblings.

Tests: 10 new cases across TestRefSiblingStripping + TestTupleItems,
plus the existing TestMissingTypeFilled::test_ref_node_is_not_given_synthetic_type
still passes (it asserted plain $ref passes through; now it passes through
as exactly `{"$ref": "..."}` which is strictly compatible).

All 35 tests in test_moonshot_schema.py pass.
…rch#27105)

Port from anomalyco/opencode#25019 ("fix: handle invalid mcp urls").

Previously: a typo in `config.yaml` (missing scheme, wrong scheme,
empty string, non-string value) slipped past `_is_http()` and hit
`httpx.URL(url)` or `streamablehttp_client(url, ...)` deep in the
transport layer. That raised a generic exception which went through
the reconnect-backoff loop, so a bad URL caused _MAX_INITIAL_CONNECT_RETRIES
attempts with doubling backoff — about a minute of pointless retries
plus an opaque error — before the server was marked failed.

Now: we validate the URL once, at the top of `run()`, before
entering the retry loop. A malformed URL raises `InvalidMcpUrlError`
(a `ValueError` subclass) with a message that names the offending
server and explains exactly what was wrong. `_ready` is set and
`_error` is populated, so `start()` re-raises and the server shows
up as failed in `hermes mcp list` without any backoff burn.

Validation rules:
- Must be a string (rejects None, dict, int)
- Must be non-empty (rejects '' and whitespace-only)
- Scheme must be http or https (rejects file://, ws://, stdio://)
- Must have a non-empty host (rejects http:///, http://:8080)

Tests (21 new cases in tests/tools/test_mcp_invalid_url.py):
- TestValidUrlsAccepted: http, https, IPv6, ports, paths, query strings
- TestInvalidUrlsRejected: every rejection path above + clear error text
- TestErrorIsValueError: downstream code catching ValueError still works

E2E verified: a misconfigured server with `url: not-a-valid-url`
now fails in <0.001s with the clear error, instead of minutes of retries.

Doesn't touch stdio servers (they use `command`, not `url`) — the
validator only fires when `_is_http()` returns True.
…ousResearch#27110)

xAI announced on 2026-05-16 (https://x.ai/news/grok-hermes) that X Premium
subscriptions now work in Hermes Agent. The hint we shipped in PR NousResearch#26644
asserted the opposite ("X Premium+ does NOT include xAI API access — only
standalone SuperGrok subscribers can use this provider"), which would now
misdirect Premium+ users who hit any other 403 (no Grok sub at all, wrong
tier, exhausted quota) into thinking they need to switch subscriptions
when their sub is in fact valid.

Remove _decorate_xai_entitlement_error and its two call sites in
_summarize_api_error. xAI's own body text already says "Manage subscriptions
at https://grok.com/?_s=usage" — surface that verbatim and let xAI's wording
do the diagnosis.

The _is_entitlement_failure guard (which prevents credential-pool refresh
loops on entitlement 403s) and the reasoning-replay gating for xai-oauth
are unrelated and untouched.

Update tests to assert the body still surfaces verbatim and that no
Hermes-side editorializing is appended.
…e running (NousResearch#27175)

Surface live background-task count in the prompt_toolkit status bar so users
can see at a glance that a /background task exists and is running — no need
to ask the agent about it (the agent has no visibility into bg sessions by
design).

- _get_status_bar_snapshot now reports active_background_tasks from len()
  of the live _background_tasks dict (entries are removed in the task
  thread's finally block, so this reflects truly-running tasks)
- Indicator shown only on medium (<76) and wide (>=76) tiers; narrow (<52)
  stays minimal since it's already cramped
- No invalidate plumbing needed: status bar fragments are pulled via lambda
  on every redraw, and the bg thread already calls _app.invalidate() on exit

Refs NousResearch#8568
)

Adds a pure-local recap of recent session activity — turn counts,
tools used, files touched, last user ask, last assistant reply —
appended to the existing /status output. Useful when juggling multiple
sessions and you want a one-glance reminder of where this one left off.

Inspired by Claude Code 2.1.114's /recap, but folded into /status so
we don't add a 6th info command. Pure local computation: no LLM call,
no auxiliary model, no prompt-cache invalidation, instant and free.

Salvage of NousResearch#18587 — kept the shared hermes_cli.session_recap.build_recap
helper and its 13 unit tests, dropped the /recap slash command +
ACTIVE_SESSION_BYPASS_COMMANDS entry + Level-2 bypass since /status
already covers both surfaces.

Tailored to hermes-agent's tool vocabulary: file-editing tools
(patch, write_file, read_file, skill_manage, skill_view) surface
touched paths; tool-call counts highlight which classes of work
drove the session.

Source: https://code.claude.com/docs/en/whats-new/2026-w17
teknium1 and others added 24 commits May 17, 2026 17:15
7 new tests:

TestAuxiliaryFallbackLayering (3):
  - configured_chain succeeds → main agent fallback NOT consulted
  - chain returns nothing → main agent fallback runs and succeeds
  - both exhausted → user-visible 'all fallbacks exhausted' warning
    fires before the original error is re-raised

TestTryMainAgentModelFallback (4):
  - returns (None, None, "") when main provider is 'auto'
  - returns (None, None, "") when failed provider == main provider
    (no point retrying the same backend)
  - resolves the main provider's client when configured correctly
  - skips when main provider is marked unhealthy
Adds a new 'Auxiliary Capacity-Error Fallback' section to
website/docs/user-guide/features/fallback-providers.md covering:

- The 4-step ladder (primary → fallback_chain → main agent → warn)
- Which errors trigger fallback (402, 429 quota, connection) vs
  which respect explicit provider choice (transient 429 rate limits)
- Optional fallback_chain config schema with vision + compression examples
- Recognized quota-error phrases (Bedrock, Vertex AI, generic)

Updates the bottom summary table — every auxiliary task now shows
'Layered (see above)' instead of 'Auto-detection chain' since
explicit-provider users also get the main-agent safety net.
…odels (NousResearch#27797)

Grok models hit the same failure modes that OPENAI_MODEL_EXECUTION_GUIDANCE
addresses for GPT/Codex: claiming completion without tool calls
('to be honest, I didn't create the file yet'), suggesting workarounds
instead of using existing tools (proposing a folder-based memory system
when the memory tool exists), replying with plans instead of executing.

TOOL_USE_ENFORCEMENT_GUIDANCE was already injected for any model whose
name contains 'grok' (TOOL_USE_ENFORCEMENT_MODELS). This extends the
follow-on family-specific block — OPENAI_MODEL_EXECUTION_GUIDANCE
(tool_persistence / mandatory_tool_use / act_dont_ask / prerequisite_checks
/ verification / missing_context) — to grok-named models too.

The OPENAI_ prefix is retained for backwards compat with imports/tests;
docstring + inline comment now note that the body is family-agnostic and
the prefix reflects origin, not exclusivity.

Tests cover the OpenRouter slug (x-ai/grok-4.3) and the xai-oauth bare
name (grok-4.3), plus a negative control on claude.

E2E verified against a real AIAgent build of the system prompt for both
xai-oauth and openrouter grok models.
…ogging

The system prompt's 'Conversation started:' line carried minute precision
(%I:%M %p), making it byte-unstable across every rebuild path. Within a
CLI session the in-memory cache held, but on the gateway path (fresh
AIAgent per turn → restore from session DB), any silent failure in the
read or write path dropped the cache stem and forced a full re-prefill
on every subsequent turn. Local prefix-caching backends (llama.cpp /
vLLM) saw this as KV-cache invalidation; remote prefix-caching providers
saw it as an Anthropic-style cache miss.

Three changes:

1. Date-only timestamp ('Sunday, May 17, 2026' instead of '... 03:42 PM').
   System prompt now byte-stable for the full day. The model can still
   query exact time via tools when it actually needs it. Credit:
   @iamfoz (PR NousResearch#20451).

2. Loud logging on session DB write failures. The update_system_prompt
   call used to log at DEBUG, hiding disk-full / locked-database / schema
   drift behind a silent fall-through that forced fresh rebuilds on
   every subsequent turn. Now WARN with the session id and exception so
   persistent issues show up in agent.log without verbose mode.

3. Three-way stored-state distinction on read. The previous
   'session_row.get("system_prompt") or None' collapsed three states
   into one (missing row / null column / empty string). Now we tell them
   apart and WARN when a continuing session lands on null/empty (which
   means the previous turn's write never persisted — every subsequent
   turn rebuilds and the prefix cache misses every time).

The restore block is extracted into _restore_or_build_system_prompt()
so the prefix-cache path can be unit-tested in isolation.

E2E proof: fresh AIAgent constructed for turn 2 across a minute-boundary
sleep restores byte-identical bytes from the session DB. NULL stored
prompt fires the new warning. Date-only timestamp survives the rebuild
path. All on real SessionDB, no mocks.

Tests:
  - tests/agent/test_system_prompt_restore.py (10 new tests)
  - tests/run_agent/test_run_agent.py::TestBuildSystemPrompt::
        test_datetime_is_date_only_not_minute_precision

Closes NousResearch#20451 (date-only), NousResearch#18547 (prefix stabilization),
NousResearch#8689 (stabilize timestamp across compression), NousResearch#15866 (timestamp
caching question), NousResearch#8687 (compression timestamp), NousResearch#27339
(claim #3: live timestamp in cached system prompt).

Co-authored-by: Martyn Forryan <9133432+iamfoz@users.noreply.github.com>
…e — no LLM (NousResearch#27590)

* feat(session_search): single-shape tool with discovery, scroll, browse — no LLM

Replaces the LLM-summarized session_search with a single-shape tool that
returns actual messages from the DB. Three calling shapes inferred from
args (no mode parameter):

  1. Discovery — pass query. FTS5 + anchored ±5 window + bookends per hit,
     all in one call. ~20ms on a real DB instead of ~90s for the previous
     three aux-LLM calls.
  2. Scroll — pass session_id + around_message_id. Returns a window
     centered on the anchor. To paginate, re-anchor on the first/last id
     of the returned window. Boundary message appears in both windows
     as the orientation marker. ~1ms per scroll call.
  3. Browse — no args. Recent sessions chronologically.

Bookend_start (first 3 user+assistant msgs) and bookend_end (last 3) give
the agent goal + resolution on every discovery hit, so a single tool call
reconstructs a long session's arc without loading the whole transcript.

The aux-LLM summary path is gone: it cost ~$0.30/call, took ~30s, and
laundered FTS5 hits through a model that could confabulate when the right
session wasn't in the hit list. The merged shape returns byte-for-byte
content from SQLite.

History:
- PR NousResearch#20238 (JabberELF) seeded the fast/summary dual-mode split.
- PR NousResearch#26419 (yoniebans) expanded to fast/guided/summary with bookends,
  multi-anchor drill-down, default-mode config, and a teaching skill.

This PR collapses that toolkit into one shape with explicit scroll
support, drops the summary path, drops the mode parameter, drops the
config knob, drops the skill. JabberELF's seed work is acknowledged via
the AUTHOR_MAP entry.

Validation:
- 38/38 tool tests pass (tests/tools/test_session_search.py)
- 12/12 get_messages_around tests pass (tests/hermes_state/)
- 11/11 get_anchored_view tests pass (tests/hermes_state/)
- Full tests/tools/ run: 5168 passing, 2 failures pre-exist on main
  (test ordering in test_delegate.py, unrelated)
- E2E against live state DB: discovery 20ms, scroll 1ms, browse 280ms;
  pagination forward+backward works with boundary-message orientation;
  error paths return clean tool_error responses

Co-authored-by: JabberELF <abcdjmm970703@gmail.com>
Co-authored-by: yoniebans <jonny@nousresearch.com>

* chore(session_search): prune dead LLM-summary config and docs

Companion to the single-shape rewrite. The auxiliary.session_search config
block, max_concurrency / extra_body tunables, and matching docs sections
all referenced the removed LLM summarization path. Removing them so users
don't try to tune knobs that nothing reads.

- hermes_cli/config.py: drop dead auxiliary.session_search block from
  DEFAULT_CONFIG. Leftover keys in user config.yaml are harmless and
  ignored.
- hermes_cli/tips.py: drop two tips referencing the removed
  max_concurrency / extra_body knobs.
- website/docs/user-guide/configuration.md: drop 'Session Search Tuning'
  section and the auxiliary.session_search block from the example.
- website/docs/user-guide/features/fallback-providers.md: drop session_search
  rows from the auxiliary-tasks tables and the dedicated tuning subsection.
- website/docs/reference/tools-reference.md: rewrite the session_search
  entry to describe the new three-shape behaviour.
- CONTRIBUTING.md: update the file-tree description.
- tests/tools/test_llm_content_none_guard.py: remove TestSessionSearchContentNone
  class and test_session_search_tool_guarded — both guard against an
  unguarded .content.strip() call site in _summarize_session() that no
  longer exists.

Validation: 97/97 targeted tests still pass (hermes_state + session_search +
llm_content_none_guard). Config tests 55/55.

---------

Co-authored-by: JabberELF <abcdjmm970703@gmail.com>
Co-authored-by: yoniebans <jonny@nousresearch.com>
…ousResearch#27840)

Companion PR to NousResearch#27590. Sweeps remaining stale references to the
LLM-summary path that landed in main with NousResearch#27590 but weren't fully
caught in the followup cleanup commit.

Real rewrites:
- user-guide/sessions.md: 'Session Search Tool' section rewritten to
  describe the three calling shapes (discovery / scroll / browse) with
  worked examples. Adds the 'Optional parameters' subsection covering
  sort and role_filter.
- user-guide/features/memory.md: 'Session Search' overview rewritten,
  comparison table updated (speed: ms instead of LLM summarization,
  added explicit free-cost row, link to sessions.md for details).

Stale-claim sweeps:
- user-guide/configuring-models.md: drop the 'Session Search' row from
  the aux-model override table (no aux model anymore), drop session
  search from the auxiliary-models list.
- user-guide/features/codex-app-server-runtime.md: drop session_search
  from the ChatGPT-subscription cost note, drop the session_search
  block from the per-task override config example.
- developer-guide/provider-runtime.md: drop 'session search
  summarization' from the auxiliary tasks list.
- developer-guide/agent-loop.md: drop session search from the
  auxiliary fallback chain list.
- user-guide/skills/.../autonomous-ai-agents-hermes-agent.md: drop
  session_search from the 'auxiliary models not working' debug step.

Untouched (still accurate as tool-name mentions, not behavioral claims):
- features/tools.md, features/honcho.md, features/acp.md
- cli.md, sessions.md (other sections)
- developer-guide/tools-runtime.md, agent-loop.md (line 157)
- acp-internals.md, adding-tools.md, prompt-assembly.md
- reference/toolsets-reference.md, reference/tools-reference.md
…usResearch#27830)

`hermes_cli/doctor.py` had two recurring patterns:

1. **15 section headers** of the form `print() ; print(color("◆ Name", Colors.CYAN, Colors.BOLD))`
   bracketed by 3-line `# =====` / `# Check: X` / `# =====` comment banners.

2. **Paired `check_fail(...) ; issues.append(...)`** for every diagnostic that emits both a
   user-visible failure and an auto-fix instruction.

Add two helpers and collapse the patterns:

  def _section(title):
      print()
      print(color(f"◆ {title}", Colors.CYAN, Colors.BOLD))

  def _fail_and_issue(text, detail, fix, issues):
      check_fail(text, detail)
      issues.append(fix)

Replacements:
- 15 `# =====/# X/# =====` banner triples + section header pairs compressed to `_section(...)`
- All 18 `check_fail + issues.append` pairs collapsed to `_fail_and_issue(...)` (single-line
  where the call fits under 120 chars, multi-line where it doesn't)
- Net -5 LOC (`+128 / -133`)

The LOC delta is modest after wrapping long calls onto multi-line form for readability — the
real win is uniform call shape and removal of two parallel-pattern footguns. There is now
exactly one way to emit a diagnostic that pairs a user-visible failure with a fix instruction.

Behavior is byte-identical. `_section` produces the same blank line + bold-cyan output the
inline two prints did, and `_fail_and_issue` does the same `check_fail + issues.append`
sequence in the same order. Verified empirically by diffing live `run_doctor()` stdout from
this branch against `origin/main` — `diff -q` reports zero differences.

Test plan:
- All 69 tests across test_doctor.py, test_doctor_command_install.py, and
  test_doctor_dedicated_provider_skip.py pass
- `ruff check hermes_cli/doctor.py` clean
- Live `run_doctor()` output byte-identical to origin/main

Refs NousResearch#23972 (Phase 2 tracker — dedup-only refactor in line with the "net-LOC-negative"
discipline).
Yuanbao's QuoteContextMiddleware has a transcript-lookup fallback for
when quote.desc is empty: it scans the session transcript for the quoted
message_id and pulls ybres anchors out of its content. That fallback
works for observed (silent) group messages because the platform writer
attaches message_id (yuanbao.py:2091).

It silently fails for @bot agent-processed messages because gateway/run.py
wrote them as {role:user, content, timestamp} with no message_id, so
quoting an earlier @bot turn that contained an image/file couldn't be
resolved.

Fix: attach event.message_id to the user transcript entry at all three
write sites in gateway/run.py — the agent_failed_early branch, the
no-new-messages edge case, and the normal agent path (first user-role
entry in new_messages).

Surfaces gap reported in NousResearch#27425 (loongfay) using the existing fallback
already on main; no new caches needed.

Co-authored-by: loongfay <loongfay@users.noreply.github.com>
…aware) (NousResearch#27871)

Adds a 'triage_aux_unavailable' diagnostic for tasks stuck in triage when
neither the active aux helper slot nor the main-model auto fallback is usable.

Auto-decompose aware:
- kanban.auto_decompose=True (default): primary is auxiliary.kanban_decomposer,
  triage_specifier is the fanout=false fallback.
- kanban.auto_decompose=False: primary is auxiliary.triage_specifier (manual
  'hermes kanban specify' path).

Default aux slots use 'provider: auto' which falls back to the main model, so
this rule only fires when both the explicit slot config AND the main-model
auto fallback are absent. Quiet by default; informative when there is a real
config gap.

Also adds kd.config_from_runtime_config() that carries kanban + auxiliary +
model keys through to diagnostics, and updates CLI/dashboard call sites to
use it. config_from_kanban_config() is preserved for back-compat.

Reworks the original PR NousResearch#25640 idea (@qWaitCrypto) to align with the new
auto-decompose dispatcher path landed in NousResearch#27572. The original PR pointed only
at auxiliary.triage_specifier, which is now the fallback rather than the
primary helper.

Co-authored-by: qWaitCrypto <axmaiqiu@gmail.com>
…om any agent surface (NousResearch#27813)

The agent can now produce a chart, PDF, spreadsheet, or any other supported
file type and have it land in Slack / Discord / Telegram / WhatsApp / etc.
as a native attachment, just by mentioning the absolute path in its
response. Same primitive works for kanban-worker completions: workers
attach artifacts via kanban_complete(artifacts=[...]) and the gateway
notifier uploads them alongside the completion message.

Changes:

- gateway/platforms/base.py: extract_local_files now covers PDFs, docx,
  spreadsheets (xlsx/csv/json/yaml), presentations (pptx), archives
  (zip/tar/gz), audio (mp3/wav/...), and html — not just images and video.
  Image/video extensions still embed inline; everything else routes to
  send_document via the existing dispatch partition in gateway/run.py.

- tools/kanban_tools.py + hermes_cli/kanban_db.py: kanban_complete gains
  an explicit ``artifacts`` parameter. The handler stashes it in
  metadata.artifacts (for downstream workers) and the kernel promotes
  it onto the completed-event payload so the notifier can find it
  without a second SQL round-trip.

- gateway/run.py: _kanban_notifier_watcher now calls a new helper
  _deliver_kanban_artifacts after sending the completion text. The
  helper reads payload.artifacts (preferred), falls back to scanning
  the payload summary and task.result with extract_local_files, then
  partitions images / videos / documents and uploads each via
  send_multiple_images / send_video / send_document.

- website/docs/user-guide/features/deliverable-mode.md + sidebars.ts:
  user-facing docs page covering the extension list, the kanban
  artifacts pattern, and the MCP-for-connector-breadth recommendation.

Tests:

- tests/gateway/test_extract_local_files.py: 7 new test cases
  (documents, spreadsheets, presentations, audio, archives, html,
  chart-pdf canonical case). 44 passing, 0 regressions.
- tests/tools/test_kanban_tools.py: 4 new cases covering the artifacts
  arg shape (list / string / merge with existing metadata / type
  rejection). 17 passing.
- tests/hermes_cli/test_kanban_notify.py: 2 new cases covering full
  notifier → artifact-upload path and missing-file silent-skip. 12
  passing.
- E2E (real files, real kanban kernel, real BasePlatformAdapter):
  worker calls kanban_complete(artifacts=[png,pdf,csv]) → metadata +
  event payload land → notifier helper partitions correctly →
  send_multiple_images called once with the PNG, send_document called
  twice with PDF + CSV.

What's NOT in this PR (deferred to follow-ups):

- Ad-hoc "research this for two hours, ping the thread when done"
  slash command — covered today by kanban subscriptions; a dedicated
  slash command can ride a follow-up PR if needed.
- Setup-wizard prompt for recommended MCP servers (Notion, GitHub,
  Linear, etc.) — docs page lists them; UI is a separate change.

Plan and rationale captured in ~/.hermes/docs/perplexity-computer-parity.pdf
(local doc, not shipped).
…earch#27843)

* feat(config): add install-method stamping + Docker detection

Dockerfile stamps "docker", install.sh stamps "git", and cmd_postinstall
stamps "pip" into ~/.hermes/.install_method. detect_install_method() reads
the stamp first, then falls back to managed-system / container / .git
heuristics. Adds Docker upgrade guidance.

Tracking: NousResearch#27826

* fix(stamp): move Docker stamp to entrypoint, install.sh stamp after print_success

The Dockerfile stamp was overwritten by the VOLUME overlay at container
start. Moving it to entrypoint.sh ensures it persists. The install.sh
stamp now writes after print_success so it only lands on full success.
…s1 + detection (NousResearch#27845)

* feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection

dep_ensure.py gains Windows awareness: PowerShell invocation, platform-
specific browser detection, (path, shell) tuple returns.

install.ps1 gains -Ensure/-PostInstall modes using npm -g --prefix
(aligned with install.sh) and agent-browser install for Chromium.

browser_tool.py gains node/ in candidate dirs for Windows .cmd shims.
Both install scripts bundled in pip wheel.

Tracking: NousResearch#27826

* fix(install.ps1): add --ignore-scripts to npm install for camofox

@askjo/camofox-browser has a dependency (impit) whose postinstall
script runs `npx only-allow pnpm`, which fails under npm. Adding
--ignore-scripts avoids the spurious failure without affecting
functionality.

Tracking: NousResearch#27826

* fix: remove duplicate install scripts from git

CI already copies scripts/install.{sh,ps1} into hermes_cli/scripts/
during wheel build. No need to commit copies — .gitignore keeps them
out, _find_install_script() falls back to scripts/ for git-clone users.

Tracking: NousResearch#27826

* fix: address review — remove env_extra, fix ps1 error handling

- Remove unused env_extra parameter from ensure_dependency()
- Invoke-EnsureMode node case now uses Test-Node consistently
- Install-AgentBrowser uses throw instead of exit 1
…sh,ps1} (NousResearch#27851)

* refactor(bootstrap): consolidate ACP browser bootstrap into install.{sh,ps1}

Delete 687 lines of duplicated browser bootstrap code from
acp_adapter/bootstrap/. All browser installation now routes through
dep_ensure -> install.{sh,ps1} --ensure, using agent-browser install
for Chromium. install.sh gains ensure_browser() with macOS app-bundle
detection and per-distro guidance.

Tracking: NousResearch#27826

* fix(install.sh): add --ignore-scripts to npm install for camofox

@askjo/camofox-browser has a dependency (impit) whose postinstall
script runs `npx only-allow pnpm`, which fails under npm. Adding
--ignore-scripts avoids the spurious failure without affecting
functionality.

Tracking: NousResearch#27826

* fix: add explicit return in ensure_browser, narrow exception in entry.py

ensure_browser() now returns 0 explicitly on all success paths.
_run_setup_browser() catches OSError instead of broad Exception,
letting ImportError propagate as a real packaging bug.
…al-statusbar

fix(tui): keep /goal verdict out of compact status row
Daily sync with upstream. Auto-created by cron job.
@github-actions

github-actions Bot commented May 18, 2026

Copy link
Copy Markdown

🔎 Lint report: sync/upstream-20260518 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8793 on HEAD, 8711 on base (🆕 +82)

🆕 New issues (40):

Rule Count
invalid-argument-type 13
unresolved-import 12
unresolved-attribute 7
unused-type-ignore-comment 3
unknown-argument 2
invalid-parameter-default 1
no-matching-overload 1
invalid-assignment 1
First entries
tests/hermes_cli/test_kanban_decompose.py:144: [unresolved-attribute] unresolved-attribute: Attribute `status` is not defined on `None` in union `Task | None`
agent/auxiliary_client.py:2732: [unknown-argument] unknown-argument: Argument `api_key` does not match any known parameter of function `resolve_provider_client`
agent/auxiliary_client.py:2731: [unknown-argument] unknown-argument: Argument `base_url` does not match any known parameter of function `resolve_provider_client`
tests/hermes_cli/test_kanban_decompose_db.py:9: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/hermes_cli/test_kanban_decompose.py:187: [unresolved-attribute] unresolved-attribute: Attribute `assignee` is not defined on `None` in union `Task | None`
tests/hermes_cli/test_web_oauth_dispatch.py:27: [unresolved-import] unresolved-import: Cannot resolve imported module `httpx`
tests/tools/test_session_search.py:98: [unresolved-attribute] unresolved-attribute: Attribute `lower` is not defined on `dict[str, str | dict[str, dict[str, str] | dict[str, str | int] | dict[str, str | list[str]]] | list[Unknown]]` in union `str | dict[str, str | dict[str, dict[str, str] | dict[str, str | int] | dict[str, str | list[str]]] | list[Unknown]]`
hermes_cli/main.py:9206: [invalid-argument-type] invalid-argument-type: Argument to function `normalize_profile_name` is incorrect: Expected `str`, found `Any | None`
tests/hermes_cli/test_kanban_decompose.py:15: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
hermes_cli/profile_describer.py:214: [unused-type-ignore-comment] unused-type-ignore-comment: Unused blanket `type: ignore` directive
agent/tool_executor.py:626: [invalid-argument-type] invalid-argument-type: Argument to function `session_search` is incorrect: Expected `int`, found `Any | None`
tests/hermes_state/test_get_messages_around.py:8: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/tools/test_session_search.py:198: [invalid-argument-type] invalid-argument-type: Argument to function `session_search` is incorrect: Expected `int`, found `Literal["bogus"]`
tests/hermes_cli/test_profile_describer.py:11: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/tools/test_session_search.py:86: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> LiteralString, (key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> str]` cannot be called with key of type `Literal["sort"]` on object of type `str`
hermes_cli/main.py:9210: [invalid-argument-type] invalid-argument-type: Argument to function `get_profile_dir` is incorrect: Expected `str`, found `Any | None`
tests/tools/test_session_search.py:85: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> LiteralString, (key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> str]` cannot be called with key of type `Literal["properties"]` on object of type `str`
tests/hermes_state/test_get_anchored_view.py:7: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/agent/test_system_prompt_restore.py:21: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
hermes_cli/main.py:9236: [invalid-argument-type] invalid-argument-type: Argument to function `describe_profile` is incorrect: Expected `str`, found `str | Any | None`
plugins/kanban/dashboard/plugin_api.py:1597: [unused-type-ignore-comment] unused-type-ignore-comment: Unused blanket `type: ignore` directive
tests/tools/test_session_search.py:346: [invalid-argument-type] invalid-argument-type: Argument to function `session_search` is incorrect: Expected `int`, found `Literal["not-an-int"]`
tests/hermes_cli/test_kanban_decompose_db.py:67: [unresolved-attribute] unresolved-attribute: Attribute `status` is not defined on `None` in union `Task | None`
tools/session_search_tool.py:386: [invalid-parameter-default] invalid-parameter-default: Default value of type `None` is not assignable to annotated parameter type `int`
agent/auxiliary_client.py:2691: [no-matching-overload] no-matching-overload: No overload of bound method `dict.get` matches arguments
... and 15 more

✅ Fixed issues (10):

Rule Count
invalid-argument-type 4
unresolved-attribute 2
invalid-assignment 1
unresolved-import 1
invalid-return-type 1
unresolved-reference 1
First entries
tests/tools/test_session_search.py:327: [unresolved-attribute] unresolved-attribute: Unresolved attribute `SessionDB` on type `ModuleType`
tools/session_search_tool.py:375: [invalid-argument-type] invalid-argument-type: Argument to bound method `SessionDB.search_messages` is incorrect: Expected `list[str]`, found `None | list[str]`
tools/xai_http.py:11: [invalid-assignment] invalid-assignment: Object of type `None` is not assignable to `def get_env_value(key: str) -> str | None`
tests/tools/test_session_search.py:441: [invalid-argument-type] invalid-argument-type: Argument to function `session_search` is incorrect: Expected `int`, found `None`
tests/tools/test_session_search.py:467: [invalid-argument-type] invalid-argument-type: Argument to function `session_search` is incorrect: Expected `int`, found `Literal["2"]`
tests/hermes_cli/test_web_oauth_dispatch.py:26: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tools/session_search_tool.py:474: [invalid-return-type] invalid-return-type: Return type does not match returned value: expected `list[str | Exception]`, found `list[str | None | BaseException]`
run_agent.py:3612: [unresolved-attribute] unresolved-attribute: Object of type `Self@_needs_kimi_tool_reasoning` has no attribute `model`
tests/agent/test_skill_commands.py:47: [unresolved-reference] unresolved-reference: Name `pytest` used when not defined
tests/tools/test_session_search.py:454: [invalid-argument-type] invalid-argument-type: Argument to function `session_search` is incorrect: Expected `int`, found `<class 'int'>`

Unchanged: 4583 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@bot-ted bot-ted merged commit afedefe into main May 18, 2026
19 of 20 checks passed
@bot-ted bot-ted deleted the sync/upstream-20260518 branch May 18, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.