Skip to content

feat(dashboard-auth): rotate dashboard sessions via refresh token#37247

Merged
benbarclay merged 3 commits into
mainfrom
feat/dashboard-oauth-refresh-client
Jun 2, 2026
Merged

feat(dashboard-auth): rotate dashboard sessions via refresh token#37247
benbarclay merged 3 commits into
mainfrom
feat/dashboard-oauth-refresh-client

Conversation

@benbarclay

Copy link
Copy Markdown
Collaborator

Summary

The dashboard auth-code grant now issues a 24h rotating refresh token (server side: NousResearch/nous-account-service#293). This is the Hermes client half — it wires up transparent access-token refresh so an active dashboard user is no longer bounced to /login every 15 minutes (the access-token TTL).

Previously refresh_session was an unconditional RefreshExpiredError ("no refresh token in V1") and the middleware went straight to re-login when the access token expired.

Depends on: NousResearch/nous-account-service#293 (server-side RT issuance). Merge that first.

What changed

plugins/dashboard_auth/nous/__init__.py

  • refresh_session() POSTs grant_type=refresh_token to Portal's token endpoint and returns a Session carrying the rotated refresh token. The RT is sent in both the request body and the X-Refresh-Token header — Portal's schema requires it in the body (header-only is rejected invalid_request), and the header lets Portal keep it out of body-access-logs. (This body-vs-header detail was caught by live testing against the fix: eliminate shell noise from terminal output and fix 36 test failures #293 preview — see Validation.)
  • 400 (expired / revoked / reuse-detected) → RefreshExpiredError → middleware forces clean re-login. Network error → ProviderError. Empty RT → fast-fail, no network call.
  • complete_login now captures the initial RT (forward-tolerant: "" if a deploy omits it).
  • Extracted shared _token_response_to_session(response, *, bad_request_exc=...) so the auth-code path raises InvalidCodeError and the refresh path raises RefreshExpiredError from one code path.
  • revoke_session stays a best-effort no-op: Portal exposes no public token-endpoint revoke grant (revocation is the authenticated /sessions UI, keyed by sessionId+userId), so logout is cookie-clearing and the 24h session expires on its own. Documented for a future revoke grant.

hermes_cli/dashboard_auth/middleware.py

  • On an expired/invalid access token the gate now attempts refresh via the session RT before forcing re-login. On success it serves the request and re-sets the rotated cookies on the response. This write-back is mandatory: Portal rotates the RT on every refresh and reuse-detects, so a stale RT cookie would replay a rotated token on the next refresh and (outside Portal's 60s grace) revoke the whole session.
  • RefreshExpiredError / no-RT → fall through to clear-and-relogin. ProviderError (Portal unreachable) → clean re-login, not a 500.
  • Uses the existing REFRESH_SUCCESS / REFRESH_FAILURE audit events.

Validation

Unit / integration: 176 dashboard-auth tests pass (provider rotated-RT capture, 400/500/network mapping, middleware near-expiry rotation + cookie write-back, gate semantics).

Live E2E against the #293 preview deploy (…nousresearch.wtf), driving the real provider — 3/3:

  • refresh_session(bad rt) → real POST to live /api/oauth/token → Portal 400 → RefreshExpiredError. This caught a real bug: the first cut sent the RT header-only and the live server rejected it invalid_request; fixed to send it in the body too.
  • ✅ Live JWKS fetch + RS256 verification correctly rejects a forged token.
  • ✅ Empty-RT fast-fail (no network).

Validation boundary (honest): the successful happy-path rotation (real Portal-signed RT → rotate → fresh access token + newly-rotated RT) is covered by unit tests + the server-side #293 tests, but was not exercised against live Portal here — that needs an interactive browser OAuth round-trip with a Nous login + a registered agent:* client on the preview. The failure path, JWKS, and contract shape are all live-verified.

Security notes

  • RT cookie is HttpOnly + Secure (HTTPS) + SameSite=Lax, written only on the token-rotation response.
  • 24h session TTL, rotating RT, reuse-detection with whole-session revocation, 60s dashboard grace — all enforced server-side (fix: eliminate shell noise from terminal output and fix 36 test failures #293); this client just persists the rotated RT and forces re-login on any refresh failure.

The dashboard auth-code grant now issues a 24h rotating refresh token
(server side: NousResearch/nous-account-service#293). This wires up the
Hermes client half so an expired access token is transparently refreshed
instead of bouncing the user to /login every 15 minutes.

plugins/dashboard_auth/nous:
- refresh_session() now POSTs grant_type=refresh_token to Portal's token
  endpoint and returns a Session carrying the ROTATED refresh token (was
  an unconditional RefreshExpiredError under the old "no RT in V1"
  contract). The RT is sent in BOTH the request body (Portal's schema
  requires it there) and the X-Refresh-Token header (log redaction) —
  verified against the #293 preview deploy: header-only is rejected as
  invalid_request, body is accepted.
- A 400 from Portal (expired / revoked / reuse-detected) maps to
  RefreshExpiredError so the middleware forces a clean re-login; network
  errors map to ProviderError; empty RT fast-fails without a network call.
- complete_login now captures the initial refresh token Portal returns
  (forward-tolerant: empty string if a deploy omits it).
- Extracted the shared token-response handling into
  _token_response_to_session, parameterised on the 400 exception type so
  the auth-code path raises InvalidCodeError and the refresh path raises
  RefreshExpiredError.
- revoke_session stays a best-effort no-op: Portal exposes no public
  token-endpoint revocation grant (revocation is the authenticated
  /sessions UI, keyed by sessionId+userId), so logout is cookie-clearing
  and the 24h session expires on its own. Documented for a future
  revoke grant.

hermes_cli/dashboard_auth/middleware:
- On an expired/invalid access token the gate now attempts refresh via
  the session's RT BEFORE forcing re-login. On success it serves the
  request and re-sets the rotated cookies on the response (mandatory:
  Portal rotates the RT every refresh and reuse-detects, so a stale RT
  cookie would revoke the whole session on the next refresh). On
  RefreshExpiredError (or no RT) it falls through to clear-and-relogin.
- ProviderError during refresh (Portal unreachable) forces a clean
  re-login rather than 500-ing the request.
- Uses the existing REFRESH_SUCCESS / REFRESH_FAILURE audit events.

Validation:
- 176 dashboard-auth unit/integration tests pass.
- Live E2E against the #293 preview deploy: refresh_session(bad rt) ->
  RefreshExpiredError through the real token endpoint; live JWKS fetch +
  RS256 verification rejects a forged token; empty-RT fast-fail. The
  successful happy-path rotation is covered by unit tests (a live run
  needs an interactive browser OAuth round trip + registered agent:*
  client).

Depends on: NousResearch/nous-account-service#293 (server-side RT issuance).
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: feat/dashboard-oauth-refresh-client vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9647 on HEAD, 9647 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4997 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

The refresh-token header must match Portal's REFRESH_TOKEN_HEADER exactly
("x-nous-refresh-token"); the initial cut used "X-Refresh-Token", which
Portal silently ignores (harmless since the RT is also in the body, which
is what the schema requires — but the header redaction was a no-op).
Confirmed against the NAS token route + re-validated live against the
#293 preview deploy.
@benbarclay

Copy link
Copy Markdown
Collaborator Author

✅ Live happy-path validation complete (gap closed)

The one gap noted in the PR description — successful happy-path rotation against live Portal — is now closed. Validated end-to-end against the #293 preview deploy using a real Portal-signed agent:* refresh token minted via an actual browser login (gated dashboard bound 0.0.0.0, localhost redirect_uri which Portal accepts):

1. Happy-path rotationrefresh_session(real RT):

  • Fresh access token returned (real signed JWT, len 1501)
  • Access TTL = 899s (~15 min, as designed)
  • A new, different refresh token came back (rotation confirmed → middleware has a value to persist)
  • Real user_id / org_id resolved from the rotated token

2. Reuse-detection lifecycle — replaying the now-rotated RT:

  • Within Portal's 60s grace: accepted, returns the successor (covers the benign multi-tab / retry race)
  • After 60s grace: 400 invalid_grantRefreshExpiredError → clean re-login (server-side: family revocation)

This also incidentally confirms the DASHBOARD_REFRESH_DUPLICATE_GRACE_SECONDS = 60 constant from #293 is live and behaving exactly as specified (the device-flow 24h grace would not show this transition in ~65s).

Minor non-blocking note: Portal's rotated access tokens omit the oauth_contract_version claim (the provider tolerates this — warns + proceeds, per its documented contract-version handling). Harmless; flagging as a Portal-side observation, not a client defect.

Combined with the earlier live checks (refresh-failure → RefreshExpiredError, JWKS forgery rejection) and 176 passing unit/integration tests, every code path is now verified against the deployed server.

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/cli CLI entry point, hermes_cli/, setup wizard comp/plugins Plugin system and bundled plugins area/auth Authentication, OAuth, credential pools labels Jun 2, 2026
…n evicted

The gated middleware bounced users to /login the instant the access-token
cookie was absent, without ever consulting the refresh token:

    at, _rt = read_session_cookies(request)
    if not at:
        return _unauth_response(...)   # bailed here

This made transparent refresh effectively dead for the common case. The
access-token cookie is set with Max-Age = access_token_expires_in (~15 min),
so a real browser EVICTS hermes_session_at the moment the token lapses while
hermes_session_rt persists (30-day Max-Age). From that point the browser
sends only the refresh-token cookie — and the old guard rejected it before
_attempt_refresh could run. The _attempt_refresh path only fired for a
present-but-invalid access token, which never happens in a browser.

Fix: only hard-bounce when NEITHER cookie is present. A request carrying
just the refresh token now skips verification (no AT to verify) and flows
into the existing refresh path, which rotates both cookies and serves the
request transparently. A dead/expired RT still raises RefreshExpiredError
and falls through to clear-and-relogin.

This failure mode escaped the original tests + manual refresh button because
both kept the access-token cookie present; only a real browser evicting the
cookie at Max-Age exposes it. Added 3 regression tests covering: AT-evicted +
RT-present (transparent refresh), no-cookies (still bounces), and RT-only with
a dead RT (clean 401, no 500).
@benbarclay

Copy link
Copy Markdown
Collaborator Author

Fix: transparent refresh was dead for the common expiry case (commit 5e3f230)

While verifying the refresh feature live in a real browser, the session still bounced to /login the moment the access token expired — the exact thing this PR was meant to prevent. Root-caused it to a real logic bug in the gated middleware.

The bug

The gate bailed the instant the access-token cookie was absent, before ever looking at the refresh token:

at, _rt = read_session_cookies(request)
if not at:
    return _unauth_response(request, reason="no_cookie")   # ← bounced here

The _attempt_refresh path only ran for a present-but-invalid access token. But that state never occurs in a browser: the access-token cookie is set with Max-Age = access_token_expires_in (~15 min), so the browser evicts hermes_session_at the instant the token lapses, while hermes_session_rt persists (30-day Max-Age). From that point the browser sends only the refresh-token cookie — and the if not at: guard rejected it outright.

Captured live at the moment of the bounce (debug instrumentation, since removed):

GATE path=/api/sessions/stats  at=MISSING  rt=present  cookies=[..., 'hermes_session_rt']

at=MISSING rt=present → bounced to /login with zero REFRESH_* audit events, i.e. the refresh code never ran.

Why it slipped through

Both the in-process tests and the manual /refresh debug button happened to keep the access-token cookie present (the button POSTs the RT explicitly). Only a real browser evicting the cookie at Max-Age reproduces it — which is why mocked coverage and manual testing both passed while the live flow failed.

The fix

Only hard-bounce when neither cookie is present. A request carrying just the RT skips verification (no AT to verify) and flows into the existing refresh path, which rotates both cookies and serves the request transparently. A dead/expired RT still raises RefreshExpiredError → clear-and-relogin.

Tests

3 regression tests added to test_dashboard_auth_401_reauth.py:

  • AT evicted + RT present → 200, both cookies rotated (the bug)
  • No cookies at all → 401 bounce (guards against over-reach)
  • RT-only with a dead RT → 401 session_expired, no 500

Full dashboard_auth + nous_provider suite: 312 passed, 3 skipped, no regressions.

Live confirmation

Verified end-to-end against the staging Portal (NAS #293 deployed): logged in, let the access token expire, navigated — stayed logged in, REFRESH_SUCCESS landed in ~/.hermes/logs/dashboard-auth.log instead of a bounce.

@benbarclay benbarclay merged commit c10ccaa into main Jun 2, 2026
23 checks passed
@benbarclay benbarclay deleted the feat/dashboard-oauth-refresh-client branch June 2, 2026 11:16
nampv-itomo added a commit to nampv-itomo/hermes-agent that referenced this pull request Jun 4, 2026
* refactor(desktop): consolidate skills + tools management into one pane

The left-nav Skills pane and Settings > Skills & Tools rendered the same
getSkills()/getToolsets() data with the same helpers and toggles — genuine
duplication that drifted (different default category labels, sort orders).

Make the left pane the single home: it keeps its category-tabbed browsing
and now gains the functional bits it lacked — a real toolset enable/disable
switch (was a read-only pill) and the expandable ToolsetConfigPanel for
provider selection + per-key credential config. Remove the Tools section
from Settings (nav item, view branch, query slot, type union entries) and
delete tools-settings.tsx, migrating its toggle coverage into the skills
pane test. Relabel the entry point to 'Skills & Tools' in the sidebar and
command center.

* refactor(desktop): move model management from Command Center into Settings

Command Center's Models section and Settings > Model rendered the same
model state with identical persistence semantics — both write config and
apply to new sessions only (POST /api/model/set). The Command Center UI
was strictly better (provider catalog, curated model lists, friendly
auxiliary-task labels, Nous-gateway auto-routing on main-provider switch),
while Settings > Model was three barebones config fields.

Extract that UI into a shared settings/model-settings.tsx (restyled with
Settings primitives) and render it at the top of Settings > Model: main
model picker via setModelAssignment + the 9 auxiliary task slots with
per-task set-to-main / change / reset-all. model_context_length and
fallback_providers stay as config fields below it; the raw auxiliary.*
keys are dropped from Advanced (now covered by the panel).

Strip the Models section from Command Center entirely (section, state,
handlers, render, nav, search entry) leaving it focused on Sessions /
System / Usage, and move the live store-sync callback (onMainModelChanged)
from CommandCenterView to SettingsView. The composer's per-session model
picker (the only live hot-swap, via /model) is unchanged.

* feat(dashboard-auth): rotate dashboard sessions via refresh token (#37247)

* feat(dashboard-auth): rotate dashboard sessions via refresh token

The dashboard auth-code grant now issues a 24h rotating refresh token
(server side: NousResearch/nous-account-service#293). This wires up the
Hermes client half so an expired access token is transparently refreshed
instead of bouncing the user to /login every 15 minutes.

plugins/dashboard_auth/nous:
- refresh_session() now POSTs grant_type=refresh_token to Portal's token
  endpoint and returns a Session carrying the ROTATED refresh token (was
  an unconditional RefreshExpiredError under the old "no RT in V1"
  contract). The RT is sent in BOTH the request body (Portal's schema
  requires it there) and the X-Refresh-Token header (log redaction) —
  verified against the #293 preview deploy: header-only is rejected as
  invalid_request, body is accepted.
- A 400 from Portal (expired / revoked / reuse-detected) maps to
  RefreshExpiredError so the middleware forces a clean re-login; network
  errors map to ProviderError; empty RT fast-fails without a network call.
- complete_login now captures the initial refresh token Portal returns
  (forward-tolerant: empty string if a deploy omits it).
- Extracted the shared token-response handling into
  _token_response_to_session, parameterised on the 400 exception type so
  the auth-code path raises InvalidCodeError and the refresh path raises
  RefreshExpiredError.
- revoke_session stays a best-effort no-op: Portal exposes no public
  token-endpoint revocation grant (revocation is the authenticated
  /sessions UI, keyed by sessionId+userId), so logout is cookie-clearing
  and the 24h session expires on its own. Documented for a future
  revoke grant.

hermes_cli/dashboard_auth/middleware:
- On an expired/invalid access token the gate now attempts refresh via
  the session's RT BEFORE forcing re-login. On success it serves the
  request and re-sets the rotated cookies on the response (mandatory:
  Portal rotates the RT every refresh and reuse-detects, so a stale RT
  cookie would revoke the whole session on the next refresh). On
  RefreshExpiredError (or no RT) it falls through to clear-and-relogin.
- ProviderError during refresh (Portal unreachable) forces a clean
  re-login rather than 500-ing the request.
- Uses the existing REFRESH_SUCCESS / REFRESH_FAILURE audit events.

Validation:
- 176 dashboard-auth unit/integration tests pass.
- Live E2E against the #293 preview deploy: refresh_session(bad rt) ->
  RefreshExpiredError through the real token endpoint; live JWKS fetch +
  RS256 verification rejects a forged token; empty-RT fast-fail. The
  successful happy-path rotation is covered by unit tests (a live run
  needs an interactive browser OAuth round trip + registered agent:*
  client).

Depends on: NousResearch/nous-account-service#293 (server-side RT issuance).

* fix(dashboard-auth): use Portal's x-nous-refresh-token header name

The refresh-token header must match Portal's REFRESH_TOKEN_HEADER exactly
("x-nous-refresh-token"); the initial cut used "X-Refresh-Token", which
Portal silently ignores (harmless since the RT is also in the body, which
is what the schema requires — but the header redaction was a no-op).
Confirmed against the NAS token route + re-validated live against the
#293 preview deploy.

* fix(dashboard-auth): refresh session when access-token cookie has been evicted

The gated middleware bounced users to /login the instant the access-token
cookie was absent, without ever consulting the refresh token:

    at, _rt = read_session_cookies(request)
    if not at:
        return _unauth_response(...)   # bailed here

This made transparent refresh effectively dead for the common case. The
access-token cookie is set with Max-Age = access_token_expires_in (~15 min),
so a real browser EVICTS hermes_session_at the moment the token lapses while
hermes_session_rt persists (30-day Max-Age). From that point the browser
sends only the refresh-token cookie — and the old guard rejected it before
_attempt_refresh could run. The _attempt_refresh path only fired for a
present-but-invalid access token, which never happens in a browser.

Fix: only hard-bounce when NEITHER cookie is present. A request carrying
just the refresh token now skips verification (no AT to verify) and flows
into the existing refresh path, which rotates both cookies and serves the
request transparently. A dead/expired RT still raises RefreshExpiredError
and falls through to clear-and-relogin.

This failure mode escaped the original tests + manual refresh button because
both kept the access-token cookie present; only a real browser evicting the
cookie at Max-Age exposes it. Added 3 regression tests covering: AT-evicted +
RT-present (transparent refresh), no-cookies (still bounces), and RT-only with
a dead RT (clean 401, no 500).

* fix(desktop): keep pinned + recent sessions visible across compression

Long-running sessions auto-compress: the gateway ends the original session
and surfaces the live continuation under a new id (list_sessions_rich projects
the root forward to its tip). Two symptoms fell out of the id rotation:

- A pinned session "vanished" — the pin is stored as the pre-compression root
  id, but the sidebar only matched on the live id, so it was filtered out.
  Pins now resolve on the durable lineage-root id (`_lineage_root_id`, already
  surfaced by the projection): the sidebar indexes sessions by both ids, pin/
  unpin and reorder operate on the durable id, and `sessionPinId()` is shared
  with the Cmd+P toggle. Existing pins keep working with no migration.

- A freshly-continued session was missing from the list until you ungrouped +
  "load 50 more" — the list paginated by original start time, so an old-but-
  active conversation sat past the first page. The desktop now requests
  `order=recent` (GET /api/sessions gains an `order` param backed by the
  existing recency CTE), surfacing live continuations on the first page.

* feat(desktop): stable in-workspace ordering + No-workspace default

- Sidebar: rows within a workspace group now sort by creation time instead of
  last activity, so they stop reshuffling every time a message lands (muscle
  memory). Groups still float up by recency.
- Sessions only persist a workspace cwd when one was explicitly chosen; an
  auto-detected launch directory is no longer stamped on the row, so untargeted
  sessions group under "No workspace" instead of "desktop". The agent still
  runs in the detected directory.

* feat(desktop): session search in the sidebar

Adds a search box above the session list. Loaded sessions match instantly
client-side; a debounced full-text search (existing /api/sessions/search FTS)
covers the rest so all sessions stay findable at 699+. Results replace the
pinned/agents sections while a query is active and resume on click.

* feat(streaming): per-platform streaming defaults (Telegram on, Discord off) + dashboard toggles (#37303)

Streaming quality differs sharply by platform: Telegram has native animated
draft streaming (sendMessageDraft) which is smooth, while Discord/Slack only
have edit-based streaming (repeated editMessage) which visibly flickers. Ship
defaults that match reality instead of one global flag.

- hermes_cli/config.py: DEFAULT_CONFIG display.platforms now ships
  telegram.streaming=true and discord.streaming=false (was empty {}). These
  are gap-fillers — config deep-merge has user values win, so anyone who
  explicitly sets discord.streaming=true keeps it. The global
  streaming.enabled master switch still gates everything; these per-platform
  flags only take effect once streaming is on.
- Dashboard exposure comes for free: the web settings schema is generated
  from DEFAULT_CONFIG, so display.platforms.telegram.streaming and
  .discord.streaming now surface as editable boolean toggles in the UI with
  no frontend change. (Previously the per-platform tree was {} and invisible.)
- tests: pin the defaults, the resolver outcome (telegram on / discord off /
  unlisted platforms follow global), user-override-wins, and dashboard schema
  exposure.

No _config_version bump: deep-merge fills the gap for existing installs; no
value migration needed.

* fix(model-picker): OpenAI shows curated models; OpenRouter no longer phantom-shows (#37404)

The model picker now matches `hermes model` for OpenAI, and OpenRouter
stops appearing as authenticated when only OPENAI_API_KEY is set.

- models.py: provider_model_ids() for the default api.openai.com endpoint
  intersects the live /v1/models dump (120+ entries incl. embeddings,
  whisper, tts, dall-e, moderation, legacy chat) with the curated agentic
  list, preserving curated order. Custom OpenAI-compatible endpoints keep
  the live list verbatim so discovery still works.
- providers.py: drop extra_env_vars=("OPENAI_API_KEY",) from the openrouter
  overlay. list_authenticated_providers reads extra_env_vars to decide
  whether a provider is authenticated, so any OpenAI user saw a phantom
  OpenRouter row. Runtime OpenRouter credential resolution still falls back
  to OPENAI_API_KEY (runtime_provider.py), independent of the overlay.
- Regression tests for both paths.

* feat(desktop): cancellable first-launch install

The install overlay had no way to stop a running install — the runner already
supported an abortSignal, but nothing drove it. Wire it end to end:

- main.cjs holds an AbortController for the active runBootstrap and aborts it
  on a new hermes:bootstrap:cancel IPC and on app quit, so quitting/cancelling
  mid-install actually kills install.sh/ps1 instead of orphaning it.
- runBootstrap bails before spawning anything if the signal is already aborted.
- Install overlay gains a "Cancel install" button while a bootstrap is active;
  a cancel surfaces the recovery overlay (retry/repair).

Test: electron/bootstrap-runner.test.cjs asserts the already-aborted early
return (no spawn) via `node --test`.

* chore: uptick

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* docs: add Desktop App guide (#37457)

The native Electron desktop app shipped (PR #20059 and follow-ups) but the
docs only told people how to download it, not what it is or how to use it.

Adds website/docs/user-guide/desktop.md covering install (installer +
prebuilt + Windows GUI), the chat-first UI and management panes, the
hermes desktop CLI flag reference, self-update, how-it-works, and
troubleshooting. Sourced from apps/desktop/README.md, routes.ts, and the
real argparse. Wired into sidebars.ts under Interfaces after the TUI.

* Merge pull request #37462 from NousResearch/bb/desktop-update-throttle

fix(desktop): throttle the update-available toast

* fix(docs): update desktop app docs

* feat(dashboard): nous-blue theme, bulk sessions, schedule picker (#37383)

* feat(dashboard): nous-blue theme, bulk sessions, schedule picker

Batch of related dashboard improvements gathered on
austin/fix/dashboard-changes:

* Nous Blue theme — faithful port of the LENS_5I overlay system onto
  the existing DashboardTheme. Lifts the foreground inversion layer to
  z-index 200 to fix the long-standing hover / loading visual artifact,
  adds an explicit swatchColors slot so the theme picker shows the
  post-inversion preview, and migrates the legacy "lens-5i" theme key
  from localStorage / API to "nous-blue" on first read.
* Theme-aware series colors: new --series-input-token /
  --series-output-token CSS vars consumed by Analytics + Models
  charts; ToolCall + ModelInfoCard switched to semantic
  --color-success for diff lines and the Tools capability badge.
* Analytics + Models headers: consolidate period selector + refresh
  next to the page title and drop the redundant period badge.
* Bulk session management — "Delete empty (N)" button + per-row
  checkboxes with shift-click range select and a bulk-delete action
  bar. Backed by SessionDB.delete_sessions() /
  delete_empty_sessions() plus POST /api/sessions/bulk-delete and
  DELETE /api/sessions/empty (registered before the templated
  /api/sessions/{session_id} family so they don't get shadowed).
  Hard cap of 500 IDs per bulk request. Full pytest coverage.
* Cron page — human-readable schedule picker (every-interval / daily
  / weekly / monthly / once / custom) replaces the raw cron
  expression input; the job list now renders "Weekly on Mon, Wed,
  Fri at 14:30" instead of "30 14 * * 1,3,5". English-only ordinals
  for monthly schedules so non-English locales don't get incorrect
  suffixes.
* example-dashboard plugin moved from plugins/ to tests/fixtures/ so
  stock installs no longer ship the demo. Tests install it
  dynamically via a pytest fixture that also reorders the FastAPI
  routes.
* i18n: 40+ new keys for the bulk-select UI and schedule
  picker/describer translated across all 16 locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(dashboard): dedupe memory provider picker

The memory provider <Select> lived on both /system and /plugins,
writing the same config.yaml field through two different endpoints
with no cross-page refresh. Remove the picker from /system in favor
of a read-only status row + link to /plugins, where it pairs with
the context-engine picker under "Plugin providers".

/system retains the destructive admin controls (file sizes, Reset
MEMORY.md / USER.md / all). The api.setMemoryProvider client and
PUT /api/memory/provider backend endpoint are left in place for
CLI / script callers.

Co-authored-by: Cursor <cursoragent@cursor.com>

* docs(dashboard): address Copilot review on PR #37383

- Backdrop layer-stack comment claimed LENS_5I-style themes override
  --component-backdrop-bg-blend-mode to multiply, but our only
  LENS_5I-style theme (nous-blue) keeps the default difference.
  Reword to describe what the code actually does and present the
  var as a forward-looking extension hook.
- /api/sessions/bulk-delete docstring promised the response would
  echo back the list of deleted IDs, but the implementation only
  returns {ok, deleted}. Tighten the docstring to match the wire
  format; the client already knows what it asked to delete, so the
  IDs aren't needed.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(dashboard): address copilot review on cron describe + bulk-select checkbox

- schedule.ts: restrict `describeCronExpression` to strictly 5-field cron
  expressions. The backend `parse_schedule` also accepts the 6-field
  `min hour dom month dow year` form, and humanising those by
  destructuring only the first five fields would silently drop the year
  (e.g. ``0 9 * * * 2099`` rendered as "Daily at 09:00"). 6+ field
  expressions now fall through to the raw-string fallback so the user
  sees what's actually scheduled.

- SessionsPage.tsx (SessionRow): wire the bulk-select Checkbox's
  ``onClick`` directly instead of attaching it to a parent ``<span>``
  with a no-op ``onCheckedChange``. Radix forwards onClick to the
  underlying ``<button role=checkbox>``, so the same handler now drives
  both mouse clicks (preserving shift-key state for range select) and
  keyboard activation (Space on the focused checkbox, which the browser
  synthesises as a click on the <button>). Improves a11y / keyboard UX
  without changing the controlled-selection model.

- SessionsPage.tsx: also extend ``SessionRowProps`` with the new
  ``onRename`` / ``onExport`` props introduced on main so the row's
  destructured prop types resolve after the merge.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>

* Clarify desktop install retry guidance

* fix(auth): align Codex OAuth persistence paths (#37517)

* fix(desktop): codex OAuth onboarding now resolves on fresh install

The desktop codex device-code worker persisted tokens with a hand-rolled
pool.add_entry(), writing only credential_pool.openai-codex. It never set
active_provider, so on a fresh install the onboarding setup.runtime_check
resolved provider "auto", couldn't detect the Codex OAuth session, and raised
"No inference provider configured" — while setup.status (which sniffs the pool)
reported configured. The disagreement surfaced as the onboarding banner
"Connected, but Hermes still cannot resolve a usable provider."

Use the canonical _save_codex_tokens() instead, matching the CLI's
`hermes auth add openai-codex` path and the Nous/MiniMax dashboard workers.
It writes the providers.openai-codex singleton (setting active_provider) and
syncs the pool.

* fix(auth): align Codex OAuth persistence paths

Ensure desktop and CLI Codex OAuth logins both write the canonical provider state so fresh installs resolve a usable runtime provider.

---------

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>

* feat(installer): rename macOS installer to "Hermes" and make it a launcher (#37516)

* feat(installer): rename macOS installer to "Hermes" and make it a launcher

The bootstrap installer was branded "Hermes Setup" and always re-ran the full
install flow on every open — so the /Applications app said "Setup" and couldn't
double as a way to relaunch Hermes (the real desktop app lives in ~/.hermes,
not /Applications, with no Dock/Launchpad entry).

Two changes, macOS-focused:

1. Rename the installer's user-visible name to "Hermes" (productName, window
   title, shortDescription, document title). Bundle id stays
   com.nousresearch.hermes.setup (distinct from the desktop app's
   com.nousresearch.hermes); the on-disk staged updater name (hermes-setup) is
   unchanged, so the desktop's update hand-off still resolves it.

2. Launcher fast path: on a bare ("Install") launch, if Hermes is already
   installed (bootstrap-complete marker + a built desktop app on disk), skip the
   installer UI entirely and relaunch the desktop app, then exit. First run still
   installs; Update mode and fresh/repair installs still show the UI. The window
   now starts hidden ("visible": false) and is revealed only when the UI is
   actually needed, so the launcher path never flashes a window.

Net UX: one "Hermes" in /Applications you can pin to the Dock — first click
installs, every later click opens the app instantly (same icon throughout, so
the Dock stays seamless). Nothing pins to the Dock permanently; the app shows a
normal Dock icon only while running.

Windows naming is intentionally left as-is in this change (scope: macOS).

* fix(installer): gate launcher fast path to macOS + log window-show failures

Address review feedback:
- Gate the already-installed launcher fast path to macOS (cfg!(target_os =
  "macos")). On Windows/Linux the installer keeps its prior behavior, so the
  change is a pure no-op there. This avoids relaunching the desktop app on
  Windows via a spawn that lacks the DETACHED_PROCESS + startup-grace handling
  launch_hermes_desktop uses (which could race the installer's exit).
- Add a brief startup grace before exiting on the mac fast path, mirroring
  launch_hermes_desktop.
- Log (instead of silently ignoring) failures to show the main window, and log
  when the "main" window can't be found, so a no-UI state is diagnosable.

* fix(installer): add --reinstall escape hatch + keep spawn detached on Windows

Address follow-up review:
- Add a `--reinstall`/`--repair` flag that forces the installer UI even when
  Hermes is already installed, so a broken install can be repaired by re-running
  setup instead of the launcher fast path silently relaunching the (possibly
  bad) app.
- Apply DETACHED_PROCESS on Windows in spawn_installed_desktop, mirroring
  launch_hermes_desktop, so the helper stays correct cross-platform even though
  its only caller is macOS-gated today.

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* test(installer): unit-test --reinstall/--repair force-setup parsing

Extract the force-setup flag parsing into a unit-testable
`force_setup_from_args` helper (mirrors `AppMode::from_args`) and add tests:
- --reinstall and --repair are recognized
- bare/unrelated args (incl. --update) do not force setup
- the repair flags never affect Install<->Update mode selection

---------

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* feat(desktop): content-hash build stamp with --build-only and --force-build flags

Add a SHA-256 content-hash based build stamp to `hermes desktop` so
unchanged source trees skip the npm install + build step. Uses pathspec
for .gitignore-aware file matching instead of a hardcoded skip-list.

New CLI flags:
- --build-only: run the build but don't launch the app
- --force-build: rebuild even when the stamp matches

`hermes update` now calls `hermes desktop --build-only` so the
desktop app is rebuilt (if needed) as part of the update flow.

16/16 tests passing.

* fix(models): restore gemini-3-flash-preview to Gemini OAuth picker (#37606)

#37046 swapped gemini-3-flash-preview -> gemini-3.5-flash in the
google-gemini-cli (OAuth/Code Assist) picker on the premise that the
preview slug was renamed. It wasn't. Per gemini-cli's models.ts, Code
Assist serves two distinct flash slugs with different access gates:
gemini-3-flash-preview (PREVIEW_GEMINI_FLASH_MODEL — what subscription/
free-tier OAuth users reach) and gemini-3.5-flash
(DEFAULT_GEMINI_3_5_FLASH_MODEL — GA-channel-gated). The model string is
passed verbatim into the {project, model, ...} envelope sent to
cloudcode-pa.googleapis.com, so non-GA users got a hard error on every
prompt because gemini-3.5-flash 404s for them.

Offer both slugs in the OAuth picker (matching gemini-cli's own /model
list) so non-GA users can select the preview flash that works. The
gemini (API-key), OpenRouter, and Nous lists are untouched —
google/gemini-3.5-flash is a real live model on those surfaces.

* fix(desktop): stabilize project folder sessions (#37586)

* fix(desktop): stabilize project folder sessions

Keep desktop folder selection aligned with new sessions and scope TUI gateway cwd through session context so prompts and tools resolve against the selected workspace.

* fix(desktop): address review feedback on folder sessions

Snapshot sessions before iterating to avoid concurrent-mutation crashes,
optional-chain the revealLogs catch, and read console-message args from
the correct Electron event/messageDetails positions.

* fix(desktop): address second review pass on folder sessions

Sync the remembered workspace key with the cwd atom (clear on empty),
only load tree children for real directory nodes, and throttle renderer
auto-reloads so a deterministic startup crash can't loop forever.

* fix(desktop): inherit parent workspace for ephemeral agent tasks

Background and preview tasks use ephemeral ids absent from the session
map, so pass the parent session cwd into the session context explicitly
instead of clearing it back to the gateway launch dir. Also correct the
set_session_vars docstring about clear_session_vars semantics.

* fix(desktop): validate preview cwd before pinning session context

A non-empty but non-existent client cwd would pin an unusable override
and silently fall back to the launch dir. Validate once, reuse for both
the session context and the terminal override, and fall back to the
parent session workspace when invalid.

* fix(desktop): harden preview cwd normalization and adopt normalized cwd

Guard preview cwd normalization against malformed client paths so a bad
input can't fail the whole restart, and adopt the backend's normalized
config.get cwd in the no-active-session path so the persisted workspace
stays consistent with what the agent uses.

* fix(desktop): triage batch of GUI quality-of-life fixes (#37536)

* fix(desktop): triage 24 GUI quality-of-life fixes across sidebar, composer, tool cards, messaging, and platform plumbing

A grab-bag of high-leverage UX fixes plus a few backend touches that the
GUI needs to behave correctly on Windows.

Sidebar / sessions
- Decrement $sessionsTotal on delete + archive so "Load N more" stops
  claiming removed rows are still on the server.
- Hide the "Group by workspace" toggle when no unpinned sessions exist.
- Accept Cmd/Ctrl+N as a "new session" accelerator (in addition to bare
  Shift+N), and render the kbd hint per-platform.
- Switch the statusbar to overflow-x-clip so untitled sessions don't
  paint a horizontal scrollbar at the bottom of the window.

Messaging + Cron
- Add [-webkit-app-region: no-drag] to the page-search input so clicks
  reach the field instead of routing to the OS window-drag handler.
- Replace single-letter PlatformAvatar with brand glyphs from
  @icons-pack/react-simple-icons (telegram, discord, matrix, signal,
  whatsapp, mattermost, wechat, qq, ...). Letter monogram fallback for
  Slack / Dingtalk / Feishu / WeCom (removed from Simple Icons at brand
  owner request).
- Drop the duplicate "Create first cron" button in the empty state.

Composer
- Dedupe pasted images by (name, size, lastModified, type) instead of
  Blob identity; Chromium hands us the same screenshot via both
  clipboard.items and clipboard.files with fresh File instances.
- Enable spellcheck on the contentEditable, configure Chromium's
  spellchecker with the system locale on whenReady, and add
  replaceMisspelling + "Add to dictionary" entries to the context menu.
- Render user messages through a minimal markdown pipeline (inline
  backtick code + fenced ``` blocks) while keeping @file:/@image:
  directive chips intact.
- max-h-[60vh] overflow-y-auto + collisionPadding on the prompt-snippet
  submenu.
- Bake cursor-pointer into the <Button> primitive (with
  disabled:cursor-default) and into titlebarButtonClass.

Dialogs + tabs + version
- Default DialogContent now has max-h-[85vh] overflow-y-auto so long
  bodies scroll instead of falling off-screen.
- Right-rail preview tabs close on middle-click (button === 1), with an
  onMouseDown swallow to suppress Chromium autoscroll.
- New refreshDesktopVersion() helper called from About mount, after
  every update check, and on throttled window focus so About reflects
  the just-installed binary.

Keys + Artifacts + Terminal
- Drop the global "Show advanced" toggle in KeysSettings. Provider
  groups now default-expand when they have any key set.
- Extend openExternalUrl to handle file:// via shell.openPath, with
  showItemInFolder fallback when the OS can't open the file.
- New lib/ansi.ts SGR parser + <AnsiText> component, applied to
  terminal/execute_code tool output.
- ToolView gained stdout / stderr / rendersAnsi; tool-fallback renders
  the two streams as separate labeled blocks with stderr in a neutral
  tone (not destructive — many CLIs log info on stderr).
- Drop 'stderr' from ERROR_MSG_KEYS in tool-result-summary.

Paths + platform
- resolveHermesCwd skips process.cwd() when packaged and prefers a
  user-configurable default project directory.
- New hermes:setting:defaultProjectDir:{get,set,pick} IPC handlers +
  preload bridge + global.d.ts typing + a "Default project directory"
  row in Sessions settings.
- FileOperations.delete_path(path, recursive=True) on the abstract
  base; ShellFileOperations.delete_file rewritten to run a cross-
  platform python3 -c snippet so deletes work on Windows shells (which
  have no rm/rm -rf). Fallback to `python` when `python3` isn't on PATH.
- README troubleshooting block split into macOS/Linux + Windows
  PowerShell recipes.
- Tightened renderer favicon links in index.html + added color-scheme
  and theme-color meta.

Backend lifecycle (renderer-side mitigation)
- New noteSessionActivity() heartbeat + session.ts watchdog: an
  8-minute silence on the stream auto-clears stuck $workingSessionIds
  entries so "Session Busy" never gets permanently wedged. Wired into
  useSessionStateCache so every state update refreshes the timer.

i18n spike
- docs/desktop-i18n-rfc.md scoping a future language-switcher PR
  (recommends react-intl, audits IME/RTL/CJK in the composer +
  chat bubbles, 4-PR rollout plan, ~3-4 eng-weeks for the first
  non-English locale).

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop): replace native OS scrollbar in portaled dropdown menus

Radix's DropdownMenuPrimitive.Portal renders content under document.body,
outside the `.scrollbar-dt` scope on #root. Whenever a menu's max-height
clipped its content (even by a pixel — common for the composer "+" menu
that opens upward near the bottom of the window), the user saw the OS's
chunky native scrollbar painted across the whole menu.

Bake a thin, slot-styled scrollbar onto DropdownMenuContent and
DropdownMenuSubContent via [scrollbar-width:thin] + WebKit pseudo-element
arbitrary variants. The submenu also gets a max-h tied to
--radix-dropdown-menu-content-available-height so long snippet lists scroll
cleanly instead of running off the bottom of the viewport. Drop the now-
redundant max-h-[60vh] override on the prompt-snippet submenu.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop): unbork dropdown menu — submenu opens, parent isn't a circle

Two regressions from the previous dropdown-scrollbar fix:

- The parent menu rendered as a rounded oval. Long Tailwind v4 arbitrary-
  variant strings like [&::-webkit-scrollbar-thumb]:rounded-full inside a
  cn() call were being mis-resolved so the `rounded-full` leaked onto the
  menu container itself. Replaced the whole tower of arbitrary variants
  with a real `.dt-portal-scrollbar` class in styles.css that mirrors what
  `.scrollbar-dt` already does for #root descendants. Plain CSS, no Tailwind
  parser ambiguity.
- The Prompt snippets submenu didn't open. Radix publishes
  --radix-dropdown-menu-content-available-height on Content but NOT on
  SubContent, so the `max-h` bound to that variable computed to 0 and the
  submenu collapsed to zero height. Switched SubContent to a fixed
  max-h-80 (≈20rem) which is plenty for a snippet list and never collapses.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop): promote prompt snippets from Radix submenu to a real Dialog

The submenu refused to open when the parent dropdown was anchored at the
bottom of the window (composer "+" button) — Radix's collision detection +
SubContent positioning was fighting us. Rather than keep tuning side /
sideOffset / collisionPadding / max-h until something stuck, replace the
DropdownMenuSub with a clicked DropdownMenuItem that opens a proper
Dialog.

Side benefits over the submenu:
- Each snippet gets a description line, so a glance is enough to pick one.
- Focus management is handled by Dialog automatically.
- Easy to grow (search, custom user snippets, categories) without
  another round of Radix positioning bugs.

Also extract types/interfaces to the bottom of the file per workspace
convention.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop): move cron 'New cron' button off the top bar into the body

Reverses the previous direction on cron empty-state dedup. The body
button is more discoverable for first-time users (it's anchored next to
the "No scheduled jobs yet" copy that explains the feature) and frees
the top bar from a global CTA that wasn't pulling its weight.

- Empty (zero jobs): EmptyState renders the "Create first cron" button
  again, like the original design.
- Empty (search filtered out all jobs): no button, just "Try a broader
  search query" copy.
- Has jobs: small inline header above the list shows `N/M active` plus
  a single "New cron" button (right-aligned). The rows themselves
  already cover edit/pause/trigger/delete, so this is the only "create"
  affordance.

Also drop the dead `<div className="hidden">…</div>` enabledCount line
the previous patch left behind; the count is now visible in the new
header instead of hidden.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop): address Copilot review on PR 37536

- sessions-settings: guard the WHOLE bridge call rather than chaining
  `?.settings.foo().then(...)` — the latter throws when
  `window.hermesDesktop` is undefined (non-Electron / Vitest contexts)
  because the chain short-circuits to `undefined.then(...)`.
- file_operations: drop `Path.unlink(missing_ok=True)` (Py>=3.8) so the
  generated delete snippet still works on remote backends running
  Python 3.7. The existing FileNotFoundError handler covers the same
  case and works back to 3.4.
- ansi.test.ts: add focused Vitest coverage for the SGR parser
  (basic/bright colors, bold toggles, default-fg reset, coalescing,
  256-color / truecolor arg consumption, non-SGR CSI drop, empty SGR
  full-reset) so future refactors can't silently regress terminal
  rendering.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop/updates): swallow refreshDesktopVersion bridge errors

`refreshDesktopVersion()` is called best-effort with `void` from
`checkUpdates()`, `startUpdatePoller()`, and the window focus handler.
If the IPC bridge rejects (main process shutting down during reload,
bridge not yet ready on first paint), the rejection surfaces as an
unhandled promise rejection in the renderer. Wrap the call in try/catch
and return null on failure so callers can keep the existing
fire-and-forget pattern safely.

Co-authored-by: Cursor <cursoragent@cursor.com>

* chore(desktop): drop work duplicated by other in-flight PRs

- composer/text-utils.ts: revert paste-image dedupe — PR #37596
  ships the same fix with a cleaner content-key approach and a
  Vitest file (text-utils.test.ts). Letting that PR own the change.
- docs/desktop-i18n-rfc.md: delete the i18n scoping RFC — PR #37568
  has already shipped a working i18n surface (homegrown nanostores
  `t()` helper over en/zh dictionaries), so the RFC's framework
  recommendation (`react-intl`) is now obsolete and would just
  contradict the implementation that's actually landing.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(desktop): make xAI Grok a first-class OAuth provider in the launcher

xAI Grok was only reachable via the "I have an API key" form. xAI's
OAuth (SuperGrok / Premium+) flow already exists in the backend
(`hermes auth add xai-oauth`) but was never surfaced in the desktop
onboarding launcher.

Add a loopback PKCE flow: the local backend binds the 127.0.0.1
callback listener, the client opens the browser, and the redirect lands
back automatically — no code to copy/paste. Reuses the existing xAI
OAuth helpers (discovery, callback server, token exchange, persist)
rather than duplicating them.

- web_server: catalog entry (flow: loopback) + status dispatch +
  _start_xai_loopback_flow + background worker + route branch
- desktop: 'loopback' flow type, awaiting_browser status, xAI Grok card
  (PROVIDER_DISPLAY / FLOW_SUBTITLES / FlowPanel waiting render)
- tests: catalog listing, start authorize-url, worker persist, state
  mismatch rejection

* fix(desktop): order xAI Grok after MiniMax in the OAuth catalog

* fix(desktop): address Copilot review on xAI loopback flow

- web_server: join the callback-server thread in the start error path so a
  failed discovery/URL build doesn't leave a daemon thread running
- web_server: loopback worker now bails if the session was cancelled while
  waiting for the callback or exchanging the code, instead of persisting
  tokens the user no longer wants (+ regression test)
- onboarding: fall back to window.open when the desktop bridge's
  openExternal is unavailable, so the flow never silently stalls

* fix(desktop): address second Copilot pass on xAI loopback flow

- onboarding: openSignInUrl now falls back to window.open when the desktop
  bridge's openExternal throws/rejects (OS handler missing, user denied),
  not just when the bridge is absent
- web_server: cancelling a loopback session shuts down the 127.0.0.1
  callback server + joins its thread immediately, freeing the port instead
  of holding it until the wait times out (+ regression test)
- web_server: document the new "loopback" flow in the /api/providers/oauth
  enum, the poll-endpoint docstring, and the Phase 2 flow comment block

* ci(nix): fold package+devShell builds into flake check

Add build-package and build-devshell as cross-platform check
derivations so nix flake check verifies the default package and
devShell build on every platform (including darwin, which previously
only did eval-only checks).

This lets us drop the separate nix build step from the CI workflow
and removes the macOS-only eval fallback — a single nix flake check
now covers builds + runtime checks on all runners.

* fix(desktop): use auth-store path as xAI OAuth source_label

source_label is meant to be a human-readable origin (file path / source),
not the internal auth_mode string ("oauth_pkce"). Surface the auth-store
path, then the source slug, then a generic label.

* fix(desktop): signal loopback worker to stop on cancel

Shutting down the callback server stopped the serve thread but left the
worker spinning in _xai_wait_for_callback (which polls callback_result)
until the timeout. Flag callback_result as cancelled on DELETE so the
wait returns promptly and the daemon thread exits — avoids thread
buildup on repeated cancel/retry.

* fix(web-server): move event channel state from module globals to app.state (#37683)

Module-level asyncio.Lock() binds to whatever event loop was active at
import time.  When the same web_server module is reused across multiple
TestClient instances (or across uvicorn reloads), the old lock still
references a defunct loop, causing 'attached to a different loop' errors
and flaky subscriber-registration races in CI.

Replace the module-level _event_channels dict + _event_lock with:
  - _lifespan() async context manager that creates both on the running
    event loop during FastAPI startup (guaranteed correct loop binding)
  - _get_event_state() lazy accessor that initialises on app.state when
    TestClient is used without a `with` block (preserves backward compat)

All call sites (_broadcast_event, /api/pub, /api/events) now receive the
app reference and read state via _get_event_state(app) instead of the
module globals.  The test polling loop is updated to check
app.state.event_channels rather than the removed module attribute.

* fix(gateway): route /background result media by type

Background-task (/background, /btw) result media now routes to the
type-specific sender — TTS clip → voice bubble, video → send_video,
image → send_image_file — instead of forcing everything through
send_document. Mirrors the streaming + kanban delivery paths and
reuses base.should_send_media_as_audio for the Telegram OGG nuance.

Co-authored-by: LJ Li <liliangjya@gmail.com>
Co-authored-by: Kolektori <256073454+Kolektori@users.noreply.github.com>

* test(honcho): de-flake prewarm smoke test's thread wait (#37614)

TestDialecticLifecycleSmoke._await_thread did a single join(timeout=3.0) and
then proceeded regardless of whether the background dialectic thread had
finished. On a loaded CI runner (6 parallel test slices) the prewarm thread's
completion can slip past that 3s window, so the join times out silently and the
test reads _prefetch_result before the worker wrote it — the intermittent
'session-start prewarm must land in _prefetch_result' failure.

Join in a loop up to a 30s ceiling and assert the thread is actually dead, so a
genuine hang surfaces as a clear failure instead of a timing race. Reproduced
the old failure deterministically (5/5 fails with a 3.5s prewarm delay) and
confirmed the fix (0/8) before/after.

* feat(desktop): inline model picker in the status bar

Replace the status-bar model chip's modal with a Cursor-style dropdown:
- providers grouped by name in a stable order (no recency reshuffle on select)
- per-model hover-Edit submenu for reasoning effort + fast, gated by per-model
  capabilities now surfaced in the model.options payload
- unified Fast toggle: flips the speed=fast param where supported, else swaps
  to the model's `-fast` variant (base and variant collapse into one row)
- localStorage-backed "Edit Models" dialog to choose which models appear

Adds reusable dropdown primitives (DropdownMenuSearch, shared row/label
tokens, portaled + collision-aware submenus) and reads session state from
nanostores rather than prop-drilling, so editing options doesn't rebuild and
close the menu.

* fix(desktop): adopt existing macOS install + auto-place app

First-launch "already installed?" hinged solely on a marker that only the
desktop's own bootstrap writes, so a runtime from `install.sh --include-desktop`
(or a DMG launch over a prior CLI install) was runnable yet markerless and got
the WHOLE installer re-run on top of it. Detect a runnable ACTIVE_HERMES_ROOT
(valid source + venv), adopt it (stamp the marker, recording HEAD), and forward
straight to the app. Repair keeps forcing a real re-bootstrap.

Also: on first packaged macOS launch relocate the bundle into /Applications
(Electron relaunches from there) and pin the canonical copy to the Dock once,
so users stop re-opening the installer from Downloads/the DMG.

* fix(aux): self-heal Nous-routed calls when a pinned model leaves the catalog (#37732)

A long-lived process (gateway, watcher) caches the Nous Portal's
recommended-models payload and can pin a model for its whole lifetime.
When that model is later dropped from the Nous -> OpenRouter catalog,
every auxiliary call 404s with 'model does not exist in our
configuration or OpenRouter catalog' until the process restarts.

Now such a 404 force-refreshes the Portal recommendation and retries
once with the current pick (or the gemini-3-flash-preview default).
Scoped to Nous-routed calls only.

- _is_model_not_found_error(): 404/400 'not found / does not exist /
  not a valid model' predicate, excludes billing keywords so it never
  overlaps _is_payment_error.
- _refresh_nous_recommended_model(): force-refresh fetch, returns a
  model distinct from the one that failed, else the known-good default.
- Wired into both call_llm and async_call_llm error chains.

* fix(gateway): close ResponseStore + dispose unowned adapter on reconnect failure

Three separate code paths in the gateway's platform reconnect loop
leaked file descriptors every retry, exhausting the default 2560-fd
ulimit in ~12 hours of continuous failure and turning the gateway
into a zombie that raises OSError: [Errno 24] on every open() (#37011).

Root cause:
  * APIServerAdapter.__init__ opens a ResponseStore SQLite connection
    that holds 2 fds (db file + WAL sidecar).
  * APIServerAdapter.disconnect() previously only stopped the aiohttp
    web server — the ResponseStore connection was never closed.
  * The reconnect watcher in _platform_reconnect_watcher constructs a
    fresh adapter on every retry attempt. When the connect call fails
    (3 paths: non-retryable error, retryable error, exception during
    connect) the adapter is dropped without ever being installed on
    self.adapters, so nothing else calls its disconnect(). Result: the
    2 ResponseStore fds stay open until GC sweeps the unreachable
    object, which Python's cyclic GC does not do promptly for
    asyncio-bound native handles.

  2 fds × 1 retry × (3600s / 300s backoff cap) ≈ 12 fds/hour.
  2560 fds / 12 fds/hr ≈ 12h to ulimit exhaustion.

Fix:

  * APIServerAdapter.disconnect() now also calls
    self._response_store.close() (with a try/except so a SQLite
    close failure doesn't abort the aiohttp teardown).
  * New module-level helper _dispose_unused_adapter(adapter) in
    gateway/run.py that calls adapter.disconnect() and swallows
    any exception (so half-constructed adapters whose __init__
    crashed don't kill the watcher loop).
  * _platform_reconnect_watcher calls _dispose_unused_adapter() in
    all three failure paths: non-retryable, retryable, and the
    except Exception arm. adapter = None is initialized
    before the try so the except arm can see the partial
    construction.

Tests:

  * New file tests/gateway/test_platform_reconnect_fd_leak.py with
    7 regression tests covering all three failure paths, the
    _dispose_unused_adapter helper (None + raising-disconnect cases),
    and the APIServerAdapter ResponseStore close behavior (success +
    close-exception cases). The _CountingAdapter fixture tracks
    disconnect() invocations and an _open_fds counter that is
    decremented on dispose, so the assertion is the literal
    observable behavior of the leak.

Refs:
  - Closes #37011 (the original fd-leak report)
  - Supersedes #37018, #37110, #37238, #37260, #37394 (7 competing
    open PRs all addressing the same root cause from different angles;
    none of them rebased cleanly against current main, and none
    covered all three failure paths in one fix with regression tests
    for both the watcher and the platform-level close behavior)

* fix(release): add fearvox1015@gmail.com -> Fearvox to AUTHOR_MAP

The check-attribution CI job on #37679 failed because the commit
author email nolan@0xvox.com (a local git config mistake on this
machine) is not in scripts/release.py AUTHOR_MAP. The commit
itself is now re-authored to fearvox1015@gmail.com, and this
follow-up adds the entry to AUTHOR_MAP so any future commits
authored from this email also pass the check.

* polish(gateway): address Copilot review comments on fd-leak fix

Seven Copilot inline review comments on #37679, four worth landing
in a polish pass before merge:

1. _dispose_unused_adapter signature: 'BasePlatformAdapter' ->
   'BasePlatformAdapter | None'. The function explicitly handles
   None and the reconnect watcher calls it with None in the
   except arm, so the annotation now matches the actual contract.

2. (duplicate of #1 on a different line) — same fix.

3. except Exception in _dispose_unused_adapter — the reviewer
   asked about asyncio.CancelledError swallowing. On Python 3.8+
   (Hermes requires 3.13, see pyproject.toml), CancelledError
   inherits from BaseException, NOT Exception, so the existing
   'except Exception' does NOT swallow task cancellation. Added
   an explicit comment explaining the contract so future readers
   don't repeat the analysis. We don't re-raise because the
   watcher loop intentionally treats dispose failures as
   best-effort: a failed dispose on an unowned adapter should not
   take down the watcher that's keeping the gateway alive.

4. _response_store = None after close in api_server.py — the
   reviewer flagged this for idempotency. Decided to keep the
   non-None state intentionally: setting it to None cascades
   to ~9 callers that access self._response_store without a
   None check, and 'close() is idempotent on a closed sqlite3
   Connection' means the current code is already safe. The
   type stays stable; LSP doesn't flag a cascade of
   reportOptionalMemberAccess errors. (This matches the
   pre-existing pattern in the codebase — e.g.
   _mark_disconnected doesn't reset state to None either.)

5. _build_adapter_with_store: reviewer worried about
   disconnect() failing on the self.name property if
   __init__ wasn't called. Already handled: we set
   'adapter.platform = Platform.API_SERVER' so the
   'self.platform.value.title()' property returns
   'Api_Server' without raising. The exception-swallowing
   branch in disconnect() does call self.name via the
   logger.debug format, so this is a real path that needs
   the platform attribute, and we have it.

6. test_disconnect_closes_response_store: bare 'pytest.raises(Exception)'
   -> 'pytest.raises(sqlite3.ProgrammingError)'. The bare
   Exception matcher would silently accept AttributeError,
   OperationalError, env-related issues, etc. The specific
   exception type ('Cannot operate on a closed database') is
   the actual signal we want — proves the SQLite conn is
   closed, not just that *something* raised.

7. test_nonretryable_failure_disposes_unowned_adapter:
   assertion tightened from '>= 1' to '== 1' on
   adapter._disconnect_calls. The docstring said 'exactly once',
   the assertion now matches. Catches the hypothetical
   'watcher disposes the same adapter twice' regression that
   '>=' would have missed.

* fix(desktop): address Copilot review on model picker

- selectModel reports success; edits bail (and roll back) instead of landing
  on the previously active model when a switch fails
- Fast toggle stays available to turn off a carried-over speed param even when
  the new model has no native fast mechanism
- active row's "Fast" label derives from the same fastControl as the submenu
  toggle, so it's consistent and handles standalone `-fast` model ids

* fix(node/nix): consolidate workspace lockfile + update all consumers

Consolidate per-package package-lock.json files into a single root-level
workspace lockfile.  Update all consumers:

- Nix: shared src/npmDeps/npmDepsHash in lib.nix; devshell hook stamps
  package.json paths then runs npm ci from root; individual .nix files
  use mkNpmPassthru attrs instead of per-package fetchNpmDeps.
- Python CLI: new _workspace_root() helper so _tui_need_npm_install,
  _make_tui_argv, _build_web_ui resolve lockfile/node_modules from the
  workspace root.
- Desktop: replace --force-build/mtime heuristic with content-hash build
  stamp (_compute_desktop_content_hash via pathspec).  Remove --force-build
  flag.
- Dockerfile: single root npm install; no per-directory lockfile copies.
- CI: nix-lockfile-fix and osv-scanner reference root package-lock.json;
  apps/dashboard → apps/desktop.
- Tests: new test_tui_npm_install.py; desktop stamp tests in
  test_gui_command.py; updated assertions in test_cmd_update.py,
  test_web_ui_build.py, test_dockerfile_pid1_reaping.py.
- Docs: remove --force-build from desktop flag table.

Deleted: apps/desktop/package-lock.json, ui-tui/package-lock.json,
ui-tui/packages/hermes-ink/package-lock.json, web/package-lock.json.

* refactor(uv): single managed-uv path, delete fts5 installer escalation

Replace the multi-path UV resolution chain (PATH probing, conda guards,
5-location trust ordering, temp-dir fallback installs) with a single
managed uv binary at $HERMES_HOME/bin/uv. Every code path that needs
uv resolves it from that one location; if missing, ensure_uv()
bootstraps it via the official standalone installer.

Key changes:

- New hermes_cli/managed_uv.py: managed_uv_path(), resolve_uv(),
  ensure_uv() (returns (path, freshly_bootstrapped) tuple),
  update_managed_uv(), rebuild_venv(), installer internals.
- hermes_cli/main.py: replace all shutil.which('uv') with ensure_uv(),
  add venv rebuild on first-time managed uv bootstrap, update_managed_uv
  before dep install on all 3 update paths.
- scripts/install.sh: install_uv() always installs to
  $HERMES_HOME/bin/uv; delete ensure_fts5, _python_has_fts5,
  _reinstall_python_with_fts5, _warn_no_fts5 (61 lines).
  Managed uv always installs current Python with FTS5.
- scripts/install.ps1: Install-Uv always installs to
  $HermesHome\bin\uv.exe; Resolve-UvCmd checks managed location first.
- hermes_state.py: simplified FTS5 warning now suggests 'hermes update'
  as the fix instead of blaming install method.
- tests: 15 tests in test_managed_uv.py, autouse _patch_managed_uv
  fixture in test_cmd_update.py.

Closes #37605, Closes #37622

* fix(tests): add _patch_managed_uv autouse fixture to uv-dependent test files

Production code now uses ensure_uv()/update_managed_uv() from
managed_uv.py instead of shutil.which("uv") directly. Tests that
patched shutil.which to control uv availability no longer controlled
the actual code path, causing CI failures.

Add an autouse _patch_managed_uv fixture to test_update_autostash.py
and test_uv_tool_update.py (matching the existing fixture in
test_cmd_update.py). The fixture makes managed_uv functions delegate
to shutil.which so existing test patches flow through naturally.

* fix(desktop): write Dock tile as a file-reference URL

The Dock stores persistent-apps as type-15 file:// URLs; the type-0/raw-path
tile we wrote was silently dropped on the next Dock restart (so the pin never
took, yet we'd stamped the marker and never retried). Use pathToFileURL + type
15 and flush prefs through cfprefsd before `killall Dock`. Verified end-to-end
on a packaged build: move -> adopt -> Dock tile lands as
file:///Applications/Hermes.app/.

* fix(desktop): configure Linux Electron sandbox helper

Electron's chrome-sandbox helper must be root:root 4755 on Linux or the
sandboxed renderer aborts before the desktop app starts. The existing
installer only searched for macOS .app bundles, so a successful Linux
build was reported as missing.

Changes:
- Add _desktop_linux_sandbox_fixup() to hermes_cli/main.py, called
  before launching a packaged desktop app on Linux.
- Use lstat() + S_ISREG check to reject symlinks — chown/chmod on a
  symlink target would set SUID on an arbitrary path.
- Update install.sh to recognize Linux unpacked artifacts and configure
  chrome-sandbox with proper error handling (the original PR silently
  ignored chown/chmod failures).
- Add regression tests: normal fixup flow, symlink rejection, and
  already-configured skip path.

Closes #37529 (rebased, merge conflicts resolved, copilot review
feedback addressed).

* fix(desktop): inherit microphone entitlement for macOS helpers

Add com.apple.security.device.audio-input to entitlements.mac.inherit.plist.
Under hardenedRuntime the Electron Helper/Setup processes inherit this file,
and the missing entitlement made macOS TCC deny the microphone with no prompt,
breaking voice chat.

Fixes #37718

* test(desktop): assert macOS device entitlements are inherited

Pin #37718: the inherit plist must grant audio-input, every device.*
entitlement on the main app must also be inherited by the Helper/Setup
processes, and both entitlement files must stay valid plists.

* fix(desktop): roll back optimistic model switch on failure

selectModel snapshots the prior model/provider and restores the store +
query cache when the backend switch fails, so the UI never shows a model the
backend didn't actually select.

* docs(desktop): sync marker schema comment + default dock note arg

Address Copilot review: document the `adopted` flag and nullable `pinnedCommit`
in the marker schema comment, and default `done(note = {})` so the dock-pinned
marker write is unambiguous (object spread of undefined was already a no-op, but
explicit is clearer).

* fix(desktop): switch model on keyboard activation of picker rows

The model row is a Radix sub-trigger (no onSelect), so switching was
pointer-only. Wire Enter/Space alongside onClick so keyboard users can switch
models too.

* fix(dashboard): allow desktop websocket origins on remote binds

* chore: add leonardsellem to AUTHOR_MAP for PR #37405

* fix: expand skill bundles in cron jobs

* feat(cli): configurable default interface (cli vs tui)

Add `display.interface` config key so users can make the modern TUI the
default for bare `hermes` / `hermes chat` without exporting HERMES_TUI=1 in
every shell. Default stays "cli" to preserve current behavior.

Add a `--cli` flag (mirrors `--tui`) so an explicit invocation can force the
classic prompt_toolkit REPL even when `display.interface: tui` is configured.

Precedence (highest first): `--cli` > `--tui`/`HERMES_TUI=1` > config
`display.interface` > classic REPL. Two resolvers enforce it:

  * `_resolve_use_tui(args)` — the args-aware resolver used by `cmd_chat`
    and the Termux fast-TUI path (uses full load_config()).
  * `_wants_tui_early(argv)` — a dependency-free early resolver used by
    mouse-residue suppression and the Termux fast paths, which run before
    argparse / hermes_cli.config are importable (minimal cached YAML read).

Both `--cli` and `--tui` are registered via `_inherited_flag`, so they are
carried across self-relaunch automatically.

- config: add display.interface ("cli" default), bump _config_version 25->26.
  The generic missing-field migration + load_config() deep-merge seed the key
  for existing configs; no bespoke migration block needed.
- docs: document --cli flag and display.interface in cli-commands.md and
  the TUI user guide.
- tests: new test_default_interface_resolution.py covering resolver
  precedence at every layer, early resolver edge cases (missing/garbage
  config), parser flags, and relaunch inheritance.

* fix(deps): refresh lockfile to clear 6 npm audit findings (#37752)

* fix(deps): refresh lockfile to clear 6 npm audit findings

Plain `npm audit fix` (no --force, no overrides) — every patched
version was already in-range, so a lockfile refresh clears all
findings without permanent override pins.

Cleared:
- tmp 0.2.5 -> 0.2.7 (path traversal, HIGH — GHSA-ph9p-34f9-6g65)
- brace-expansion 5.0.5 -> 5.0.6 (DoS — GHSA-jxxr-4gwj-5jf2)
- mermaid 11.14.0 -> 11.15.0 (4 advisories: GHSA-6m6c-36f7-fhxh,
  GHSA-xcj9-5m2h-648r, GHSA-87f9-hvmw-gh4p, GHSA-ghcm-xqfw-q4vr)

npm audit: 6 vulnerabilities -> 0. package.json untouched.

* fix(nix): bump npmDepsHash for refreshed lockfile

Uses the hash fetchNpmDeps (the actual build fetcher) produces, which
diverges from prefetch-npm-deps / nix run .#fix-lockfiles output for
this lockfile.

* fix(setup): default browser/TTS picker to free local backend, not paid Nous (#37800)

The Browser Automation and Text-to-Speech provider pickers listed the paid
"Nous Subscription" gateway row first, so on a fresh install the menu cursor
defaulted to index 0 (Nous). Pressing Enter selected it and ran the inline
Nous Portal device-code login — walking users into a paid offering they
never chose.

Reorder both provider lists so the free, no-key local backend is index 0
(Local Browser / Microsoft Edge TTS). Users who already configured Nous are
unaffected: _detect_active_provider_index still resolves their active row
first, so the cursor lands on Nous (now index 1) for them.

Reported by Javier via Kujila.

* fix(tui): clear selection on right-click copy + group transcript blocks

Two TUI polish fixes.

(1) Right-click copy now clears the highlight.
The right-click handler copied an active selection via onCopySelectionNoClear
(the copy-on-select variant that keeps the highlight during a drag) and never
cleared it, so after right-click-to-copy the selection stayed lit with no
confirmation and a follow-up right-click re-copied the stale range instead of
pasting. A successful right-click copy now clears the selection and notifies;
if the copy fails (no clipboard path) the highlight survives and we fall back
to the right-click paste handler, exactly as before.

(2) Group transcript blocks so boundaries read clearly.
Model replies, reasoning/tool trails, and system/error notes rendered with no
vertical separation, so distinct block types butted together and were hard to
scan. Group adjacent blocks by kind: one blank line opens only where the visual
group changes (model prose <-> reasoning/tool trails <-> notes), while a run of
same-kind blocks renders flush. The rule lives in domain/blockLayout.ts
(messageGroup + hasLeadGap) and is applied intrinsically in MessageLine via a
`prev` prop, which fixes the things ad-hoc per-block margins kept breaking:

  - Streaming stability: the gap is derived from the stable predecessor, never
    the live block's own changing text, so the actively-streaming reply computes
    the same gap while it streams as the settled segment does once it flushes.
    No reflow/jump.
  - Transparent empty trails: a trail hidden by /details, or one carrying only a
    token tally (the finalDetails segment message.complete appends), renders
    nothing and is transparent to grouping (prevRenderedMsg skips it), so there
    are no floating gaps, no doubled gap after a prompt, and no padded space
    above the final reply. In the default/collapsed modes content-bearing trails
    always render, so the grouping is a no-op there.

The virtual-height estimator counts the group-boundary line so scroll math
stays accurate before Yoga remeasures.

ui-tui/src/domain/blockLayout.ts (new), components/messageLine.tsx,
components/streamingAssistant.tsx, components/appLayout.tsx,
lib/virtualHeights.ts, app/useMainApp.ts.

Tests: blockLayout.test.ts (grouping + hidden/empty-trail visibility),
virtualHeights leadGap, app-mouse.test.ts copy behavior. Full ui-tui suite
green apart from 3 pre-existing local/env failures (cursorDrift, ink-resize,
virtualHeights user-prompt-width) unchanged from main.

* fix(desktop): stop chat scroll jumping by disabling native scroll anchoring

The thread renders virtualized turns in natural document flow with padding
spacers, and @tanstack/react-virtual already adjusts scrollTop itself when an
off-screen turn is measured and its real height differs from the 220px
estimate. With the browser default `overflow-anchor: auto`, native scroll
anchoring corrects that SAME size delta too, so the two double-correct and the
view lurches — most visibly with Windows mouse wheels, whose coarse notches
mount/measure several under-estimated turns per tick (Mac trackpads scroll
~1-3px/frame, keeping it sub-perceptual).

Set `overflow-anchor: none` on the thread viewport so only the virtualizer
compensates. Also adds `diag-scroll-reset.mjs`, a CDP wheel-up repro that A/B
tests the anchor behavior at runtime to confirm the fix.

* feat(desktop): clamp sticky human messages to ~2 lines until hover/focus

Long user prompts stick to the top of the thread while the response streams
beneath them, so a multi-line prompt could eat most of the viewport. Clamp the
read-only human bubble's text to ~2 lines with a soft bottom fade; the clamp
lifts on hover or keyboard focus, and clicking the bubble still opens the edit
composer (which shows the full text). Short messages are untouched — no clamp,
no fade.

Overflow is measured on an unclamped inner wrapper so the ResizeObserver only
fires on real content/width changes, not every frame while the outer
max-height animates open; the measured height feeds --human-msg-full so
expand/collapse animate to the true height instead of overshooting the cap.

* fix(dashboard): trust non-web WS origins on OAuth-gated binds after ticket auth (#37870)

Generalises #37747. The WS Origin guard (_ws_host_origin_is_allowed) only
trusted the packaged Electron app's non-web origin (file:// / null / app://)
when the bind was NOT OAuth-gated. The packaged Hermes Desktop renderer loads
over file://, so when it drives a remote OAuth-gated gateway its /api/ws
upgrade was rejected with HTTP 403 even though _ws_auth_ok had already
validated the single-use ?ticket= one line earlier.

This guard runs only AFTER _ws_auth_ok has accepted the WS credential, which
is the real auth boundary in every mode:
  * loopback bind          -> legacy dashboard session token
  * non-loopback --insecure -> legacy session token (Tailscale / LAN, #37747)
  * OAuth-gated public bind -> single-use, 30s-TTL, identity-bound ?ticket=
A non-web origin can only come from a native client; a DNS-rebinding attack
always arrives from an http(s) origin and is still match-checked against the
bound host. So once the upstream credential check has…
teknium1 pushed a commit that referenced this pull request Jun 5, 2026
… alone

The desktop OAuth remote-gateway path gated connectivity on
hasOauthSessionCookie(), which checks only the access-token cookie
(hermes_session_at, ~15 min TTL). The moment that cookie's Max-Age
lapsed, Electron's cookie jar dropped it and both resolveRemoteBackend()
and sanitizeDesktopConnectionConfig() reported "not signed in" — forcing
a full IDP re-login every ~15 min — even though a valid 24h refresh-token
cookie (hermes_session_rt) was sitting in the same jar.

The desktop OAuth code (2026-06-04) was written against the obsolete
"contract v1 issues no refresh token" model, two days after #37247
re-introduced server-side transparent refresh: Portal now issues a 24h
rotating, reuse-detected refresh token, and the gateway middleware
(_attempt_refresh) rotates a fresh AT from the RT on the next
authenticated request. So an expired-AT/live-RT session is fully
connectable — the desktop just never let the request through.

Fix:
- connection-config.cjs: add RT_COOKIE_VARIANTS + cookiesHaveLiveSession()
  (true when EITHER a live AT or RT cookie is present). Keep
  cookiesHaveSession() AT-only for callers that need that specific signal.
- main.cjs: add hasLiveOauthSession(); resolveRemoteBackend()'s oauth
  branch now early-outs only when NEITHER cookie is present, otherwise
  uses the ws-ticket mint as the authoritative liveness probe (that POST
  carries the RT cookie and triggers the server-side AT rotation). A real
  401 still surfaces as needsOauthLogin. Settings indicator + oauth-logout
  report against the same AT-or-RT notion.
- Remove the stale "contract v1 / NO refresh token" docstrings in
  cookies.py and the verify_session comments in the Nous provider that
  contradicted #37247.

Tests: +57 lines in connection-config.test.cjs covering the RT-only
"still connectable" case. node --test: 32/32. dashboard-auth +
nous-provider Python suites: 223/223.

Note: server-side files (hermes_cli/dashboard_auth/, plugins/dashboard_auth/)
are comment/docstring-only here, but this touches outside apps/desktop/ so
it needs Teknium review.
zenc-cp added a commit to zenc-cp/hermes-agent that referenced this pull request Jun 7, 2026
* feat(desktop): concurrent multi-profile gateway sockets

Keep one persistent socket per profile with live work instead of closing
the single socket on every profile swap, so background sessions across
profiles keep streaming at once. A gateway registry owns the primary
(window) socket plus lazy secondaries (own backoff/reconnect); all feed
the same session-keyed event handler. Secondaries are pruned to profiles
with a working/needs-input session, the keepalive pings every open
backend, and LRU eviction spares freshly-touched backends so the soft cap
can't abort a running agent. Approval/sudo/secret prompts are parked
per-session (surfaced via the needs-input badge) so a background turn can
block without hijacking the foreground. Single-profile users only ever
have the primary, so their path is unchanged.

* feat(desktop): per-profile "+" to start a session in the all-profiles view

Mirror the workspace-group "+": each profile header in the all-profiles
session list gets a new-session button. Unlike selecting the profile, it
leaves the browse scope untouched (newSessionInProfile keeps
$showAllProfiles), so creating a chat doesn't collapse the unified view.

* fix(cli): use Rich [dim] tag instead of ANSI escape in _restore_session_cwd

Replace [{_DIM}] with [dim] in all _restore_session_cwd and
_preload_resumed_session messages that go through _console_print (Rich
Console.print).  _DIM is an ANSI escape (\x1b[2;3m) that Rich cannot
parse as a markup tag, causing MarkupError on session resume when the
stored cwd is missing or inaccessible.

Also uses [/dim] closing tag for explicit tag matching.

Fixes #39469

* fix(desktop): recover from corrupt Electron cache in bootstrap install (Windows)

Windows counterpart of #39127: scripts/install.ps1 `Install-Desktop` runs
`npm run pack` once and throws on the opaque ENOENT a corrupt cached Electron
download produces, with no recovery. Add `Clear-ElectronBuildCache` plus a
purge-and-retry-once on pack failure, mirroring the install.sh fix: remove the
cached electron-*.zip (%LOCALAPPDATA%\electron\Cache + ELECTRON_CACHE /
electron_config_cache overrides) and stale *-unpacked output, then retry so
@electron/get re-downloads with its own SHASUM verification.

Refs #37544.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore(release): map youngstar-eth in AUTHOR_MAP for salvage PR #39134

* fix(approval): check is_approved in execute_code guard (#39275)

check_execute_code_guard() never called is_approved() before entering the
approval flow, and never persisted session/permanent approvals from the
gateway response. This meant 'Approve session' and 'Always' buttons had
no effect — every execute_code call re-prompted the user.

- Add is_approved() check after get_current_session_key(), matching
  check_all_command_guards()
- Persist session ('approve_session') and permanent ('approve_permanent')
  approvals based on the gateway choice, same as terminal command guard
- Add 3 regression tests for session persistence, permanent persistence,
  and short-circuit on pre-existing approval

* fix(mcp): widen shutdown_mcp_servers exception guard to BaseException

* fix(cli): widen _run_cleanup MCP shutdown guard to BaseException

* chore: add HeLLGURD to release AUTHOR_MAP for PR #39453 salvage

* fix(tui): add thread-safety locks for _sessions and prompt dicts

C1: Add _sessions_lock to protect all compound mutations and iterations
    on the global _sessions dict across 5+ concurrent execution contexts
    (main dispatcher, pool workers, daemon threads, notification poller,
    atexit handler).

C2: Add _prompt_lock to protect _pending/_pending_prompt_payloads/_answers
    dicts from races between _block() (agent callback thread) and
    _respond() (pool worker).  Lock scope is kept tight — _block() only
    holds the lock during registration/cleanup, releasing it before
    _emit() and ev.wait() to avoid blocking other prompts for 300s.

All 187 existing TUI tests pass with no regressions.

* feat(desktop): add 'choose provider later' skip to first-run onboarding (#39483)

The first-run provider picker was a hard gate — the only way out was
connecting a provider. Add an 'I'll choose a provider later' link that
dismisses the overlay and persists the skip to localStorage so it never
re-nags on subsequent launches. Users connect a provider any time from
Settings -> Providers (manual onboarding already bypasses the skip gate).

- onboarding.ts: firstRunSkipped state seeded from localStorage
  (hermes-onboarding-skipped-v1) + dismissFirstRunOnboarding() action;
  completeDesktopOnboarding clears the flag once a provider connects.
- overlay: skip gate (firstRunSkipped && !manual returns null); ChooseLaterLink
  rendered in both the OAuth picker footer and the API-key fallback, first-run only.
- tests: skip persists + hidden in manual mode; full-state fixtures updated.

* test(desktop): add injectable gateway WebSocket probe + unit tests

Adds electron/gateway-ws-probe.cjs: a small helper that opens a gateway
WebSocket URL and classifies the handshake (open/frame → ok; error or close
before open → fail; open-then-early-close → credential rejected; never-opens →
timeout). The WebSocket implementation is injected so it can be unit-tested
without a real socket.

Wires gateway-ws-probe.test.cjs into test:desktop:platforms, covering every
handshake outcome plus constructor-throw and missing-impl.

* fix(desktop): validate live WebSocket in remote gateway connection test

The "Test remote" button only checked HTTP GET /api/status, but the chat
surface depends on the renderer opening a live WebSocket to /api/ws — a
separate transport with separate server-side guards (Host/Origin checks,
ws-ticket/token auth, peer-IP checks). A gateway could pass the HTTP check yet
reject the WebSocket, so the test reported "reachable" while boot still failed
with the opaque "Could not connect to Hermes gateway".

testDesktopConnectionConfig now mirrors the renderer's connect: after the
status check it opens the WS URL (token/local) or a freshly minted ws-ticket
(OAuth) and confirms the upgrade is accepted and not immediately torn down by
a post-handshake auth rejection. Failures surface an actionable message instead
of a false-positive. The WS leg is skipped when the runtime lacks a global
WebSocket so it never fails spuriously.

* fix(desktop): fail remote test when OAuth ws-ticket mint fails

Youssef's review caught a residual false-positive: resolveTestWsUrl
swallowed an OAuth ticket-mint failure and returned null, so the caller
skipped the WS probe and reported the remote test as reachable. But the
real boot path (resolveRemoteBackend) treats a mint failure as a hard
'session expired' auth error and refuses to connect — so an expired OAuth
session passed the test then failed boot, the exact false-positive this
PR exists to kill.

Extract resolveTestWsUrl into the electron-free connection-config.cjs
(injectable mintTicket) so it's unit-testable, and make OAuth mint
failure throw an actionable needsOauthLogin error instead of skipping.
Adds the three cases Youssef requested plus a mintTicket-required guard.

* fix(dashboard): reap orphaned embedded-chat sessions to stop slash_worker leak

Since #38591 made the dashboard's embedded chat unconditional, every
browser refresh of /chat spins up a fresh session.create (new sid + a
fresh _SlashWorker via _deferred_build) over /api/ws, but the old tab's
WS disconnect only DETACHES the transport (ws.py) — it never closes the
old session or its slash_worker. The dashboard's in-process gateway is
long-lived, so the detached _SlashWorker subprocess's stdin pipe stays
open forever and the worker never reaches EOF: one leaked python process
per refresh.

Fix at the session-lifecycle layer (not PTY signal timing — verified that
a process whose owning gateway dies is always reaped via stdin-EOF; the
leak is specifically the long-lived dashboard process keeping detached
sessions parked). On WS disconnect, schedule a grace-delayed reap of any
session left orphaned (transport detached to stdio, not mid-turn). A quick
reconnect / session.resume / prompt.submit rebinds a live transport and
cancels the reap, preserving the intentional detach-for-reconnect window.

- server.py: extract _teardown_session() (shared with session.close),
  add _ws_session_is_orphaned() + _schedule_ws_orphan_reap(), gated by
  HERMES_TUI_WS_ORPHAN_REAP_GRACE_S (default 20s, 0 disables).
- ws.py: schedule the reap for each detached session on disconnect.
- tests: reap-closes-worker, spares-reattached/mid-turn/finalized,
  disabled-when-grace-zero.

* fix(install): scrap rebuild venv

* fix(google-workspace): fall back to uv when venv has no pip (#39516)

The Hermes Docker image's venv is built with `uv sync`, which does not
bootstrap pip into the venv. When the google-workspace setup script needs
to install its deps and the running interpreter has no pip,
`sys.executable -m pip install` dead-ends with "No module named pip"
(reported via Discord support).

install_deps() now falls back to `uv pip install --python <interpreter>`
when the pip path fails and uv is on PATH. uv installs into the exact
interpreter the script is running under without needing pip present, so
the pip-less venv self-heals (e.g. a dep evicted on image update, or a
build without the [google]/[all] extra). On environments with neither
pip nor uv, the [google] extra hint is printed as before.

Verified E2E against nousresearch/hermes-agent:latest: under the venv
python with a missing dep, --install-deps now prints "Dependencies
installed." and exits 0 instead of failing.

Adds TestInstallDeps regression coverage: pip path, uv fallback,
uv-not-consulted-when-pip-works control, and both no-installer-available
and uv-also-fails failure cases.

* fix: temp for update

* Clear stale desktop onboarding errors (#38844)

* change(desktop): show up to 50 models in list per provider by default

* fix(deps): bump react-router-dom to 7.17.0 (GHSA-8x6r-g9mw-2r78)

Clears the npm-audit React Router advisory CVE-2026-42342 in the web
and apps/desktop workspaces by bumping react-router-dom 7.14.x -> ^7.17.0
(patched in 7.15.0; both react-router and react-router-dom now resolve
to 7.17.0 in the root lockfile).

Note: the advisory's DoS only affects React Router *Framework Mode*
(the __manifest server endpoint). Both workspaces use Declarative Mode
(web: <BrowserRouter>, desktop: <HashRouter>) as pure client-side SPAs,
so we were never actually exploitable -- this is audit-hygiene only.

npm audit --omit=dev: 0 vulnerabilities. Web + desktop + ui-tui builds
and tsc typecheck all green on 7.17.0.

* fix(nix): bump npmDepsHash for react-router 7.17.0 lockfile

The react-router-dom 7.14->7.17 lockfile change stales the pinned
npm-deps hash in nix/lib.nix, turning the nix flake checks red. Bump
to the hash CI's prefetch diagnostic computed for the new lockfile.

* fix(nix): use fetchNpmDeps hash for npmDepsHash, not prefetch-npm-deps

The previous fix committed the hash from `prefetch-npm-deps`
(sha256-hgnqc...), but the actual `fetchNpmDeps` FOD (fetcherVersion 2)
that `nix flake check` builds wants sha256-cY+gM... . These two tools
disagree for this lockfile, so the build's npm-deps derivation failed
with a hash mismatch even though `fix-lockfiles --check` reported "ok".

Corrected to the build-verified value. Confirmed `nix build .#tui`,
`.#web`, and `.#desktop` all build cleanly with the new hash.

* fix(line): map inbound message types to the correct MessageType

The LINE adapter classified every non-text inbound message as
`MessageType.IMAGE`, which doesn't exist on the enum — so any image,
video, audio, file, sticker, or location message raised AttributeError
the moment it was constructed.

Beyond fixing the crash, every non-text message was being collapsed onto
a single type. The gateway routes on MessageType (voice → STT, files →
document handling, etc.), so misclassification silently mishandled media.
Replace the inline ternary with a `_LINE_MESSAGE_TYPES` lookup that maps
each LINE webhook type to its proper enum member (audio → VOICE to match
how Telegram/WhatsApp treat voice notes), falling back to TEXT for
unknown types. Adds regression tests covering the mapping and the old
AttributeError.

Co-authored-by: Sahibzada Allahyar <94376830+sahibzada-allahyar@users.noreply.github.com>

* fix(gateway): tolerate non-UTF-8 status/pid files in gateway status reads

`_read_json_file` caught OSError but not UnicodeDecodeError, so a status
file holding binary/non-UTF-8 bytes (truncated or clobbered write) would
crash the gateway status path instead of being treated as unreadable.
UnicodeDecodeError is a ValueError subclass, not an OSError, so it
escaped the existing guard.

Widen the catch to (OSError, UnicodeDecodeError) at both read sites in
gateway/status.py — `_read_json_file` and the sibling `_read_pid_record`,
which had the identical gap. Adds tests covering binary input (returns
None) and valid input (still parses) for both.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>

* fix(desktop): gate OAuth remote connect on AT-or-RT, not access token alone

The desktop OAuth remote-gateway path gated connectivity on
hasOauthSessionCookie(), which checks only the access-token cookie
(hermes_session_at, ~15 min TTL). The moment that cookie's Max-Age
lapsed, Electron's cookie jar dropped it and both resolveRemoteBackend()
and sanitizeDesktopConnectionConfig() reported "not signed in" — forcing
a full IDP re-login every ~15 min — even though a valid 24h refresh-token
cookie (hermes_session_rt) was sitting in the same jar.

The desktop OAuth code (2026-06-04) was written against the obsolete
"contract v1 issues no refresh token" model, two days after #37247
re-introduced server-side transparent refresh: Portal now issues a 24h
rotating, reuse-detected refresh token, and the gateway middleware
(_attempt_refresh) rotates a fresh AT from the RT on the next
authenticated request. So an expired-AT/live-RT session is fully
connectable — the desktop just never let the request through.

Fix:
- connection-config.cjs: add RT_COOKIE_VARIANTS + cookiesHaveLiveSession()
  (true when EITHER a live AT or RT cookie is present). Keep
  cookiesHaveSession() AT-only for callers that need that specific signal.
- main.cjs: add hasLiveOauthSession(); resolveRemoteBackend()'s oauth
  branch now early-outs only when NEITHER cookie is present, otherwise
  uses the ws-ticket mint as the authoritative liveness probe (that POST
  carries the RT cookie and triggers the server-side AT rotation). A real
  401 still surfaces as needsOauthLogin. Settings indicator + oauth-logout
  report against the same AT-or-RT notion.
- Remove the stale "contract v1 / NO refresh token" docstrings in
  cookies.py and the verify_session comments in the Nous provider that
  contradicted #37247.

Tests: +57 lines in connection-config.test.cjs covering the RT-only
"still connectable" case. node --test: 32/32. dashboard-auth +
nous-provider Python suites: 223/223.

Note: server-side files (hermes_cli/dashboard_auth/, plugins/dashboard_auth/)
are comment/docstring-only here, but this touches outside apps/desktop/ so
it needs Teknium review.

* fix(update-check): stop reporting phantom "N commits behind" inside Docker (#39559)

Inside the published Docker image, both the `--tui` banner and the
dashboard-embedded TUI report `1 commit behind — run docker pull
nousresearch/hermes-agent:latest to update` even though the container
has no git repo and no way to compute a commit delta.

Root cause: two independent update-detection paths, only one of which
knows it's running in Docker.

- `recommended_update_command()` → `detect_install_method()` reads the
  `.install_method` stamp that `docker/stage2-hook.sh` writes at boot →
  returns "docker", so the *command string* correctly says `docker pull`.
- `banner.check_for_updates()` (the source of the "N commits behind"
  *count*) has no notion of the docker install method. It only detects a
  build via `HERMES_REVISION` (nix-only, unset in the image) or a `.git`
  dir (excluded from the image by .dockerignore). Neither matches, so it
  silently falls through to `check_via_pypi()`, whose PyPI-version
  mismatch flag (1) is then rendered verbatim by the CLI banner
  (build_welcome_banner), the Ink TUI badge (branding.tsx), and `hermes
  version` as "1 commit behind" — a phantom count, no commit math
  involved. `hermes update` already refuses to run in-place in the
  container.

The dashboard's REST `/api/hermes/update/check` endpoint already
short-circuits docker (returns behind=None + the docker guidance). This
mirrors that guard inside `check_for_updates()` so the banner/TUI/version
surfaces agree: when `detect_install_method() == "docker"`, return None
before any git/pypi probe (and before writing a cache entry). None makes
the render guards (`typeof === 'number' && > 0`, `behind and behind > 0`)
stay false, so the badge/line disappears entirely — matching the System
page.

Fix is in one place (check_for_updates) because all three consumers route
through it via get_update_result()/_update_result.

Tests: test_check_for_updates_docker_returns_none asserts None + no
git/pypi probe + no cache write; test_check_for_updates_non_docker_still_checks
guards against over-broadening (pip still version-checks). Mutation-tested:
removing the guard fails the docker test.

Verified against a real `docker build` of the image — see PR description.

* fix(terminal): guard os.getcwd() against a deleted CWD

`os.getcwd()` raises FileNotFoundError when the process's working
directory was removed out from under it (e.g. a scratch workspace
cleaned up mid-session), crashing terminal env setup.

Extract a `_safe_getcwd()` helper that falls back to TERMINAL_CWD, then
the user's home, on FileNotFoundError, and route all three `os.getcwd()`
call sites in terminal_tool.py through it (local default_cwd, the Docker
cwd-passthrough source, and the debug-config print) so the same crash
can't resurface at a sibling site. Adds unit tests for the real-cwd path
and both fallback branches.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>

* fix(update): require managed marker before destructive clean

* fix(cron): don't crash on `cron list` when a job's repeat is null

`cron_list` read `job.get("repeat", {})`, but the dict-default only
applies to a MISSING key. A one-shot job persisted with `"repeat": null`
returns None, and the next `.get("times")` raised AttributeError, taking
down the whole `cron list` output. Coalesce with `or {}` so a
present-but-null repeat renders as ∞ like the other cron readers already
do. Adds a regression test.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>

* docs(dashboard): clarify auth provider suitability + registration across dashboard/Docker/Desktop docs (#39633)

* docs(dashboard): clarify auth provider suitability + document dashboard registration

- Add a 'Registering a dashboard' subsection under the Nous Research
  provider covering both the 'hermes dashboard register' CLI command
  and the Portal /local-dashboards GUI page.
- Note that the Nous provider is the one suitable for public-internet
  exposure (logins verified against your Nous account).
- Add a warning that the username/password provider is for trusted
  networks / VPN only and is not suitable for direct public-internet
  exposure; point readers to the Nous / OIDC / custom OAuth providers.
- Surface the same distinction in the two-provider intro list.

* docs(dashboard): count three bundled auth providers, add self-hosted OIDC to intro

'Two providers ship in the box' undercounted — the bundled
plugins/dashboard_auth/self_hosted (generic OpenID Connect) is a third.
List all three in the gated-mode intro and link each to its section.

* docs(dashboard): extend auth provider updates to Docker and Desktop pages

- docker.md: list all three bundled gate providers (was username/password
  + OAuth only), adding the self-hosted OIDC provider and its env vars,
  and note username/password is not for public-internet exposure.
- desktop.md: reframe the remote-backend connection so OAuth (Nous Portal)
  is the preferred option for any backend reachable beyond the local
  machine, with username/password positioned for local / trusted-network
  use only. Cover the 'Sign in with <provider>' OAuth flow in the in-app
  steps and scope the VPN warning to the password path.

* docs(dashboard): align env-var, CLI, and remote-Desktop recipe with provider changes

- environment-variables.md: reframe the Web Dashboard & Hermes Desktop
  intro (OAuth preferred for remote/public, username/password for
  trusted networks), add the self-hosted OIDC env vars
  (HERMES_DASHBOARD_OIDC_*) that were missing from the table, and note
  hermes dashboard register provisions the OAuth client_id.
- cli-commands.md: document the 'hermes dashboard register' subcommand
  (flags, behavior, /local-dashboards GUI alternative).
- web-dashboard.md: apply the OAuth-preferred reframe to the bottom
  'Connecting Hermes Desktop to a remote backend' recipe and scope its
  VPN warning to the username/password path, matching desktop.md.

* docs(dashboard): move 'recommended remote Desktop path' framing from username/password to OAuth

The gated-mode intro list claimed the username/password provider was the
recommended path for a remote Hermes Desktop connection, contradicting the
OAuth-preferred framing established elsewhere. Move that recommendation onto
the OAuth (Nous Portal) item so the docs are consistent: OAuth is the
recommended provider for any remote/internet-facing backend; username/password
is for trusted networks only.

* docs(dashboard): drop unreleased managed/hosted-install provisioning notes

Remove the 'not available in managed/hosted installs, where the client id is
provisioned by the hosting platform' line from the dashboard register docs
(web-dashboard.md, cli-commands.md) and the 'provisioned by the Nous Portal for
hosted deploys' clause from the HERMES_DASHBOARD_OAUTH_CLIENT_ID env-var row —
that platform-provisioning path is unreleased.

* docs(dashboard): drop --portal-url / HERMES_DASHBOARD_PORTAL_URL from user docs

The portal-URL override targets a non-production Nous Portal and only works
for internal Nous usage — it won't function for end users (the access token
must be issued by the same portal). Remove it from the register CLI flags,
the Nous-provider config/env tables, and the verify-the-gate example so users
aren't pointed at an option that can't work for them.

* docs(dashboard): add worked examples for Nous and username/password providers

The self-hosted OIDC provider already had a full 'Worked example: Keycloak'
walkthrough; the Nous and username/password providers only had scattered
config snippets. Add parallel '#### Worked example' sections for both
(register/run/login + /api/status verification), mirroring the Keycloak
example's structure so all three bundled providers read consistently.

* docs(env): move HERMES_DESKTOP_REMOTE_URL to end of the dashboard auth table

It was sitting between the HERMES_DASHBOARD_BASIC_AUTH_* block and the
HERMES_DASHBOARD_OAUTH/OIDC block, splitting the dashboard-side vars. As the
only desktop-side var in the table, it belongs at the end so the dashboard
provider vars (basic, OAuth, OIDC) stay grouped together.

* docs(dashboard): remove Fly.io references from dashboard auth docs

Fly.io is the internal hosting implementation for hosted Hermes — it shouldn't
leak into user-facing dashboard auth docs. Reword the OAuth provider intro,
the env-var-path rationale, the public-URL-override section, the cookie Secure
note, and the verify-the-gate example to generic 'hosting platform' / 'reverse
proxy' / 'TLS terminator' phrasing.

Left the legitimate user-facing Fly.io mentions in telegram.md (a deliberate
cloud-deployment walkthrough) and work-with-skills.md (a generic example)
untouched.

* fix(dashboard): strip session token from subprocess env

Add HERMES_DASHBOARD_SESSION_TOKEN to the Hermes-managed subprocess environment blocklist so dashboard authorization material does not propagate into shell, PTY, or background process launches.

Extend the local environment blocklist regression coverage to prove the dashboard session token is stripped like other Hermes-managed secrets.

* fix(cli): gitignore Desktop bootstrap marker so hermes update stops autostashing it

The Desktop bootstrap installer writes `.hermes-bootstrap-complete` into the
managed git checkout root. Because it wasn't gitignored, `hermes update`'s
`git stash push --include-untracked` treated it as a local change and created an
autostash on every run — prompting the user to restore "local changes" that were
really Hermes-managed runtime state (and risking the marker getting stranded in a
stash, which re-triggers Desktop bootstrap).

Add the marker to .gitignore; `git stash -u` and `git status --porcelain` both
skip ignored files, so the updater now sees a clean tree.

Fixes #38529

* chore(release): map harjoth.khara@gmail.com → harjothkhara for #38550 salvage

* fix(models): don't silently default Nous to the most expensive flagship

When a provider is configured but no model is selected (e.g. a profile sets
provider: nous with no model), the gateway/CLI fall back to
get_default_model_for_provider(), which returned the first curated catalog
entry. The Nous Portal list is ordered most-capable-first, so entry [0] is
anthropic/claude-opus-4.8 — the single most expensive model ($5/$25 per Mtok).
A misconfigured profile therefore silently routed every call to the flagship
and billed it for traffic the user never opted into.

Pin the silent (non-interactive) default for metered aggregators to the cheapest
curated tier via _PROVIDER_SILENT_DEFAULT_OVERRIDES so a missing model can never
auto-escalate to the flagship. The interactive default (GUI onboarding /
`hermes model`) keeps using the richer free/paid-tier-aware resolver.

Fixes the unexpected anthropic/claude-opus-4.8 charges reported for a
free-tier Nous account whose new profile had no default model.

* test(models): guard Nous silent default against expensive-flagship escalation

Assert get_default_model_for_provider("nous") never returns the priciest
catalog entry (anthropic/claude-opus-4.8) and that an override pointing at a
model absent from the catalog falls back to catalog order. Regression for the
silent flagship-billing footgun.

* fix(models): use deepseek-v4-flash as Nous silent default

Follow-up on the salvaged fix: point the Nous silent-default override at
deepseek/deepseek-v4-flash (a cheap chat model) instead of the nvidia
nemotron entry. Keeps the no-model-configured fallback off the priciest
flagship while landing on a low-cost, broadly-capable default.

* feat(discord): voice-channel mixer — ambient idle bed + verbal acks that overlap TTS (#39659)

* feat(discord): voice-channel mixer — ambient idle bed + verbal acks that overlap TTS

Discord voice mode can now feel conversational: the bot speaks a short
acknowledgement before it starts working, and a subtle ambient 'thinking' bed
plays underneath while tools run, ducking under speech and swelling back — the
Grok-voice-mode feel.

discord.py plays only one audio stream per voice connection, so this adds a
software mixer (VoiceMixer, a discord.AudioSource) installed once per guild on
join. It sums an ambient loop, verbal acks, and TTS replies into that single
20ms/48kHz/stereo stream (numpy int16 add + clip), so they overlap instead of
stop-and-swap. Speech ducks the ambient gain down and releases it smoothly.

- plugins/platforms/discord/voice_mixer.py: VoiceMixer + MixerChild (gain,
  loop, fade, duck/release), decode_to_pcm (ffmpeg), synth_ambient_pcm (no
  asset needed — synthesised pad).
- adapter: install mixer on join, tear down on leave, route
  play_in_voice_channel through the mixer (legacy one-shot path kept as
  fallback), play_ack_in_voice, voice_mixer_active. Defensive getattr for the
  object.__new__ test helpers.
- gateway/run.py: tool_start_callback fires a one-time verbal ack on the first
  tool call of a turn when in a voice channel (independent of the text
  tool-progress gate). No system-prompt or message-flow changes.
- config: discord.voice_fx.* (OFF by default; ambient/duck/speech gains, ack
  phrases). All in config.yaml, not .env.
- docs + tests (mixer unit + adapter integration).

Verified: 19 new tests pass, existing voice suite green (2 pre-existing
davey-module env failures unchanged), and a real-mixer E2E confirms ambient
streams, TTS overlaps it, acks layer in, and teardown is clean.

* fix(discord): make voice mixer numpy import lazy (numpy is voice-extra-only)

numpy ships in the optional 'voice' extra, not [all,dev], so a module-level
'import numpy' broke CI test collection (and would break the always-imported
Discord adapter on any install without the voice extra). Defer numpy to the
functions that actually mix audio via _require_numpy(); guard the test module
with pytest.importorskip('numpy').

* Add CLI Telegram QR onboarding

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>

* fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730)

* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints

Two distinct failures hit users on the gemini provider with only Google
AI Studio keys set.

1. Truncation loop: build_gemini_request() only set maxOutputTokens when
   max_tokens was non-None. Hermes passes None to mean "unlimited", but
   Gemini's native generateContent does NOT treat an absent maxOutputTokens
   as full budget — it applies a low internal default and stops early with
   finishReason=MAX_TOKENS, truncating tool calls. The agent then retries
   3x and refuses the incomplete call. Now default to the published 65,535
   ceiling (shared by all current Gemini text models) when max_tokens=None.

2. HTTP 400 on Gemini endpoint: the chat_completions transport assembles
   profile extra_body (Nous portal 'tags', reasoning, provider prefs) and
   sends it via the OpenAI client to whatever base_url is resolved. When a
   profile that emits extra_body (e.g. Nous) is active but the endpoint is a
   native Gemini base_url — typical when only Google creds exist and a
   fallback/aux call lands on Gemini — Google rejects the unknown 'tags'
   field with a non-retryable 400. Strip all non-thinking_config extra_body
   keys when the resolved endpoint is native Gemini.

Verified E2E against real transport code: tags stripped on native Gemini,
preserved on Nous and the /openai compat endpoint; maxOutputTokens=65535
on None, explicit values respected.

* fix(cli): require Chromium for local browser readiness in setup/status surfaces

* feat(gateway): explain /voice usage when toggled bare (#39766)

A bare /voice silently toggled on/off with a one-line result, leaving
users with no idea what the modes mean or that Discord also supports
TTS-all and live voice-channel join/leave. Bare /voice now still
toggles but appends a usage explainer covering on/off/tts/status, with
the Discord voice-channel lines shown only on adapters that support
them.

Adds gateway.voice.help + gateway.voice.help_channels across all 16
locales (placeholders {toggle}/{channels}).

* feat(delegation): uncap max_spawn_depth (floor 1, no ceiling) (#39772)

* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* feat(delegation): uncap max_spawn_depth to match max_concurrent_children

Removed the hard ceiling of 3 on delegation.max_spawn_depth. Depth now has
a floor of 1 and no upper limit, mirroring max_concurrent_children. Cost
(each level multiplies API spend) is the practical limiter, not a constant.

- delegate_tool.py: drop _MAX_SPAWN_DEPTH_CAP, _get_max_spawn_depth() floors
  at 1 instead of clamping to [1,3]; depth-limit error string reworded
- config.py / cli-config.yaml.example: doc comments say floor 1, no ceiling
- docs (configuration, delegation, delegation-patterns): range 1-3 -> >=1
- tests: convert clamp-above-3 change-detector into a no-ceiling invariant,
  drop the _MAX_SPAWN_DEPTH_CAP==3 snapshot assert, fix warning-text assert

* fix(gateway): log silent file-delivery drops (#39767)

When the agent's reply references a deliverable file path that does not
exist on disk, extract_local_files dropped it from native delivery with
no log line — the most common reason a promised file never arrives over
a messaging platform. Add an INFO log at that drop point so the gap is
visible in gateway.log instead of vanishing.

Also convert the two print() calls in Telegram's send_document /
send_video exception handlers to logger.warning(exc_info=True). print()
writes to stdout, which 'hermes logs' never captures, so outbound upload
failures (oversized files, Bot API rejections) were invisible.

* feat(update): stash/restore by default + settable discard for non-interactive updates (reverts #38542, #39568) (#39645)

* Revert "fix(update): require managed marker before destructive clean"

This reverts commit c8e80cd0bfdbbfa0b14296ef59a1c3d353917add.

* Revert "fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)"

This reverts commit 8a19884bf3995a8d1144c828582de043abb4331c.

* chore(install): keep npm ci desktop-build fix after stash revert

The destructive-clean reverts (#38542/#39568) pulled the desktop
workspace install back to bare `npm install`. The npm ci -> npm install
fallback is orthogonal build-correctness (avoids the Windows
workspace-hoisting flake where install reports up-to-date against a
stale marker while node_modules is empty, breaking tsc -b). Preserve it.

* feat(update): settable stash-or-discard for non-interactive local changes

Adds updates.non_interactive_local_changes (stash | discard, default
stash). Governs ONLY non-interactive updates (desktop/chat app, gateway,
--yes) — interactive terminal updates always stash-and-ask, unchanged.

- config.py: new key under existing updates section; _config_version 26->27.
- main.py: _cmd_update_impl detects non-interactive (gateway/--yes/no-TTY),
  reads the setting; new _discard_stashed_changes() drops the stash
  (stash-and-drop, never reset --hard/clean -fd, so ignored paths survive).
  Post-pull restore site branches on it; the bail-out and up-to-date
  restores always preserve work.
- web_server.py + apps/desktop settings: exposes it as a stash/discard
  select (Advanced section, In-App Update Local Changes).
- docs + tests (discard drops, stash restores, interactive ignores setting,
  missing section defaults to stash).

* fix(install.ps1): stash/restore instead of reset --hard on Windows update

The PR reverted the destructive update path to stash/restore everywhere
except scripts/install.ps1, whose managed-clone update path still ran
`git reset --hard HEAD` before checkout — silently destroying agent-edited
tracked source on Windows (the same #38542 data-loss class the PR fixes).

- Replace `git reset --hard HEAD` with stash-before-checkout +
  restore-after-checkout, mirroring install.sh. Untracked files are
  included so agent-created dirs (e.g. tinker-atropos/) survive.
- Keep `core.autocrlf false` (it prevents the phantom CRLF dirt that made
  the stash necessary; it's also load-bearing for a clean restore).
- Wrap all three checkout modes (Commit/Tag/Branch); Branch case now uses
  `git pull --ff-only` so local commits are never clobbered.
- Only prompt to restore when a real console is attached (UserInteractive
  + non-redirected stdin/stdout + ConsoleHost); the desktop Update button
  and bootstrap have no usable console, so they default to restore and
  never hang on Read-Host.
- On restore conflict or a failed update, the stash is preserved with
  recovery instructions — work is never silently dropped.

Validated on Windows (PowerShell 5.1, git 2.54): AST parse clean;
E2E non-conflicting restore applies+drops cleanly with ignored paths
(node_modules) untouched; conflicting restore preserves the stash.

---------

Co-authored-by: alt-glitch <balyan.sid@gmail.com>

* fix(update): make ensure_uv() survive the update boundary (no first-run crash) (#39780)

* fix(update): make ensure_uv() survive the update boundary (no first-run crash)

`hermes update` runs the `ensure_uv()` call site from the old, already-imported
`hermes_cli.main` against the *freshly pulled* `managed_uv` (managed_uv is only
ever lazily imported, so it loads from disk post-pull). `ensure_uv()`'s return
arity flipped from a single path string to `(path, fresh_bootstrap)` (4df280d51)
and back to a single string (fb853a178). Installs parked on a 2-tuple release
unpack `uv_bin, fresh_bootstrap = ensure_uv()` against the new single-value
module and crash the first update with
`ValueError: not enough values to unpack (expected 2, got 1)` — inside the
dependency-install step, *before* the PR #39763 subprocess hand-off can run.

Return a `_UvResult` (a `str` subclass) that is usable as the bare path AND
unpackable as `(path|None, fresh_bootstrap)`. Missing uv is `""` (falsy) instead
of `None` so legacy 2-target call sites can unpack a failure without raising,
while `if not uv_bin` keeps working for single-value callers. fresh_bootstrap is
always False (the rebuild-venv path it gated was scrapped in fb853a178).

* docs(update): correct the verified error string + mechanism for ensure_uv()

A hermetic repro (old 2-target call site vs the freshly-pulled single-value
module) shows the first-update crash is exactly the string from PR #39763's
report: `ValueError: too many values to unpack (expected 2)` — not "not enough".
The returned path is a plain `str`, which is iterable, so `uv_bin, fresh =
ensure_uv()` walks its characters; the failure path's `None` return raises
`TypeError: cannot unpack non-iterable NoneType`. Both are fixed by `_UvResult`.
Comment/test wording updated to match; no behavior change.

* feat(desktop): per-profile remote gateway hosts (#39778)

* feat(desktop): per-profile remote gateway hosts

Profile switching silently failed whenever the desktop was connected to a
remote backend: the rail routed non-active profiles to a local pool backend,
but spawnPoolBackend hard-threw "Profiles are unavailable when connected to a
remote Hermes backend", and the renderer swallowed the error into an infinite
reconnect backoff while still marking the profile active. Remote was also a
single app-global setting, so there was no way to give a profile its own host.

Add per-profile remote hosts so each profile can point at its own backend:

- connection.json gains a validated `profiles` map; profileRemoteOverride()
  (pure, unit-tested) selects an explicit per-profile remote.
- resolveRemoteBackend(profile) precedence: per-profile override → env override
  → global remote → local spawn. spawnPoolBackend now connects to a profile's
  remote (no local child) instead of throwing; startHermes resolves the primary
  profile's remote.
- coerce/sanitize connection config are scope-aware (global vs named profile)
  and preserve each other's entries; IPC get/save/apply/test thread an optional
  profile. Per-profile apply drops only that profile's pool backend.
- Settings → Gateway adds an "Applies to" scope selector reusing the existing
  URL/token/OAuth/test UX per profile.

Tests: connection-config pure suite (+6) and desktop platform suite pass;
tsc/eslint/vitest clean.

* refactor(desktop): DRY per-profile remote helpers

Share connectionScopeKey + normAuthMode from connection-config.cjs (drop the
main.cjs copy), collapse the scope/auth ternaries, route the env remote through
buildRemoteConnection, and fold the duplicated remote-block validation into
buildRemoteBlock. No behavior change; pure suite + live E2E still green.

* fix(desktop): heal stale runtime-id cache + model on profile switch (#39819)

Two switch-time regressions from the multi-profile rail work:

- "Session not found" (4007): pruneSecondaryGateways idle-reaps a
  non-active profile's backend; switching back respawns a *fresh*
  backend that mints new runtime ids, but runtimeIdByStoredSessionId is
  never pruned. resumeSession's cache fast-path then makes a dead runtime
  id active and returns, so session.usage + the next prompt 404. Probe
  the cached id; on rejection drop the stale mapping and fall through to
  a full resume that rebinds a live id.

- "Forgets the LLM setting": $currentModel is a nanostore set only by
  refreshCurrentModel (gatewayState->open, etc). A swap fires
  invalidateQueries() (react-query only) and keeps the socket 'open', so
  the model/pill kept showing the previous profile. Re-pull both when
  $activeGatewayProfile changes.

* fix(update/windows): don't return _UvResult on Windows (subprocess argv crash) (#39820)

PR #39780 made ensure_uv() return a _UvResult — a str subclass whose
__iter__ yields (path, fresh_bootstrap) so old `uv_bin, fresh = ensure_uv()`
call sites survive the update boundary. That trick is unsafe on Windows.

The dependency installer passes uv straight into the command list
(`[uv_bin, "pip", "install", ...]`). On Windows, subprocess serializes argv
via subprocess.list2cmdline, which iterates every entry *as a string*
(`for c in arg`). Because _UvResult overrides __iter__, that iteration yields
(path, fresh_bootstrap) instead of characters, injecting the bool into the
command line and crashing the first update with:

    TypeError: sequence item 1: expected str instance, bool found

This bites the common single-assignment caller (`uv_bin = ensure_uv()`) on
its first update after #39780: the freshly pulled _UvResult flows into the
old in-memory call site and into the argv. Reported in the field on a
~10-commits-behind Windows install.

A single return value cannot satisfy both legacy 2-target unpacking and
Windows char-iteration — both use the iterator protocol with contradictory
results. So gate the wrapper to POSIX: Windows returns a plain str/None
(the historical, subprocess-safe contract). POSIX keeps _UvResult and the
#39780 update-boundary fix.

Tests: list2cmdline canary proving _UvResult breaks Windows, plus Windows
returns-plain-str and POSIX dual-contract coverage.

* fix(desktop/windows): stop racing our own backend during in-app update (#39828)

* fix(desktop/windows): stop racing our own backend during in-app update

The Windows in-app update (Update button -> hermes-setup.exe --update handoff)
bricked because it raced a still-locked hermes.exe: the desktop quit
fire-and-forget without reaping its backend child + grandchildren, so when
the updater ran `hermes update`, the venv shim was still open. The quarantine
rename then failed, uv's `pip install -e .` hit "Access is denied", the git
path bailed to a full ZIP re-download, and the deps still couldn't write the
locked shim -- leaving a half-applied install. macOS is fine because it never
blocks REPLACE on a running executable.

Three coordinated fixes restore Mac-style parity (click Update -> progress ->
relaunch, no terminal):

A. Desktop (main.cjs): before spawning the updater, releaseBackendLockForUpdate()
   tree-kills the primary + pool backends (taskkill /T /F on Windows, to catch
   REPL/pty/gateway grandchildren that SIGTERM misses) and polls the venv shim
   until it is actually writable (bounded 15s) -- so the lock is gone before we
   hand off. Also fixes resolveHermesCliBinary to use venv\Scripts\hermes.exe on
   Windows.

B. Updater (update.rs): wait_for_venv_free no longer "proceeds anyway" on
   timeout -- it force-kills any lingering hermes.exe (excluding itself) and
   re-checks, so a straggler can't doom the install.

C. Updater (update.rs): pass --force to `hermes update`. By contract the desktop
   has exited + waited, and the wait force-kills stragglers, so the running-exe
   guard would only produce a false "Hermes is still running" dead-end.

Verified: node --check on main.cjs, cargo check on the updater (clean), and the
Windows-gated taskkill body type-checks standalone. Field repro: ryanc's
update.log (manual + handoff both hit the same lock cascade).

* review: scope backend kill+wait to Windows; drop meaningless POSIX pgid kill

* fix(update/windows): retry handoff `hermes update` once on first-run crash (#39831)

The in-app updater (Hermes-Setup --update) runs `hermes update`, which lazily
imports the freshly-pulled modules — but the dependency-install step runs the
already-in-memory PRE-pull code for one invocation. When a release changes an
updater-path contract across that boundary, the FIRST update on the parked
population crashes even though the fix is already on disk.

Concretely this is #39780's `_UvResult`: its `__iter__` yields (path, bool), so
Windows `subprocess.list2cmdline([uv_bin, "pip", ...])` injects the bool and
dies with `TypeError: sequence item 1: expected str instance, bool found`
(fixed in #39820). A parked Windows user clicking Update pulls #39820 to disk,
then still crashes on the in-memory pre-merge module; only the SECOND click runs
clean. Field repro: ryanc's bootstrap.log (2026-06-05 12:41:41).

Fix: when the first `hermes update` exits non-zero (and it isn't the
concurrent-instance guard, exit 2, which a retry can't fix), retry once
automatically. The retry loads the now-current module from the start and
succeeds — so the parked user gets a working one-click update instead of a
scary crash + manual second attempt.

Verified: cargo check clean.

* fix(completion): remove /model <arg> autocomplete from CLI/TUI (#39727)

* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* fix(completion): remove /model <arg> autocomplete from CLI/TUI

The TUI frontend already suppressed /model argument completion in favor of
the two-step ModelPicker (useCompletion.ts), but the CLI prompt_toolkit
completer and the gateway-backed complete.slash RPC (TUI + desktop) still
emitted model aliases and probed LM Studio on every keystroke.

Drops the /model branch in SlashCommandCompleter.get_completions, the
_model_completions method, and the LM Studio probe/cache helper that only
fed it. Command-name completion (/mod -> model) and sibling arg completers
(/skin, /personality) are untouched. Removes the now-dead TestModelTabCompletion
tests.

* docs: remove --include-desktop install instructions (#39762)

* docs: remove --include-desktop install instructions

Drop the --include-desktop curl one-liner from the desktop app docs.
The flag remains in scripts/install.sh; these docs now point to the
desktop installer / website and the 'hermes desktop' path instead.

* docs: remove --include-desktop from install docs

Drop the redundant 'Hermes Desktop installer on Linux' block (which
used --include-desktop) from quickstart, installation, and index docs.
The website installer covers macOS/Windows desktop; the CLI-only path
covers Linux. Removes the flag from all user-facing docs.

* perf(/model): prewarm picker provider-models cache in background (#39847)

* fix: respect disabled auto-compaction on context overflow

Port from anomalyco/opencode#30749.

When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.

Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.

Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).

* perf(/model): prewarm picker provider-models cache in background

The no-args /model picker calls list_authenticated_providers(), which
fetches each authenticated provider's live /v1/models list serially. On a
cold or stale (>1h TTL) cache that blocks ~1.5s on the user's critical path
the first time /model is opened in a session.

Warm that exact path off-thread during the idle window right after the CLI
banner is shown: a once-per-process daemon thread runs
list_authenticated_providers() to populate provider_models_cache.json for
every authed provider. By the time the user types /model, the picker hits
the warm disk cache (~136ms vs ~1500ms).

Process-level Event guard (mirrors run_agent's _openrouter_prewarm_done)
ensures at most one thread per process; fully exception-isolated so an
offline/no-creds provider can never affect the session.

* fix(termux): scope frontend npm installs

* fix(desktop): route remote-profile session reads to the owning remote backend

Per-profile remote hosts (#39778) wired the chat/resume socket to a profile's
remote backend, but session list + transcript reads still assumed every
profile's state.db is a local file the primary can open. For a remote profile
the local file is absent or stale, so the IDs the sidebar shows 404 the moment
resume runs against the remote -- the "session not found -> new session" bug.

Intercept the three session-read GETs in the hermes:api handler and route them
to the owning remote backend (which serves its own state.db natively):

  GET /api/profiles/sessions        -> splice each remote profile's real rows in
  GET /api/sessions/{id}[/messages] -> read from the remote for remote profiles

No remote profiles configured -> untouched local fast path. A dead remote
contributes nothing rather than breaking the sidebar.

Verified end-to-end against a live remote backend: a remote-profile session
resumes from remote history and continues on the remote across turns (history
grows in place, no new session spawned).

* fix(desktop): route remote-profile session mutations + fix unified-list pagination

Follow-up to the read-routing fix: make remote-profile sessions fully
first-class, not just resumable.

Mutations (rename/archive/delete) went through the same hermes:api handler but
never carried the owning profile, so they hit the local primary's state.db --
which has no row for a remote session. Deleting/archiving/renaming a remote
session silently no-op'd or 404'd, and the row reappeared on next refresh.

- hermes.ts: setSessionArchived/deleteSession/renameSession take the owning
  profile and pass it as request.profile so Electron routes to that profile's
  backend (matching the read path). Callers now forward session.profile.
- main.cjs: generalize the intercept (read -> request) to also reroute
  DELETE/PATCH on /api/sessions/{id} for remote profiles, stripping the profile
  param (the remote serves its own state.db; no cross-profile semantics there).
- web_server.py: DELETE /api/sessions/{id} gains a profile param for parity with
  GET/PATCH (local cross-profile delete).

Also fix the unified-list merge: it concatenated each remote's page onto the
primary's without re-windowing, so a limit=N request could return up to
N*(1+remotes) rows and report the primary's (stale) total. Now it over-fetches
limit+offset from each remote (from offset 0), re-sorts by recency, re-windows
to the page, and recomputes total/profile_totals from the remote counts.

Verified live against a remote backend: rename/archive/delete mutate the remote
db; page 1 windows to limit, profile_totals reflect remote counts, page 2 has no
overlap with page 1. tsc -b clean; connection-config tests pass.

* fix(gateway): propagate max_tokens from config.yaml to AIAgent

max_tokens set under model: in config.yaml was silently ignored.
The value was never read from config, never passed through
_resolve_runtime_agent_kwargs(), _resolve_turn_agent_config(),
or the session override path.  Added it to all three code paths
so custom/Ollama endpoints receive the correct output cap.

Closes #20741

* fix(cli,gateway): complete max_tokens propagation — CLI path + env var override

Previous commit only covered the gateway runtime path. This adds:
- CLI __init__: read max_tokens from model config with HERMES_MAX_TOKENS env override
- CLI AIAgent() calls (interactive + background): pass max_tokens
- Gateway _resolve_runtime_agent_kwargs: add HERMES_MAX_TOKENS env override

All three code paths (CLI, gateway runtime, session override) now
consistently propagate max_tokens to AIAgent.

* fix(gateway): honor per-provider max_output_tokens in max_tokens chain

Widens ViewWay's #20741 fix to the sibling config surface: a
custom_providers entry can pin its own output cap via max_output_tokens
(or max_tokens). _get_named_custom_provider now lifts it onto the
resolved runtime at all three return sites, and the gateway uses it as a
fallback only when the documented global model.max_tokens isn't set, so
the global key always wins.

Precedence: HERMES_MAX_TOKENS > model.max_tokens > provider
max_output_tokens > None. Closes the same #20741 truncation for users who
configure the cap per-provider rather than globally.

Picks up the intent of #19782 (alexcam1901), reimplemented to feed
ViewWay's max_tokens pipeline.

* test(gateway): regression tests for max_tokens propagation chain (#20741)

* chore(release): map ViewWay author email for AUTHOR_MAP

* fix(desktop+gateway): full multi-profile support over one global-remote dashboard (#39921)

* fix(desktop): cross-profile session history in app-global remote mode

#39894 made remote-profile sessions first-class for PER-PROFILE remote
overrides. But the common setup — Settings → Gateway → "All profiles" → Remote
— writes app-GLOBAL remote mode (connection.json top-level mode:'remote', empty
profiles map), which the intercept didn't recognize. Switching to a non-launch
profile then 404'd every session read, so no history showed for it.

In global remote mode a SINGLE backend serves every profile via ?profile= (it
reads each profile's state.db off the remote host's own disk — verified: one
dashboard returns /api/profiles and /api/profiles/sessions?profile=all across
all profiles). The fix: when no per-profile override matches but global remote
mode is active, route per-session reads/mutations to that one backend and KEEP
the ?profile= param so it opens the right state.db (instead of bailing to the
local path and dropping the profile scope).

- new globalRemoteActive() — true for connection.json mode:'remote' or the
  HERMES_DESKTOP_REMOTE_URL env override.
- per-session branch: per-profile override → route sans profile (own db);
  global mode → route to the single backend WITH ?profile= preserved.
- unified list is unchanged in global mode: it already passes through to the one
  backend, which aggregates all profiles natively.

Verified live against a one-dashboard / multi-profile remote (Austin's topology):
cross-profile transcript reads load (was 404), rename/delete route to the right
profile, unified list spans both profiles.

Known limitation (architectural, not fixed here): LIVE chat as a non-launch
profile still needs a per-profile dashboard on the remote — the dashboard binds
HERMES_HOME once at process start, so one global backend can't run an agent
turn as another profile. Session history/read/mutate now work regardless.

* fix(gateway): resume + chat any profile over one global-remote dashboard

The REST half of this branch made cross-profile session history visible in
app-global remote mode, but resume + chat still went over the WebSocket gateway,
which was hard-bound to the dashboard's launch profile. Resuming a non-launch
profile's session 404'd ("session not found") and sending spawned a new session
— because session.resume/prompt.submit had no profile concept and the live
agent + state.db were process-global to the launch profile's HERMES_HOME.

Make the WS gateway per-session profile-aware so ONE dashboard can serve every
local profile on its host (the app-global remote topology):

- session.resume accepts an optional `profile`. _profile_home() resolves that
  profile's home on this host; resume opens THAT profile's state.db, binds its
  HERMES_HOME (ContextVar override) while building the agent so config/skills/
  model resolve to it, and passes the profile db to the agent so turns persist
  to the right state.db. The owning profile_home is stored on the session.
- prompt.submit re-binds the stored profile_home for the turn thread (mid-turn
  home reads — memory, skills — resolve to the resumed profile), reset in finally.
- _make_agent gains an optional session_db param (defaults to _get_db()).
- _load_cfg honors the home override (falls back to _hermes_home) so a resumed
  profile loads its own config; cache keyed on resolved path.
- desktop: session.resume now sends the owning profile.

Omitted/launch profile → unchanged (single-profile and per-profile-remote setups
are byte-for-byte the same path). Verified live against a one-dashboard /
multi-profile remote: resuming a non-launch profile's session loads its history,
runs a real turn against THAT profile's home/env, and persists to its state.db.

tests/tui_gateway/test_protocol.py: _make_agent mocks updated for the new param.

* feat(desktop): add i18n with Simplified Chinese (zh-Hans) support

Introduce a lightweight React context-based i18n layer for the desktop
app and translate the UI into Simplified Chinese.

- New apps/desktop/src/i18n module: typed Translations interface, en + zh
  locale tables, I18nProvider/useI18n, localStorage-persisted locale
  (defaults to English), and language endonym metadata for the picker.
- Wire I18nProvider at the app root in main.tsx.
- Refactor 24 desktop screens/components to read strings from the `t`
  object instead of hard-coded English.
- Add a unit test for the i18n context.

* feat(desktop): persist i18n language in config

* fix(gateway): new chats honor their profile in global-remote mode (#39993)

Follow-up to #39921. That PR scoped session.resume + prompt.submit to a
session's profile, but a BRAND-NEW chat (session.create) under a non-launch
profile was still built and persisted against the dashboard's launch profile.
Two visible symptoms in app-global remote mode (one dashboard, many profiles):

  1. "who are you" in profile S replied as the launch (default) profile/agent —
     the agent was built with the launch HERMES_HOME, so config/SOUL/identity
     came from the wrong profile.
  2. "session not found" on later resume — _ensure_session_db_row persisted the
     row into the launch profile's state.db via _get_db(), so the session lived
     in the wrong db, the unified list mis-tagged it (it showed up under BOTH
     profiles), and resume routed to the wrong one.

Fix — carry the owning profile through the create path too:

- session.create accepts an optional `profile`; resolves its home and stores
  `profile_home` on the session (alongside what resume already set).
- _start_agent_build binds that profile's HERMES_HOME while building the agent
  (config/skills/model/identity resolve to it) and hands the agent the profile's
  state.db so turns persist there.
- _ensure_session_db_row writes the row into the profile's state.db, not the
  launch db — fixing the duplicate row + mis-tag + resume 404.
- desktop sends the new-chat profile on session.create.

None/launch profile → unchanged (single-profile and per-profile-remote setups
take the same path). Verified live against a one-dashboard / multi-profile
remote: a new chat under `work` builds as work's agent (correct SOUL identity),
persists ONLY to work's state.db (launch db stays empty), the unified list tags
it `work` exactly once, and it resumes cleanly.

tests/test_tui_gateway_server.py: _make_agent mocks updated for the session_db
param added in #39921's build path.

* feat(model_switch): honor discover_models in custom_providers section 4

Section 3 (user `providers:`) already honors `discover_models: false` to
skip live /models discovery and keep the explicit `models:` list. Section 4
(`custom_providers:` list) did not — `should_probe` ignored the field, so any
grouped custom provider with an api_key always had its configured subset
replaced by the full live /models catalog.

This adds the same `discover_models` support to section 4:
- Default True — no behaviour change for existing configs.
- `discover_models: false` keeps the explicit `models:` list even when an
  api_key is present.
- String values ("false"/"no"/"0") are normalised to False, matching
  section 3.
- If any entry in a grouped endpoint opts out, the whole group opts out.

Use case: endpoints that expose a full aggregator catalog via /models but
only serve a configured subset.

Salvaged from #29810 — rebased onto current main. The PR's other change
(`key_env` resolution in section 4) landed independently in commit aa283d1e4
(custom provider picker credential isolation), so only the discover_models
portion is carried here.

Co-authored-by: ohMyJason <42903577+ohMyJason@users.noreply.github.com>

* chore: add ohMyJason to AUTHOR_MAP

* feat(model): honor d…
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…usResearch#37247)

* feat(dashboard-auth): rotate dashboard sessions via refresh token

The dashboard auth-code grant now issues a 24h rotating refresh token
(server side: NousResearch/nous-account-service#293). This wires up the
Hermes client half so an expired access token is transparently refreshed
instead of bouncing the user to /login every 15 minutes.

plugins/dashboard_auth/nous:
- refresh_session() now POSTs grant_type=refresh_token to Portal's token
  endpoint and returns a Session carrying the ROTATED refresh token (was
  an unconditional RefreshExpiredError under the old "no RT in V1"
  contract). The RT is sent in BOTH the request body (Portal's schema
  requires it there) and the X-Refresh-Token header (log redaction) —
  verified against the NousResearch#293 preview deploy: header-only is rejected as
  invalid_request, body is accepted.
- A 400 from Portal (expired / revoked / reuse-detected) maps to
  RefreshExpiredError so the middleware forces a clean re-login; network
  errors map to ProviderError; empty RT fast-fails without a network call.
- complete_login now captures the initial refresh token Portal returns
  (forward-tolerant: empty string if a deploy omits it).
- Extracted the shared token-response handling into
  _token_response_to_session, parameterised on the 400 exception type so
  the auth-code path raises InvalidCodeError and the refresh path raises
  RefreshExpiredError.
- revoke_session stays a best-effort no-op: Portal exposes no public
  token-endpoint revocation grant (revocation is the authenticated
  /sessions UI, keyed by sessionId+userId), so logout is cookie-clearing
  and the 24h session expires on its own. Documented for a future
  revoke grant.

hermes_cli/dashboard_auth/middleware:
- On an expired/invalid access token the gate now attempts refresh via
  the session's RT BEFORE forcing re-login. On success it serves the
  request and re-sets the rotated cookies on the response (mandatory:
  Portal rotates the RT every refresh and reuse-detects, so a stale RT
  cookie would revoke the whole session on the next refresh). On
  RefreshExpiredError (or no RT) it falls through to clear-and-relogin.
- ProviderError during refresh (Portal unreachable) forces a clean
  re-login rather than 500-ing the request.
- Uses the existing REFRESH_SUCCESS / REFRESH_FAILURE audit events.

Validation:
- 176 dashboard-auth unit/integration tests pass.
- Live E2E against the NousResearch#293 preview deploy: refresh_session(bad rt) ->
  RefreshExpiredError through the real token endpoint; live JWKS fetch +
  RS256 verification rejects a forged token; empty-RT fast-fail. The
  successful happy-path rotation is covered by unit tests (a live run
  needs an interactive browser OAuth round trip + registered agent:*
  client).

Depends on: NousResearch/nous-account-service#293 (server-side RT issuance).

* fix(dashboard-auth): use Portal's x-nous-refresh-token header name

The refresh-token header must match Portal's REFRESH_TOKEN_HEADER exactly
("x-nous-refresh-token"); the initial cut used "X-Refresh-Token", which
Portal silently ignores (harmless since the RT is also in the body, which
is what the schema requires — but the header redaction was a no-op).
Confirmed against the NAS token route + re-validated live against the
NousResearch#293 preview deploy.

* fix(dashboard-auth): refresh session when access-token cookie has been evicted

The gated middleware bounced users to /login the instant the access-token
cookie was absent, without ever consulting the refresh token:

    at, _rt = read_session_cookies(request)
    if not at:
        return _unauth_response(...)   # bailed here

This made transparent refresh effectively dead for the common case. The
access-token cookie is set with Max-Age = access_token_expires_in (~15 min),
so a real browser EVICTS hermes_session_at the moment the token lapses while
hermes_session_rt persists (30-day Max-Age). From that point the browser
sends only the refresh-token cookie — and the old guard rejected it before
_attempt_refresh could run. The _attempt_refresh path only fired for a
present-but-invalid access token, which never happens in a browser.

Fix: only hard-bounce when NEITHER cookie is present. A request carrying
just the refresh token now skips verification (no AT to verify) and flows
into the existing refresh path, which rotates both cookies and serves the
request transparently. A dead/expired RT still raises RefreshExpiredError
and falls through to clear-and-relogin.

This failure mode escaped the original tests + manual refresh button because
both kept the access-token cookie present; only a real browser evicting the
cookie at Max-Age exposes it. Added 3 regression tests covering: AT-evicted +
RT-present (transparent refresh), no-cookies (still bounces), and RT-only with
a dead RT (clean 401, no 500).
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
… alone

The desktop OAuth remote-gateway path gated connectivity on
hasOauthSessionCookie(), which checks only the access-token cookie
(hermes_session_at, ~15 min TTL). The moment that cookie's Max-Age
lapsed, Electron's cookie jar dropped it and both resolveRemoteBackend()
and sanitizeDesktopConnectionConfig() reported "not signed in" — forcing
a full IDP re-login every ~15 min — even though a valid 24h refresh-token
cookie (hermes_session_rt) was sitting in the same jar.

The desktop OAuth code (2026-06-04) was written against the obsolete
"contract v1 issues no refresh token" model, two days after NousResearch#37247
re-introduced server-side transparent refresh: Portal now issues a 24h
rotating, reuse-detected refresh token, and the gateway middleware
(_attempt_refresh) rotates a fresh AT from the RT on the next
authenticated request. So an expired-AT/live-RT session is fully
connectable — the desktop just never let the request through.

Fix:
- connection-config.cjs: add RT_COOKIE_VARIANTS + cookiesHaveLiveSession()
  (true when EITHER a live AT or RT cookie is present). Keep
  cookiesHaveSession() AT-only for callers that need that specific signal.
- main.cjs: add hasLiveOauthSession(); resolveRemoteBackend()'s oauth
  branch now early-outs only when NEITHER cookie is present, otherwise
  uses the ws-ticket mint as the authoritative liveness probe (that POST
  carries the RT cookie and triggers the server-side AT rotation). A real
  401 still surfaces as needsOauthLogin. Settings indicator + oauth-logout
  report against the same AT-or-RT notion.
- Remove the stale "contract v1 / NO refresh token" docstrings in
  cookies.py and the verify_session comments in the Nous provider that
  contradicted NousResearch#37247.

Tests: +57 lines in connection-config.test.cjs covering the RT-only
"still connectable" case. node --test: 32/32. dashboard-auth +
nous-provider Python suites: 223/223.

Note: server-side files (hermes_cli/dashboard_auth/, plugins/dashboard_auth/)
are comment/docstring-only here, but this touches outside apps/desktop/ so
it needs Teknium review.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/auth Authentication, OAuth, credential pools comp/cli CLI entry point, hermes_cli/, setup wizard comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants