fix: enforce memory routing on replace/remove#1
Merged
Conversation
… from NousResearch#32487) (NousResearch#38756) * refactor(supermemory): session-level conversation ingest + kebab tool aliases Salvaged from NousResearch#32487 (by @MaheshtheDev), rebased onto current main. - sync_turn now buffers cleaned turns; the full session is ingested once at session end / switch / shutdown via the conversations endpoint - ingest_conversation() accepts and forwards functional document metadata (type, session_id, message_count, partial) - register kebab-case tool aliases (supermemory-save/search/forget/profile) alongside the snake_case names - README + docs (EN/zh-Hans) updated for the simplified session model Source/vendor-attribution removed per project policy (no telemetry): dropped x-sm-source header, sm_source metadata, and sm_capture_mode tags. Preserved the post-branch atomic_json_write(mode=0o600) hardening that the PR's stale base had reverted. Updated provider tests for the new behavior and added maheshthedev@gmail.com to release.py AUTHOR_MAP. Co-authored-by: alt-glitch <balyan.sid@gmail.com> * feat(supermemory): restore x-sm-source for Spaces routing Reinstates x-sm-source: hermes (SDK default_headers + conversations POST) and sm_source: hermes document metadata. Per @Dhravya (Supermemory), this is a functional routing key, not telemetry: it groups Hermes writes into a dedicated "Hermes" Space in the Supermemory app so users can filter and bulk-manage memories per source agent. sm_capture_mode remains dropped (appears analytics-only; Spaces are routed by sm_source) pending confirmation. Adds README note + a unit test covering _merge_metadata sm_source stamping and legacy source->type migration. --------- Co-authored-by: Mahesh Sanikommu <maheshthedev@gmail.com>
…-compat fix(config): align prefill messages key handling
…ient
Adds a CLI command that registers this install as a self-hosted dashboard
with the user's Nous Portal account, automating the manual browser flow on
/local-dashboards.
- New hermes_cli/dashboard_register.py: resolves a fresh Nous access token
from auth.json (fast-fails with a `hermes setup` hint when not logged in),
POSTs to {portal}/api/oauth/self-hosted-client, and writes
HERMES_DASHBOARD_OAUTH_CLIENT_ID into ~/.hermes/.env idempotently.
- Docker-style adjective_noun auto-naming; --name and --redirect-uri overrides.
- Persists HERMES_DASHBOARD_PORTAL_URL only when non-default and unset (so a
Vercel preview / staging portal sticks, prod default stays implicit).
- Refuses in managed/hosted installs (the orchestrator stamps the client_id).
- Post-register hint explains the OAuth gate only engages on a non-loopback bind.
- Nested 'register' subparser leaves bare `hermes dashboard` unchanged.
- 9 unit tests (name gen, fast-fails, POST shape, env writes, redirect URI,
portal-URL persistence, 401/403 mapping); dashboard lifecycle tests still green.
Depends on NousResearch/nous-account-service#324 (the portal endpoint).
…ride in register The register command resolved the portal base URL purely from the stored login, ignoring any override. That meant `HERMES_DASHBOARD_PORTAL_URL` (and the absence of any flag) gave no way to point registration at a staging or preview portal — the request always hit the login's portal, returning 404 against a branch that wasn't deployed there. - _resolve_portal_base_url now takes an optional override (precedence: override > stored login portal > prod default). - New --portal-url flag; falls back to HERMES_DASHBOARD_PORTAL_URL env. - Documents that the access token must be valid at the overridden portal (it's minted by whoever you logged into). - 3 new tests for override precedence. Verified live against the PR NousResearch#324 Vercel preview: CLI -> preview endpoint -> real agent:{id} client_id written to .env.
…earch#38743) * Port from google-gemini/gemini-cli#21541: back up corrupted config.yaml When config.yaml fails to parse, load_config() silently falls back to DEFAULT_CONFIG and leaves the broken file on disk. If the user then re-runs the setup wizard or hermes config set (both rewrite config.yaml), their broken-but-recoverable overrides are lost for good. Adapts the policy-file recovery from gemini-cli#21541: on the first parse warning for a given broken file, snapshot it to config.yaml.corrupt.<ts>.bak (best-effort, symlink-guarded, size-deduped) and tell the user where it landed. Unlike Gemini's version we deliberately do NOT reset config.yaml to a clean state — hermes never silently mutates user config, and leaving it means a hand-fixed file is re-read on the next load. Tests: 3 new cases (backup created + content preserved + original untouched; same-size backup dedup; symlink not copied). E2E verified with isolated HERMES_HOME and a real tab-indented broken config. * fix(dashboard): explain WHY a chat WS connection was refused The embedded-chat PTY WebSocket (/api/pty) collapsed every rejection into a bare close code: 4401 for any auth failure, 4403 for three unrelated failures (host mismatch, origin mismatch, peer-IP). Neither the server log nor the browser said which gate fired or why, so a "chat won't connect" report was undiagnosable without a repro. Server (web_server.py): - _ws_auth_reason / _ws_host_origin_reason / _ws_client_reason return a short machine-parseable reason; old bool wrappers kept for callers/tests. - pty_ws splits the overloaded 4403 into 4401 (auth), 4403 (host/origin), 4408 (peer not allowed), 4404 (chat disabled), and sends the reason on the close frame (clamped to the 123-byte RFC6455 limit). - Each path logs one line: 'pty auth rejected reason=.. mode=.. cred=.. peer=..' / 'pty refused: <reason> ..'. Accepted path logs 'pty accepted peer=.. mode=.. cred=..' so an audit shows HOW a peer authed, not just that it did. tui_gateway/ws.py: - 'ws send/write failed' now logs error_type=<ExcName> so an exception whose str() is empty (closed-transport sends) no longer logs 'error='. web/src/pages/ChatPage.tsx: - console.warn the real close code + server reason on every close. - Map 4404/4408 to specific banners; 4401/4403 banners echo the server reason; [session ended] prints the close code. E2E verified all five reject paths + accepted path produce matching close code, wire reason, and server log line.
…NousResearch#38827) The Nous dashboard OAuth login rejected any http:// redirect_uri whose host was not localhost/127.0.0.1, surfacing "redirect_uri may only use http:// for localhost/127.0.0.1" on the login screen. This broke self-hosted dashboards reached over plain HTTP — LAN IPs, internal hostnames, and reverse proxies that terminate TLS upstream. The Portal-side check (agent-redirect-uri.ts) is authoritative on which redirect_uris are permitted; this client-side _validate_redirect_uri is only a fast-fail for obvious operator error and should not second-guess valid http:// deployments. Fix: drop the localhost-only branch on the http scheme. Validation now enforces only that the scheme is http(s) and the path ends with /auth/callback. Updated the docstring to explain the relaxed contract, and replaced test_rejects_http_with_non_localhost (which pinned the old behavior) with test_allows_http_with_arbitrary_host covering a Fly hostname, a LAN IP, and an internal hostname.
The dashboard auth gate was OAuth-only: a DashboardAuthProvider could
authenticate only via a redirect to an IDP (start_login -> /auth/callback
-> complete_login). There was no first-class path for username/password
auth, so self-hosters who just want a password on their dashboard had no
clean option short of an external OAuth IDP.
Extend the provider framework with a parallel, non-redirect front door
that converges on the same Session + cookie + refresh machinery:
- base.py: add the optional supports_password flag and
complete_password_login(username, password) -> Session (default
raises NotImplementedError so an OAuth-only provider that forgets the
flag fails loudly). Add InvalidCredentialsError. OAuth providers are
unaffected (flag defaults False; the method is never called).
- routes.py: add POST /auth/password-login, mirroring the cookie-minting
tail of /auth/callback but skipping PKCE/state/code. Returns JSON
{ok, next} (the form POSTs via fetch). Generic 401 for both unknown
user and wrong password (no enumeration oracle); 404 hides whether a
provider exists or supports passwords; per-IP sliding-window rate
limit (10/min -> 429). /api/auth/providers now reports
supports_password so the login page can branch.
- middleware.py: allowlist /auth/password-login (a bootstrap route).
verify/refresh/revoke/ws-tickets/logout need zero changes — a password
session is just a Session with provider-minted opaque tokens.
- login_page.py: render a credential form (instead of a redirect button)
for supports_password providers, wired by a small inline script that
POSTs to /auth/password-login and navigates on success. OAuth-only
pages stay script-free.
A bundled, zero-infrastructure 'just put a password on my dashboard'
provider that uses the supports_password extension point. No external IDP,
no database: sessions are stateless HMAC-signed tokens the provider mints
and verifies itself, and passwords are hashed with stdlib scrypt (no
third-party dependency — deliberately avoids bcrypt to keep the dep
surface unchanged).
- plugins/dashboard_auth/basic: BasicAuthProvider (scrypt verify with a
constant-time dummy-hash path for unknown users so the endpoint is not
a username-timing oracle; access/refresh tokens carry a 'kind' claim
that verify/refresh enforce; cross-secret tokens are rejected). The
register() entry point mirrors the Nous plugin's config/env precedence
(env wins; empty treated as unset) and LAST_SKIP_REASON channel.
- config.py: document the canonical dashboard.basic_auth.* surface
(username / password_hash / password / secret / session_ttl_seconds).
Activates only when username + (password or password_hash) are set, so
OAuth users and loopback/--insecure operators are unaffected. Without an
explicit secret a random per-process key is generated (logged): fine for a
single process, but sessions then don't survive restart or span workers.
- test_dashboard_auth_password_login.py: drives /auth/password-login
end-to-end through the REAL gated_auth_middleware (login -> session
cookie -> authenticated /api/auth/me -> transparent refresh via the RT
cookie), plus protocol-extension checks, the generic-401/404 oracle
properties, the rate limiter, and login-page rendering (form+script
when supports_password, script-free otherwise, both for mixed
providers). Reuses the existing StubAuthProvider harness convention.
- test_basic_provider.py: scrypt hash/verify, login mint, kind-claim
enforcement (access != refresh), cross-secret rejection, and the
register() config/env precedence + skip reasons.
Mutation-tested: dropping the kind-claim check in verify_session makes
test_access_token_not_accepted_as_refresh fail, confirming the test isn't
theater.
Add a 'Username/password provider (no OAuth IDP)' section to the web dashboard guide (config.yaml + env surfaces, the explicit-secret caveat, the rate-limit/generic-401 properties, and a 'write your own password provider' pointer to the supports_password extension point), and list the HERMES_DASHBOARD_BASIC_AUTH_* env vars in the environment-variables reference.
…eys UX (NousResearch#38551) * feat(desktop): dedicated Providers settings with Accounts/API-keys subnav Rework provider configuration in the desktop app into its own Providers page that mirrors the first-run onboarding picker, instead of burying provider keys in the generic Tools & Keys list. - Add a Providers settings page (providers-settings.tsx) reusing the onboarding picker cards/ApiKeyForm so the two surfaces stay identical - Add a sidebar subnav (Accounts vs API keys) backed by a deep-linkable `pview` URL param; nested OverlayNavItem variant for a lighter active state so children don't compete with the parent item - Scope provider search to the active sub-view in its native card format (no more accordion fallback); collapse the API-key grid to the top providers behind a "Show all" toggle to cut scrolling - Launch real in-app OAuth from settings via startManualProviderOAuth; fix the misleading red "reason" banner that showed during an active connect (neutral style, hidden during a flow, omitted for direct per-provider launches) - Expand PROVIDER_GROUPS and add longest-prefix matching so providers like xAI/Ollama group correctly instead of landing under "Other" - Drop redundant messaging API keys from Tools & Keys (channel_managed) Co-authored-by: Cursor <cursoragent@cursor.com> * feat(desktop): Cursor-style provider key list with inline inputs Replace the card-grid API-key form on the Providers page with a per-provider list (mirrors Cursor's API keys section): - One row per vendor with its primary key input inline; rows with extra vars (base URL, region, alt tokens) expand to reveal those on focus - Set keys show their redacted value as the placeholder; Save appears on edit, Remove on a set key - Hide redundant alias key fields (e.g. ANTHROPIC_TOKEN vs ANTHROPIC_API_KEY) unless already set, and label set aliases by env var name so they're unambiguous - Smaller mono input text + compact height Co-authored-by: Cursor <cursoragent@cursor.com> * style(desktop): flatten providers settings UI chrome Tighten the providers settings surface to match the newer desktop style: remove extra card rails/borders in API-key rows, reduce visual noise in the providers subnav, replace bespoke link-like controls with shared text-button variants, and improve key input readability. * feat(desktop): rework providers settings UI - Flatten the shared OAuth picker rows (accounts + onboarding): drop the rounded-2xl/border cards for flat hover-bg rows; Nous hero keeps a subtle tint plus an animated blue→purple arc border. - Key fields collapse to a single input: a set key reads read-only (redacted) and edits in place on focus/click — no Replace/Cancel chrome. Save on type, Esc cancels (without closing the overlay), "Remove or esc to cancel" hint. - Non-key overrides render boxless, content-sized (field-sizing) and right-anchored; advanced fields align under the primary key column. - Add `xs` control size; size fields via padding (no fixed heights). - Cards expand on key-input focus; chevron shows on hover/expanded; expanded state uses a ring + softer bg tier so hover ≠ focus. - Relocate "Get a key" to the bottom-right of the expanded panel; drop the redundant provider description. - Cmd+K: add Providers (accounts) and Provider API keys deep-links. * fix(desktop): flatten provider fields, drop input shadows, fix Cmd+K provider rank - KeyField: collapse to one stacked label-above-input form field (drop the bespoke `naked`/inline/column branches); empty advanced overrides fade until hover/focus/set - styles: kill the resting + focus drop shadow on shared input chrome so form inputs sit flat (composer keeps its own shadow) - Cmd+K: drop stray `providers` keyword from Skills & Tools so the Providers settings entry ranks first for "provider" * fix(desktop): nous portal arc blue → orange * fix(desktop): rank appearance above settings in Cmd+K --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>
The desktop remote-gateway settings now auto-detect whether a gateway
authenticates with OAuth or a static session token and present the
matching UI + connection mechanism.
Detection: an unauthenticated GET {base}/api/status reads auth_required
(true => OAuth, false => session token); /api/auth/providers supplies the
provider label. The settings UI debounce-probes the entered URL and shows
either a 'Sign in with <provider>' button or the session-token box.
OAuth connection mechanism:
- REST is authed by the HttpOnly session cookie held in a persistent
Electron session partition (persist:hermes-remote-oauth); main-process
REST routes through electron net bound to that partition so the cookie
attaches automatically.
- Login opens a BrowserWindow on {base}/login in that partition and
resolves once the hermes_session_at cookie lands.
- WebSocket upgrades use a single-use ?ticket= minted at
POST /api/auth/ws-ticket (the gateway rejects ?token= in gated mode);
getGatewayWsUrl() re-mints before every (re)connect since tickets are
single-use and short-lived.
- Missing cookie / 401 surfaces needsOauthLogin to prompt re-sign-in
(Nous Portal contract v1 issues no refresh token).
Local and token modes are unchanged.
Pure helpers (URL normalize, ws-url token/ticket builders, auth-mode
classify/resolve, cookie detector) are extracted to a standalone
connection-config.cjs (no electron import) and unit-tested with
node --test (26 tests), matching the backend-probes.cjs pattern.
The remote-gateway settings rendered the session-token box for every gateway during the idle/probing window before the first /api/status probe lands, because authMode defaults to 'token'. Gate both the OAuth sign-in button and the token box behind an authResolved flag so neither renders until the probe resolves the scheme (or a previously-saved remote config is being re-shown, so re-opening settings doesn't flicker). The gateway-side WS Origin fix that lets the packaged desktop (file:// origin) connect to an OAuth-gated remote gateway landed separately in NousResearch#37870; this branch is now purely the desktop client + this UI fix.
…ilure The reconnect and boot paths resolved the WS URL with `(await getGatewayWsUrl().catch(() => null)) || conn.wsUrl`. For OAuth gateways the cached conn.wsUrl carries a single-use, ~30s-TTL ticket; the desktop connection is memoized for the process lifetime, so on reconnect that ticket is both expired and already consumed. A failed fresh mint therefore fell back to a guaranteed-dead ticket and surfaced as an opaque "connection closed", masking the gateway's actionable "session expired, sign in again" message. Extract resolveGatewayWsUrl() (with unit tests): in OAuth mode a mint failure throws a tagged GatewayReauthRequiredError instead of falling back; token/local modes keep the long-lived-token fallback. Thread that error through the reconnect path so requestGateway surfaces the reauth message rather than the generic transport error that triggered the retry. Co-authored-by: Kenmege <205099287+Kenmege@users.noreply.github.com>
…earch#38841) PR NousResearch#38743 split the dashboard PTY WebSocket refusal codes (4404 = chat disabled, 4403 = host/origin mismatch — see web_server.py refusal site comment) but left test_rejects_when_embedded_chat_disabled asserting the old 4403, so it has expected 4403 while the server sends 4404. Main CI has been red on test (2)/(4) shards since that commit. Update the assertion to 4404 to match the disabled-chat path.
…rch#38851) Surface the username/password dashboard-auth provider in Hermes Desktop's remote-gateway connect flow. A password gateway gates the same way an OAuth one does (auth_required + session cookie + ws-ticket), so the desktop already drives it through the existing sign-in window; the only gaps were that the probe dropped supports_password and the UI always said "OAuth". - main.cjs: capture supports_password from /api/auth/providers in the probe. - global.d.ts: add optional supportsPassword to DesktopAuthProvider. - gateway-settings.tsx: derive isPasswordProvider; render a plain "Sign in" button + "username and password" copy instead of an OAuth provider label when every advertised provider is password-based. Login still flows through the gateway's /login credential form (POST /auth/password-login).
…venv (NousResearch#38885) When 'hermes update' rebuilds the project venv (rmtree + uv venv on the first managed-uv migration), the desktop-rebuild and profile-skills-sync steps that follow both spawn sys.executable. Firing while the venv is mid-rewrite makes the child interpreter abort with the bare stderr line 'No pyvenv.cfg file', surfacing as a spurious 'Desktop build failed' / 'default: sync failed' on an update that actually succeeded. Add _wait_for_interpreter_venv_ready(): resolve the venv hosting sys.executable and poll briefly for pyvenv.cfg to (re)appear before each of those subprocess steps. No-op when the interpreter isn't venv-hosted. The desktop rebuild also retries once after re-waiting, and keeps streaming its output live (no capture). Best-effort throughout — callers proceed regardless, so a genuinely broken venv still surfaces the real error.
…rch#38886) attemptReconnect() connected with the stale cached conn.wsUrl. OAuth WS tickets are single-use with a ~30s TTL, so the first sign-in (which goes through boot() and re-mints via resolveGatewayWsUrl) succeeds, but every reconnect (sleep/wake, network online, window refocus, socket drop, app restart) reused a dead ticket and failed the WS upgrade with an opaque "Could not connect to Hermes gateway" — even though backend resolution (cookie + REST) reported ready. attemptReconnect now mints a fresh ticket before connecting, mirroring use-gateway-request.ts, and surfaces the reauth "sign in again" message once on OAuth expiry instead of silently looping backoff against a dead ticket. Local/token gateways are unaffected (re-mint is a no-op).
…e bundle (NousResearch#38888) The desktop renderer is bundled as one chunk on purpose (codeSplitting: false) because Shiki's many dynamic chunks make electron-builder OOM scanning thousands of files. That makes the ~22 MB bundle expected, but Vite still nags with 'Some chunks are larger than 500 kB' on every build. Raise chunkSizeWarningLimit to 25000 kB so the cosmetic warning stays quiet while still firing as a regression alarm if the bundle grows well past today's size. Config-only; codeSplitting:false is untouched.
…lapsed group (NousResearch#38829) The desktop command-approval ApprovalBar renders inline inside ToolEntry, which lives inside ToolGroupSlot. When 2+ tools group, the group body is hidden until expanded, so an approval raised by a pending terminal/ execute_code call was buried behind "Tool actions · N steps" and required manual expansion to act on (sudo/secret were unaffected — they use modal overlays). ToolGroupSlot now subscribes to $approvalRequest and force-opens its body while an approval targeting one of its pending approval-eligible tools is in flight, so the inline controls surface with nothing expanded. The group reverts to the user's stored collapse state once the approval resolves.
… root installs Root installs on Linux (FHS layout, NousResearch#15608) put the `hermes` command in `/usr/local/bin` (on PATH) but symlinked the bundled node/npm/npx into `~/.local/bin`, which isn't on PATH for a stock root shell. `node`/`npm` were 'command not found' and `hermes dashboard` failed with 'npm is not available' because its build-on-demand fallback couldn't find npm. Fix: `install_node()` now symlinks into `get_command_link_dir()` — the same helper the `hermes` command link already uses — so node/npm/npx land wherever the command does (`/usr/local/bin` on FHS root, `~/.local/bin` otherwise, `$PREFIX/bin` on Termux). Non-root and Termux installs are unchanged. Also fixes: - `scripts/lib/node-bootstrap.sh`: adds `_nb_get_link_dir()` mirroring the same root/Termux/user logic for the standalone bootstrap path (used by `hermes update`, TUI node bootstrap, etc.) - `hermes_cli/uninstall.py`: `remove_node_symlinks()` now checks all candidate directories (`~/.local/bin`, `/usr/local/bin`, `$PREFIX/bin`) so root FHS uninstalls don't leave orphan symlinks Regression from NousResearch#15608, which created the FHS path for the command but left `install_node` pointed at the legacy user-local dir.
…ssion summary Sets erase_when_done=True on the classic CLI's prompt_toolkit Application so the live bottom chrome (status bar, input box, separator rules) is wiped on exit instead of frozen into scrollback. Previously prompt_toolkit's render_as_done teardown repainted the chrome one final time and left it on screen (ESC[J only erases below the cursor, not the chrome above), so a dead status bar + empty prompt + rules were stranded between the conversation transcript and the 'Resume this session' summary, and stacked with the next session's UI on resume. erase_when_done routes teardown through renderer.erase() which wipes exactly the managed chrome region; the conversation transcript prints through patch_stdout into normal scrollback and is untouched. Applies to every exit path (/exit, /quit, EOF, Ctrl+C). Fixes NousResearch#38252.
…flag The dashboard's embedded Chat surface (/chat, /api/ws, /api/pty) was gated behind `hermes dashboard --tui` / HERMES_DASHBOARD_TUI=1. The desktop app and the dashboard's own Chat tab both drive the agent over the /api/ws + /api/pty WebSockets, so a dashboard started without the flag would pass the /api/status health check but slam the chat WebSocket shut with WS code 4403 — the app connects, reports "ready", and chat stays dead. This was the root cause behind multiple user reports of the desktop app failing to connect to a self-hosted gateway/dashboard, and it bit Docker and host installs alike. Make the embedded chat unconditional: - web_server.py: _DASHBOARD_EMBEDDED_CHAT_ENABLED defaults to True; drop the embedded_chat parameter and the runtime reassignment from start_server(). The WS gates still read the constant (now always true) so the seam — and its "rejects when disabled" contract test — stays meaningful. - main.py: remove the `--tui` argument from the dashboard subparser and the `embedded_chat = args.tui or HERMES_DASHBOARD_TUI==1` derivation. - web/: isDashboardEmbeddedChatEnabled() returns true unconditionally; drop the deprecated __HERMES_DASHBOARD_TUI__ alias and the dead LEGACY_TUI_RE scrape in the vite dev-token plugin. - apps/desktop/electron/main.cjs: drop `--tui` from the spawned dashboardArgs (it would now error with "unrecognized arguments: --tui") and the redundant HERMES_DASHBOARD_TUI env injection. - Docker: no s6 run-script change needed — the script never passed --tui; the HERMES_DASHBOARD_TUI env var is now simply a no-op, so the image works out of the box with no extra var. - Docs: remove every dashboard --tui / HERMES_DASHBOARD_TUI reference across the CLI reference, env-var reference, docker/desktop/web-dashboard guides, in-app tips, and the zh-Hans translations. The terminal `hermes --tui` / HERMES_TUI references are intentionally left untouched. Tests: 270 passing across web_server, dashboard lifecycle, host-header, auth-gate, and docker-override-scripts suites.
Adds a bundled dashboard-auth provider plugin that authenticates the
web dashboard against any conformant self-hosted OpenID Connect server
(Authentik, Keycloak, Zitadel, Authelia, Auth0, Okta, Google, …) using
standard OIDC — no per-IDP code.
It's a pure drop-in plugin implementing the DashboardAuthProvider
protocol; it touches no core auth/runtime/login paths. Mechanics:
- OIDC discovery from {issuer}/.well-known/openid-configuration
(cached; issuer pinned; endpoints required HTTPS, loopback http
allowed for local-dev IDPs)
- authorization-code + PKCE (S256), public client
- verifies the OIDC ID token (RS256/ES256) against the discovered
jwks_uri with iss/aud pinned to the configured issuer/client_id, and
maps standard claims (sub/email/name/preferred_username, groups→org)
onto a Session
- standard refresh_token grant for silent re-auth; RFC 7009 revocation
on logout when advertised
Verifies the ID token (not the access token) because OIDC guarantees the
ID token is a signed JWT carrying identity, while access-token format is
opaque to the client per spec — the only universally-correct choice
across self-hosted IDPs.
Config via dashboard.oauth.self_hosted.{issuer,client_id,scopes} in
config.yaml or HERMES_DASHBOARD_OIDC_{ISSUER,CLIENT_ID,SCOPES} env vars
(env-wins-config, empty-is-unset — same convention as the nous plugin).
Confidential clients (client_secret) left as a documented TODO seam.
Docs: adds a Self-hosted OIDC section to the web-dashboard guide,
including a copy-paste Keycloak worked example (realm import + docker
run + dashboard wiring + login walkthrough).
Tests: 65 cases covering construction, discovery (incl. issuer
mismatch + https enforcement), start_login/PKCE, complete_login, ID
token verification, refresh/revoke, and env/config precedence.
…derError The gated dashboard verifies a session cookie by trying each registered DashboardAuthProvider's verify_session in turn (the session cookie stores only the access token, not which provider issued it). A provider that doesn't recognise a token returns None; a provider whose IDP/JWKS is unreachable raises ProviderError. The loop used to return HTTP 503 on the FIRST ProviderError, before any later provider got a turn. With multiple providers stacked, that means an unreachable IDP for a session you didn't even use blocks login through a different, reachable provider. Concrete repro: a self-hosted-OIDC session hits the 'nous' provider first (registered earlier); nous tries to reach Nous Portal's JWKS, which is unreachable in a self-hosted deployment, so it raises — and the gate 503s before the 'self-hosted' provider can verify the token. Hit live while testing the new self-hosted OIDC plugin against a local Keycloak. Fix: a ProviderError from one provider is logged and the loop continues to the next. A 503 is returned only if NO provider verified the token AND at least one was unreachable — distinguishing a transient IDP outage (don't force a needless re-login) from a token that's genuinely invalid (fall through to refresh/relogin). Single-provider behaviour is unchanged. Tests: adds an _UnreachableProvider stub and three cases — unreachable provider first must not block a working second; all-unreachable still 503s; reachable-but-unrecognised falls through to 401/relogin (not 503). Mutation-tested: reverting the fix makes the first case fail with the exact 503 bug.
…ch#38918) The hermes tools save summary printed '- kanban' (and would print '+ kanban') for a platform even though kanban is never offered as a checklist option. kanban is a check_fn-gated toolset whose tools are a subset of the platform composite, so _get_platform_tools resolves it as enabled, but _prompt_toolset_checklist only renders CONFIGURABLE_TOOLSETS — so it can never survive into the returned selection. The added/removed diff (current_enabled - new_enabled) then surfaced kanban as removed. Scope the printed diff to the checklist's actual universe via the new _checklist_toolset_keys() helper at all three diff sites (first-install, all-platforms, per-platform). The persisted config is unaffected — _save_platform_tools already preserves non-configurable entries; this was purely a false-signal in the UI.
… session token (NousResearch#38926) The documented path for connecting Hermes Desktop to a remote backend was `--insecure` + a pinned HERMES_DASHBOARD_SESSION_TOKEN — an unauthenticated bind plus a copy-pasted token. Replace it everywhere with the bundled username/password dashboard-auth provider: set HERMES_DASHBOARD_BASIC_AUTH_*, run `hermes dashboard --host 0.0.0.0` (the non-loopback bind engages the auth gate), and Sign in from the app. - desktop.md: rewrite 'Connecting to a remote backend' for the user/pass + Sign in flow - web-dashboard.md: rewrite both remote-backend sections (overview + dedicated); reframe the auth-gate section so --insecure is a discouraged escape hatch, not a co-equal use case; drop the removed --tui flag from the systemd example - environment-variables.md: lead with HERMES_DASHBOARD_BASIC_AUTH_*; drop the session-token / HERMES_DESKTOP_REMOTE_TOKEN remote-connect entries - docker.md: mention the username/password provider as the simplest gate provider
…lback (NousResearch#38928) The classic CLI left its live bottom chrome — the status bar, input box, and separator rules — frozen in terminal scrollback after exit, on every exit path (/exit, /quit, Ctrl+C, EOF) and on both Linux and Windows. The prior erase_when_done=True fix (bf82a7f) routes prompt_toolkit's teardown through renderer.erase(), but that walks back by the renderer's internal cursor model and does not reliably wipe the chrome in practice — users still saw a dead status bar + the rest of the session sitting above the resume summary. Clear the screen + scrollback directly at the single exit funnel instead. All exit paths converge on _print_exit_summary() (called from the run-loop finally block after app.run() returns and prompt_toolkit has restored terminal modes), so a new _clear_terminal_on_exit() helper runs there before the summary prints. It writes ESC[3J ESC[2J ESC[H (erase scrollback, erase screen, home cursor) on a real TTY, no-ops silently when stdout is not a terminal (pipes/redirects), and falls back to the platform clear command if the escape write fails. Works on Linux, macOS, and modern Windows terminals (Terminal/conhost with VT processing, already enabled by prompt_toolkit). The resume/goodbye summary now prints at a clean top-left with nothing stranded above it. Fixes NousResearch#38252.
The shared-key bridging loop (allow_from, require_mention, free_response_channels, …) read only the top-level yaml platform block (yaml_cfg.get(plat.value)). When a user configured a platform solely under ``platforms:`` or ``gateway.platforms:`` with no top-level block, the loop skipped that platform entirely and all bridged keys were silently dropped into PlatformConfig.extra — making allow_from, require_mention, etc. ineffective for nested-only configs. The apply_yaml_config_fn dispatch already received this same fallback in 44f3e51 to handle plugin adapters (e.g. Discord allow_from). The shared-key loop now mirrors it: if yaml_cfg.get(plat.value) is absent, fall back to gateway.platforms.<name> then platforms.<name>. The enabled field is deliberately excluded from the nested fallback (guarded by _cfg_toplevel): _merge_platform_map already merged it with the correct precedence, so re-applying it from a single nested source would overwrite the correctly-merged value. Two new regression tests assert that allow_from and require_mention configured under platforms.telegram and gateway.platforms.telegram are bridged into PlatformConfig.extra. All 54 existing config tests pass.
…orts
User-installed memory providers load under the synthetic
_hermes_user_memory.<name> package, but the loader never registered that
parent namespace in sys.modules (it only registers "plugins" and
"plugins.memory" for bundled providers). As a result any external provider
using a relative import failed to load:
from . import config
ModuleNotFoundError: No module named '_hermes_user_memory'
The same gap in discover_plugin_cli_commands() meant an external provider's
cli.py with a relative import could never be discovered, so the documented
"hermes <plugin>" CLI integration did not work for standalone plugins.
Register the synthetic parent namespace before loading user-installed
providers, mirror it for cli.py discovery (including the per-provider parent
package, without executing the plugin's __init__.py), and make
_load_provider_from_dir() reuse only modules actually loaded from disk so a
parent shell registered by CLI discovery is never mistaken for the loaded
provider.
Regressions cover: a flat provider with a sibling relative import, a provider
with its implementation in a nested subpackage (including a namespace
intermediate directory), cli.py discovery with a relative import, and
provider load after CLI discovery ran first.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…anese/Korean) DOM repro that drives compositionstart -> input(preedit) -> compositionend with no trailing input event and asserts the composer payload (send button) becomes visible for committed CJK/IME input. Regression guard for NousResearch#39614.
The new IME repro test has two it() blocks but the desktop suite registers
no global testing-library auto-cleanup, so the first render() leaked its
editor into the second test and getByTestId('editor') matched two nodes.
Add afterEach(cleanup) so each case renders into a fresh DOM.
Surfaces Hermes Agent version info on demand without leaving chat; works mid-run like /help and /update.
Reuses format_banner_version_label() so CLI, TUI, gateway, and desktop show upstream/local commit when available.
Slack's native-slash manifest hard-caps at 50 (_SLACK_MAX_SLASH_COMMANDS). Adding the /version canonical claims a pass-1 slot, so the lowest-priority pass-2 alias (/q for /quit) clamps off the end. /q stays reachable via /hermes q. Surviving aliases (/btw /bg /reset) still prove alias parity.
* fix(desktop): make composer message queue reliable The queue felt 'dumb' because of three real bugs: 1. Drained-after-interrupt sends went silent. cancelRun sets interrupted:true and nothing reset it; submitPromptText's optimistic seed preserved it, and the message stream drops every delta while interrupted. So Send-now-while-busy and any interrupt+drain submitted the next turn into a muted session. Fix: a fresh submit is a new turn — seed interrupted:false. 2. Back-to-back queue drains stalled. The drain fires on the busy->false settle edge, but busyRef (synced from the busy store by a separate effect) can still read true on that same edge, so the drained send hit the busy guard, returned false, and the entry was never removed. Fix: fromQueue sends bypass the busyRef guard (the queue drain lock serializes them); the user path keeps the guard. 3. Double-enter-to-interrupt killed single non-queue turns. The hidden 450ms timer meant a natural double-tap after sending stopped the agent. Fix: empty Enter while busy is a no-op; interrupting is explicit — Stop button or Esc. Also: clean stop (no [interrupted] marker), Send-now works while busy (promote + interrupt + auto-drain), settle on the interrupted completion path. Adds regression tests and unblocks the prompt-actions suite by completing its stale @/hermes mock. * fix(desktop): float the queue panel as an overlay so the chat doesn't resize The queue list rendered in-flow inside the composer root, so its height fed --composer-measured-height (the composer rect drives the thread's bottom padding + last-message clearance). Queuing a message grew that rect and the whole chat visibly resized. Anchor the panel out of flow above the composer (absolute bottom-full, capped at 40vh with internal scroll). It no longer contributes to the measured height, so the thread layout stays put and the list overlays the (already faded) chat. Still collapsible via the panel's own disclosure header. * fix(desktop): queue panel collapsed by default + shared border with composer - Default the queue disclosure to collapsed (compact 'N queued' pill) instead of expanded. - Drop the gap and merge the panel into the composer: square bottom corners, no bottom border/radius, and overlap down by the Root's pt-2 (-mb-2) so the panel's borderless bottom lands on the composer surface's top border — one continuous bordered shape. * style(desktop): tighten queue panel padding * style(desktop): trim queue-ux comments to house style * style(desktop): drop 'Cursor' references from comments
Builds on @naqerl's arrow up/down history (previous commit), making ArrowUp do the right thing when a queue exists. ArrowUp/ArrowDown priority: 1. Editing a queued turn → walk older/newer through queued entries, saving each edit; ArrowDown past the newest exits and restores the pre-edit draft. 2. Empty composer + queued turns → ArrowUp opens the newest queued entry for editing (the row's pencil), so Enter saves it back to the queue instead of firing a new message — the gap the history nav had alone. 3. Otherwise → sent-message history recall (unchanged). Also: Esc cancels an in-progress queue edit (else interrupts). Cleanups on the integrated code: fold the browse-state reset into the existing session-change effect (drop the duplicate ref+effect); reuse loadIntoComposer for history recall; sort imports; add curly braces + the runDrain sessionId dep (lint).
…eue-arrow-edit-v2 feat(desktop): arrow-key history + queue editing in composer
Force-sending a queued message (double-empty-enter, or interrupt-mode submit) flipped busy→false optimistically, so the queue drain raced the still-unwinding turn: duplicate user bubble, a stray "queued: …" note, and the cancelled turn's "Operation interrupted…" reply leaking in. interruptTurn gains `keepBusy`: hold busy until the gateway's real settle edge (message.complete, suppressed while interrupted), which drains the queued message exactly once — desktop "send now" parity. The interrupt paths now queue + interrupt instead of optimistically sending.
The desktop app could only queue while busy — `/steer` was in the palette but had no first-class affordance, so the "nudge the agent mid-turn without interrupting" lane was effectively unreachable. Add a steer action to the composer: while busy with a text-only draft, a steering-wheel button (and Cmd/Ctrl+Enter) injects the text into the live turn via the `session.steer` RPC — the gateway folds it into the next tool result so the model reads it on its next iteration. Plain Enter still queues. steerPrompt returns false when the gateway has no live tool window (or the RPC errors), and the composer re-queues the words so nothing is lost — the same safety net as a plain queue.
A steer rides inside a tool result (the only role-alternation-safe slot mid-turn), so a bare "User guidance:" line reads as untrusted tool content — well-behaved models refuse it as suspected prompt injection (observed live: "I only follow instructions from you directly, not ones injected through command results"). - Wrap steers in a bounded, self-describing [OUT-OF-BAND USER MESSAGE] marker (prompt_builder.format_steer_marker), shared by both drain sites. - Add STEER_CHANNEL_NOTE to the core system prompt so the model expects this exact marker and trusts it as a genuine user message — while still ignoring lookalikes buried in tool/web/file output. Static text → byte-stable prompt, no prompt-cache regression; gated on the agent having tools. - Desktop: steer ack is now an inline transcript note (⏩ steered · …) instead of a toast. Marker is intentionally static (not a per-session nonce) to honor the byte-stable system-prompt caching policy; nonce hardening noted as follow-up.
Cmd/Ctrl+Enter now steers when there's a steerable draft and is a no-op otherwise — it never falls through to a send, so the shortcut can't surprise-send. Plain Enter keeps its role (queue while busy, send when idle).
The inline steer note used a ⏩ emoji. Emit a structured `steer:<text>` system note and render it in SystemMessage as a codicon (compass) row — same style as slash-status output. No emoji in the transcript.
Compute the trimmed draft once and reuse for hasComposerPayload + canSteer instead of trimming three times per render.
feat: usable mid-turn steer — desktop affordance + trusted injection
…orms (NousResearch#40218) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(dashboard): populate cron delivery dropdown from configured platforms The dashboard cron-create/edit dropdown hardcoded five delivery options (local, telegram, discord, slack, email), so users on Matrix — or any other backend-supported platform — had no way to pick their channel even though the cron scheduler delivers to all of them. It also offered Telegram/Discord/etc. to users who never set those up. - cron/scheduler.py: add cron_delivery_targets() — the single source of truth. Intersects gateway-configured platforms with cron-deliverable ones and reports whether each platform's home channel is set. - web_server.py: GET /api/cron/delivery-targets exposes that list (+ the implicit local option) to the dashboard. - CronPage.tsx: both modals render options from the endpoint. Configured platforms missing a home channel still appear, annotated "set a home channel first" (option B), so the user knows what to fix. Edit modal preserves a job's current target even if it's no longer configured. Local-only state shows a "configure a platform under Channels" hint. Validation: scheduler + endpoint E2E'd with a Matrix gateway (home set and unset); 5 new tests; tests/cron + tests/hermes_cli/test_web_server green (366 passed).
…e.json The native macOS About panel showed the Electron package.json version (e.g. 0.15.1) while the status bar showed the real Hermes version (0.16.0). setAboutPanelOptions() set applicationName + copyright but omitted applicationVersion, so macOS fell back to app.getVersion() = package.json, which drifts (release.py's desktop lockstep bump didn't land for 0.16.0). resolveHermesVersion() already reads the live version from hermes_cli/__init__.py and was built 'so the desktop About panel shows the real Hermes version' per its own comment, but was never wired in. - Seed applicationVersion: resolveHermesVersion() at module load. - Replace the macOS About menu item's role:'about' with a click handler (showAboutPanelFresh) that re-resolves the version on every open, so an in-place `hermes update` is reflected without an app restart.
Follow-up on the cherry-picked content-block fix. _extract_output_tail
(the live subagent overlay) still used crude str(content), which renders
a "[{'type': 'text'...}]" blob and — worse — mislabels a block-wrapped
"Error: ..." result as is_error=False. Route it through the same
_stringify_tool_content helper so error detection and previews work at
both consumer sites.
- delegate_tool.py: _extract_output_tail uses _stringify_tool_content
- tests: add _extract_output_tail content-block test (error detection +
clean preview)
- release.py: AUTHOR_MAP entry for randomsnowflake (CI gate)
…ned to another provider (NousResearch#40286) Switching the main model never touches auxiliary slot pins (they're independent, sticky per-task overrides). A user who switches main away from a now-unpaid provider keeps paying 402s on every background aux call until they manually reset those pins — silently, with no UI signal. - /api/model/set scope:'main' now returns stale_aux: slots still pinned to a provider different from the new main (additive field). - Desktop Model Settings shows a switch-time notice after Apply AND a persistent banner when any loaded aux slot mismatches the main provider, both wired to the existing 'Reset all to main' action. - Never auto-clears pins — a dedicated cheaper aux model is a legitimate config; surface-and-offer instead of nuking. - Fixes a stale pre-existing assertion in the panel test (main model now renders via selectors, not a standalone label).
… dev readout (NousResearch#40011) * feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits) L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that exercises the real header -> CreditsState -> TUI pipe end-to-end behind HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists. - agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are strings -> paid_access via == "true", never bool(); retain-last-known; only subscription_micros may be negative; *_usd kept verbatim). - run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros, session-start baseline latch, + dev-gated "credits" capture log. - agent/chat_completion_helpers.py: capture on the streaming response. - agent/agent_init.py: init _credits_state + _credits_session_start_micros. - tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged. - ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner. Off by default; silent for normal users. Validated live against staging (capture log delta matches the TUI segment). Throwaway consumer (readout/log/ banner); credits_tracker + the capture plumbing are the real feature foundation. * test(credits): lock parser under 9-state matrix + harden validation (L2) Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free, depleted, debt, missing, no_org) plus validation edge cases: version strict==1 with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off == "true"/"false", never bool()), half-pair subscription limit treated as both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros → None, negative non-subscription micros → None, as_of_ms junk → None, zero limit ZeroDivision guard. Harden agent/credits_tracker.py to match the spec: - Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState - Add depleted property (== not paid_access, never remaining==0) - Change used_fraction guard to key off subscription_limit_micros (the actual denominator) not denominator_kind (metadata) - Replace fail-soft _safe_int with a sentinel-returning variant; full validation now returns None on any malformed field rather than silently defaulting - Add module-level warn-once latch for version > 1 - Add USD regex validation; add denominator_kind allow-list check - Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-*) * feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1) L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's policy will fire through and L5's TUI render will consume. - agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id; kind defaults "sticky", kept TTL-expressive for a future config seam). - run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a notice must never break the agent loop; no-op when unbound). - agent/agent_init.py: thread both callbacks through init_agent. - tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear WS events (snake_case payload, matching the existing gateway-event convention). - ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent. - tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op, signature threading, TUI binding payload shape). Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/ decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly. * feat(credits): threshold reconciliation policy + tests (L4.1) * feat(credits): wire threshold policy into capture + latch (L4.2) After a fresh header parse, _capture_credits runs evaluate_credits_notices against the agent's _credits_latch and emits the result — clears first, then shows (so a recovered depletion clears before the "restored" success lands, and depleted wins the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks) still caches state for /usage but runs no policy. Parse stays fail-open (miss → keep last-known); the eval/emit path warns on failure rather than swallowing, so a depletion-notice bug can't vanish silently. - run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn); latch lazy-guarded (object.__new__ safety). - agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}. * feat(tui): render credits notices in the status bar (L5, Strategy B) The TUI now renders the notification.show / notification.clear gateway events the agent emits — a level-colored notice overrides the status/verb slot when not busy. - Notice state machine on turnController (pendingNotice + dedicated noticeTimer + show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler decodes the events and delegates. - Render priority busy > notice > status (appChrome StatusRule); notice text rendered verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx; dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire). - Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset() also calls (would leak across sessions); reset() clears instead. - Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard; latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak). - 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority). * feat(credits): cold-start seed for new Nous sessions (L3) A genuinely-new Nous session has no inference header yet, so seed credits state from the authoritative GET /api/oauth/account snapshot at session start (in the new-session branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin hook gets no agent reference). The seed runs the shared notice policy, so a session that opens already depleted warns IMMEDIATELY rather than only after the first turn. - Maps the nested account fields (paid_service_access → paid_access; total_usable / subscription / purchased on paid_service_access_info; rollover on subscription), each None-guarded; float dollars → micros via round(d*1e6), *_usd left "" (render formats from micros — never synthesize a verbatim usd from a float). - Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset → used_fraction None → no warn90 from the seed (% only once a header lands, per D-E). - Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never blocks startup); paid_access unknown ⇒ True (never falsely depleted). - run_agent.py: extracted the warm-path policy/emit block into a shared _emit_credits_notices() so capture and the seed fire notices identically. * feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6) Add Nous credit dollar magnitudes to /usage (subscription / top-up / total + rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the account endpoint exposes a denominator). Reuses the existing account-usage render machinery via a new pure build_nous_credits_snapshot() that maps a NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to fetch_account_usage (keeps the per-provider boundary intact). CLI /usage also doubles as a depletion-recovery trigger: a force_fresh account fetch, kept in a SEPARATE local so it never clobbers the header-sourced agent._credits_state (which alone carries used_fraction). If paid access recovered while credits.depleted is latched and a notice consumer is bound, it reuses agent._emit_credits_notices() to clear it. Gateway /usage displays magnitudes only — messaging binds no notice consumer, so it performs no recovery emit. Fail-open throughout: any portal hiccup leaves /usage unaffected. * refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers The dev-flag truthy check was inlined in three places. Replace with the shared utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the env check on every render). Behaviour-preserving; identical truthy set. * fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review) Adversarial review found the /usage depletion-recovery trigger dead AND broken: the CLI binds no notice_clear_callback, the TUI runs /usage in a separate slash-worker subprocess (its own agent/latch), and the no-clobber rule made it evaluate stale paid_access anyway. Recovery already happens on the next inference (warm path), so the trigger was redundant — remove it and stop the depleted notice over-promising. - cli.py: remove the dead recovery block; bound the /usage portal fetch with a 10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch — urllib's per-socket timeout is not a wall-clock guarantee. - agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance" (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn). - agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch so a stalled portal can't hang session startup; tidy its time import. * chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE) Throwaway dev scaffolding to exercise the notice pipeline without real spend or Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct / grant_exhausted / depleted / clear) or a file path whose contents name a state (re-read each turn → flip states live for recovery testing). _capture_credits injects the chosen CreditsState instead of parsing real headers and runs the shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding. * feat(credits): /usage monthly-grant % gauge The portal /api/oauth/account subscription block now carries monthly_credits (the per-period grant allowance, the % denominator). The consumer parsed monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only. Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload. build_nous_credits_snapshot emits a Subscription usage window (real % used, routed through the existing render machinery) when monthly_credits is a finite positive denominator and credits_remaining is finite and <= cap; otherwise it degrades to magnitudes-only (older portals, rollover-over-cap, or non-finite payloads). Guards (adversarial-review-driven): reject non-finite operands (json.loads parses bare NaN/Infinity by default → would render $nan + a false 100% used), reject bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap (rollover spanning the period makes the cap a nonsensical denominator → the $X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%. Money rule preserved: the ratio + magnitudes are computed from numeric float account fields via display formatting, never by parsing a server *_usd string (there are none on these dataclasses). 13 gauge tests added (tests/agent/test_nous_credits_gauge.py). * fix(credits): show /usage Nous block whenever a Nous account is present /usage runs in a slash-worker subprocess whose resolved inference provider is often not "nous" even when the user has a Nous account, so gating the Nous credits block on (provider == "nous") hid it entirely — the account data was fully available but never rendered. Gate instead on "a Nous account is logged in": a cheap local auth-state lookup (get_provider_auth_state('nous') has an access_token) decides whether to attempt the portal fetch, regardless of which provider inference runs on. In the gateway the block is also lifted out of the 'if provider:' scope so a Nous-credentialled user with another (or no) resident inference provider still sees their balance. Fail-open and the per-fetch wall-clock timeout are preserved. * fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker) In the TUI, /usage runs in a slash-worker subprocess that resumes the session WITHOUT building an agent (self.agent is None), so _show_usage early-returned "(._.) No active agent" before ever reaching the Nous credits block — which is agent-independent (a portal fetch gated on Nous auth-state). Extract the block into _print_nous_credits_block() and run it at the no-agent / no-calls early-returns too (returns True if it printed, so the fallback message only shows when there's genuinely nothing). Verified live against staging: the block + monthly-grant gauge now render in the slash-worker /usage path (previously hidden). The plain CLI REPL + messaging paths are unchanged (they have a live agent). * feat(credits): escalating 50/75/90 usage bands (single status line) Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn, 90 warn) shown as ONE status-bar line: it displays the highest band the subscription grant has crossed, replaces the line as usage climbs, steps back down on recovery, and clears below 50%. No stacking, no per-turn churn. Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything from it. Single notice key (credits.usage) with a usage_band latch field so the notice only re-emits when the band actually changes. The crossing gate (seen_below_90) is preserved so a fresh live session that opens mid-range stays quiet until it has been observed below the lowest band (cold-start primes it when it wants an open-high warning). Denominator math unchanged: % = subscription grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %. Migrated test_credits_policy.py to the new key + added TestUsageBands (climb, step-down, recovery-clear, idempotent, inclusive boundaries). * feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn) Notices previously only fired inside a conversation turn (first message), so a session that opened already depleted / past a usage band showed nothing at 'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start() and call it (a) in the TUI/desktop agent build right after the notice callback is wired (fires at 'ready', before any message) and (b) as the first-turn fallback in conversation_loop. Idempotent (skips once _credits_state exists) and fail-open. The seed now maps monthly_credits -> subscription_limit_micros + denominator_kind='subscription_cap', so used_fraction is computable at seed time and usage-band warnings (not just depletion) hydrate on open. Primes the crossing latch so a session opening already in a band warns immediately. Degrades to depletion-only when monthly_credits is absent (older portals). Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap degradation, and the shared seed (fires/idempotent/skips-non-nous). * feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge when the portal supplies a positive, finite monthly_credits denominator with remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise. Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so the CLI and TUI /usage render the same block, and _snapshot_from_credits_state() so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too. TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage panel renders them regardless of API-call count or resume state — previously the TUI's separate /usage implementation only showed token counts. Money rule preserved: %% and magnitudes come from numeric float account fields via display formatting, never by parsing a server *_usd string. * feat(credits): CLI REPL inline notices (parity with TUI) The plain CLI agent bound no notice callbacks, so credit notices were TUI-only. Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders a single level-colored line above the prompt (error red / warn yellow / success green / info dim) via _cprint, and seed credits at session open so a depletion or usage-band warning shows before the first message — the same hydration the TUI got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot). * test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open seed, /usage gauge). * fix(credits): usage-band notice clears on next prompt (not sticky-forever) A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear the visible credits.usage notice when a new turn starts (startMessage), so it shows until your next prompt then yields. The server latch is unchanged, so it won't re-nag at the same band — it only re-shows when the band actually changes (climb) or clears when usage drops below the lowest band. Depletion stays sticky. * refactor(credits): consolidate the /usage credits block behind nous_credits_lines() The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command) each re-implemented the auth-gate + portal fetch + render, and both bypassed the dev-fixture short-circuit that only the TUI honored — so /usage ignored HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth gate, and the fixture works on every surface (~60 fewer duplicated lines). The gateway usage test recorded only the last asyncio.to_thread call; /usage now dispatches both the account fetch and the credits fetch, so it records every call and matches the account fetch by its provider arg. * fix(credits): keep the /usage gauge type-safe and log its fail-open path _is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge operands (monthly_credits / credits_remaining) and the magnitudes passed to _fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block is diagnosable in agent.log without a dev flag. * fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed - Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a real account. Matches the documented run workflow (both vars set together). - Hot-path probe: parse_credits_headers checks for the version sentinel header before allocating a lowercased copy of the response headers — skips that work on every non-Nous API call. Behaviour-identical and still case-insensitive. - Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now runs in a daemon thread, so a slow/unreachable portal never delays session "ready" (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread re-checks idempotency before hydrating (a live header may land first). - Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed parser / dead seed is distinguishable from a legitimate no-headers miss. Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate. * test(tui): fix env-timing in the StatusRule dev-credits assertion DEV_CREDITS_MODE is read once at module load (config/env), so mutating process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner assertion only passed if the env was exported before vitest started, and failed in a normal run. Move that assertion to a sibling file that mocks config/env with DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard). * test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt - _snapshot_from_credits_state (the offline /usage renderer) had no direct test: lock the gauge math, the verbatim *_usd magnitudes, the depletion line and the fixture marker, plus the no-cap (no gauge) and None-state cases. - turnController.startMessage had no test for clearing the credits.usage notice on the next prompt while leaving credits.depleted sticky. * feat(credits): deliver credit notices over messaging gateways Bind notice_callback/notice_clear_callback on the per-turn gateway agent so usage-band / depletion / restored notices reach Telegram/Discord/Slack/ etc. Previously the messaging gateway bound neither callback, so the agent's _emit_credits_notices early-returned and a chat user crossing a band got nothing unless they ran /usage manually. - render_notice_line(): AgentNotice -> single plaintext line (level glyph + text), plaintext-only so it renders uniformly without per-platform escaping. Fail-soft on malformed/empty notices. - Standalone push for every notice (messaging has no persistent status bar): route through the shared _deliver_platform_notice rail (honors private/ public delivery + thread metadata), scheduled onto the gateway loop via safe_schedule_threadsafe from the agent's sync worker thread — same pattern as _status_callback_sync. - The fired-once latch lives on the cached (reused-in-place) agent and persists across turns, so a band crosses once -> one push, no per-turn re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder). - Recovery ('Credit access restored') rides the show path (emitted as a success notice, not a clear). notice_clear_callback is a no-op: a sent platform message can't be cleanly retracted. Tests: render glyph/levels/fail-soft + public/private delivery seam through _deliver_platform_notice + no-adapter no-op. * fix(credits): don't double the glyph on messaging notices render_notice_line prepended a per-level glyph, but the notice policy already bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used", "⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead level→glyph map. The render tests fed glyph-less text (and the success case only checked startswith), so the doubling slipped through. Rework them around the verbatim contract and add an end-to-end regression that runs real evaluate_credits_notices output through render_notice_line and asserts the line is returned unchanged.
…ousResearch#40367) credits.grant_spent is a one-time "your monthly grant is used up, you're now on top-up" heads-up, but it was sticky — it camped the TUI status bar until the grant refilled, so a user with healthy top-up saw "Grant spent · $990 top-up left" indefinitely. Treat it like the usage-band notice: flash once, then clear on the next prompt (startMessage). Depletion stays sticky (you actually can't make requests). The Python `active` latch keeps the key, so it won't re-fire next turn.
…ousResearch#40371) profile list and profile show assumed the wrapper script is always named after the profile (wrapper_dir / name). When a custom alias exists — e.g. `hermes profile alias steve --name qiaobusi` creates ~/.local/bin/qiaobusi pointing at `hermes -p steve` — the display silently showed the profile name (or nothing) instead of the alias the user actually typed. The custom-alias *creation* path (create_wrapper_script(name, target)) was added later; the *display* path was never updated to match. Add find_alias_for_profile() — a reverse lookup that scans the wrapper dir for our own wrappers (alias-named file containing 'hermes -p <profile>'), prefers a custom alias over the profile-named one, strips .bat on Windows, and sorts for deterministic output. Populate ProfileInfo.alias_name and wire it into the three display sites (profile describe, list, show). Credit: salvages the intent of NousResearch#11506 by wss434631143, reimplemented on current main against the post-NousResearch#11506 custom-alias (--name/target) mechanism. Tests: 6 new (profile-named, custom-name, none, unrelated-file rejection, windows .bat strip, list_profiles surfacing). All 123 in test_profiles pass. E2E verified against the real CLI for both custom and profile-named aliases.
…, preview + security scan (NousResearch#40384) The Browse-hub tab was a blank search box with sparse result cards (name + source + one Install button), no way to read a skill before installing, no visual security scan, and no indication it was even connected to any hubs. Backend (web_server.py): - GET /api/skills/hub/sources — lists the configured hubs (label + trust tier + GitHub rate-limit + index availability) and featured skills pulled from the centralized index (zero extra API calls), plus installed-skill provenance so the UI can mark already-installed results. - GET /api/skills/hub/preview — fetches a skill's SKILL.md text + file manifest WITHOUT installing (decodes byte-stored text, masks binaries). - GET /api/skills/hub/scan — runs the SAME quarantine + scan_skill + should_allow_install pipeline the CLI installer uses, then cleans up quarantine, returning verdict / per-finding detail / severity tally / install-policy decision. - search now returns per-source counts + timed-out sources + installed map. Frontend (SkillsPage HubBrowser): - Landing state: connected-hubs strip + featured skill grid (no more blank page). - Rich cards: trust-level color coding, source, tags, identifier, Details + Install (or Installed state). - Detail dialog: read the actual SKILL.md, on-demand visual security scan (verdict pill, severity tally, per-finding list, allow/block policy), GitHub repo link. - Search meta line: result count + timing + per-source breakdown (the 'feels slow / no feedback' complaint). Tests: 4 new endpoint test classes (sources/preview/scan + updated search shape) in test_dashboard_admin_endpoints.py.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
replaceandremoveold_textthrough the provider/manager boundaryVerification
./venv/bin/python -m pytest -o addopts='' tests/agent/test_memory_provider.py tests/run_agent/test_run_agent.py -q453 passed, 1 warning