fix(xai-oauth): rewrite entitlement-403 hint to not accuse subscribers by teknium1 · Pull Request #26666 · NousResearch/hermes-agent

teknium1 · 2026-05-16T00:15:15Z

Summary

Rewrite the xAI permission-denied 403 hint so it doesn't accuse subscribed users of being unsubscribed. Follow-up to #26644 (added the hint) and #26664 (broke the refresh loop).

Why

PR #26644 confidently told users xAI OAuth account lacks SuperGrok / X Premium entitlement for this model. But xAI's /v1/responses returns the SAME body for at least four distinct causes we can't distinguish:

Account has no Grok subscription
Account has SuperGrok but the tier doesn't include the requested model (grok-4.3 needs SuperGrok Heavy)
Monthly quota for the subscribed tier is exhausted
SuperGrok active but the API access add-on isn't enabled

Don Piedro reported he IS subscribed and still hit the 403. Telling him "you're not subscribed" reads as wrong and points him at a fix he already did. The detection logic and credential-pool short-circuit (#26664) are unchanged — only the user-facing wording.

Changes

run_agent.py::_decorate_xai_entitlement_error — new wording lists all 4 possible causes and points at https://grok.com/?_s=usage (the URL xAI itself returns) where the user can verify which one applies. Idempotency check now keys on a hint-unique substring rather than the URL (since xAI's own body contains the URL).

Validation

	Result
`test_codex_xai_oauth_recovery.py`	24/24 pass (1 new test ensures hint never says "lacks subscription")

Before / After

Before:

⚠ Auxiliary title generation failed: HTTP 403: ... — xAI OAuth account lacks SuperGrok / X Premium entitlement for this model. Subscribe at https://grok.com or run `/model` to switch providers.

After:

⚠ Auxiliary title generation failed: HTTP 403: ... — xAI rejected the request on this OAuth account. Could be a missing subscription, a tier that doesn't include this model, an exhausted quota, or API access not enabled. Check https://grok.com/?_s=usage to see which, or run `/model` to switch providers.

PR #26644 confidently told users "xAI OAuth account lacks SuperGrok / X Premium entitlement" on any 403 from xAI's permission-denied surface. But that body is returned for at least four distinct causes that Hermes cannot distinguish from the wire: * Account has no Grok subscription at all * Account has SuperGrok but the tier doesn't include the requested model (e.g. grok-4.3 needs SuperGrok Heavy) * Monthly quota for the subscribed tier is exhausted * SuperGrok is active but the API access add-on isn't enabled Don Piedro pushed back that he IS subscribed yet still hit this. Picking the worst-case interpretation ("you're not subscribed") reads as wrong and insulting to subscribers, and points them at a fix they already did. New wording lists all 4 possibilities and points at https://grok.com/?_s=usage where the user can check which applies. The detection logic and credential-pool short-circuit (PR #26664) are unchanged — only the user-facing wording is rephrased.

github-actions · 2026-05-16T00:16:01Z

🔎 Lint report: `hermes/hermes-cfe77b12` vs `origin/main`

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8304 on HEAD, 8304 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4338 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

) The #1 confusing cause of the xAI 403 (per Teknium): X Premium+ subscribers see Grok inside the X app and assume API access is included. It is NOT — only standalone SuperGrok subscribers can use xai-oauth with Hermes today. Without calling this out, every Premium+ user hits the 403 with no idea why. PR #26666's neutral 4-cause list was correct but buried the most common cause. Lead with the Premium+ gotcha, then list the other possibilities (no subscription, wrong tier, exhausted quota) as fallbacks. Same neutral framing — does not accuse anyone of being unsubscribed.

NousResearch#26666) PR NousResearch#26644 confidently told users "xAI OAuth account lacks SuperGrok / X Premium entitlement" on any 403 from xAI's permission-denied surface. But that body is returned for at least four distinct causes that Hermes cannot distinguish from the wire: * Account has no Grok subscription at all * Account has SuperGrok but the tier doesn't include the requested model (e.g. grok-4.3 needs SuperGrok Heavy) * Monthly quota for the subscribed tier is exhausted * SuperGrok is active but the API access add-on isn't enabled Don Piedro pushed back that he IS subscribed yet still hit this. Picking the worst-case interpretation ("you're not subscribed") reads as wrong and insulting to subscribers, and points them at a fix they already did. New wording lists all 4 possibilities and points at https://grok.com/?_s=usage where the user can check which applies. The detection logic and credential-pool short-circuit (PR NousResearch#26664) are unchanged — only the user-facing wording is rephrased.

…sResearch#26672) The #1 confusing cause of the xAI 403 (per Teknium): X Premium+ subscribers see Grok inside the X app and assume API access is included. It is NOT — only standalone SuperGrok subscribers can use xai-oauth with Hermes today. Without calling this out, every Premium+ user hits the 403 with no idea why. PR NousResearch#26666's neutral 4-cause list was correct but buried the most common cause. Lead with the Premium+ gotcha, then list the other possibilities (no subscription, wrong tier, exhausted quota) as fallbacks. Same neutral framing — does not accuse anyone of being unsubscribed.

* fix(tui): restrict fast-echo bypass to ASCII so Vietnamese/CJK/IME input renders correctly (#26011) * fix(tui): restrict fast-echo bypass to ASCII so Vietnamese/CJK/IME input renders correctly The composer's fast-echo path (canFastAppend / canFastBackspace) writes characters straight to stdout to skip an Ink re-render on the hot typing path. The previous guard only checked 'stringWidth(text) === text.length', which lets a lot of non-ASCII through: - Vietnamese precomposed letters (ề, ắ, ờ, ự, ...) report width 1 and length 1, but a Vietnamese Telex / IME stack produces them across multiple keystrokes; the intermediate composition state must be drawn by Ink so the rendered cell, the stored value, and the cursor column stay in lockstep when the final commit replaces the preview. - NFD combining marks (U+0300..U+036F) are zero-width but length 1, so even a passing equality lets them slip and silently desync the cell column. - CJK/East-Asian wide and emoji rejected only because their length differs, but the boundary was shape-shaped, not intent-shaped. User-visible bug from the original report: Example: eê noiói nge neène -> the bypass committed the IME preview char before the diacritic replaced it, leaving doubled letters on screen. Fix: gate fast-echo on pure printable ASCII (0x20-0x7e). The performance-critical English typing path is unchanged; everything else goes through the normal Ink render path so layout stays accurate. Also extracts the shape preconditions as pure exported helpers (canFastAppendShape / canFastBackspaceShape) so the regression matrix is testable without spinning up a TextInput. Tests: ui-tui/src/__tests__/textInputFastEcho.test.ts adds 20 cases covering ASCII still works, Vietnamese precomposed + NFD, CJK, emoji, NBSP / Latin-1, ANSI / control bytes, multi-line, and end-of-line preconditions. Verified RED on the previous guard (11 of 20 fail) and GREEN on the new guard. Refs: #5221, #7443, #17602, #17603 (similar wide-char rendering bugs). * docs(tui): clarify Vietnamese char terminology in regression comment Address Copilot review: 'single byte width' implied UTF-8 byte semantics, but the relevant property is JS code units (`text.length === 1`) and display width (`stringWidth === 1`). Reworded to match. * feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider Adds a new authentication provider that lets SuperGrok subscribers sign in to Hermes with their xAI account via the standard OAuth 2.0 PKCE loopback flow, instead of pasting a raw API key from console.x.ai. Highlights ---------- * OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery, state/nonce, and a strict CORS-origin allowlist on the callback. * Authorize URL carries `plan=generic` (required for non-allowlisted loopback clients) and `referrer=hermes-agent` for best-effort attribution in xAI's OAuth server logs. * Token storage in `auth.json` with file-locked atomic writes; JWT `exp`-based expiry detection with skew; refresh-token rotation synced both ways between the singleton store and the credential pool so multi-process / multi-profile setups don't tear each other's refresh tokens. * Reactive 401 retry: on a 401 from the xAI Responses API, the agent refreshes the token, swaps it back into `self.api_key`, and retries the call once. Guarded against silent account swaps when the active key was sourced from a different (manual) pool entry. * Auxiliary tasks (curator, vision, embeddings, etc.) route through a dedicated xAI Responses-mode auxiliary client instead of falling back to OpenRouter billing. * Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen plugin) resolve credentials through a unified runtime → singleton → env-var fallback chain so xai-oauth users get them for free. * `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are wired through the standard auth-commands surface; remove cleans up the singleton loopback_pkce entry so it doesn't silently reinstate. * `hermes model` provider picker shows "xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls back to pool credentials when the singleton is missing. Hardening --------- * Discovery and refresh responses validate the returned `token_endpoint` host against the same `*.x.ai` allowlist as the authorization endpoint, blocking MITM persistence of a hostile endpoint. * Discovery / refresh / token-exchange `response.json()` calls are wrapped to raise typed `AuthError` on malformed bodies (captive portals, proxy error pages) instead of leaking JSONDecodeError tracebacks. * `prompt_cache_key` is routed through `extra_body` on the codex transport (sending it as a top-level kwarg trips xAI's SDK with a TypeError). * Credential-pool sync-back preserves `active_provider` so refreshing an OAuth entry doesn't silently flip the active provider out from under the running agent. Testing ------- * New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests) covers JWT expiry, OAuth URL params (plan + referrer), CORS origins, redirect URI validation, singleton↔pool sync, concurrency races, refresh error paths, runtime resolution, and malformed-JSON guards. * Extended `test_credential_pool.py`, `test_codex_transport.py`, and `test_run_agent_codex_responses.py` cover the pool sync-back, `extra_body` routing, and 401 reactive refresh paths. * 165 tests passing on this branch via `scripts/run_tests.sh`. * fix(tools): video_gen picker reflects active xAI selection and runs xai_grok post_setup Two bugs in the `hermes tools` reconfigure flow caused picking xAI Grok Imagine for video_gen (or image_gen) to feel like a no-op: 1. `_is_provider_active()` had a branch for `image_gen_plugin_name` but none for `video_gen_plugin_name`, so a row marked as the active xAI video provider was never recognized as active. The picker fell through to the env-var fallback in `_detect_active_provider_index()`, which matched the FAL row (because `FAL_KEY` is set), so the picker visually defaulted to FAL even though the user had selected xAI. 2. `_plugin_video_gen_providers()` and `_plugin_image_gen_providers()` built picker rows from the plugin's `get_setup_schema()` but only copied `name`, `badge`, `tag`, `env_vars`. The xAI plugins declare `post_setup: "xai_grok"` so the picker should run the OAuth / API-key prompt hook after selection — that key was silently dropped, so the hook never fired from the picker rows. Adds the missing `video_gen_plugin_name` branch (placed before the `managed_nous_feature` block, mirroring the existing image_gen branch) and propagates `post_setup` from the plugin schema into both picker-row builders. Adds focused tests in `test_video_gen_picker.py` and `test_image_gen_picker.py`. * chore(release): map Jaaneek@users.noreply.github.com to Jaaneek The contributor's commit author email is the legacy GitHub noreply form (no leading numeric "id+"), so it doesn't match the check-attribution workflow's auto-resolve regex (\+.*@users\.noreply\.github\.com). Register it explicitly in AUTHOR_MAP so the PR #26457 attribution check passes. * fix(xai-http): preserve ~/.hermes/.env fallback and XAI_STT_BASE_URL precedence The new resolve_xai_http_credentials() resolver was using os.getenv() for the XAI_API_KEY/XAI_BASE_URL fallback path, which dropped the ~/.hermes/.env contract guarded by PR #17140 / #17163. Users with XAI_API_KEY in dotenv only would see "No xAI credentials found" even though the key was configured. Separately, _transcribe_xai started consulting creds["base_url"] (which always returns at least the default https://api.x.ai/v1) ahead of the public XAI_STT_BASE_URL env override, so the per-tool override stopped working. - tools/xai_http.py: add module-level get_env_value() wrapper that reads ~/.hermes/.env first (via hermes_cli.config.get_env_value), then os.environ. Resolver uses it for the API-key/base-url fallback. - tools/transcription_tools.py: restore precedence so XAI_STT_BASE_URL wins over creds["base_url"]. - tests/tools/test_transcription_dotenv_fallback.py + tests/tools/test_tts_dotenv_fallback.py: repoint the per-call-site patches at the new resolution point (tools.xai_http.get_env_value). The end-to-end regression-guard test (which patches load_env) is unchanged and still passes. * refactor(transports/codex): trim duplicated cache-key comments The xAI prompt_cache_key block carried two long comment paragraphs that either restated setdefault semantics, narrated the SDK type-validation mechanism, or recapped the historical motivation for the extra_body indirection — all already covered by the test docstring at test_xai_responses_sends_cache_key_via_extra_body (which links to the xAI docs). Also restored the truncated link in the body-injection comment. No behavior change. * docs(xai-oauth): correct logout command (was hermes auth remove) The previous "Logging Out" section showed `hermes auth remove xai-oauth` with no positional target — argparse rejects that and the command does not clear the singleton OAuth state anyway. The correct command for the "clear everything" intent is `hermes auth logout xai-oauth`. Also point users at `hermes auth remove xai-oauth <target>` for single-pool-row deletion. * test(xai-oauth): use grok-4.3 instead of retiring grok-code-fast-1 Per @mark-xai's review on PR #26457 and the xAI model retirement on 2026-05-15: grok-code-fast-1 is being retired today and aliases redirect to grok-4.3 (already pinned to the top of the xAI model list by this PR). Update the two xAI Responses-API test fixtures Mark flagged plus the picker fallback default in hermes_cli/main.py that uses the same literal. * chore(xai-oauth): trim CORS allowlist to xAI auth origins Drop accounts.mouseion.dev and localhost:20000 / 127.0.0.1:20000 from the loopback callback CORS allowlist — leftover dev origins. The redirect_uri is bound to 127.0.0.1 and gated by PKCE + state, so only xAI's own auth origins are needed. Co-Authored-By: Jaaneek <Jaaneek@users.noreply.github.com> * docs(xai-oauth): add xai-oauth to provider enumeration pages (#26542) Follow-up to #26534 (xai-oauth provider). The new guide and integrations page were shipped with the salvage, but four reference/enumeration pages still listed every other OAuth provider without xai-oauth: - reference/cli-commands.md — `--provider` choices list - reference/environment-variables.md — HERMES_INFERENCE_PROVIDER values - user-guide/configuration.md — auxiliary-task provider list, OAuth tip block (mirrored from MiniMax OAuth), and provider table row - user-guide/features/fallback-providers.md — provider table * fix(cronjob): require explicit truthy session env values * fix(env-flags): widen truthy-only session env checks to sibling sites Build on @aydnOktay's cronjob fix by routing the cronjob check through the shared 'env_var_enabled' helper in utils.py (same truthy set: 1/true/yes/on) and applying the same semantics to the 8 sibling call sites that read HERMES_INTERACTIVE / HERMES_GATEWAY_SESSION / HERMES_EXEC_ASK / HERMES_CRON_SESSION with bare os.getenv() truthy checks: - tools/approval.py: _is_gateway_approval_context (2), check_command_safety (2), check_all_command_guards (3) -- 7 sites total - tools/terminal_tool.py: _handle_sudo_failure, sudo password prompt -- 2 sites - tools/skills_tool.py: _is_gateway_surface -- 1 site Without this, a user who exports HERMES_INTERACTIVE=0 in their shell still gets interactive sudo prompts, approval prompts, and gateway skill-install paths -- only the cronjob tool was hardened. Now all consumers agree on the same false-like values. Also drops the duplicate _is_truthy_env helper from cronjob_tools.py in favour of the existing canonical utils.env_var_enabled. Tests: extend the parametrized regression coverage to all three session env vars (HERMES_INTERACTIVE / HERMES_GATEWAY_SESSION / HERMES_EXEC_ASK) symmetrically. tests/tools/test_cronjob_tools.py: 60/60 pass; tests/tools/{approval,terminal_tool,skills_tool, cron_approval_mode,hardline_blocklist}.py: 378/378 pass. * fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) Wraps every sync->async coroutine-scheduling site in the codebase with a new agent.async_utils.safe_schedule_threadsafe() helper that closes the coroutine on scheduling failure (closed loop, shutdown race, etc.) instead of leaking it as 'coroutine was never awaited' RuntimeWarnings plus reference leaks. 22 production call sites migrated across the codebase: - acp_adapter/events.py, acp_adapter/permissions.py - agent/lsp/manager.py - cron/scheduler.py (media + text delivery paths) - gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper which now delegates to safe_schedule_threadsafe) - gateway/run.py (10 sites: telegram rename, agent:step hook, status callback, interim+bg-review, clarify send, exec-approval button+text, temp-bubble cleanup, channel-directory refresh) - plugins/memory/hindsight, plugins/platforms/google_chat - tools/browser_supervisor.py (3), browser_cdp_tool.py, computer_use/cua_backend.py, slash_confirm.py - tools/environments/modal.py (_AsyncWorker) - tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to factory-style so the coroutine is never constructed on a dead loop) - tui_gateway/ws.py Tests: new tests/agent/test_async_utils.py covers helper behavior under live loop, dead loop, None loop, and scheduling exceptions. Regression tests added at three PR-original sites (acp events, acp permissions, mcp loop runner) mirroring contributor's intent. Live-tested end-to-end: - Helper stress test: 1500 schedules across live/dead/race scenarios, zero leaked coroutines - Race exercised: 5000 schedules with loop killed mid-flight, 100 ok / 4900 None returns, zero leaks - hermes chat -q with terminal tool call (exercises step_callback bridge) - MCP probe against failing subprocess servers + factory path - Real gateway daemon boot + SIGINT shutdown across multiple platform adapter inits - WSTransport 100 live + 50 dead-loop writes - Cron delivery path live + dead loop Salvages PR #2657 — adopts contributor's intent over a much wider site list and a single centralized helper instead of inline try/except at each site. 3 of the original PR's 6 sites no longer exist on main (environments/patches.py deleted, DingTalk refactored to native async); the equivalent fix lives in tools/environments/modal.py instead. Co-authored-by: JithendraNara <jithendranaidunara@gmail.com> * feat(nvidia): add NIM billing origin header * chore(release): add AUTHOR_MAP entry for kchantharuan@nvidia.com * fix(acp): emit native plan updates for todo * fix(acp): replay native todo plans * fix(install.ps1): restore EAP=Continue around uv python install, skip Store stub (#26586) Fresh Windows installs were failing on first run with: ⚠ uv python install error: Downloading cpython-3.11.15-windows-x86_64-none (24.5MiB) ✗ Installation failed: Python was not found; run without arguments to install from the Microsoft Store... Two bugs compounding: 1) EAP=Stop swallows uv's stderr progress as an exception. uv writes download progress ("Downloading cpython-3.11.15-windows-x86_64-none (24.5MiB)") to stderr. With $ErrorActionPreference = "Stop" set at the top of the script plus 2>&1 capture, PowerShell wraps each stderr line as an ErrorRecord and throws on the first one — even though uv exits 0 and Python was installed successfully. This was previously fixed in commit ec1714e71 (May 8) but lost in the May 12 release squash (413990c94). Reapply the EAP=Continue + verify-via 'uv python find' pattern. 2) System-python fallback invokes the Microsoft Store stub. When the uv paths fall through, the legacy 'python --version' check invokes %LOCALAPPDATA%\\Microsoft\\WindowsApps\\python.exe, a 0-byte reparse-point stub that prints 'Python was not found...' to stdout and exits non-zero. Get-Command matches it. The resulting error message is what the user sees as the final installer crash. Detect and skip the stub by checking for the \\WindowsApps\\ path component or a 0-byte file size before invoking python. Also save/restore EAP defensively in the catch blocks so a throw before the assignment can't leave EAP in 'Continue'. * fix(auth): point SSH OAuth users at the tunnel they actually need (#26592) Two loopback-redirect OAuth flows (xAI Grok, Spotify) silently fail when Hermes runs on a remote host: the auth server redirects to 127.0.0.1:<port> on the user's laptop, not on the remote box. The --no-browser flag only suppresses webbrowser.open() — it doesn't change the bind address. Symptom xAI surfaces is 'Could not establish connection. We couldn't reach your app.', followed by a 'xAI authorization timed out waiting for the local callback' on the CLI side. Changes - hermes_cli/auth.py: new _print_loopback_ssh_hint() helper, called from _xai_oauth_loopback_login() and _spotify_login() right after they print the redirect URI. Silent off SSH; on SSH prints the exact 'ssh -N -L <port>:127.0.0.1:<port>' command using the actually-bound port (not the hardcoded constant — the listener auto-bumps when the preferred port is busy), a provider-specific docs URL, and a link to the new shared guide. - website/docs/guides/oauth-over-ssh.md (new): single source of truth for the tunnel pattern — TL;DR command, jump-box / ProxyJump variant, mosh+tmux+ControlMaster gotchas, troubleshooting. - website/docs/guides/xai-grok-oauth.md: fix the two sections that claimed --no-browser alone was enough; link to the shared guide. - website/docs/user-guide/features/spotify.md: expand the existing one-liner; link to the shared guide. - website/sidebars.ts: register the new page. - tests/hermes_cli/test_auth_loopback_ssh_hint.py: 7 unit tests covering SSH-vs-not, loopback-vs-not, malformed URIs, port echo, with and without provider docs URL. * fix(gateway): keep running when platforms fail; add per-platform circuit breaker + /platform (#26600) Stop the gateway from exiting (or systemd-restart-looping) when a single messaging adapter fails at startup or runtime. A misconfigured WhatsApp (npm install timeout, unpaired bridge, missing creds.json) used to take the entire gateway down, killing cron jobs and any other connected platforms with it. Changes: • Startup (gateway/run.py): when connected_count==0 but the only errors are retryable, log a degraded-state warning and keep the gateway alive instead of returning False. Reconnect watcher then recovers platforms as their underlying problem clears. • Runtime (gateway/run.py _handle_adapter_fatal_error): when the last adapter goes down with a retryable error and is queued for reconnection, stay alive instead of exit-with-failure. Previously this triggered systemd Restart=on-failure, which created infinite restart loops on persistent retryable failures (proxy outage, repeated bridge crashes). • Reconnect watcher (gateway/run.py _platform_reconnect_watcher): replace the 20-attempt hard drop with a circuit-breaker pause. After _PAUSE_AFTER_FAILURES (10) consecutive retryable failures, the platform stays in _failed_platforms with paused=True so the watcher skips it but the operator can still see and resume it. Non-retryable errors still drop out of the queue immediately. Resolves #17063 (gateway giving up on Telegram after 20 attempts). • WhatsApp preflight (gateway/platforms/whatsapp.py): refuse to start the Node bridge when creds.json is missing. Sets a non-retryable whatsapp_not_paired fatal error so the watcher drops it cleanly with a single 'run hermes whatsapp' log line instead of paying the 30s bridge bootstrap timeout on every gateway start. • WhatsApp setup ordering (hermes_cli/main.py cmd_whatsapp): only set WHATSAPP_ENABLED=true once pairing actually succeeds. Previously the wizard wrote the env var at step 2 (before npm install and QR pairing), so any Ctrl+C left .env claiming WhatsApp was ready when the bridge had no creds.json. Also propagate the env var when the user keeps an existing pairing on a re-run. • /platform slash command (hermes_cli/commands.py + gateway/run.py): new gateway-only command for manual circuit-breaker control. /platform list — show connected + failed/paused platforms /platform pause <name> — silence a known-broken platform /platform resume <name> — re-queue a paused platform Tests: • New: pause/resume helpers, /platform list|pause|resume command, WhatsApp creds.json preflight, WhatsApp setup ordering. • Updated: stale assertions that codified the old 'exit and let systemd restart' behavior in test_runner_fatal_adapter.py, test_runner_startup_failures.py, and test_platform_reconnect.py (the 20-attempt give-up test became a circuit-breaker pause test). 5488 tests pass in tests/gateway/. * docs(hermes_tools_mcp_server): align scope docstring with EXPOSED_TOOLS (#26603) The top-of-file scope docstring listed delegate_task, memory, and session_search as exposed tools, but EXPOSED_TOOLS deliberately omits them (they're _AGENT_LOOP_TOOLS and require the running AIAgent context to dispatch — the inline comment block already explains this). Kanban tools, which ARE exposed, were missing from the docstring entirely. Rewrite the Scope / DO NOT expose sections to match the actual tuple: drop delegate_task/memory/session_search from 'expose', add the kanban_* family, move delegate_task/memory/session_search/todo into 'DO NOT expose' with the agent-loop rationale. Fixes #26567 (doc-only fix; option 2 — shimming memory/session_search through MemoryStore/SessionDB directly — left for a follow-up issue once the plugin-memory locking story is audited). * ci(pypi): build web dashboard + TUI bundle before creating wheel * feat(banner): check PyPI for updates when not a git install For pip-installed hermes-agent (no .git directory), fall back to querying PyPI's JSON API to compare __version__ against the latest published release, using stdlib only (urllib + json, no packaging dep). * feat(install): add --ensure and --postinstall modes for targeted dep bootstrap Adds --ensure DEPS for pip-runtime dep installation and --postinstall for pip users who want the full post-install experience without cloning. * fix(doctor): generate config from defaults when template file is missing When cli-config.yaml.example is not present (e.g. pip wheel install), fall back to writing DEFAULT_CONFIG via save_config() instead of warning and requiring a manual fix. * fix(gateway): build service PATH from existing dirs only, include ~/.hermes/node_modules Extract PATH building into _build_service_path_dirs() that skips directories which don't exist on disk (e.g. node_modules/.bin for pip installs) and also includes ~/.hermes/node/bin and ~/.hermes/node_modules/.bin for agent-browser. * feat(tui): find bundled entry.js from wheel before falling back to npm build Add _find_bundled_tui() that checks for hermes_cli/tui_dist/entry.js (present in wheel installs) and wire it into _make_tui_argv() between the HERMES_TUI_DIR prebuilt path and the npm install fallback. * feat(config): detect pip install method and recommend correct update command Adds detect_install_method() to identify nixos/homebrew/git/pip installs, and recommended_update_command_for_method() to return the right upgrade command for each method. Updates recommended_update_command() to use these for pip-installed instances (no .git dir, not managed). * feat(update): support pip install --upgrade for PyPI installs When .git is absent and detect_install_method returns "pip", fork hermes update to run `uv pip install --upgrade hermes-agent` (or `python -m pip install --upgrade hermes-agent` as fallback) instead of hard-exiting with "Not a git repository". * chore(config): expand ensure_hermes_home to create full directory scaffold Match the full set of subdirs created by install.sh: pairing, hooks, image_cache, audio_cache, and skills are now pre-created alongside the existing cron, sessions, logs, logs/curator, and memories dirs. This makes hermes doctor checks cleaner without changing any runtime behaviour. * feat: add ensure_dependency() wrapper + ship install.sh in wheel Includes paired change: browser tool now searches ~/.hermes/node_modules/.bin/ for agent-browser installed via install.sh --ensure browser. * refactor: fix review findings — remove duplicate imports and deduplicate update command - banner.py: remove redundant `import json as _json` (json already at module level) - main.py: _cmd_update_pip now delegates to recommended_update_command_for_method instead of duplicating the uv-vs-pip detection logic - main.py: remove redundant `import subprocess as _sp` (subprocess already at module level) * fix(update): handle --check for pip installs (missed code path) _cmd_update_check() had its own `.git` gate separate from _cmd_update_impl. For pip installs, fork to _check_via_pypi() and display the result with the correct recommended_update_command(). * chore(ci): pin actions/setup-node to SHA for supply-chain consistency * feat: wire ensure_dependency into TUI and browser tool call sites Before: missing node → hard exit; missing browser → FileNotFoundError. After: both try ensure_dependency() first, which prompts interactively and delegates installation to install.sh --ensure. ripgrep and ffmpeg already degrade gracefully (grep fallback, skip conversion) so they don't need wiring. Also documents the design rationale in dep_ensure.py: detection and prompting live in Python (portable, instant, UX-integrated); only the actual installation delegates to install.sh (1900 lines of battle-tested OS/package-manager logic). * chore: gitignore hermes_cli/scripts/ (bundled at wheel build time) * feat: add `hermes postinstall` command for pip users One-shot bootstrap that installs non-Python deps (node, browser, ripgrep, ffmpeg) via ensure_dependency(), then runs setup if no provider is configured. Closes the gap between `pip install` and the full user-facing experience. Also fixes 3 pre-existing test regressions caused by earlier commits: - test_recommended_update_command: mock detect_install_method for git env - test_check_for_updates_no_git_dir: now falls back to PyPI, not None - test_plist_path_includes_node_modules_bin: skip when dir absent * docs: add pip install path to installation, quickstart, updating, and CLI reference Document pip install hermes-agent as a first-class install option. Clarify that PyPI releases track tagged versions (major/minor), not every commit on main — git installer is for bleeding-edge. * refactor: DRY cleanup from code review - dep_ensure.py: use get_hermes_home() instead of hand-rolled env var - dep_ensure.py: add "chrome" to browser name list (was inconsistent with browser_tool.py) - main.py _cmd_update_check: use detect_install_method() directly instead of redundant .git check - main.py _cmd_update_pip: build command list directly instead of fragile split() on display string - banner.py: rename _check_via_pypi → check_via_pypi (cross-module public API) * docs: add hermes postinstall to installation + quickstart, fix update --check description - installation.md: add tip about `hermes postinstall` for upfront dep install - quickstart.md: show `hermes postinstall` in pip install flow - updating.md: fix --check description to mention PyPI path for pip installs * docs(xai): link OAuth-over-SSH guide from xAI provider surfaces (#26610) Follow-up to #26592. The new docs/guides/oauth-over-ssh.md page was linked from the two SSH-specific sections of the xAI Grok OAuth guide but was missing from the surfaces a user is more likely to hit first: - guides/xai-grok-oauth.md 'See Also' — add the SSH guide at the top with a short qualifier so remote users notice it before clicking through. - integrations/providers.md xAI Grok OAuth callout — append the SSH guide link alongside the existing xAI OAuth guide link. - user-guide/configuration.md xai-oauth tip — same. Docs build: zero warnings on touched files. * ci: reject PRs with no common ancestor on main (#26611) Catches the failure mode that produced #25045: a contributor PR whose branch had been disconnected from main's history (likely an accidental 'git checkout --orphan' or '.git/' re-init). GitHub's merge UI does not refuse merges of unrelated histories, so the PR landed cleanly with its intended one-file change but its parent-less root commit (413990c94) got grafted into main as a second root. The merge resolution itself was correct — main's content won for every conflicting file — but ~1500 files' worth of git blame collapsed onto that single commit. Implementation: 'git merge-base origin/main HEAD' exits non-zero and prints nothing when the two commits share no ancestor. Check both conditions and fail with a clear message + recovery steps. Verified: against the historic state of PR #25045 (base 5d90386ba, head 1149e75db), 'git merge-base' returns empty with exit 1, so the new check would have rejected it. * feat(skills/notion): overhaul for Notion Developer Platform (May 2026) (#26612) * feat(skills/notion): overhaul for Notion Developer Platform (May 2026) Notion shipped its Developer Platform on May 13, 2026: ntn CLI, Workers, Markdown API, bidirectional webhooks, agent tools. The existing skill only covered curl + integration token CRUD, so it didn't surface any of the new ergonomics — particularly the /markdown endpoints (much easier for agents to consume) and the ntn CLI for headless API + Workers management. This rewrite (v1.0.0 -> v2.0.0): - Splits setup into Path A (HTTP, cross-platform incl. Windows), Path B (ntn CLI on macOS/Linux, with NOTION_API_TOKEN env var for headless), and Path C (Windows fallback — HTTP API or WSL2; native ntn is 'coming soon'). - Keeps the full curl reference (still the only Windows-compatible path). - Adds /markdown endpoints — GET and PATCH page-as-markdown, plus POST /v1/pages with a markdown body param. Agent-friendly, no CLI required. - Adds ntn CLI cheat sheet for raw API shorthand, file uploads, and workspace flags. - Adds Notion Workers section: scaffold, tool/webhook capability shapes, lifecycle commands. Gated on Business/Enterprise plans + macOS/Linux. - Adds Notion-flavored Markdown reference (callouts, toggles, columns, mentions, colors) for the /markdown endpoints. - Adds a 'choose the right path' decision table at the bottom. - Notes the new efficient Notion MCP server as an optional wiring path. Auto-generated docs page regenerated via website/scripts/generate-skill-docs.py. * docs(skills-catalog): update notion description for v2.0.0 * fix(delegate): move heartbeat thread start inside try block to prevent orphan _heartbeat_thread.start() was called before the try/finally block that contains _heartbeat_stop.set(). If _register_subagent() or any code between .start() and try: raised an exception, the finally block would never run — leaving the heartbeat thread as an orphan that continues calling _touch_activity() on the parent agent, incorrectly resetting gateway timeout counters. Move _heartbeat_thread.start() to be the first statement inside the try block so the finally block always reaches _heartbeat_stop.set() regardless of how the child run completes or fails. Root cause: heartbeat start outside try/finally scope Impact: orphan heartbeat thread incorrectly resets parent gateway timeouts * fix(delegate): guard heartbeat join against unstarted thread Pairs with the prior commit (start() now inside the try block). If threading.Thread.start() itself raises (OS thread exhaustion under heavy delegation fanout), the finally would call .join() on a never-started thread, which raises RuntimeError("cannot join thread before it is started") — trading one rare bug for another. Thread.ident is None until start() succeeds, so gate the join on it. * fix(memory): eliminate TOCTOU race in Windows file lock creation On Windows (msvcrt path), _file_lock() first checked if the lock file existed and wrote it with write_text(), then opened it with open('r+'). Between these two calls, another process could delete the file causing open('r+') to raise FileNotFoundError — uncaught, leaving memory writes to proceed without holding the lock, risking data corruption. Replace the three-line sequence with a single open('a+', ...) call which atomically creates the file if missing or opens it if it exists, closing the TOCTOU window entirely. The existing fd.seek(0) before msvcrt.locking() is preserved and sufficient for correct lock byte positioning. Root cause: TOCTOU between lock_path.write_text() and open('r+') Impact: concurrent memory writes on Windows could corrupt MEMORY.md * fix(windows): stop spamming cwd-missing + tirith-spawn warnings on every terminal call Two log-spam fixes surfaced by a Windows user (Git Bash + Python 3.11.9): 1. LocalEnvironment cwd warn spam ============================ Git Bash's `pwd -P` emits paths like `/c/Users/x`. The base-class `_extract_cwd_from_output` was assigning this verbatim to `self.cwd` without validation, then `_resolve_safe_cwd`'s `os.path.isdir(/c/...)` returned False on Windows, triggering: LocalEnvironment cwd '/c/Users/NVIDIA' is missing on disk; falling back to '/' so terminal commands keep working. ...on every terminal call. The pre-existing Windows-path translation inside `_run_bash` ran AFTER the safe-cwd check, so it could never prevent the warning. Fix: - New `_msys_to_windows_path` helper (idempotent, no-op off Windows). - `_resolve_safe_cwd` normalizes before `isdir`, so a valid MSYS path is recognized as the real directory it points at. - `LocalEnvironment._update_cwd` and a new override of `_extract_cwd_from_output` translate + validate before mutating `self.cwd`. Stale / non-existent marker paths roll back to the previous cwd instead of clobbering it. - The fallback warning still fires when the directory really is gone (deletion-recovery scenario from #17558 still covered). 2. tirith spawn-failed warn spam ============================= When tirith isn't installed (background install in flight, or marked failed for the day) and the configured path stays as the bare string `tirith`, every `subprocess.run([tirith_path, ...])` raises OSError and logged: tirith spawn failed: [WinError 2] The system cannot find the file specified ...on every command. fail_open=True means behaviour is correct, but the log noise is severe. Fix: - `_warn_once(key, ...)` thread-safe dedupe helper. - Three hot-path warnings (`tirith path resolved to None`, `tirith spawn failed: ...`, `tirith timed out after Ns`) now log once per (exception class, errno) / timeout-value / path-none key. - Dedupe set is cleared on `_clear_install_failed` so a successful install lets a subsequent failure surface again. Tests ===== - `tests/tools/test_local_env_windows_msys.py`: 12 tests covering the MSYS→Windows translator, the resolve fast-path, update_cwd validation, and extract_cwd_from_output rollback. - `tests/tools/test_tirith_security.py`: 4 new dedupe tests (15 spawn failures → 1 log line; distinct exc types → 2 lines; timeout dedupe; path-None dedupe). Targeted runs: test_local_env_windows_msys.py 12 passed test_local_env_cwd_recovery.py 7 passed (pre-existing, no regressions) test_tirith_security.py 67 passed (63 pre-existing + 4 new) test_base_environment + local_* 37 passed (no regressions) test_local_env_blocklist + neighbours 114 passed Reported via Hermes log capture: 19× cwd warnings + 15× tirith warnings in a single short session. * fix(xai-oauth): recover from prelude SSE errors, gate reasoning replay, surface entitlement 403s (#26644) Three fixes for the May 2026 xAI OAuth (SuperGrok / X Premium) rollout failures: - _run_codex_stream: when openai SDK raises RuntimeError("Expected to have received `response.created` before `<type>`"), retry once then fall back to responses.create(stream=True) — same path used for missing-response.completed postlude. Fallback surfaces the real provider error with body+status_code intact. Also fixes #8133 (response.in_progress prelude on custom relays) and #14634 (codex.rate_limits prelude on codex-lb). - _summarize_api_error: when error body matches xAI's entitlement shape, append a one-line hint pointing to https://grok.com and /model. Once-only, applies to both auxiliary warnings and main-loop error surfacing. - _chat_messages_to_responses_input: new is_xai_responses kwarg drops replayed codex_reasoning_items (encrypted_content) before they reach xAI. Also drops reasoning.encrypted_content from the xAI include array. Native Codex behavior unchanged. Grok still reasons natively each turn; coherence rides on visible message text alone. Closes #8133, #14634. * feat(deepseek): add thinking.type + reasoning_effort mapping for DeepSeek API DeepSeek's thinking mode requires both: - extra_body.thinking.type: "enabled" to activate thinking mode - top-level reasoning_effort: "max" or "high" to control depth Previously, the ChatCompletionsTransport only handled Kimi's thinking mode — DeepSeek was left unmapped, so reasoning_effort config was silently dropped. This patch: 1. Adds is_deepseek: bool to the Params dataclass, detected by base_url matching api.deepseek.com 2. Maps Hermes effort levels (xhigh/max → "max", low/medium/high → themselves) to the top-level reasoning_effort parameter 3. Sets extra_body.thinking.type alongside the effort 4. Strips reasoning_content from assistant messages sent back to DeepSeek, preventing 400 errors when thinking was enabled * fix(deepseek): wire thinking-mode via DeepSeekProfile, not legacy fallback The cherry-picked PR #15251 from @tw2818 correctly identified the DeepSeek 400 root cause but placed the fix in the legacy fallback path of `build_kwargs`, which DeepSeek never reaches — DeepSeek has a registered ProviderProfile and goes through `_build_kwargs_from_profile` instead. The legacy-path block was therefore dead code. This commit pivots the fix to where it actually fires: - New `DeepSeekProfile` in `plugins/model-providers/deepseek/__init__.py` overrides `build_api_kwargs_extras` to emit DeepSeek's expected wire format (mirrors `KimiProfile`): {"reasoning_effort": "<low|medium|high|max>", "extra_body": {"thinking": {"type": "enabled" | "disabled"}}} - Model gating: only `deepseek-v4-*` and `deepseek-reasoner` emit thinking control. `deepseek-chat` (V3) is untouched — current behavior. - Effort mapping: low/medium/high passthrough, xhigh/max → max, unset → omitted (DeepSeek server applies its own default). - Revert the legacy-path additions from PR #15251 — they were dead code, and the `_copy_reasoning_content_for_api` strip block specifically would have nullified the existing reasoning_content padding machinery (`_needs_deepseek_tool_reasoning` → space-pad on replay) that the active provider already relies on for replay correctness. - Unit tests pin the wire-shape contract and the model gating rules (26 tests, all passing). Existing transport + provider profile suites (321 tests) continue to pass. - AUTHOR_MAP: map twebefy@gmail.com → tw2818 for release notes credit. Closes #15700, #17212, #17825. Co-authored-by: tw2818 <twebefy@gmail.com> * feat(docs): show per-skill pages in the left sidebar (#26646) Individual skill pages (e.g. /docs/user-guide/skills/bundled/productivity/notion) had no sidebar rendered — the sidebar config only listed the two catalog index pages. That was an intentional choice from an earlier 'too many entries would drown product docs' concern, but the effect is that a user landing on any skill page (via search, share link, or the catalog table) loses navigation entirely and can't see related skills. Wire build_sidebar_items() (which was already computed and discarded) back into the sidebar. Structure: Skills ├── Bundled skills catalog (catalog table, was already there) ├── Optional skills catalog (catalog table, was already there) ├── Bundled │ ├── apple/ │ │ ├── apple-apple-notes │ │ └── ... │ └── ... (one collapsed category per skill category) └── Optional └── ... (same) Categories are collapsed by default so the top-level Skills entry doesn't explode visually. Users browsing one skill see siblings in the same category; the catalogs remain the at-a-glance entry point. Also includes drift the regen script naturally produces on top of current main: - creative-comfyui v5.0.0 → v5.1.0 page (author + new ref file) - devops-kanban-worker SKILL.md updates - new pages for optional skills that lacked generated docs: hyperliquid, finance-stocks, software-development/rest-graphql-debug - updated optional-skills-catalog row for those Validation: - npx docusaurus build (en locale) succeeded — only pre-existing warnings - inspected built productivity-notion/index.html: sidebar tree present, sibling productivity skills (airtable, linear, etc.) all linked * fix(xai-oauth): break entitlement-403 credential-refresh loop, bump grok-4.3 context to 1M (#26664) Don Piedro's 18-minute hang on grok-4.3 traced to two issues PR #26644 didn't cover: - _recover_with_credential_pool classifies 403 as FailoverReason.auth and calls pool.try_refresh_current(). For xAI OAuth on an unsubscribed account, refresh succeeds (mints a new token from the same account) but the next API call 403s with the same entitlement error. Result: infinite refresh → retry → 403 loop until Ctrl+C (1133s in Don's log). New _is_entitlement_failure(error_context, status_code) detects the subscription-shape body ("do not have an active Grok subscription" / "out of available resources" + grok / "does not have permission" + grok) and short-circuits recovery so _summarize_api_error surfaces PR #26644's friendly hint. - grok-4.3 resolved to 256k via the grok-4 catch-all in DEFAULT_CONTEXT_LENGTHS. Per docs.x.ai/developers/models/grok-4.3 the model ships with 1M context. Add explicit grok-4.3 entry before the grok-4 fallback (longest-first substring matching ensures grok-4.3 and grok-4.3-latest both land on the new value). Tests: 8 new (23 total in test_codex_xai_oauth_recovery.py). E2E verified Don's 100-iteration loop bails out with 0 refresh calls while genuine auth failures still refresh once and recover. * fix(xai-oauth): rewrite entitlement-403 hint to not accuse subscribers (#26666) PR #26644 confidently told users "xAI OAuth account lacks SuperGrok / X Premium entitlement" on any 403 from xAI's permission-denied surface. But that body is returned for at least four distinct causes that Hermes cannot distinguish from the wire: * Account has no Grok subscription at all * Account has SuperGrok but the tier doesn't include the requested model (e.g. grok-4.3 needs SuperGrok Heavy) * Monthly quota for the subscribed tier is exhausted * SuperGrok is active but the API access add-on isn't enabled Don Piedro pushed back that he IS subscribed yet still hit this. Picking the worst-case interpretation ("you're not subscribed") reads as wrong and insulting to subscribers, and points them at a fix they already did. New wording lists all 4 possibilities and points at https://grok.com/?_s=usage where the user can check which applies. The detection logic and credential-pool short-circuit (PR #26664) are unchanged — only the user-facing wording is rephrased. * fix(xai-oauth): lead entitlement-403 hint with X Premium+ gotcha (#26672) The #1 confusing cause of the xAI 403 (per Teknium): X Premium+ subscribers see Grok inside the X app and assume API access is included. It is NOT — only standalone SuperGrok subscribers can use xai-oauth with Hermes today. Without calling this out, every Premium+ user hits the 403 with no idea why. PR #26666's neutral 4-cause list was correct but buried the most common cause. Lead with the Premium+ gotcha, then list the other possibilities (no subscription, wrong tier, exhausted quota) as fallbacks. Same neutral framing — does not accuse anyone of being unsubscribed. * fix(tui): keep DECSTBM scroll region off bottom row (#26683) Avoid shifting the terminal's last visible row in the alt-screen DECSTBM fast path, which can leave transient scroll bleed/discoloration artifacts around the status lane until a repaint. Add regression tests to preserve the fast path when safe and skip it when the hint touches the bottom row. * fix(tui): handle timeout/error subagent statuses in /agents (#26687) Accept delegation timeout/error statuses in the TUI subagent model, normalize unknown status strings defensively, and harden /agents overlay rendering/sorting so unknown statuses cannot crash glyph/color lookup. Add regression tests for live event normalization and disk snapshot replay. * fix(tui): width-aware markdown table rendering with vertical fallback (#26195) * refactor(tui): thread cols through Md/StreamingMd/renderTable, update cache key * feat(tui): three-tier width calc + full-line string rendering in renderTable Replaces the old renderTable (L203-244) with: - Empty table guard - Ragged row normalization - Three-tier column width calculation (ideal → proportional shrink → hard scale) - Rounding remainder distribution - Full-line string rendering (one <Text> per row, not per cell) - wrap=truncate-end on all table lines - All cells rendered as plain text via stripInlineMarkup No wrapping or vertical fallback yet — those come in Phase 3 and 4. * feat(tui): wrapCell with grapheme-safe hard-break + multi-line row rendering Adds: - Intl.Segmenter-based grapheme splitting (fallback to [...word]) - wrapCell() for width-correct word wrapping on stripped text - Multi-line row rendering with LineEntry metadata (header/separator/body) - Post-render safety condition (maxLineWidth computed, vertical fallback in Task 4) - Non-wrapping path preserved for tables that fit at ideal widths * feat(tui): vertical key-value fallback with scaled threshold + safety check Wires: - Scaled row-height threshold (numCols<=3: 8, <=6: 5, else: 4) - Post-render safety check (maxLineWidth > available space) - Header-only edge case - Vertical format: bold headers, stripped cell text, clamped separator width - Iterates headers (not rows) for consistent key-value fields on ragged rows * test(tui): pass cols to Md in test helpers, add width-overflow assertions - renderAtWidth now passes cols={columns} to <Md> so width-aware code paths are exercised in tests - tableFuzz: every rendered line must fit within allocated width (stringWidth) - tableRepro: separator regex updated to match truncation ellipsis - stringWidth imported from @hermes/ink for CJK-correct assertions * fix(tui): address adversarial review — comment tier 3 budget overshoot, eliminate redundant wrapCell - Add comment on Tier 3 MIN_COL_WIDTH clamp exceeding budget (self-heals via safetyOverflow) - Track tallestBodyRow during allEntries build pass instead of re-wrapping every cell in a second traversal (eliminates O(cells) of redundant stripInlineMarkup+stringWidth) * fix(tui): pass cols to recursive fenced-markdown Md, fix test frame extraction - Thread cols into <Md> for fenced markdown blocks (L734) so nested tables use the width-aware renderer instead of max-content path - Fix renderAtWidth helpers to extract final Ink repaint frame instead of concatenating all intermediate frames (REPAINT_RE split) - Add fenced-markdown-table fixture to tableFuzz (exercises the nested path) * chore: remove repro test suites and tmux driver script These were scaffolding for development/reproduction — not needed in the PR. * remove pip installation method from docs * fix(dashboard): clarify Kanban Ready vs assignment Ready column help and fallbacks now describe dependency-ready work; show a badge on unassigned ready cards and fix the stale unassigned tooltip. Align localized Ready help strings with the new semantics. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(auxiliary): resolve xai oauth compression from pool * fix(tui): allow transcript scroll + Esc during approval/clarify/confirm prompts (#26414) When an approval / clarify / confirm overlay was active, the global input handler in useInputHandlers returned for every key that wasn't Ctrl+C, which silently disabled transcript scrolling. On long threads the context the prompt was asking about often lived above the visible viewport, and being unable to scroll while answering felt like the prompt had locked the UI. ApprovalPrompt also had no Esc handler at all, so the one obvious 'abort' key did nothing during a permission prompt and the user had to memorize Ctrl+C or hunt for the deny number. Fixes: - Extract shouldFallThroughForScroll(key) (pure, exported) covering wheel scrolls, PageUp/PageDown, and Shift+ArrowUp/Down. When a prompt overlay is up and the pressed key is a scroll input, skip the early return so it reaches the existing wheel/PageUp/Shift+arrow handlers below. Plain arrows still drive in-prompt selection — they don't fall through. - ApprovalPrompt now maps Esc to onChoice('deny'), parity with the global Ctrl+C cancellation path that already invokes cancelOverlayFromCtrlC() for approvals. The bottom-of-prompt hint now advertises 'Esc/Ctrl+C deny'. - Extract approvalAction(ch, key, sel) — pure key-dispatch helper for the approval prompt, exported so the regression matrix (Esc, numbers, Enter, arrows, edge clamping, precedence) is testable without mounting Ink. Tests: - useInputHandlers.test.ts: 6 cases covering shouldFallThroughForScroll positives (wheel/PageUp/PageDown/Shift+arrows) and negatives (plain arrows, bare shift, no scroll key). - approvalAction.test.ts: 8 cases covering Esc→deny, numeric mapping, Enter, ↑↓ within bounds, edge clamping, Esc-beats-others precedence, unrelated keystrokes. * fix(docs): unique sidebar keys for duplicate skill categories (#26726) The per-skill sidebar tree from PR #26646 emitted category entries with only a label. Docusaurus derives translation keys from the label (sidebar.docs.category.<label>), and categories that exist in both Bundled and Optional (productivity, mcp, mlops, research, email, software-development, dogfood) collided on identical keys — failing i18n extraction and the Deploy Site build. Result: source had the sidebar fix but no per-skill page rendered with a sidebar in production. Add a 'key: skills-<source>-<category>' attribute to each generated category dict so Bundled vs Optional get distinct translation keys. Regenerated sidebars.ts via the script. Local docusaurus build passes. * fix(windows): silence tirith-unavailable banner + skip install/spawn attempts on unsupported platforms (#26718) Tirith ships no Windows binary, so on every Windows CLI startup users saw a scary 'tirith security scanner enabled but not available' banner they could not act on. The banner suggested degraded security; in reality pattern-matching guards still run and the message was pure noise. Fix: - New public is_platform_supported() helper in tools/tirith_security.py that returns False when _detect_target() doesn't resolve (Windows, any non-x86_64/aarch64 arch). - ensure_installed(), _resolve_tirith_path(), and check_command_security() short-circuit on unsupported platforms: cache _resolved_path = _INSTALL_FAILED with reason 'unsupported_platform', skip PATH probes, skip the background download thread, skip the disk failure marker, and return allow with an empty summary from check_command_security so the spawn loop never fires. - Explicit user-configured tirith_path is still honored everywhere (a user who built tirith themselves under WSL keeps that path). - CLI banner in cli.py gated on is_platform_supported() — fires only on platforms where tirith *should* work but isn't installed. - Docs note tirith's supported-platform list and point Windows users at WSL. Tests: tests/tools/test_tirith_security.py +8 tests covering Linux x86_64, Darwin arm64, Windows, and unknown-arch verdicts plus the silent ensure_installed / check_command_security / _resolve_tirith_path fast-paths and the explicit-path override. test_tirith_security.py 75 passed (8 new + 67 pre-existing) test_command_guards.py 19 passed * fix(dashboard): align Ukrainian Kanban Ready column help Mirrors the dependency-ready / assign-profile semantics used in other locales; Copilot review noted uk.ts was still on the old dispatcher-tick wording. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): tolerate unreadable dirs when building systemd PATH generate_systemd_unit runs _build_service_path_dirs(); tests that mimic sudo (Path.home → /root) caused is_dir() to raise PermissionError for unprivileged users on /root/.hermes/..., failing CI. Treat inaccessible paths like missing. Co-authored-by: Cursor <cursoragent@cursor.com> * Revert "fix(cli): tolerate unreadable dirs when building systemd PATH" This reverts commit 965610f922be5b2afb6fa412205077486734a433. * feat(skill): darwinian-evolver optional skill Thin wrapper around Imbue's darwinian_evolver (AGPL-3.0, subprocess-only). Ships a working OpenRouter driver (parrot_openrouter.py), a snapshot inspector (show_snapshot.py), and a custom-problem template. SKILL.md has 58-char description, Pitfalls sourced from actually running the loop: non-viable seed trap, Azure content filter killing runs, loop.run() being a generator, nested-pickle snapshots, and aggressive default concurrency. Salvaged from #12719 by @Bihruze — original PR shipped 12,289 LOC across 61 files (29 Python modules, FastAPI dashboard, VS Code extension, benchmark hub, marketplace, etc.) which was far beyond the scope of the underlying issue (#336). This version stays at the ~700-LOC scope that issue actually asked for. Authorship of the original effort credited via AUTHOR_MAP entry and the SKILL.md author field. Verified end-to-end: seed 'Say {{ phrase }}' (score 0.000) evolved into 'Please repeat the following phrase exactly as it is, without any modifications or additional formatting: {{ phrase }}' (score 0.750) across 3 iterations on gpt-4o-mini via OpenRouter. Co-authored-by: Bihruze <98262967+Bihruze@users.noreply.github.com> * chore(skills/darwinian-evolver): AUTHOR_MAP + docs regen * fix(agent): retry malformed anthropic stream parser errors * feat(plugins): tool override flag for replacing built-in tools (closes #11049) (#26759) Plugins can now replace a built-in tool by passing override=True to ctx.register_tool(). Without it, the registry rejects any registration that would shadow an existing tool from a different toolset (unchanged default behavior). Unlocks the use case from #11049: drop-in replacement of browser/web backends without forking core. Composes with the existing pre_tool_call hook for runtime interception of any implementation. The override is audit-logged at INFO so it surfaces in agent.log. * docs: add Programmatic Integration overview (closes #360) Document the three protocols already available for driving hermes-agent from external programs — ACP, the TUI gateway JSON-RPC, and the OpenAI-compatible API server — with a 'which one should I use' guide and a Pi-style RPC command mapping table. Sidebar entry under Developer Guide -> Architecture. * feat(skills): add optional pinggy-tunnel skill Zero-install localhost tunnels over SSH via Pinggy. Covers HTTP/HTTPS, TCP, TLS, access control (basic auth / bearer / IP whitelist), header manipulation (CORS, force-HTTPS), web debugger, Pro token mode, and four composite recipes (webhook receiver, MCP server exposure, local LLM endpoint share, dev-server quick-share with one-shot password). Closes #361 * fix(tui): keep Ink displayCursor in sync with fast-echo writes so cursor stops drifting (#26717) * fix(tui): keep Ink displayCursor in sync with fast-echo writes so cursor stops drifting TextInput's fast-echo bypass writes characters directly to stdout to avoid waiting on a React re-render for each keystroke. The hardware cursor advances by text.length cells, but Ink's cached `displayCursor` (the basis for the next frame's relative cursor-move preamble in log-update) stayed unchanged. When ANY unrelated component re-rendered between the fast-echo write and the deferred composer setCur/setParent flush — status bar timer, streaming reasoning, etc. — the next frame's preamble emitted a relative cursor move from a stale parked position and the hardware cursor parked N cells offset from the actual caret. Visible symptom: extra whitespace between the just-typed character and the cursor block, intermittent, worse on long sessions during streaming. Alt-screen was immune because frames begin with absolute CSI H. This adds a small API in @hermes/ink: - `Ink.noteExternalCursorAdvance(dx, dy?)` — bumps displayCursor if set, otherwise seeds from frontFrame.cursor so the next preamble's relative move correctly cancels the external advance. No-op on alt-screen. - `CursorAdvanceContext` + `useCursorAdvance()` hook to expose it. TextInput then calls `noteCursorAdvance(text.length)` after the fast-echo `stdout.write(text)` append, and `noteCursorAdvance(-1)` after the fast-backspace `\b \b` sequence. Tests: 4 new vitest cases pin the API contract (bumps when set, seeds from frontFrame.cursor when null, alt-screen no-op, zero-delta no-op). All 751 ui-tui tests pass; tests/test_tui_gateway_server.py (177) pass. * fix(tui): also advance cursorDeclaration so fast-echo survives deferred React state Copilot review on PR #26717 flagged a gap in the original fix: TextInput's fast-echo path defers the React `cur` state update by 16ms (perf optimization that batches re-renders during heavy typing). Inside that window, `useDeclaredCursor` still publishes a target computed from the PRE-keystroke `cur` — `cursorLayout(display, cur, columns)`. Advancing only `displayCursor` would let any unrelated re-render in that 16ms window run onRender's cursor-park branch with the stale declaration and visually undo the fast-echo's advance. The fix is symmetric: `noteExternalCursorAdvance` now bumps BOTH `displayCursor` (the log-update relative-move basis) AND, if non-null, `cursorDeclaration.relativeX/Y` (the target the cursor parks at after every frame). When React finally flushes `setCur`, `useDeclaredCursor` publishes a fresh declaration that supersedes our bumped one — exactly what we want. Adds two new vitest cases covering both halves: - active declaration advances in lock-step with displayCursor - null declaration stays null (no spurious bump) All 753 ui-tui tests pass; tests/test_tui_gateway_server.py (177) pass. Closes review threads: PRRT_kwDOPRF1G86ChKtD (textInput.tsx:1016 fast-echo append) PRRT_kwDOPRF1G86ChKtF (textInput.tsx:924 fast-backspace) PRRT_kwDOPRF1G86ChKtG (ink-cursor-advance.test.ts:57 missing coverage) * fix(tui): make fast-echo survive TextInput rerenders + alt-screen (Copilot round 2) Round 2 of PR #26717 review. Three real holes Copilot flagged after the initial cursorDeclaration bump: 1. alt-screen early-return skipped BOTH halves of the notifier. But the default TUI wraps the composer in <AlternateScreen> — that IS the production path. CSI H resets log-update's relative-move basis, but the alt-screen park branch uses absolute CUP = `rect.x + decl.relativeX`, so a stale declaration there still parks the cursor at the pre-keystroke caret. Fix: skip ONLY the displayCursor half on alt-screen; still bump cursorDeclaration. 2. TextInput's own rerender could clobber the Ink-level bump. The fast- echo path defers setCur by 16ms; if a parent state change rerenders TextInput in that window, the layout effect inside useDeclaredCursor reads the stale React `cur` state and re-publishes a declaration at the OLD column. Fix: `cursorLayout(display, curRef.current, columns)` — read the always- up-to-date ref, not the deferred state. useMemo dropped (compute is cheap, single-line wrap-text in the common case). 3. Tests bypassed the production wiring. Added two structural tests: - `still advances cursorDeclaration on alt-screen` in the Ink-level suite, asserting displayCursor stays put but the declaration advances by the delta. - `textInputCursorSourceOfTruth.test.ts` pins three structural invariants: layout reads curRef.current, never the bare `cur` state, and the fast-echo stdout.write calls remain paired with noteCursorAdvance(±N). Source-grep invariants > flaky Ink mount tests for this kind of regression. 757/757 ui-tui tests pass (+3 over round 1). type-check clean. lint introduces zero new errors on touched files. tests/test_tui_gateway_server.py (177) pass. Closes review threads: PRRT_kwDOPRF1G86ChOG2 (ink.tsx alt-screen guard) PRRT_kwDOPRF1G86ChOG9 (textInput.tsx fast-backspace rerender window) PRRT_kwDOPRF1G86ChOHC (textInput.tsx fast-append rerender window) PRRT_kwDOPRF1G86ChOHJ (alt-screen test asserts wrong invariant) PRRT_kwDOPRF1G86ChOHP (missing integration-style coverage) * fix(tui): reject fast-backspace at soft-wrap boundary (Copilot round 3) PR #26717 round 3. Copilot caught two real things: 1. `\b \b` cannot move the terminal cursor onto the previous visual row across a soft-wrap boundary. When the caret sits at visual column 0 of a wrapped row (e.g. value 'hello ' at width 6 → cursorLayout produces (line 1, col 0)), backspace would leave the physical cursor in place while the logical caret moves up to the end of the previous visual line. `noteCursorAdvance(-1)` would then feed Ink a wrong delta. Fix: `canFastBackspaceShape` now takes the composer width and rejects when `cursorLayout(value, cursor, columns).column === 0`. The fast path falls through to the normal Ink render, which correctly lays out the new caret position. The PR-description inconsistency about alt-screen is fixed in a separate gh pr edit. Adds 4 new tests in textInputFastEcho.test.ts pinning the rejection at exact-multiple wrap boundaries plus a positive control inside a wrapped line and a back-compat case where `columns` is omitted. 761/761 ui-tui tests pass. type-check / lint clean. 177/177 Python tests/test_tui_gateway_server.py pass. Closes review threads: PRRT_kwDOPRF1G86ChxE5 (textInput.tsx:933 wrap-boundary regression) * fix(tui): polish doc + tests after Copilot round 4 Three polish points Copilot raised: 1. canFastBackspaceShape doc comment overstated the legacy contract — said it conservatively rejects potential wrap boundaries when columns is omitted, but the implementation actually skips the wrap-boundary check entirely. Reworded to make the legacy behavior explicit and warn callers not to rely on protection they don't get. 2. ink-cursor-advance.test.ts rationale comment for the 'advances cursorDeclaration in lock-step' case still referenced the pre-fix `cursorLayout(display, cur, columns)` expression. Now accurately describes the current source of truth — `curRef.current` in textInput.tsx — and explains the window the bump is bridging. 3. Removed the three `__get*ForTest` accessors from Ink. The test file already cast the instance to inspect private state in the couple of tests that needed declaration mutation; the rest now use a small `peek(ink)` helper that does the same cast for reads. No test-only API surface ships in production. 761/761 ui-tui tests pass. type-check clean. lint introduces zero new errors on touched files. 177/177 tests/test_tui_gateway_server.py pass. Closes review threads: PRRT_kwDOPRF1G86Ch23W (canFastBackspaceShape doc accuracy) PRRT_kwDOPRF1G86Ch23f (stale test rationale) PRRT_kwDOPRF1G86Ch23p (test-only API surface in production) * fix(tui): tighten doc + add dy test coverage (Copilot round 5) Two polish points from round 5: 1. canFastBackspaceShape doc had two paragraphs that conflicted — the main 'Additionally rejects when the physical cursor sits at visual column 0' was stated unconditionally, then the columns-param paragraph qualified that it only happens when columns is passed. Reworked into clear 'When supplied / When omitted' branches with a concrete example value ('hello ' returns true without columns even though it would be unsafe at width 6). No more inconsistency. 2. Added a test asserting cursorDeclaration.relativeY advances when dy is non-zero. Existing tests exercised dy on displayCursor onl…

NousResearch#26666) PR NousResearch#26644 confidently told users "xAI OAuth account lacks SuperGrok / X Premium entitlement" on any 403 from xAI's permission-denied surface. But that body is returned for at least four distinct causes that Hermes cannot distinguish from the wire: * Account has no Grok subscription at all * Account has SuperGrok but the tier doesn't include the requested model (e.g. grok-4.3 needs SuperGrok Heavy) * Monthly quota for the subscribed tier is exhausted * SuperGrok is active but the API access add-on isn't enabled Don Piedro pushed back that he IS subscribed yet still hit this. Picking the worst-case interpretation ("you're not subscribed") reads as wrong and insulting to subscribers, and points them at a fix they already did. New wording lists all 4 possibilities and points at https://grok.com/?_s=usage where the user can check which applies. The detection logic and credential-pool short-circuit (PR NousResearch#26664) are unchanged — only the user-facing wording is rephrased.

…sResearch#26672) The NousResearch#1 confusing cause of the xAI 403 (per Teknium): X Premium+ subscribers see Grok inside the X app and assume API access is included. It is NOT — only standalone SuperGrok subscribers can use xai-oauth with Hermes today. Without calling this out, every Premium+ user hits the 403 with no idea why. PR NousResearch#26666's neutral 4-cause list was correct but buried the most common cause. Lead with the Premium+ gotcha, then list the other possibilities (no subscription, wrong tier, exhausted quota) as fallbacks. Same neutral framing — does not accuse anyone of being unsubscribed.

teknium1 merged commit 9818b9a into main May 16, 2026
16 of 17 checks passed

teknium1 deleted the hermes/hermes-cfe77b12 branch May 16, 2026 00:15

teknium1 mentioned this pull request May 16, 2026

fix(xai-oauth): lead entitlement-403 hint with X Premium+ gotcha #26672

Merged

alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have provider/xai xAI (Grok) comp/agent Core agent loop, run_agent.py, prompt builder labels May 16, 2026

Haderach-Ram mentioned this pull request May 16, 2026

Ecosystem Digest — 2026-05-16 Haderach-Ram/openclaw-radar#9

Open

alt-glitch mentioned this pull request May 17, 2026

High quota burn with xAI Grok-4.3 OAuth on SuperGrok $30 plan #27228

Closed

teknium1 mentioned this pull request May 17, 2026

chore: rebase #25968 (xai-oauth) on latest main + fix review issues #27148

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(xai-oauth): rewrite entitlement-403 hint to not accuse subscribers#26666

fix(xai-oauth): rewrite entitlement-403 hint to not accuse subscribers#26666
teknium1 merged 1 commit into
mainfrom
hermes/hermes-cfe77b12

teknium1 commented May 16, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented May 16, 2026

Summary

Why

Changes

Validation

Before / After

Uh oh!

Uh oh!

github-actions Bot commented May 16, 2026

🔎 Lint report: hermes/hermes-cfe77b12 vs origin/main

ruff

ty (type checker)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

🔎 Lint report: `hermes/hermes-cfe77b12` vs `origin/main`