Sync upstream main into fork#2
Conversation
Expose skill usage in analytics so the dashboard and insights output can
show which skills the agent loads and manages over time.
This adds skill aggregation to the InsightsEngine by extracting
`skill_view` and `skill_manage` calls from assistant tool_calls,
computing per-skill totals, and including the results in both terminal
and gateway insights formatting. It also extends the dashboard analytics
API and Analytics page to render a Top Skills table.
Terminology is aligned with the skills docs:
- Agent Loaded = `skill_view` events
- Agent Managed = `skill_manage` actions
Architecture:
- agent/insights.py collects and aggregates per-skill usage
- hermes_cli/web_server.py exposes `skills` on `/api/analytics/usage`
- web/src/lib/api.ts adds analytics skill response types
- web/src/pages/AnalyticsPage.tsx renders the Top Skills table
- web/src/i18n/{en,zh}.ts updates user-facing labels
Tests:
- tests/agent/test_insights.py covers skill aggregation and formatting
- tests/hermes_cli/test_web_server.py covers analytics API contract
including the `skills` payload
- verified with `cd web && npm run build`
Files changed:
- agent/insights.py
- hermes_cli/web_server.py
- tests/agent/test_insights.py
- tests/hermes_cli/test_web_server.py
- web/src/i18n/en.ts
- web/src/i18n/types.ts
- web/src/i18n/zh.ts
- web/src/lib/api.ts
- web/src/pages/AnalyticsPage.tsx
…ousResearch#12416) Fixes silent data loss in the TUI when /undo, /compress, /retry, or rollback.restore runs during an in-flight agent turn. The version- guard at prompt.submit:1449 would fail the version check and silently skip writing the agent's result — UI showed the assistant reply but DB / backend history never received it, causing UI↔backend desync that persisted across session resume. Changes (tui_gateway/server.py): - session.undo, session.compress, /retry, rollback.restore (full-history only — file-scoped rollbacks still allowed): reject with 4009 when session.running is True. Users can /interrupt first. - prompt.submit: on history_version mismatch (defensive backstop), attach a 'warning' field to message.complete and log to stderr instead of silently dropping the agent's output. The UI can surface the warning to the user; the operator can spot it in logs. Tests (tests/test_tui_gateway_server.py): 6 new cases. - test_session_undo_rejects_while_running - test_session_undo_allowed_when_idle (regression guard) - test_session_compress_rejects_while_running - test_rollback_restore_rejects_full_history_while_running - test_prompt_submit_history_version_mismatch_surfaces_warning - test_prompt_submit_history_version_match_persists_normally (regression) Validated: against unpatched server.py the three 'rejects_while_running' tests fail and the version-mismatch test fails (no 'warning' field). With the fix, all 6 pass, all 33 tests in the file pass, 74 TUI tests in total pass. Live E2E against the live Python environment confirmed all 5 patches present and guards enforce 4009 exactly as designed.
Several correctness and cost-safety fixes to the Honcho dialectic path after a multi-turn investigation surfaced a chain of silent failures: - dialecticCadence default flipped 3 → 1. PR NousResearch#10619 changed this from 1 to 3 for cost, but existing installs with no explicit config silently went from per-turn dialectic to every-3-turns on upgrade. Restores pre-NousResearch#10619 behavior; 3+ remains available for cost-conscious setups. Docs + wizard + status output updated to match. - Session-start prewarm now consumed. Previously fired a .chat() on init whose result landed in HonchoSessionManager._dialectic_cache and was never read — pop_dialectic_result had zero call sites. Turn 1 paid for a duplicate synchronous dialectic. Prewarm now writes directly to the plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with no extra call. - Prewarm is now dialecticDepth-aware. A single-pass prewarm can return weak output on cold peers; the multi-pass audit/reconcile cycle is exactly the case dialecticDepth was built for. Prewarm now runs the full configured depth in the background. - Silent dialectic failure no longer burns the cadence window. _last_dialectic_turn now advances only when the result is non-empty. Empty result → next eligible turn retries immediately instead of waiting the full cadence gap. - Thread pile-up guard. queue_prefetch skips when a prior dialectic thread is still in-flight, preventing stacked races on _prefetch_result. - First-turn sync timeout is recoverable. Previously on timeout the background thread's result was stored in a dead local list. Now the thread writes into _prefetch_result under lock so the next turn picks it up. - Cadence gate applies uniformly. At cadence=1 the old "cadence > 1" guard let first-turn sync + same-turn queue_prefetch both fire. Gate now always applies. - Restored query-length reasoning-level scaling, dropped in 9a0ab34c. Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars, +2 at ≥400), clamped at reasoningLevelCap. Two new config keys: `reasoningHeuristic` (bool, default true) and `reasoningLevelCap` (string, default "high"; previously parsed but never enforced). Respects dialecticDepthLevels and proportional lighter-early passes. - Restored short-prompt skip, dropped in ef7f315. One-word acknowledgements ("ok", "y", "thanks") and slash commands bypass both injection and dialectic fire. - Purged dead code in session.py: prefetch_dialectic, _dialectic_cache, set_dialectic_result, pop_dialectic_result — all unused after prewarm refactor. Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py, and run_agent/test_run_agent.py. New coverage: - TestTrivialPromptHeuristic (classifier + prefetch/queue skip) - TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard) - TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback) - TestReasoningHeuristic (length bumps, cap clamp, interaction with depth) - TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)
- Revert website/docs and SKILL.md changes; docs unification handled separately - Scrub commit/PR refs and process narration from code comments and test docstrings (no behavior change)
… multi-peer - cli: setup wizard pre-fills dialecticCadence=2 (code default stays 1 so unset → every turn) - honcho.md: fix stale dialecticCadence default in tables, add Session-Start Prewarm subsection (depth runs at init), add Query-Adaptive Reasoning Level subsection, expand Observation section with directional vs unified semantics and per-peer patterns - memory-providers.md: fix stale default, rename Multi-agent/Profiles to Multi-peer setup, add concrete walkthrough for new profiles and sync, document observation toggles + presets, link to honcho.md - SKILL.md: fix stale defaults, add Depth at session start callout
…t discard, empty-streak backoff Hardens the dialectic lifecycle against three failure modes that could leave the prefetch pipeline stuck or injecting stale content: - Stale-thread watchdog: _thread_is_live() treats any prefetch thread older than timeout × 2.0 as dead. A hung Honcho call can no longer block subsequent fires indefinitely. - Stale-result discard: pending _prefetch_result is tagged with its fire turn. prefetch() discards the result if more than cadence × 2 turns passed before a consumer read it (e.g. a run of trivial-prompt turns between fire and read). - Empty-streak backoff: consecutive empty dialectic returns widen the effective cadence (dialectic_cadence + streak, capped at cadence × 8). A healthy fire resets the streak. Prevents the plugin from hammering the backend every turn when the peer graph is cold. - liveness_snapshot() on the provider exposes current turn, last fire, pending fire-at, empty streak, effective cadence, and thread status for in-process diagnostics. - system_prompt_block: nudge the model that honcho_reasoning accepts reasoning_level minimal/low/medium/high/max per call. - hermes honcho status: surface base reasoning level, cap, and heuristic toggle so config drift is visible at a glance. Tests: 550 passed. - TestDialecticLiveness (8 tests): stale-thread recovery, stale-result discard, fresh-result retention, backoff widening, backoff ceiling, streak reset on success, streak increment on empty, snapshot shape. - Existing TestDialecticCadenceAdvancesOnSuccess::test_in_flight_thread_is_not_stacked updated to set _prefetch_thread_started_at so it tests the fresh-thread-blocks branch (stale path covered separately). - test_cli TestCmdStatus fake updated with the new config attrs surfaced in the status block.
…overage - TestDialecticDepth::test_first_turn_runs_dialectic_synchronously: covered by TestSessionStartDialecticPrewarm::test_turn1_falls_back_to_sync_when_prewarm_missing (more realistic — exercises the empty-prewarm → sync-fallback path) - TestDialecticDepth::test_first_turn_dialectic_does_not_double_fire: covered by TestDialecticLifecycleSmoke (turn 1 flow) and TestDialecticCadenceAdvancesOnSuccess::test_empty_dialectic_result_does_not_advance_cadence Both predate the prewarm refactor and test paths that are now fallback behaviors already covered elsewhere.
…wards-compat fallback
Setup wizard now always writes dialecticCadence=2 on new configs and
surfaces the reasoning level as an explicit step with all five options
(minimal / low / medium / high / max), always writing
dialecticReasoningLevel.
Code keeps a backwards-compat fallback of 1 when dialecticCadence is
unset so existing honcho.json configs that predate the setting keep
firing every turn on upgrade. New setups via the wizard get 2
explicitly; docs show 2 as the default.
Also scrubs editorial lines from code and docs ("max is reserved for
explicit tool-path selection", "Unset → every turn; wizard pre-fills 2",
and similar process-exposing phrasing) and adds an inline link to
app.honcho.dev where the server-side observation sync is mentioned in
honcho.md. Recommended cadence range updated to 1-5 across docs and
wizard copy.
The cherry-picked commit from NousResearch#11434 uses the 154585401+ prefixed noreply format. Add it alongside the existing bare entry so the contributor audit passes.
…sResearch#12369) Agents can now send arbitrary CDP commands to the browser. The tool is gated on a reachable CDP endpoint at session start — it only appears in the toolset when BROWSER_CDP_URL is set (from '/browser connect') or 'browser.cdp_url' is configured in config.yaml. Backends that don't currently expose CDP to the Python side (Camofox, default local agent-browser, cloud providers whose per-session cdp_url is not yet surfaced) do not see the tool at all. Tool schema description links to the CDP method reference at https://chromedevtools.github.io/devtools-protocol/ so the agent can web_extract specific method docs on demand. Stateless per call. Browser-level methods (Target.*, Browser.*, Storage.*) omit target_id. Page-level methods attach to the target with flatten=true and dispatch the method on the returned sessionId. Clean errors when the endpoint becomes unreachable mid-session or the URL isn't a WebSocket. Tests: 19 unit (mock CDP server + gate checks) + E2E against real headless Chrome (Target.getTargets, Browser.getVersion, Runtime.evaluate with target_id, Page.navigate + re-eval, bogus method, bogus target_id, missing endpoint) + E2E of the check_fn gate (tool hidden without CDP URL, visible with it, hidden again after unset).
…ng session (NousResearch#12441) session.interrupt on session A was blast-resolving pending clarify/sudo/secret prompts on ALL sessions sharing the same tui_gateway process. Other sessions' agent threads unblocked with empty-string answers as if the user had cancelled — silent cross-session corruption. Root cause: _pending and _answers were globals keyed by random rid with no record of the owning session. _clear_pending() iterated every entry, so the session.interrupt handler had no way to limit the release to its own sid. Fix: - tui_gateway/server.py: _pending now maps rid to (sid, Event) tuples. _clear_pending takes an optional sid argument and filters by owner_sid when provided. session.interrupt passes the calling sid so unrelated sessions are untouched. _clear_pending(None) remains the shutdown path for completeness. - _block and _respond updated to pack/unpack the new tuple format. Tests (tests/test_tui_gateway_server.py): 4 new cases. - test_interrupt_only_clears_own_session_pending: two sessions with pending prompts, interrupting one must not release the other. - test_interrupt_clears_multiple_own_pending: same-sid multi-prompt release works. - test_clear_pending_without_sid_clears_all: shutdown path preserved. - test_respond_unpacks_sid_tuple_correctly: _respond handles the tuple format. Also updated tests/tui_gateway/test_protocol.py to use the new tuple format for test_block_and_respond and test_clear_pending. Live E2E against the live Python environment confirmed cross-session isolation: interrupting sid_a released its own pending prompt without touching sid_b's. All 78 related tests pass.
…arch#12444) When Discord splits a long message at 2000 chars, _enqueue_text_event buffers each chunk and schedules a _flush_text_batch task with a short delay. If another chunk lands while the prior flush task is already inside handle_message, _enqueue_text_event calls prior_task.cancel() — and without asyncio.shield, CancelledError propagates from the flush task into handle_message → the agent's streaming request, aborting the response the user was waiting on. Reproducer: user sends a 3000-char prompt (split by Discord into 2 messages). Chunk 1 lands, flush delay starts, chunk 2 lands during the brief window when chunk 1's flush has already committed to handle_message. Agent's current streaming response is cancelled with CancelledError, user sees a truncated or missing reply. Fix (gateway/platforms/discord.py): - Wrap the handle_message call in asyncio.shield so the inner dispatch is protected from the outer task's cancel. - Add an except asyncio.CancelledError clause so the outer task still exits cleanly when cancel lands during the sleep window (before the pop) — semantics for that path are unchanged. The new flush task spawned by the follow-up chunk still handles its own batch via the normal pending-message / active-session machinery in base.py, so follow-ups are not lost. Tests: tests/gateway/test_text_batching.py — test_shield_protects_handle_message_from_cancel. Tracks a distinct first_handle_cancelled event so the assertion fails cleanly when the shield is missing (verified by stashing the fix and re-running). Live E2E on the live-loaded DiscordAdapter: first_handle_cancelled: False (shield worked) first_handle_completed: True (handle_message ran to completion)
Follow-up for the helix4u easy-fix salvage batch: - route remaining context-engine quiet-mode output through _should_emit_quiet_tool_messages() so non-CLI/library callers stay silent consistently - drop the extra senderAliases computation from WhatsApp allowlist-drop logging and remove the now-unused import This keeps the batch scoped to the intended fixes while avoiding leaked quiet-mode output and unnecessary duplicate work in the bridge.
Extends the existing cron script hook with a wake gate ported from nanoclaw NousResearch#1232. When a cron job's pre-check Python script (already sandboxed to HERMES_HOME/scripts/) writes a JSON line like ```json {"wakeAgent": false} ``` on its last stdout line, `run_job()` returns the SILENT marker and skips the agent entirely — no LLM call, no delivery, no tokens spent. Useful for frequent polls (every 1-5 min) that only need to wake the agent when something has genuinely changed. Any other script output (non-JSON, missing key, non-dict, `wakeAgent: true`, truthy/falsy non-False values) behaves as before: stdout is injected as context and the agent runs normally. Strict `False` is required to skip — avoids accidental gating from arbitrary JSON. Refactor: - New pure helper `_parse_wake_gate(script_output)` in cron/scheduler.py - `_build_job_prompt` accepts optional `prerun_script` tuple so the script runs exactly once per job (run_job runs it for the gate check, reuses the output for prompt injection) - `run_job` short-circuits with SILENT_MARKER when gate fires Script failures (success=False) still cannot trigger the gate — the failure is reported as context to the agent as before. This replaces the approach in closed PR NousResearch#3837, which inlined bash scripts via tempfile and lost the path-traversal/scripts-dir sandbox that main's impl has. The wake-gate idea (the one net-new capability) is ported on top of the existing sandboxed Python-script model. Tests: - 11 pure unit tests for _parse_wake_gate (empty, whitespace, non-JSON, non-dict JSON, missing key, truthy/falsy non-False, multi-line, trailing blanks, non-last-line JSON) - 5 integration tests for run_job wake-gate (skip returns SILENT, wake-true passes through, script-runs-only-once, script failure doesn't gate, no-script regression) - Full tests/cron/ suite: 194/194 pass
… streamed text (NousResearch#8124) When a streaming edit fails mid-stream (flood control, transport error) and a tool boundary arrives before the fallback threshold is reached, the pre-boundary tail in `_accumulated` was silently discarded by `_reset_segment_state`. The user saw a frozen partial message and missing words on the other side of the tool call. Flush the undelivered tail as a continuation message before the reset, computed relative to the last successfully-delivered prefix so we don't duplicate content the user already saw.
…esearch#12471) During gateway shutdown, a message arriving while cancel_background_tasks is mid-await (inside asyncio.gather) spawns a fresh _process_message_background task via handle_message and adds it to self._background_tasks. The original implementation's _background_tasks.clear() at the end of cancel_background_tasks dropped the reference; the task ran untracked against a disconnecting adapter, logged send-failures, and lingered until it completed on its own. Fix: wrap the cancel+gather in a bounded loop (MAX_DRAIN_ROUNDS=5). If new tasks appeared during the gather, cancel them in the next round. The .clear() at the end is preserved as a safety net for any task that appeared after MAX_DRAIN_ROUNDS — but in practice the drain stabilizes in 1-2 rounds. Tests: tests/gateway/test_cancel_background_drain.py — 3 cases. - test_cancel_background_tasks_drains_late_arrivals: spawn M1, start cancel, inject M2 during M1's shielded cleanup, verify M2 is cancelled. - test_cancel_background_tasks_handles_no_tasks: no-op path still terminates cleanly. - test_cancel_background_tasks_bounded_rounds: baseline — single task cancels in one round, loop terminates. Regression-guard validated: against the unpatched implementation, the late-arrival test fails with exactly the expected message ('task leaked'). With the fix it passes. Blast radius is shutdown-only; the audit classified this as MED. Shipping because the fix is small and the hygiene is worth it. While investigating the audit's other MEDs (busy-handler double-ack, Discord ExecApprovalView double-resolve, UpdatePromptView double-resolve), I verified all three were false positives — the check-and-set patterns have no await between them, so they're atomic on single-threaded asyncio. No fix needed for those.
…inuation (NousResearch#7183) When _send_fallback_final() is called with nothing new to deliver (the visible partial already matches final_text), the last edit may still show the cursor character because fallback mode was entered after a failed edit. Before this fix the early-return path left _already_sent = True without attempting to strip the cursor, so the message stayed frozen with a visible ▉ permanently. Adds a best-effort edit inside the empty-continuation branch to clean the cursor off the last-sent text. Harmless when fallback mode wasn't actually armed or when the cursor isn't present. If the strip edit itself fails (flood still active), we return without crashing and without corrupting _last_sent_text. Adapted from PR NousResearch#7429 onto current main — the surrounding fallback block grew the NousResearch#10807 stale-prefix handling since NousResearch#7429 was written, so the cursor strip lives in the new else-branch where we still return early. 3 unit tests covering: cursor stripped on empty continuation, no edit attempted when cursor is not configured, cursor-strip edit failure handled without crash. Originally proposed as PR NousResearch#7429.
Follow-up on top of the helix4u NousResearch#6392 cherry-pick: - reuse one helper for actionable Docker-local file-not-found errors across document/image/video/audio local-media send paths - include /outputs/... alongside /output/... in the container-local path hint - soften the gateway startup warning so it does not imply custom host-visible mounts are broken; the warning now targets the specific risky pattern of emitting container-local MEDIA paths without an explicit export mount - add focused regressions for /outputs/... and non-document media hint coverage This keeps the salvage aligned with the actual MEDIA delivery problem on current main while reducing false-positive operator messaging.
Follow-up on top of the helix4u NousResearch#12388 cherry-picks: - make deferred post-delivery callbacks generation-aware end-to-end so stale runs cannot clear callbacks registered by a fresher run for the same session - bind callback ownership to the active session event at run start and snapshot that generation inside base adapter processing so later event mutation cannot retarget cleanup - pass run_generation through proxy mode and drop stale proxy streams / final results the same way local runs are dropped - centralize stop/new interrupt cleanup into one helper and replace the open-coded branches with shared logic - unify internal control interrupt reason strings via shared constants - remove the return from base.py's finally block so cleanup no longer swallows cancellation/exception flow - add focused regressions for generation forwarding, proxy stale suppression, and newer-callback preservation This addresses all review findings from the initial NousResearch#12388 review while keeping the fix scoped to stale-output/typing-loop interrupt handling.
…ousResearch#12026) Cherry-picked from PR NousResearch#12481 by @Sanjays2402. Reasoning models (GLM-5.1, QwQ, DeepSeek R1) inflate completion_tokens with internal thinking tokens. The compression trigger summed prompt_tokens + completion_tokens, causing premature compression at ~42% actual context usage instead of the configured 50% threshold. Now uses only prompt_tokens — completion tokens don't consume context window space for the next API call. - 3 new regression tests - Added AUTHOR_MAP entry for @Sanjays2402 Closes NousResearch#12026
Cherry-picked from PR NousResearch#12252 by @sirEven. Models like GLM-5.1 via Ollama can produce malformed tool_call arguments (truncated JSON, trailing commas, Python None). The existing except Exception: pass silently passes broken args to the API, which rejects them with HTTP 400, crashing the session. Adds a multi-stage repair pipeline at the pre-send normalization point: 1. Empty/whitespace-only → {} 2. Python None literal → {} 3. Strip trailing commas 4. Auto-close unclosed brackets 5. Remove excess closing delimiters 6. Last resort: replace with {} (logged at WARNING)
Follow-up for PR NousResearch#12252 salvage: - Extract 75-line inline repair block to _repair_tool_call_arguments() module-level helper for testability and readability - Remove redundant 'import re as _re' (re already imported at line 33) - Bound the while-True excess-delimiter removal loop to 50 iterations - Add 17 tests covering all 6 repair stages - Add sirEven to AUTHOR_MAP in release.py
Prefer session_store origin over _parse_session_key() for shutdown notifications. Fixes misrouting when chat identifiers contain colons (e.g. Matrix room IDs like !room123:example.org). Falls back to session-key parsing when no persisted origin exists. Co-authored-by: Ruzzgar <ruzzgarcn@gmail.com> Ref: NousResearch#12766
…hisper (NousResearch#2544) Cherry-picked from PR NousResearch#2545 by @Mibayy. The setup wizard could leave stt.model: "whisper-1" in config.yaml. When using the local faster-whisper provider, this crashed with "Invalid model size 'whisper-1'". Voice messages were silently ignored. _normalize_local_model() now detects cloud-only names (whisper-1, gpt-4o-transcribe, etc.) and maps them to the default local model with a warning. Valid local sizes (tiny, base, small, medium, large-v3) pass through unchanged. - Renamed _normalize_local_command_model -> _normalize_local_model (backward-compat wrapper preserved) - 6 new tests including integration test - Added lowercase AUTHOR_MAP alias for @Mibayy Closes NousResearch#2544
…rd-skill-analytics feat: add skill analytics to the dashboard
Make the Ink TUI match macOS keyboard expectations: Command handles copy and common editor/session shortcuts, while Control remains reserved for interrupt/cancel flows. Update the visible hotkey help to show platform-appropriate labels.
- Fix critical regression: on Linux, Ctrl+C could not interrupt/clear/exit because isAction(key,'c') shadowed the isCtrl block (both resolve to k.ctrl on non-macOS). Restructured: isAction block now falls through to interrupt logic on non-macOS when no selection exists. - Remove double pbcopy: ink's copySelection() already calls setClipboard() which handles pbcopy+tmux+OSC52. The extra writeClipboardText call in useInputHandlers copySelection() was firing pbcopy a second time. - Remove allowClipboardHotkeys prop from TextInput — every caller passed isMac, and TextInput already imports isMac. Eliminated prop-drilling through appLayout, maskedPrompt, and prompts. - Remove dead code: the isCtrl copy paths (lines 277-288) were unreachable on any platform after the isAction block changes. - Simplify textInput Cmd+C: use writeClipboardText directly without the redundant OSC52 fallback (this path is macOS-only where pbcopy works).
…t terminals (NousResearch#11300) - branding.tsx: `color="yellow"` → `t.color.warn` so light-mode users get the burnt-orange warn instead of unreadable bright yellow on white bg. - theme.ts: replace HERMES_TUI_LIGHT regex with `detectLightMode(env)` that also sniffs `COLORFGBG` (XFCE Terminal, rxvt, Terminal.app, iTerm2). Bg slot 7 or 15 → LIGHT_THEME. Explicit HERMES_TUI_LIGHT (on *or* off) still wins. - tests: cover empty env, explicit on/off, COLORFGBG positions, and off-override.
…ousResearch#8541) StatusRule now renders `{sinceLastMsg}/{sinceSession}` (e.g. `12s/3m 45s`) when a user has submitted in the current session; falls back to the total alone otherwise. Wires `lastUserAt` through the state/session lifecycle: - useSubmission stamps `setLastUserAt(Date.now())` on send - useSessionLifecycle nulls it in reset/resetVisibleHistory - /branch slash nulls it on fork
…edge Status bar ticker was too hot in peripheral vision. The moment the elapsed value matters is when the prompt returns — so surface it there. Dim `fmtDuration` next to the GoodVibesHeart, idle-only (hidden while busy), so quick turns and active streaming stay quiet.
…mode-11300 fix(tui): theme-driven update-behind banner + auto-detect light terminals (NousResearch#11300)
…ck-paste fix: enable right click to paste
Drops `lastUserAt` plumbing and the right-edge idle ticker. Matches the
claude-code / opencode convention: elapsed rides with the busy indicator
(spinner verb), nothing at idle.
- `turnStartedAt` driven by a useEffect on `ui.busy` — stamps on rising
edge, clears on falling edge. Covers agent turns and !shell alike.
- FaceTicker renders ` · {fmtDuration}` while busy; 1 s clock for the
counter, existing 2500 ms cycle for face/verb rotation.
- On busy → idle, if the block ran ≥ 1 s, emit a one-shot
`done in {fmtDuration}` sys line (≡ claude-code's `thought for Ns`).
The transcript line was noisy. Keep the one thing the issue really needs: live elapsed next to the busy verb.
…d-lastmsg-8541 feat(tui): turn elapsed in FaceTicker + done-in sys line on turn end (NousResearch#8541)
🚨 CRITICAL Supply Chain Risk DetectedThis PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging. 🚨 CRITICAL: Install-hook file added or modifiedThese files can execute code during package installation or interpreter startup. Files: Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0a94099985
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| _proxy = _get_proxy_from_env() | ||
| return _httpx.Client( | ||
| transport=_httpx.HTTPTransport(socket_options=_sock_opts), | ||
| proxy=_proxy, |
There was a problem hiding this comment.
Honor NO_PROXY when configuring keepalive proxy client
This path converts proxy environment variables into a single explicit proxy= value on the httpx.Client. That forces all traffic through the proxy and bypasses NO_PROXY host exclusions, so users with HTTPS_PROXY set (common in corp environments) will start failing calls to local endpoints like http://127.0.0.1:11434/localhost even when they configured NO_PROXY for them. Keep env-based proxy resolution (trust_env) or implement NO_PROXY-aware routing before setting an explicit proxy URL.
Useful? React with 👍 / 👎.
| user_dir = _plugins_dir() | ||
| if user_dir.is_dir(): | ||
| if (user_dir / name).is_dir(): | ||
| return True | ||
| for child in user_dir.iterdir(): |
There was a problem hiding this comment.
Allow enabling non-directory plugins in opt-in mode
With the new plugins.enabled allowlist gate, cmd_enable is now required to activate any plugin type, but _plugin_exists() only checks user-directory plugins (and bundled dirs later) and does not cover entry-point/pip plugins or project plugins. Those plugins are therefore rejected as “not installed or bundled” and cannot be enabled via CLI, which effectively disables these supported plugin sources unless users manually edit config.
Useful? React with 👍 / 👎.
Sync upstream main into fork
…#13354) Classic-CLI /steer typed during an active agent run was queued through self._pending_input alongside ordinary user input. process_loop, which drains that queue, is blocked inside self.chat() for the entire run, so the queued command was not pulled until AFTER _agent_running had flipped back to False — at which point process_command() took the idle fallback ("No agent running; queued as next turn") and delivered the steer as an ordinary next-turn user message. From Utku's bug report on PR NousResearch#13205: mid-run /steer arrived minutes later at the end of the turn as a /queue-style message, completely defeating its purpose. Fix: add _should_handle_steer_command_inline() gating — when _agent_running is True and the user typed /steer, dispatch process_command(text) directly from the prompt_toolkit Enter handler on the UI thread instead of queueing. This mirrors the existing _should_handle_model_command_inline() pattern for /model and is safe because agent.steer() is thread-safe (uses _pending_steer_lock, no prompt_toolkit state mutation, instant return). No changes to the idle-path behavior: /steer typed with no active agent still takes the normal queue-and-drain route so the fallback "No agent running; queued as next turn" message is preserved. Validation: - 7 new unit tests in tests/cli/test_cli_steer_busy_path.py covering the detector, dispatch path, and idle-path control behavior. - All 21 existing tests in tests/run_agent/test_steer.py still pass. - Live PTY end-to-end test with real agent + real openrouter model: 22:36:22 API call #1 (model requested execute_code) 22:36:26 ENTER FIRED: agent_running=True, text='/steer ...' 22:36:26 INLINE STEER DISPATCH fired 22:36:43 agent.log: 'Delivered /steer to agent after tool batch' 22:36:44 API call #2 included the steer; response contained marker Same test on the tip of main without this fix shows the steer landing as a new user turn ~20s after the run ended.
The MCP circuit breaker previously had no path back to the closed state: once _server_error_counts[srv] reached _CIRCUIT_BREAKER_THRESHOLD the gate short-circuited every subsequent call, so the only reset path (on successful call) was unreachable. A single transient 3-failure blip (bad network, server restart, expired token) permanently disabled every tool on that MCP server for the rest of the agent session. Introduce a classic closed/open/half-open state machine: - Track a per-server breaker-open timestamp in _server_breaker_opened_at alongside the existing failure count. - Add _CIRCUIT_BREAKER_COOLDOWN_SEC (60s). Once the count reaches threshold, calls short-circuit for the cooldown window. - After the cooldown elapses, the *next* call falls through as a half-open probe that actually hits the session. Success resets the breaker via _reset_server_error; failure re-bumps the count via _bump_server_error, which re-stamps the open timestamp and re-arms the cooldown. The error message now includes the live failure count and an "Auto-retry available in ~Ns" hint so the model knows the breaker will self-heal rather than giving up on the tool for the whole session. Covers tests 1 (half-opens after cooldown) and 2 (reopens on probe failure); test 3 (cleared on reconnect) still fails pending fix #2.
Summary
Notes
Validation
29 passed in 0.27s