fix: align threading docstring with implementation#27
Conversation
Code Review — Related Products, Also-Bought, and Cart Upsell🐛 Bugs
|
…down docs: update all markdown to v0.18 state
…UX (NousResearch#27) feat: Mobile Optimization — iOS Safari fixes, dashboard polish, chat UX
|
Summary
Logging additions (end-to-end accept_flow)
Supabase views (anon-friendly)
Verification
Dashboard (anon-only, static)
Rationale
Risks / Mitigations
Roll-back plan
|
|
Update: removed synthetic call_key in app model (ReservationCall.callKey now disabled as before). Dashboard updated to fall back to session_id when call_key is absent. Structured logging and Supabase views continue to work; timelines are stitchable by session_id. This keeps payloads/back-compat unchanged while retaining end-to-end visibility. |
|
Added logging guardrails to prevent overload:\n- Non-final sampling at 35%\n- Per-session budget: max 20 non-final events\n- 1.5s dedup window per (monitor_type, stage, call_key/session_id)\n- Circuit breaker: pause logs for 60s after 2 consecutive send failures\n- Trim details payload keys to essentials\n\nFinal and error outcomes are always logged. This reduces write QPS and protects Supabase while keeping end-to-end visibility. |
…docstring-threading fix: align threading docstring with implementation
…docstring-threading fix: align threading docstring with implementation
) When the WebUI's model picker sets the active provider to "venice" (or "crof", "bankr", "cometapi"), the agent's resolve_provider_client() looked up the slug in PROVIDER_REGISTRY, failed to find it, and raised: "Provider 'venice' is set in config.yaml but no API key was found. Set the VENICE_API_KEY environment variable..." VENICE_API_KEY WAS set in env — the failure was because the agent had no idea which env var or base URL to use for "venice" as a slug. PROVIDER_REGISTRY entries for the other OpenAI-compatible aggregators (Z.AI, MiniMax, Mistral, etc.) already existed; Venice/CrofAI/Bankr/CometAPI just hadn't been added in the upstream code. Adds four api_key-auth entries mirroring the existing OpenAI-compat ones: - venice → https://api.venice.ai/api/v1 + VENICE_API_KEY - crof → https://crof.ai/v1 + CROF_API_KEY - bankr → https://gateway.bankr.bot/v1 + BANKR_API_KEY - cometapi → https://api.cometapi.com/v1 + COMETAPI_API_KEY Companion to the WebUI side (PRs #26/#27/#29/#30/#32/#33) which made these providers discoverable in the model picker with their live /v1/models catalogs. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…pad MCP call — closes D-kr2-st3
Mechanical follow-on swap unlocked by K-8 (substrate `bd165eb2`) +
KR-7a's MCP client transport. Same pattern as KR-7 chain-emit:
write_scratchpad_entry body replaces `raise
ScratchpadWriteNotAvailableError()` with
`await mcp_client.invoke('kora__write_agent_scratchpad', {...})`,
returning the substrate-assigned scratchpad_entry_id. ~40 LOC across
scratchpad.py + provider.py + tools/iso_node.py + tests + docs.
Production-test posture per IsoKron PM #27:
* K-8's kora__write_agent_scratchpad handler is a
notImplementedHandler stub on substrate main; dispatch tier (task
NousResearch#395) un-stubs + bridges Layer-A wsk_* auth → Layer-B actor_kind.
* KR-8 code shape is sound; mock tests verify correctness.
* Production deploys wait on dispatch tier landing — identical
posture to KR-7 / KR-7b / KR-7a.
K-DG note: K-8's internal flow emits the substrate-canonical chain
literal `kronicle.agent_scratchpad.created` (NOT
`kora.scratchpad.entry.created` as some prior specs may suggest).
The runtime doesn't pass an event_type — the substrate emits
internally as part of the SECDEF flow. Verify-at-first-live-emit:
confirm event_log.actor_id resolves to the 0076-seeded canonical
Kora actor.
scratchpad.py:
* write_scratchpad_entry body: live call to mcp_client.invoke. K-8
contract output: {'scratchpad_entry_id': '<uuid>',
'approved_event_id': '<uuid>'} — projected to a plain str
scratchpad_entry_id return.
* BLAKE3 content_hash computed via existing
compute_scratchpad_content_hash() helper; passed to MCP tool.
* Defensive: None mcp_client raises ValueError; unexpected response
shape raises RuntimeError (same pattern as KR-7's emit_kora_event).
* mcp_client param now required (was Optional with deferred-error
default). Forward signature stays caller-stable from KR-2 ST3.
* ScratchpadWriteNotAvailableError class kept exported tagged
[kora.isokron.deprecated] for one release. Original deferred-tag
preserved inside the message for grep stability.
provider.py:_attempt_scratchpad_write:
* Stop catching ScratchpadWriteNotAvailableError (no longer raised).
* Fetch IsoKronMCPClient via self._connection.get_mcp_client();
failure logs ERROR [kora.scratchpad.write.failed] + returns.
* Catch IsoKronMCPInvocationError + defensive Exception at the
lifecycle boundary — log at ERROR so failures are operator-visible
(same posture as KR-7's chain emit). Sessions don't crash; writes
drop with a loud log line.
* Successful writes log INFO [kora.scratchpad.write] with the
substrate-assigned entry_id and invalidate the per-workspace
own-scratchpad cache so the next read re-fetches.
tools/iso_node.py:
* _handle_iso_node_create + _handle_iso_node_supersede: dropped
the deferred-envelope return path. On success return
{"ok": True, "entry_id": <substrate-uuid>}. On
IsoKronMCPInvocationError return
{"ok": False, "substrate_error": True, "tool_name": ...,
"message": ...} — structured signal for the model that maps to
substrate-level failures distinct from validation errors.
* iso_node_supersede also passes the new entry_id into the
kora.node.superseded chain event payload so audit consumers can
trace the supersession chain.
* Removed unused ScratchpadWriteNotAvailableError import.
Tests (5 new + 9 updated):
* test_scratchpad.py:
- Replaced test_write_scratchpad_entry_raises_deferred_write_error
with 4 MCP-call-path tests (happy/error/None-client/bad-shape)
+ 1 deprecation-runway test.
- _FakeProviderConnection now exposes get_mcp_client() returning a
_FakeMcpClient; sync_turn/on_memory_write tests flipped from
deferred-WARNING assertions to success-INFO + entry_id checks.
- New test_sync_turn_substrate_error_logs_at_error_session_continues
verifies the IsoKronMCPInvocationError → ERROR-log path.
* test_iso_node_tools.py:
- _FakeProviderConnection gains get_mcp_client(); new _FakeMcpClient
routes by tool_name (kora__write_agent_scratchpad → spe-mock-N;
kora__append_event → evt-mock-N; kora__read_kora_capability_row
→ minimal matrix).
- test_iso_node_create_returns_deferred_payload →
test_iso_node_create_returns_ok_envelope_with_substrate_entry_id;
new test_iso_node_create_substrate_error_surfaces_structured_envelope.
- test_iso_node_supersede_inherits_node_kind_from_original flipped
to assert success + verify the inherited node_kind in the new
content_inline header + verify the kora.node.superseded emit
payload carries the new entry_id.
* test_tool_finalize.py: round-trip tests flipped from deferred to
success-envelope assertions. _FakeProviderConnection gains
get_mcp_client().
* test_provider_end_to_end.py: _FakeMcpClient extended to handle
kora__write_agent_scratchpad (returns spe-mock-N). E2E assertions
flipped from 3 deferred-WARNINGs to 3 [kora.scratchpad.write] INFO
logs + 3 scratchpad_write_calls on the fake + 2 append_event_calls.
Assertion narrowed to NOT count the cap-matrix fetch (KR-7b adds
that; this PR stays narrow per parallel-mergeability).
BUILD_DEVIATIONS:
* D-kr2-st3-no-scratchpad-write-mcp-tool moved Open → Closed with
Rule-5 spec-quote, KR-8's call-site refactor inventory, production-
test posture, deprecation-runway note, verify-at-first-live-emit
step.
* 1 deferral remains open (D-kr3-st2 — KR-9 dispatchable).
README "Operator pitfalls":
* Deferred-surface table trimmed 2 → 1 row (D-kr3-st2 only).
* D-kr2-st3 moves to Recently-closed list.
* Individual "Scratchpad writes are currently deferred" pitfall
rewritten as "Scratchpad writes route via kora__write_agent_
scratchpad (KR-8)" with success/failure log paths + operator
grep pointer.
Local gates:
* ty check — 7,337 diagnostics, zero-delta vs KR-7 baseline.
* pytest tests/plugins/memory/ — 357/357 passing.
* Full suite via xdist (-n auto): 24,654 / 168 failed / 129 skipped.
Δ vs KR-7b (24,652/175/129): +2 passed, -7 failed. Same
tests/tools/* + tests/tui_gateway/* xdist isolation noise; none
touch isokron.
1 deferral still open pending CC#1 substrate:
* D-kr3-st2-no-relationlink-write-mcp-tool ← KR-9 (K-10 merged; PM dispatched)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pad MCP call — closes D-kr2-st3 (#16) Mechanical follow-on swap unlocked by K-8 (substrate bd165eb2) + KR-7a's MCP client transport. Same pattern as KR-7 chain-emit: write_scratchpad_entry body replaces raise ScratchpadWriteNotAvailableError with await mcp_client.invoke('kora__write_agent_scratchpad', {...}), returning the substrate-assigned scratchpad_entry_id. Production-test posture per IsoKron PM #27: K-8 handler is notImplementedHandler stub on substrate main; dispatch tier (task NousResearch#395) un-stubs + bridges Layer-A→Layer-B; KR-8 code-shape sound; mock tests verify correctness; production deploys wait on dispatch tier. K-DG note: substrate-canonical chain literal kronicle.agent_scratchpad.created (NOT kora.scratchpad.entry.created). ScratchpadWriteNotAvailableError class kept exported tagged [kora.isokron.deprecated] for one release. Defensive guards: None mcp_client → ValueError; unexpected response shape → RuntimeError. tools/iso_node.py iso_node_create + iso_node_supersede dropped deferred-envelope return path; on success returns {ok:true, entry_id}; on IsoKronMCPInvocationError returns {ok:false, substrate_error:true, tool_name, message}. 5 new + 9 updated tests; ty zero-delta; 357/357 plugins/memory tests pass. Cherry-pick rebased onto post-KR-7 + KR-7b main.
…ink MCP call — closes D-kr3-st2 Fourth + final mechanical follow-on swap unlocked by K-10 + 0083 (IsoKron PM #32). Same pattern as KR-7 (chain-emit) and KR-8 (scratchpad-write): create_relationlink body replaces `raise RelationLinkWriteNotAvailableError()` with `await mcp_client.invoke('kora__create_relationlink', {...})`, returning the substrate-assigned link_id. ~30 LOC across relationlink.py + tools/iso_link.py + tests + docs. All three KR-3 ST2 substrate blockers are now resolved substrate-side: * actor_kind CHECK extended to include 'kora' via 0083 (prod-applied) * public.kora_create_relationlink SECDEF function exists * kora.relationlink.created event literal in event_log_event_type_check (300th literal in the set post-0083) * kora__create_relationlink Sea MCP tool registered Production-test posture per IsoKron PM #27: * K-10's kora__create_relationlink handler is a notImplementedHandler stub on substrate main; substrate-team's dispatch tier (task NousResearch#395) un-stubs + bridges Layer-A wsk_* auth → Layer-B actor_kind='kora'. * KR-9 code shape is sound; mock tests verify correctness. * Production deploys wait on dispatch tier landing — identical posture to KR-7 / KR-7b / KR-8. Verify-at-first-live-emit per spec § 3: * event_log row with event_type = 'kora.relationlink.created' * event_log.actor_id = 0076-seeded canonical Kora actor (NOT token-UUIDv5) * relationlink row with chain_event_id matching the emitted event + created_by_actor_kind = 'kora' relationlink.py: * create_relationlink body: live call to mcp_client.invoke. K-10 contract output: {'link_id': '<uuid>', 'chain_event_id': '<uuid>'} — projected to a plain str link_id return. * mcp_client param now required (was Optional with deferred-error default). * New rationale_block_id + evidence_block_ids params match K-10's input schema. Legacy `rationale` parameter preserved for one- release back-compat (silently dropped — superseded by rationale_block_id). * Defensive: None mcp_client → ValueError; unexpected response shape → RuntimeError (mirrors KR-7/KR-8 pattern). * RelationLinkWriteNotAvailableError class kept exported tagged [kora.isokron.deprecated] for one release. Original three-blocker message preserved inside the message body for grep stability. tools/iso_link.py:_handle_iso_link_create: * Dropped the deferred-envelope path. Success → {"ok": True, "link_id": <substrate-uuid>}. Substrate failure → {"ok": False, "substrate_error": True, "tool_name", "message"} — same structured signal shape as KR-8's iso_node_create. * Fetches IsoKronMCPClient via get_mcp_client(); MCP-client- unavailable returns {"ok": False, "error": ...} envelope. * Removed unused RelationLinkWriteNotAvailableError import. Tests (7 new + 2 updated): * test_iso_link_tools.py: - test_create_relationlink_raises_deferred_write_error → test_create_relationlink_invokes_kora__create_relationlink: happy path asserts spec-pinned arg shape (workspace_id, from/to entity_id + kind, link_type, evidence_block_ids=[]). - New test_create_relationlink_propagates_mcp_invocation_error (substrate active-edge uniqueness violation surfaces). - New test_create_relationlink_rejects_none_mcp_client. - New test_create_relationlink_rejects_unexpected_response_shape. - New test_create_relationlink_passes_rationale_block_id_when_present (optional args pass-through). - New test_relationlink_write_not_available_error_still_importable_post_kr9 (deprecation runway). - test_iso_link_create_handler_returns_deferred_envelope → _returns_ok_envelope_with_substrate_link_id: success-envelope + verifies spec-pinned tool name in invoke recording. - New test_iso_link_create_handler_surfaces_substrate_error_envelope: IsoKronMCPInvocationError flips to structured envelope. - _FakeMcpClient routes by tool_name (returns link-N for kora__create_relationlink). _FakeProviderConnection gains get_mcp_client(). BUILD_DEVIATIONS: * D-kr3-st2-no-relationlink-write-mcp-tool moved Open → Closed with Rule-5 spec-quote, call-site refactor inventory, production-test posture, all-three-blockers-resolved-substrate-side note, deprecation-runway note, verify-at-first-live-emit step. README "Operator pitfalls": * "Deferred-surface summary" table replaced with "All BUILD_DEVIATIONS closed code-side as of KR-9 (parallel-merged with KR-7b + KR-8)" note. All 5 deferral closures listed under Recently closed. * Individual "iso_link_create writes are blocked" pitfall rewritten as "RelationLink writes route via kora__create_relationlink (KR-9)" with success/failure envelope shapes + operator grep pointer. Parallel-mergeability note: KR-7b + KR-8 + KR-9 modify disjoint files (KR-7b: capability_matrix_mirror.py + provider.initialize; KR-8: scratchpad.py + provider sync_turn/on_memory_write; KR-9: relationlink.py + tools/iso_link.py). Merge interleaves; each PR narrows its E2E assertions to its own surface so test fixture counts don't fight. PM handles README + BUILD_DEVIATIONS final reconciliation on merge order. Local gates: * ty check — 7,337 diagnostics, zero-delta vs KR-7 baseline. * pytest tests/plugins/memory/ — 357/357 passing. * Full suite via xdist (-n auto): 24,629 / 181 failed / 12 errors / 129 skipped. Same tests/tools/* xdist isolation noise as documented across prior PRs; none touch isokron. After KR-9 + KR-7b + KR-8 merge, all four BUILD_DEVIATIONS are closed code-side. Standing-by state becomes dispatch-tier-gated (waiting on substrate-team task NousResearch#395 + service-token mint for production deploys). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the SimpleX Chat platform plugin (NousResearch#27???) with the features we had been carrying out-of-tree against the older core-modifying SimpleX integration. All additions live entirely inside ``plugins/platforms/simplex/`` — no edits to any core file — and pass the existing ``tests/gateway/test_simplex_plugin.py`` suite unchanged. What's new - **Group support** via ``SIMPLEX_GROUP_ALLOWED``. Comma-separated group IDs, or ``*`` for any. Omitted (default) keeps the safer behavior of ignoring group messages entirely — a bot in a group otherwise processes every member's traffic. Inbound items from groups are surfaced with ``chat_id="group:<id>"`` and ``chat_type="group"``. - **Real inbound file/voice ingestion** via the daemon's XFTP flow. On ``rcvFileDescrReady`` we send ``/freceive`` immediately so the download starts; voice notes whose file hasn't landed yet are deferred (``_pending_file_transfers``) and replayed once ``rcvFileComplete`` fires. The previous "sleep 2s and scan ``~/Downloads``" stub was race-prone on large files. Files are classified by extension into image / audio / document and emitted as ``MessageType.PHOTO`` / ``VOICE`` / ``TEXT + document``. - **Real outbound attachments**: ``send_image`` (including ``file://`` and ``http(s)://`` URLs with Pillow / ImageMagick conversion + inline thumbnail), ``send_image_file``, ``send_voice`` (native voice notes via ``msgContent.type="voice"``), ``send_video``, ``send_document``. All use the structured ``/_send … json`` form addressed by numeric ID so groups with non-unique display names don't get the wrong target. Agent replies may also embed ``MEDIA:<path>`` tags which the adapter strips from the text and dispatches as voice or document depending on extension. - **Text-burst batching** (``HERMES_SIMPLEX_TEXT_BATCH_DELAY``, default 0.8s). Rapid-fire inbound text messages within the quiet window are concatenated into one ``MessageEvent`` before reaching the agent — same pattern as Telegram's text batching. Prevents the gateway from dropping all-but-the-first message when a user pastes a multi-line prompt. - **Auto-accept contact requests** (``SIMPLEX_AUTO_ACCEPT``, default ``true``). Incoming ``contactRequest`` events trigger ``/accept <id>``. Disable explicitly if you want to gate every new contact behind an allowlist. - **Send / fire-and-forget split**. ``send()`` is now fire-and-forget at the WebSocket level — text replies don't serialise behind a 30-second timeout when the daemon skips the corr-id reply. The ``_send_command`` path with corr-id tracking is reserved for media commands where we genuinely need acknowledgement. The bracket form ``@[<id>]`` / ``#[<id>]`` is used for single-line text; multi-line content drops to ``/_send … json`` so newlines survive the daemon's command parser. Plugin manifest ``plugin.yaml`` bumps to ``1.1.0`` and surfaces the new env vars (``SIMPLEX_AUTO_ACCEPT``, ``SIMPLEX_GROUP_ALLOWED``, ``HERMES_SIMPLEX_TEXT_BATCH_DELAY``) in the ``hermes config`` UI. ``_env_enablement`` seeds matching keys into ``PlatformConfig.extra`` so env-only setups show up in ``hermes gateway status`` without instantiating the WebSocket client. Docs ``website/docs/user-guide/messaging/simplex.md`` documents the new env vars, the group-chat addressing scheme, and the inbound/outbound attachment behaviour. Compatibility All 27 existing ``tests/gateway/test_simplex_plugin.py`` tests pass unchanged: ``_make_corr_id`` still tracks ``_pending_corr_ids`` with the same trim invariant, ``send()`` still produces ``@[<id>]`` / ``#[<id>]`` for single-line text, ``_standalone_send`` keeps its text-only contract, and ``register()`` retains the same kwargs.
Fixes 12 remaining MEDIUM issues from the deep audit (19 total, 7 fixed in Round 12): design_agent: - NousResearch#15: add asyncio.wait_for(300s) around LLM API call to prevent infinite hangs - NousResearch#17: replace 2x hardcoded 'claude-opus-4-8' with shared DEFAULT_MODEL constant qa_agent / validate_agent: - NousResearch#20,NousResearch#22,NousResearch#23: already fixed in Round 12 (verified — dynamic timeout/threshold values used) memory.py: - NousResearch#24: frontmatter parser uses regex r'^---$' instead of str.split('---',2), preventing false splits on content containing '---' (SQL, markdown tables) - NousResearch#25: parse and preserve 'description' field from frontmatter in metadata, fixing write→load roundtrip data loss profiles.py: - NousResearch#26: ProfileConfig now frozen=True (immutable dataclass per coding standards) deploy_agent: - NousResearch#31: replace 2x sync subprocess.run with asyncio.create_subprocess_exec - fix 5x .decode() → .decode('utf-8', errors='replace') for Windows CJK safety - remove unused import subprocess db.py: - NousResearch#27: add class docstring explaining RLock + _unlocked pattern - NousResearch#28: FK constraints already in DDL (verified PRAGMA foreign_keys=ON active) - NousResearch#29: add _ensure_connection() with PRAGMA integrity_check(1) + auto-reconnect on 4 critical methods (create_task, get_task, claim_task, submit_result) - extract _create_connection() static method for reuse by reconnect Tests: 79 passed, 0 failed
- path-to-regexp: 0.1.12 → 0.1.13 (CVE-2026-4867, GHSA-37ch-88jc-xwx2) - picomatch: 2.3.1 → 2.3.2 (CVE-2026-33671, GHSA-c2c7-rcm5-vvqj) - undici: 7.23.0 → 7.27.1 (CVE-2026-1526/CVE-2026-2229/CVE-2026-1528) - serialize-javascript: 6.0.2 → 7.0.3 (GHSA-5c6j-r48x-rmvq, via overrides) Fixes Dependabot alerts NousResearch#37, NousResearch#34, NousResearch#33, NousResearch#32, NousResearch#29, NousResearch#27
…docstring-threading fix: align threading docstring with implementation
… clamp the cap to the 65k native handle table
Root cause of the bench-suite crash (every otui mem3000/slope cell died at
~3000 lumpy fixture msgs, exit 7, ~880MB RSS — not a cgroup kill):
- @opentui/core 0.4.0 routes EVERY native object through ONE global handle
registry with 16-bit slot indices (core src/zig/handles.zig: INDEX_BITS=16,
MAX_SLOTS=65535, slot 0 reserved). Measured on this install: exactly 65,534
live handles; the next createSyntaxStyle() fails. destroy() DOES recycle
slots — exhaustion means LIVE objects.
- Every TextBufferRenderable burns THREE slots in its constructor
(TextBufferRenderable.ts:77-80: TextBuffer + TextBufferView + SyntaxStyle),
so the mount-everything transcript hits the wall at ~1,400 store rows
(~16 text renderables/row x 3 ~ 47 handles/row): "Failed to create
SyntaxStyle" (zig.ts:4554) throws out of a Solid mount effect.
- The crash was MASKED: CliRenderer's own uncaughtException handler
(handleError -> console.show()) allocates the console-overlay
OptimizedBuffer — another handle — so the handler itself threw "Failed to
create optimized buffer: WxH" and Node died with exit 7 (fatal error in
the uncaughtException handler), hiding the real error.
Why not share one SyntaxStyle (the obvious 3->2): the per-buffer style is
load-bearing — native setStyledText (text-buffer.zig) registers each chunk's
color by NAME ("chunk{i}") into the buffer's OWN style, and registration is
name-keyed-overwrite (syntax-style.zig putStyle), so a shared style would
cross-corrupt chunk colors between every styled <text>. Pooling is unsound
at our layer in core 0.4.0.
The fix, at the seams that are ours:
- boundary/nativeHandles.ts (ffiSafe.ts sibling): SyntaxStyle.create() on a
full table DEGRADES to a detached style (native handle 0) instead of
throwing — JS-side styleDefs/mergeStyles (what markdown/code chunk colors
actually use) keep working; all native calls on handle 0 are inert no-ops.
- boundary/renderer.ts: guard the process error listeners createCliRenderer
installs so an exception INSIDE the handler can never exit-7-mask the
original error again (logged honestly; original error stays the story).
- logic/store.ts: HERMES_TUI_MAX_MESSAGES clamped to a handle-safe ceiling
(1000 rows ~ 47k handles ~ 72% of the table on the realistic fixture).
The old default of 3000 was unreachable — the TUI crashed at ~1,400 rows,
before the cap ever bound. Renderable-weight-aware capping is #27's
(virtualization) to do properly; until then the degrade shim backstops
pathological rows.
TODO(upstream) — issue-shaped, for the OpenTUI repo:
(a) a global 64k handle table with a 3-slot cost per text renderable is
too small for transcript-style TUIs (61k renderables ~ 3k messages);
(b) native allocation failures throw out of the render loop with no
degrade path;
(c) handleError allocates (console overlay buffer) and so crashes on the
very condition it is reporting, masking the root cause with exit 7.
Also: eslint now ignores ui-opentui/.bench/** (bench `nodes`-cell build
artifact broke the lint gate) and .gitignore covers it.
Gate: npm run check green, 599 tests (595 baseline + 3 degrade-path tests
+ 1 cap-clamp test).
…RMES_TUI_WINDOWING Core machinery of docs/plans/opentui-transcript-windowing.md (#27): rows outside [scrollTop − viewport, scrollTop + 2·viewport) swap to an exact-height empty <box> (1 yoga node, no text buffers / native handles), so the mounted set stays ~3 viewports regardless of transcript length. Flag: HERMES_TUI_WINDOWING — unset → ON; 0/false/no/off → OFF (envFlag semantics, the bench A/B + one-env escape hatch). OFF renders the exact legacy tree (no wrapper boxes). Pieces: - logic/window.ts (pure, table-tested): computeWindow (viewport ± 1-viewport margin intersection over cumulative exact heights; null heights fall back to a per-row line-count estimate), hysteresisFor/shouldRecompute (≥ ¼ viewport between recomputes), correctionIsLegal (the jank rule: corrections only fully above the viewport with same-frame scrollTop compensation, or fully below it), estimateMessageHeight (line-count estimate; wrong values are fixed by remount only — S1 never corrects a spacer in place). - view/transcript.tsx: per-row measuring wrapper records exact heights via onSizeChange (only while the real row is mounted); window driver is a renderer frame callback (setFrameCallback — scroll always renders, so no extra timer) publishing the mounted set through one signal + createSelector so only flipped rows re-render. Stable row keys via WeakMap<Message, n> (messages have no id; store proxies are reference-stable). Solid <Show> unmount destroys the row's renderables (@opentui/solid _removeNode → destroyRecursively). Never-window rules: - streaming rows (remount would restart native markdown streaming), - the last row while a turn is running (deltas land there), - the bottom 30 rows (fixed K — sticky-bottom region; rows under viewport+margin are mounted by the window calc anyway), - rows the window has never adjudicated default to MOUNTED (new live rows paint instantly), - the whole window FREEZES while a mouse selection is active (renderer.getSelection()?.isActive — a swap would destroy highlighted renderables under the native selection walk). Tests: 30 pure window.test.ts cases + 2 headless integration cases (transcriptWindow.test.tsx) pinning the zero-jank invariant (scrollHeight identical ON vs OFF), the renderable shedding, and remount-on-scroll-back.
…dowed resume + edge measure S2 of docs/plans/opentui-transcript-windowing.md (#27), behind HERMES_TUI_WINDOWING (OFF path renders the byte-identical legacy tree). Append-time adjudication: the window now recomputes on transcript GROWTH, not just scroll — a createComputed on messages.length re-windows synchronously per append, and while pinned at the bottom computeWindow anchors to the cumulative content BOTTOM (pinnedBottom) instead of the stale pre-layout scrollTop, so burst-appended rows are spacer-swapped the moment they pass the margin. The frame driver additionally treats a ≥ ¼-viewport scrollHeight change (streaming growth) like scroll movement. Unseen-row default changed from "always mounted" to "mounted iff created streaming or within the bottom-30" — live rows still paint instantly with zero added latency; a bulk commitSnapshot (resume) mounts ONLY the bottom window and everything above starts as line-count-estimate spacers (chip- and-spacing-aware estimateMessageHeight). Spacer corrections (zero-jank rule): when a measure lands a height different from what the spacer occupied, the wrapper's onSizeChange fires inside the layout traversal, pre-paint. Pinned at bottom the scrollbox's own sticky re-pin (content onSizeChange runs before the row wrappers') already compensated — verified by test; otherwise scrollTop is compensated same-frame for rows fully above the viewport (correctionIsLegal). Frames stay byte-stable across corrections in both pinned and mid-history tests. Lazy exact-measure (design §4 — the simple choice, documented): no true offscreen layout exists in @opentui/core, so an idle pulse (no appends, no scroll, no turn, no selection for HERMES_TUI_WINDOW_IDLE_MS≈1s) mounts MEASURE_BATCH_ROWS=10 never-measured rows nearest the bottom window edge (edgeMeasureBatch), records exact heights (incl. a direct post-layout pull for rows whose mount changed nothing — no onSizeChange fires), and the next recompute swaps them back to now-exact spacers. Scrolling itself still measures the margin band. DEV counter: windowRowStats (current/peak simultaneously-mounted rows), exposed on globalThis behind HERMES_TUI_WINDOW_STATS; tests assert it. Measured (this build, 39f9f43+S2): - check: exit 0 (647 tests / 39 files; +11 pure window cases, +4 headless) - peak mounted: 31 rows over a 1500-row burst; 30 rows on a 600-row resume snapshot (bound asserted < 120) - gate digest: otui-capped d5e9558583159eac… — byte-identical, 2/2 reps - mem2000 (otui-capped, windowing ON, 8GB heap): vmhwm 300MB (S1 same-heap 518MB; S1 right-sized-heap 427MB; Ink 229-239MB; target ≤ 350MB) - scroll2000 otui-capped: p50 2.0ms / p99 5.0ms / max 18ms (gate ≤ 17ms p99; S1 baseline p99 15ms) Known S2 limits (deferred to S3, design §5): /compact·/details toggles and width resizes leave out-of-window spacer heights stale until remount or the idle march; expanded-body state above the window may re-collapse on remount (S1-accepted).
…payoff) With transcript windowing (S1+S2) the mounted set no longer scales with the store (peak 31 rows over a 1500-row burst), so the handle-table clamp that forced 1000 rows is unnecessary when windowing is on. The ceiling is now windowing-aware: 3000 rows (the originally-shipped default, regression documented in opentui-fixes-audit.md §2) with windowing, 1000 with HERMES_TUI_WINDOWING=0 (every row mounts again). Measured at the restored cap (full 3000-msg store): mem3000 360MB peak styled end-to-end (pre-campaign: ~870MB + unstyled past ~1,400 rows; before that: crash). scroll3000 p50=2 p90=3 p99=8 max=17ms (Ink same workload: p90=35 p99=96). Gate digest unchanged.
…RMES_TUI_WINDOWING Core machinery of docs/plans/opentui-transcript-windowing.md (#27): rows outside [scrollTop − viewport, scrollTop + 2·viewport) swap to an exact-height empty <box> (1 yoga node, no text buffers / native handles), so the mounted set stays ~3 viewports regardless of transcript length. Flag: HERMES_TUI_WINDOWING — unset → ON; 0/false/no/off → OFF (envFlag semantics, the bench A/B + one-env escape hatch). OFF renders the exact legacy tree (no wrapper boxes). Pieces: - logic/window.ts (pure, table-tested): computeWindow (viewport ± 1-viewport margin intersection over cumulative exact heights; null heights fall back to a per-row line-count estimate), hysteresisFor/shouldRecompute (≥ ¼ viewport between recomputes), correctionIsLegal (the jank rule: corrections only fully above the viewport with same-frame scrollTop compensation, or fully below it), estimateMessageHeight (line-count estimate; wrong values are fixed by remount only — S1 never corrects a spacer in place). - view/transcript.tsx: per-row measuring wrapper records exact heights via onSizeChange (only while the real row is mounted); window driver is a renderer frame callback (setFrameCallback — scroll always renders, so no extra timer) publishing the mounted set through one signal + createSelector so only flipped rows re-render. Stable row keys via WeakMap<Message, n> (messages have no id; store proxies are reference-stable). Solid <Show> unmount destroys the row's renderables (@opentui/solid _removeNode → destroyRecursively). Never-window rules: - streaming rows (remount would restart native markdown streaming), - the last row while a turn is running (deltas land there), - the bottom 30 rows (fixed K — sticky-bottom region; rows under viewport+margin are mounted by the window calc anyway), - rows the window has never adjudicated default to MOUNTED (new live rows paint instantly), - the whole window FREEZES while a mouse selection is active (renderer.getSelection()?.isActive — a swap would destroy highlighted renderables under the native selection walk). Tests: 30 pure window.test.ts cases + 2 headless integration cases (transcriptWindow.test.tsx) pinning the zero-jank invariant (scrollHeight identical ON vs OFF), the renderable shedding, and remount-on-scroll-back.
…dowed resume + edge measure S2 of docs/plans/opentui-transcript-windowing.md (#27), behind HERMES_TUI_WINDOWING (OFF path renders the byte-identical legacy tree). Append-time adjudication: the window now recomputes on transcript GROWTH, not just scroll — a createComputed on messages.length re-windows synchronously per append, and while pinned at the bottom computeWindow anchors to the cumulative content BOTTOM (pinnedBottom) instead of the stale pre-layout scrollTop, so burst-appended rows are spacer-swapped the moment they pass the margin. The frame driver additionally treats a ≥ ¼-viewport scrollHeight change (streaming growth) like scroll movement. Unseen-row default changed from "always mounted" to "mounted iff created streaming or within the bottom-30" — live rows still paint instantly with zero added latency; a bulk commitSnapshot (resume) mounts ONLY the bottom window and everything above starts as line-count-estimate spacers (chip- and-spacing-aware estimateMessageHeight). Spacer corrections (zero-jank rule): when a measure lands a height different from what the spacer occupied, the wrapper's onSizeChange fires inside the layout traversal, pre-paint. Pinned at bottom the scrollbox's own sticky re-pin (content onSizeChange runs before the row wrappers') already compensated — verified by test; otherwise scrollTop is compensated same-frame for rows fully above the viewport (correctionIsLegal). Frames stay byte-stable across corrections in both pinned and mid-history tests. Lazy exact-measure (design §4 — the simple choice, documented): no true offscreen layout exists in @opentui/core, so an idle pulse (no appends, no scroll, no turn, no selection for HERMES_TUI_WINDOW_IDLE_MS≈1s) mounts MEASURE_BATCH_ROWS=10 never-measured rows nearest the bottom window edge (edgeMeasureBatch), records exact heights (incl. a direct post-layout pull for rows whose mount changed nothing — no onSizeChange fires), and the next recompute swaps them back to now-exact spacers. Scrolling itself still measures the margin band. DEV counter: windowRowStats (current/peak simultaneously-mounted rows), exposed on globalThis behind HERMES_TUI_WINDOW_STATS; tests assert it. Measured (this build, 39f9f43+S2): - check: exit 0 (647 tests / 39 files; +11 pure window cases, +4 headless) - peak mounted: 31 rows over a 1500-row burst; 30 rows on a 600-row resume snapshot (bound asserted < 120) - gate digest: otui-capped d5e9558583159eac… — byte-identical, 2/2 reps - mem2000 (otui-capped, windowing ON, 8GB heap): vmhwm 300MB (S1 same-heap 518MB; S1 right-sized-heap 427MB; Ink 229-239MB; target ≤ 350MB) - scroll2000 otui-capped: p50 2.0ms / p99 5.0ms / max 18ms (gate ≤ 17ms p99; S1 baseline p99 15ms) Known S2 limits (deferred to S3, design §5): /compact·/details toggles and width resizes leave out-of-window spacer heights stale until remount or the idle march; expanded-body state above the window may re-collapse on remount (S1-accepted).
…payoff) With transcript windowing (S1+S2) the mounted set no longer scales with the store (peak 31 rows over a 1500-row burst), so the handle-table clamp that forced 1000 rows is unnecessary when windowing is on. The ceiling is now windowing-aware: 3000 rows (the originally-shipped default, regression documented in opentui-fixes-audit.md §2) with windowing, 1000 with HERMES_TUI_WINDOWING=0 (every row mounts again). Measured at the restored cap (full 3000-msg store): mem3000 360MB peak styled end-to-end (pre-campaign: ~870MB + unstyled past ~1,400 rows; before that: crash). scroll3000 p50=2 p90=3 p99=8 max=17ms (Ink same workload: p90=35 p99=96). Gate digest unchanged.
Before this change,
ToolContext._run_tool_in_threaddocs said it usesrun_in_executor, but the implementation actually uses a disposableThreadPoolExecutor+ blockingfuture.result().This was misleading during code review and debugging because the documented execution model did not match real behavior.
This change updates the docstring and nearby comment to reflect the real threading path without changing runtime logic.