feat(kora): KR-REASONING-PANEL — Kora reasoning activity lens (stub) by rafe-walker · Pull Request #132 · rafe-walker/kora

rafe-walker · 2026-05-23T00:17:28Z

Summary

Operator-facing view of Kora's recent ReasoningEngine calls — model used, tokens spent, cost-ladder rung at call time, errors. Pairs with CC#3's KR-FEAT-AI-RESPONSE-LOOP ST2 (in flight). Once ST2 extends slack_dm_log.jsonl outbound entries with the reasoning fields, a small follow-on bucket flips this stub to read from there.

Shape is pinned by the new test suite so the FE shipping in this PR keeps rendering during the cut-over.

What's in here

Backend stub: GET /api/reasoning/recent returning 4 representative calls deliberately spanning ok @ normal on opus + ok @ warn_75 on sonnet (cost-downshift in action) + halted @ hard_stop_100 (budget-locked refusal) + sdk_timeout failure — surfaces the full cost-ladder behaviour + error taxonomy at first look.
ReasoningPanel.tsx (new): title + stub banner + 4-column stats strip (total calls / token spend input → output / model distribution mini-bar / status distribution) + filter pills (all / ok / failed / halted) + newest-first timeline. Per row: status icon, model badge (tier-tinted — opus blue / sonnet purple / haiku gray / null=halted red), cost-rung pill color-mapped per spec (normal gray, warn_75 yellow, downshift_90 orange, hard_stop_100 red), duration with visual bar (warning ≥ 2s), input → output token counts, status badge (non-default only — happy-path rows stay clean), error_code chip, response excerpt (truncated to 80 chars collapsed; expandable to the 200-char API cap).
Dashboard card feat(KR-6): Python actor_has_capability helper — replaces assert_kora_can_perform stub, closes D-kr3-st1 #12 (Brain icon). Layout keeps lg:grid-cols-3 — row 2 grows to 3+3+3+3, the clean 4-row × 3-col closure of the layout. Headline goes destructive when halted > 0 in 24h (Kora was budget-locked — investigate cost rung); warning when failed > 0; foreground otherwise.
Route /reasoning + nav entry between /agent-activity and /slack-dm.

4-layer security contract

Extending the established pattern with reasoning-specific guards:

response_text_truncated_200 rendered as PLAIN TEXT via React's default child escaping. Real responses may contain anything Kora generates (HTML / markdown / script fragments). FE pins via dangerouslySetInnerHTML grep.
NO Anthropic key shapes (sk-ant- prefix + base64-like body) anywhere in payload. Walk-payload regex catches a future error-projection bug or log-entry edit that leaks credential material into the operator's view.
NO PII (email regex / Slack user-ID regex) leaked from the inbound user's message into response_text_truncated_200 — Kora's generated text is operator-visible; user message content lives in SLACK-DM-PANEL, not here. Belt+braces walk-payload sweep covers any field, not just response_text.
TS interface declares all fields with documented contracts; no raw_prompt / auth_token / response_html companion fields exist on ReasoningCall.

K-DG drift caught + handled

Per memory feedback_k_dg_substrate_field_names_in_specs:

Bucket §3(a) example payload used uppercase Enum NAMES (NORMAL / WARN_75 / DOWNSHIFT_90 / HARD_STOP_100) — same convention CC#3 cited in PR feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST1 — ReasoningEngine + Anthropic impl + context loader #126's K-DG drift note.
BUT the canonical wire format per agent/cost_state_holder.py:114-117 is the lowercase Enum VALUES (NORMAL = "normal" etc), and engine.py:47-49's CostLadderRungName literal enforces lowercase.
Real CC#3 data will emit lowercase via .value. Stub uses lowercase to match — so stub and real data agree at flip time, no FE pill-color map breakage at the cutover. Backend test pins lowercase and flags this as the intentional spec divergence.

Test plan

tests/kora_cli/test_web_server_reasoning.py — 16 tests: shape, 4-call stub pin, status+rung+error diversity, per-entry schema with full ReasoningEngine error taxonomy validation, cost-rung lowercase pin (K-DG catch), all 4 security guards (Anthropic key walk-payload + PII walk-payload both per-field and whole-blob), FE source-pins (no dangerouslySetInnerHTML, response_text as JSX child), 200-char API-edge cap, by_status_24h reconciliation, tokens_total shape, cron-regression sanity.
Full admin-panel regression: 287/287 across 24 suites (with --extra dev only — slowapi now runtime per PR chore(kora): KR-SLOWAPI-DEP-FIX — move slowapi to runtime deps #128).
pnpm tsc -b clean.
pnpm build clean.
Manual smoke: load /reasoning, exercise filters, expand rows, verify destructive card tone when halted in stub.

Refs

rafe-walker/kora-docs → 17_cc_bucket_prompts/KR-REASONING-PANEL_kora_thinking_lens_stub.md
PR feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST1 — ReasoningEngine + Anthropic impl + context loader #126 — KR-FEAT-AI-RESPONSE-LOOP ST1 (ReasoningEngine + error taxonomy + CostRung literal source)
PR chore(kora): KR-SLOWAPI-DEP-FIX — move slowapi to runtime deps #128 — slowapi runtime promotion (enables --extra dev only)

🤖 Generated with Claude Code

Operator-facing view of Kora's recent ReasoningEngine calls. Pairs with CC#3's KR-FEAT-AI-RESPONSE-LOOP ST2 (in flight) — once ST2 extends slack_dm_log.jsonl outbound entries with the model_used / input_tokens / output_tokens / reasoning_duration_ms / reasoning_error fields, a small follow-on bucket flips this stub to read from there. Shape is pinned by the new test suite. Single-PR scope: * GET /api/reasoning/recent stub — 4 representative calls deliberately spanning ok @ normal on opus, ok @ warn_75 on sonnet (cost-downshift), halted at hard_stop_100 (budget- locked refusal), and sdk_timeout failure so the operator's first look surfaces the full cost-ladder behaviour + error taxonomy. stub:true keeps the FE banner visible. * ReasoningPanel.tsx — title + stub banner + 4-column stats strip (total calls / token spend / model distribution / status distribution) + filter pills (all / ok / failed / halted) + newest-first timeline. Per row: status icon, model badge (tier-tinted: opus blue / sonnet purple / haiku gray / null=halted red), cost-rung pill (color-mapped per spec: normal gray / warn_75 yellow / downshift_90 orange / hard_stop_100 red), duration with visual bar (warning ≥ 2s), input→output token counts, status badge (non-default only), error_code chip, response excerpt (truncated to 80 chars collapsed; expandable to the 200-char API cap). * Dashboard card #12 with Brain icon. Layout keeps lg:grid- cols-3 — row 2 grows to 3+3+3+3 (clean 4-row × 3-col closure of the layout that's been shepherded since DASHBOARD-V2). Headline goes destructive when halted > 0 in 24h (Kora was budget-locked; investigate cost rung); warning when failed > 0; foreground/success otherwise. * Route /reasoning + nav entry between /agent-activity and /slack-dm (logical grouping with the agent-facing surfaces). 4-layer security contract (extending the established pattern with reasoning-specific guards): 1. response_text_truncated_200 rendered as PLAIN TEXT via React's default child escaping. Real responses may contain anything Kora generates (HTML / markdown / script fragments). FE pins via dangerouslySetInnerHTML grep. 2. NO Anthropic key shapes (sk-ant- prefix + base64-like body) anywhere in payload. Walk-payload regex catches a future error-projection bug or log-entry edit that leaks credential material into the operator's view. 3. NO PII (email regex / Slack user-ID regex) leaked from the inbound user's message into response_text_truncated_200 — Kora's generated text is operator-visible; user message content lives in SLACK-DM-PANEL, not here. Belt+braces walk-payload sweep covers any field, not just response_text. 4. TS interface declares all fields with documented contracts; no raw_prompt / auth_token / response_html companion fields exist on ReasoningCall. K-DG drift catch (per memory `feedback_k_dg_substrate_field_names`): * Bucket §3(a) example payload used uppercase Enum NAMES (NORMAL / WARN_75 / DOWNSHIFT_90 / HARD_STOP_100) — same convention CC#3 cited in PR #126's K-DG drift note. * BUT the canonical wire format per agent/cost_state_holder.py:114-117 is the lowercase Enum VALUES (NORMAL = "normal" etc), and engine.py:47-49's CostLadderRungName literal enforces lowercase. * Real CC#3 data will emit lowercase via .value. Stub uses LOWERCASE to match — so stub and real data agree at flip time. Backend test pins lowercase + flags this as the intentional spec divergence. Tests: * tests/kora_cli/test_web_server_reasoning.py — 16 tests: shape, 4-call stub pin, status+rung+error diversity, per- entry schema with full ReasoningEngine error taxonomy validation, cost-rung lowercase pin (K-DG catch), all 4 security guards (Anthropic key walk-payload + PII walk- payload both per-field and whole-blob), FE source-pins (no dangerouslySetInnerHTML, response_text as JSX child), 200-char API-edge cap, by_status_24h reconciliation, tokens_total shape, cron-regression sanity. * Full admin-panel regression: 287/287 across 24 suites (with --extra dev only — slowapi now runtime per PR #128). * tsc -b + vite build both clean. Refs: * rafe-walker/kora-docs 17_cc_bucket_prompts/KR-REASONING-PANEL_kora_thinking_lens_stub.md * PR #126 — KR-FEAT-AI-RESPONSE-LOOP ST1 (ReasoningEngine + error taxonomy + CostRung literal source) * PR #128 — slowapi runtime promotion (enables --extra dev only) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…#130) Agent-facing tool surface = 8 read + 5 mutating = 13 tools. Other agents have full operational surface on Kora. 2 new client listeners (slack_client_listener + purelymail_client_listener) + 2 new MCP tools (kora__send_slack_dm + kora__send_email) + Slack handler accessor refactor + JSONL caller_actor_kind + PurelymailClient send_email log + listener package wire-in. §5 rulings: Q1 listener compat (handler accessor first, lazy fallback) / Q2 single JSONL with optional caller_actor_kind / Q3 D-prefix OR Joshua user-ID; U... C... rejected. Security: tokens absent from error envelopes after diverse-failure-mode injection. Listener fail-soft contract: capabilities not gates. 36 new tests + 341 cross-bucket regression. CC#1 rebased onto #131 + #132 — combined reasoning-meta JSONL fields with caller_actor_kind audit field; preserved both prose docstrings + entry-building branches. Used explicit-SHA --force-with-lease per feedback_pm_merge_then_delete_race memory.

… JSONL (#141) 3 panels flip at once: AGENT-ACTIVITY + REASONING + WEBHOOK-EVENTS now read from kora_audit_log.jsonl. - NEW kora_cli/audit/jsonl_reader.py (shared helper) + 3 endpoint flips + 4 test files. 2 K-DG drifts caught + handled: - §2 Flip 1 spec mismatch with actual mcp_tools.py:714-724 writer — handled by setting duration_ms=0, status=ok, using details.result for result_summary. - §2 Flip 2 nullable cost_rung — handled by using lowercase unknown enum member to preserve PR #132 K-DG pin. Reasoning grouping: collapses N tool calls sharing caller_session_id into 1 ReasoningCall with tools_used: [...]. Aggregate counts use individual rows (not groups) so headline reflects volume. Webhook security: source_ip octet-masked at projection edge (audit writer passes RAW peer_ip; endpoint enforces mask). details sub-set to {reason, header_present} — never full audit details. IPv6/dash → defensive fallback. Fixture-isolation from #137 applied across all 3 test files + reader tests: monkeypatch get_kora_home in all 3 module namespaces. 357/357 admin-panel tests pass across 27 suites. Follow-on buckets cited: KR-REASONING-PANEL-MODEL-XREF (model/tokens cross-ref from slack_dm log) + KR-MCP-RUNTIME-SURFACE follow-on (extends mcp.tool_called audit with duration_ms + failure-path status).

…ia xref (#143) Cross-references audit JSONL with slack_dm_log outbound entries to populate previously-null model_used / tokens / response_text fields on reasoning panel rows. - NEW kora_cli/audit/reasoning_xref.py (cross-ref helper with parsing/matching/cost-rung-derivation/text-truncation) + /api/reasoning/recent endpoint update + 27 new xref tests. K-DG drift caught up-front: spec said verify if outbound JSONL writes caller_session_id; grep found it does NOT. Documented in module header + commit body. Correlation algorithm: caller_session_id → (channel_id, event_ts) → match outbound with same channel_id where thread_ts == event_ts, fallback to closest sent_at within ±60s. Cost-rung derivation: substring-match on model name (opus/sonnet/haiku) so future model revs keep mapping correctly; cost_ladder_halted reasoning_error supersedes; preserves lowercase Enum.value pin from #132/#141. Graceful degradation: when xref fails per-group, row renders with null fields — identical shape to #141 pre-xref output so FE handles both with no conditional logic. Security carve-out: response_text_truncated_200 is intentionally Joshua-content (carved out from PII sweep, same pattern as #141 message_id and slack_dm panel text). 384/384 admin-panel tests pass across 28 suites.

rafe-walker merged commit 3ef6b87 into feature/phase2-upgrades May 23, 2026

rafe-walker deleted the feat/kora-KR-REASONING-PANEL branch May 23, 2026 00:25

rafe-walker mentioned this pull request May 23, 2026

feat(kora): KR-AUDIT-PANEL-ENDPOINTS — flip 3 stub panels using audit JSONL #141

Merged

6 tasks

rafe-walker mentioned this pull request May 23, 2026

feat(kora): KR-REASONING-PANEL-MODEL-XREF — populate model + tokens via xref #143

Merged

3 tasks

rafe-walker mentioned this pull request May 23, 2026

chore(kora): KR-FE-PANEL-HELPERS-DRY — extract panel-shared helpers #151

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kora): KR-REASONING-PANEL — Kora reasoning activity lens (stub)#132

feat(kora): KR-REASONING-PANEL — Kora reasoning activity lens (stub)#132
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-REASONING-PANEL

rafe-walker commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rafe-walker commented May 23, 2026

Summary

What's in here

4-layer security contract

K-DG drift caught + handled

Test plan

Refs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant