This repository was archived by the owner on May 26, 2026. It is now read-only.
feat(kora): KR-REASONING-PANEL — Kora reasoning activity lens (stub)#132
Merged
rafe-walker merged 1 commit intoMay 23, 2026
Merged
Conversation
Operator-facing view of Kora's recent ReasoningEngine calls. Pairs
with CC#3's KR-FEAT-AI-RESPONSE-LOOP ST2 (in flight) — once ST2
extends slack_dm_log.jsonl outbound entries with the model_used /
input_tokens / output_tokens / reasoning_duration_ms /
reasoning_error fields, a small follow-on bucket flips this stub
to read from there. Shape is pinned by the new test suite.
Single-PR scope:
* GET /api/reasoning/recent stub — 4 representative calls
deliberately spanning ok @ normal on opus, ok @ warn_75 on
sonnet (cost-downshift), halted at hard_stop_100 (budget-
locked refusal), and sdk_timeout failure so the operator's
first look surfaces the full cost-ladder behaviour + error
taxonomy. stub:true keeps the FE banner visible.
* ReasoningPanel.tsx — title + stub banner + 4-column stats
strip (total calls / token spend / model distribution /
status distribution) + filter pills (all / ok / failed /
halted) + newest-first timeline. Per row: status icon,
model badge (tier-tinted: opus blue / sonnet purple / haiku
gray / null=halted red), cost-rung pill (color-mapped per
spec: normal gray / warn_75 yellow / downshift_90 orange /
hard_stop_100 red), duration with visual bar (warning ≥ 2s),
input→output token counts, status badge (non-default only),
error_code chip, response excerpt (truncated to 80 chars
collapsed; expandable to the 200-char API cap).
* Dashboard card #12 with Brain icon. Layout keeps lg:grid-
cols-3 — row 2 grows to 3+3+3+3 (clean 4-row × 3-col closure
of the layout that's been shepherded since DASHBOARD-V2).
Headline goes destructive when halted > 0 in 24h (Kora was
budget-locked; investigate cost rung); warning when failed > 0;
foreground/success otherwise.
* Route /reasoning + nav entry between /agent-activity and
/slack-dm (logical grouping with the agent-facing surfaces).
4-layer security contract (extending the established pattern with
reasoning-specific guards):
1. response_text_truncated_200 rendered as PLAIN TEXT via
React's default child escaping. Real responses may contain
anything Kora generates (HTML / markdown / script fragments).
FE pins via dangerouslySetInnerHTML grep.
2. NO Anthropic key shapes (sk-ant- prefix + base64-like body)
anywhere in payload. Walk-payload regex catches a future
error-projection bug or log-entry edit that leaks credential
material into the operator's view.
3. NO PII (email regex / Slack user-ID regex) leaked from the
inbound user's message into response_text_truncated_200 —
Kora's generated text is operator-visible; user message
content lives in SLACK-DM-PANEL, not here. Belt+braces
walk-payload sweep covers any field, not just response_text.
4. TS interface declares all fields with documented contracts;
no raw_prompt / auth_token / response_html companion fields
exist on ReasoningCall.
K-DG drift catch (per memory `feedback_k_dg_substrate_field_names`):
* Bucket §3(a) example payload used uppercase Enum NAMES
(NORMAL / WARN_75 / DOWNSHIFT_90 / HARD_STOP_100) — same
convention CC#3 cited in PR #126's K-DG drift note.
* BUT the canonical wire format per
agent/cost_state_holder.py:114-117 is the lowercase Enum
VALUES (NORMAL = "normal" etc), and engine.py:47-49's
CostLadderRungName literal enforces lowercase.
* Real CC#3 data will emit lowercase via .value. Stub uses
LOWERCASE to match — so stub and real data agree at flip
time. Backend test pins lowercase + flags this as the
intentional spec divergence.
Tests:
* tests/kora_cli/test_web_server_reasoning.py — 16 tests:
shape, 4-call stub pin, status+rung+error diversity, per-
entry schema with full ReasoningEngine error taxonomy
validation, cost-rung lowercase pin (K-DG catch), all 4
security guards (Anthropic key walk-payload + PII walk-
payload both per-field and whole-blob), FE source-pins
(no dangerouslySetInnerHTML, response_text as JSX child),
200-char API-edge cap, by_status_24h reconciliation,
tokens_total shape, cron-regression sanity.
* Full admin-panel regression: 287/287 across 24 suites
(with --extra dev only — slowapi now runtime per PR #128).
* tsc -b + vite build both clean.
Refs:
* rafe-walker/kora-docs 17_cc_bucket_prompts/KR-REASONING-PANEL_kora_thinking_lens_stub.md
* PR #126 — KR-FEAT-AI-RESPONSE-LOOP ST1 (ReasoningEngine +
error taxonomy + CostRung literal source)
* PR #128 — slowapi runtime promotion (enables --extra dev only)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rafe-walker
added a commit
that referenced
this pull request
May 23, 2026
…#130) Agent-facing tool surface = 8 read + 5 mutating = 13 tools. Other agents have full operational surface on Kora. 2 new client listeners (slack_client_listener + purelymail_client_listener) + 2 new MCP tools (kora__send_slack_dm + kora__send_email) + Slack handler accessor refactor + JSONL caller_actor_kind + PurelymailClient send_email log + listener package wire-in. §5 rulings: Q1 listener compat (handler accessor first, lazy fallback) / Q2 single JSONL with optional caller_actor_kind / Q3 D-prefix OR Joshua user-ID; U... C... rejected. Security: tokens absent from error envelopes after diverse-failure-mode injection. Listener fail-soft contract: capabilities not gates. 36 new tests + 341 cross-bucket regression. CC#1 rebased onto #131 + #132 — combined reasoning-meta JSONL fields with caller_actor_kind audit field; preserved both prose docstrings + entry-building branches. Used explicit-SHA --force-with-lease per feedback_pm_merge_then_delete_race memory.
6 tasks
rafe-walker
added a commit
that referenced
this pull request
May 23, 2026
… JSONL (#141) 3 panels flip at once: AGENT-ACTIVITY + REASONING + WEBHOOK-EVENTS now read from kora_audit_log.jsonl. - NEW kora_cli/audit/jsonl_reader.py (shared helper) + 3 endpoint flips + 4 test files. 2 K-DG drifts caught + handled: - §2 Flip 1 spec mismatch with actual mcp_tools.py:714-724 writer — handled by setting duration_ms=0, status=ok, using details.result for result_summary. - §2 Flip 2 nullable cost_rung — handled by using lowercase unknown enum member to preserve PR #132 K-DG pin. Reasoning grouping: collapses N tool calls sharing caller_session_id into 1 ReasoningCall with tools_used: [...]. Aggregate counts use individual rows (not groups) so headline reflects volume. Webhook security: source_ip octet-masked at projection edge (audit writer passes RAW peer_ip; endpoint enforces mask). details sub-set to {reason, header_present} — never full audit details. IPv6/dash → defensive fallback. Fixture-isolation from #137 applied across all 3 test files + reader tests: monkeypatch get_kora_home in all 3 module namespaces. 357/357 admin-panel tests pass across 27 suites. Follow-on buckets cited: KR-REASONING-PANEL-MODEL-XREF (model/tokens cross-ref from slack_dm log) + KR-MCP-RUNTIME-SURFACE follow-on (extends mcp.tool_called audit with duration_ms + failure-path status).
3 tasks
rafe-walker
added a commit
that referenced
this pull request
May 23, 2026
…ia xref (#143) Cross-references audit JSONL with slack_dm_log outbound entries to populate previously-null model_used / tokens / response_text fields on reasoning panel rows. - NEW kora_cli/audit/reasoning_xref.py (cross-ref helper with parsing/matching/cost-rung-derivation/text-truncation) + /api/reasoning/recent endpoint update + 27 new xref tests. K-DG drift caught up-front: spec said verify if outbound JSONL writes caller_session_id; grep found it does NOT. Documented in module header + commit body. Correlation algorithm: caller_session_id → (channel_id, event_ts) → match outbound with same channel_id where thread_ts == event_ts, fallback to closest sent_at within ±60s. Cost-rung derivation: substring-match on model name (opus/sonnet/haiku) so future model revs keep mapping correctly; cost_ladder_halted reasoning_error supersedes; preserves lowercase Enum.value pin from #132/#141. Graceful degradation: when xref fails per-group, row renders with null fields — identical shape to #141 pre-xref output so FE handles both with no conditional logic. Security carve-out: response_text_truncated_200 is intentionally Joshua-content (carved out from PII sweep, same pattern as #141 message_id and slack_dm panel text). 384/384 admin-panel tests pass across 28 suites.
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Operator-facing view of Kora's recent ReasoningEngine calls — model used, tokens spent, cost-ladder rung at call time, errors. Pairs with CC#3's KR-FEAT-AI-RESPONSE-LOOP ST2 (in flight). Once ST2 extends
slack_dm_log.jsonloutbound entries with the reasoning fields, a small follow-on bucket flips this stub to read from there.Shape is pinned by the new test suite so the FE shipping in this PR keeps rendering during the cut-over.
What's in here
GET /api/reasoning/recentreturning 4 representative calls deliberately spanningok @ normal on opus+ok @ warn_75 on sonnet(cost-downshift in action) +halted @ hard_stop_100(budget-locked refusal) +sdk_timeoutfailure — surfaces the full cost-ladder behaviour + error taxonomy at first look.ReasoningPanel.tsx(new): title + stub banner + 4-column stats strip (total calls / token spendinput → output/ model distribution mini-bar / status distribution) + filter pills (all/ok/failed/halted) + newest-first timeline. Per row: status icon, model badge (tier-tinted — opus blue / sonnet purple / haiku gray /null=haltedred), cost-rung pill color-mapped per spec (normal gray, warn_75 yellow, downshift_90 orange, hard_stop_100 red), duration with visual bar (warning ≥ 2s),input → outputtoken counts, status badge (non-default only — happy-path rows stay clean),error_codechip, response excerpt (truncated to 80 chars collapsed; expandable to the 200-char API cap).Brainicon). Layout keepslg:grid-cols-3— row 2 grows to 3+3+3+3, the clean 4-row × 3-col closure of the layout. Headline goes destructive whenhalted > 0in 24h (Kora was budget-locked — investigate cost rung); warning whenfailed > 0; foreground otherwise./reasoning+ nav entry between/agent-activityand/slack-dm.4-layer security contract
Extending the established pattern with reasoning-specific guards:
response_text_truncated_200rendered as PLAIN TEXT via React's default child escaping. Real responses may contain anything Kora generates (HTML / markdown / script fragments). FE pins viadangerouslySetInnerHTMLgrep.sk-ant-prefix + base64-like body) anywhere in payload. Walk-payload regex catches a future error-projection bug or log-entry edit that leaks credential material into the operator's view.response_text_truncated_200— Kora's generated text is operator-visible; user message content lives in SLACK-DM-PANEL, not here. Belt+braces walk-payload sweep covers any field, not justresponse_text.raw_prompt/auth_token/response_htmlcompanion fields exist onReasoningCall.K-DG drift caught + handled
Per memory
feedback_k_dg_substrate_field_names_in_specs:NORMAL/WARN_75/DOWNSHIFT_90/HARD_STOP_100) — same convention CC#3 cited in PR feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST1 — ReasoningEngine + Anthropic impl + context loader #126's K-DG drift note.agent/cost_state_holder.py:114-117is the lowercase Enum VALUES (NORMAL = "normal"etc), andengine.py:47-49'sCostLadderRungNameliteral enforces lowercase..value. Stub uses lowercase to match — so stub and real data agree at flip time, no FE pill-color map breakage at the cutover. Backend test pins lowercase and flags this as the intentional spec divergence.Test plan
tests/kora_cli/test_web_server_reasoning.py— 16 tests: shape, 4-call stub pin, status+rung+error diversity, per-entry schema with full ReasoningEngine error taxonomy validation, cost-rung lowercase pin (K-DG catch), all 4 security guards (Anthropic key walk-payload + PII walk-payload both per-field and whole-blob), FE source-pins (nodangerouslySetInnerHTML, response_text as JSX child), 200-char API-edge cap,by_status_24hreconciliation,tokens_totalshape, cron-regression sanity.--extra devonly — slowapi now runtime per PR chore(kora): KR-SLOWAPI-DEP-FIX — move slowapi to runtime deps #128).pnpm tsc -bclean.pnpm buildclean./reasoning, exercise filters, expand rows, verify destructive card tone when halted in stub.Refs
rafe-walker/kora-docs→17_cc_bucket_prompts/KR-REASONING-PANEL_kora_thinking_lens_stub.mdCostRungliteral source)--extra devonly)🤖 Generated with Claude Code