This repository was archived by the owner on May 26, 2026. It is now read-only.
feat(kora): KR-AUDIT-PANEL-ENDPOINTS — flip 3 stub panels using audit JSONL#141
Merged
rafe-walker merged 1 commit intoMay 23, 2026
Merged
Conversation
… JSONL
Three endpoint flips at once, all reading from CC#3's
${KORA_HOME}/kora_audit_log.jsonl (KR-AUDIT-JSONL-SINK, PR #139).
Backend-only — no FE changes; TS interfaces already match the
projected shapes. Same stub-then-real pattern as PR #137's
slack-dm flip.
Shared reader (NEW):
* kora_cli/audit/jsonl_reader.py with read_audit_entries(
seam, limit, since) — single helper used by all 3 endpoints.
* Per-call get_kora_home() lookup so monkeypatch in tests works
without ContextVar plumbing (per the #137 fixture lesson).
* Tolerates missing file, malformed JSON, non-dict lines,
Pydantic ValidationError on a single row — log + skip.
* Newest-first by emitted_at descending.
Flip 1 — /api/agent-activity/recent (mcp.tool_called):
* Filters audit rows where seam == "mcp.tool_called", projects
to AgentCall shape.
* K-DG drift caught: spec said details has duration_ms /
tool_status / result_summary. Actual writer at
kora_cli/listeners/mcp_tools.py:714-724 only emits tool_name
/ tool_kind / caller_actor_kind / args_keys / result.
Projection: duration_ms=0; status="ok" (audit only fires on
success path today; KR-MCP-RUNTIME-SURFACE follow-on extends);
result_summary=details.result (writer docstring confirms this
is pre-filtered to a short string).
Flip 2 — /api/reasoning/recent (reasoning.tool_called):
* Filters audit rows where seam == "reasoning.tool_called".
* GROUPS by caller_session_id so a multi-tool reasoning
iteration collapses into ONE ReasoningCall row with
tools_used: [name1, name2, name3]. tools_used is a FE
extension field — existing TS ReasoningCall ignores it;
follow-on FE bucket renders.
* Status derivation per spec: all-ok → ok; any not_allowed →
halted+capability_denied; any execution_error → failed+
handler_error.
* Limitations (per spec §2 Flip 2): model_used / tokens /
response_text_truncated_200 set to null — those fields live
in slack_dm_log.jsonl outbound entries, not audit. The
KR-REASONING-PANEL-MODEL-XREF follow-on bucket cross-
references by timestamp + thread_ts. cost_rung_at_call set
to "unknown" (lowercase Enum.value — preserves the K-DG pin
from PR #132 against engine.py:47-49 CostLadderRungName
literal). Keeping this PR small per spec §4.
* Aggregate counts (total_recent_24h, by_status_24h) use
INDIVIDUAL audit rows — not groups — so headline reflects
total reasoning activity volume.
Flip 3 — /api/webhooks/events/recent (webhook.dead_letter):
* Filters audit rows where seam == "webhook.dead_letter".
* status pinned to "dead_letter" (this seam ONLY emits dead-
letters; verified happy-path events stay in chain log; rate-
limited from slowapi doesn't currently call emit_audit).
* SECURITY: source_ip OCTET-MASKED at the projection edge.
Audit writer at webhook_dead_letter.py:142 passes RAW
peer_ip; endpoint enforces _mask_ipv4_last_two_octets
("54.203.99.142" → "54.203.x.x") per the KR-WEBHOOK-EVENTS
#109 contract. IPv6 / unexpected shapes → "—" defensively
(never leaks unmasked).
* SECURITY: details sub-set to {reason, header_present} —
never the full audit details dict (which carries
body_bytes / request_id / headers the panel hasn't vetted).
4-layer SECURITY contract carry-forward (preserved across all 3
endpoints):
* Walk-payload sweeps for Anthropic key shapes, Slack token
shapes, HMAC-secret shapes, full IPv4 leaks.
* Per-field caller_actor_kind label-shape pin (no hash/base64
runs).
* result_summary no-raw-JSON pin.
* cost_rung lowercase literal pin.
* details sub-set enforcement (webhook).
Fixture-isolation discipline applied per #137 lesson: all 3
endpoint test files monkeypatch get_kora_home in ALL THREE
module namespaces (kora_constants, kora_cli.config,
kora_cli.web_server). Reader tests also use the pattern. The
reader's own resolution uses a local kora_constants import so
the test patch takes effect on a fresh call.
Tests:
* tests/kora_cli/audit/test_jsonl_reader.py — 14 reader tests:
missing file, empty file, seam filter, since filter, naive
datetime UTC assumption, limit cap, malformed JSON tolerance,
non-dict line skip, Pydantic ValidationError per-line skip,
blank lines, KORA_AUDIT_LOG_PATH env override.
* test_web_server_agent_activity.py — 11 tests (rewrite):
empty log, shape, mcp.tool_called projection, seam filtering,
?limit cap, newest-first, walk-payload SECURITY, label-shape
caller_actor_kind, no-raw-JSON result_summary, by_caller_24h
reconciliation, cron-regression sanity.
* test_web_server_reasoning.py — 15 tests (rewrite): empty,
shape, single-tool projection, multi-row session grouping,
different sessions → separate rows, duration_ms sum, status
derivation (ok / halted / failed), seam filtering,
walk-payload SECURITY, lowercase cost_rung pin, ?limit on
groups, by_status_24h counts individual rows not groups,
cron-regression sanity.
* test_web_server_webhook_events.py — 15 tests (rewrite):
empty, shape, dead_letter projection, status pinned, source
→ endpoint mapping, IPv4 octet-mask enforcement, walk-payload
no-full-IPv4 sweep, IPv6/dash → "—" fallback, details sub-
set enforcement, seam filtering, newest-first, ?limit cap,
cron-regression sanity.
* Full admin-panel regression: 357/357 across 27 suites.
K-DG drift summary:
* spec §2 Flip 1 said details has duration_ms / tool_status /
result_summary — only `result` is present in the actual
writer. Documented + handled in projection.
* spec §2 Flip 2 said cost_rung could be null — FE
ReasoningCostRung union requires a value; used "unknown"
(which IS in the enum) instead. Preserves the lowercase
Enum.value contract from PR #132.
Follow-on buckets cited:
* KR-REASONING-PANEL-MODEL-XREF — cross-ref to
slack_dm_log.jsonl for model_used / tokens / response_text
* KR-MCP-RUNTIME-SURFACE follow-on — extends audit writer with
duration_ms + failure-path emit (status taxonomy)
Refs:
* rafe-walker/kora-docs 17_cc_bucket_prompts/KR-AUDIT-PANEL-ENDPOINTS_three_flips.md
* PR #139 — KR-AUDIT-JSONL-SINK (writer + AuditEntry shape)
* PR #137 — KR-SLACK-DM-PANEL-FLIP (fixture-isolation pattern)
* PR #114 / #132 / #109 — original stub panels being flipped
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
rafe-walker
added a commit
that referenced
this pull request
May 23, 2026
…ia xref (#143) Cross-references audit JSONL with slack_dm_log outbound entries to populate previously-null model_used / tokens / response_text fields on reasoning panel rows. - NEW kora_cli/audit/reasoning_xref.py (cross-ref helper with parsing/matching/cost-rung-derivation/text-truncation) + /api/reasoning/recent endpoint update + 27 new xref tests. K-DG drift caught up-front: spec said verify if outbound JSONL writes caller_session_id; grep found it does NOT. Documented in module header + commit body. Correlation algorithm: caller_session_id → (channel_id, event_ts) → match outbound with same channel_id where thread_ts == event_ts, fallback to closest sent_at within ±60s. Cost-rung derivation: substring-match on model name (opus/sonnet/haiku) so future model revs keep mapping correctly; cost_ladder_halted reasoning_error supersedes; preserves lowercase Enum.value pin from #132/#141. Graceful degradation: when xref fails per-group, row renders with null fields — identical shape to #141 pre-xref output so FE handles both with no conditional logic. Security carve-out: response_text_truncated_200 is intentionally Joshua-content (carved out from PII sweep, same pattern as #141 message_id and slack_dm panel text). 384/384 admin-panel tests pass across 28 suites.
rafe-walker
added a commit
that referenced
this pull request
May 23, 2026
…ound (#148) CC#2 follow-on after CC#1 KR-EMAIL-OUTBOUND-REASONING-META (#146) unblocked the gap her STOP-ASK caught. - 2 files, +737/-7: extension to reasoning_xref.py (email path: loader + parser + 3-tier matcher) + 20 new email-specific tests. All 3 K-DG gates verified before drafting per re-dispatch: send_email kwargs ✓; opt-in writer ✓; caller_session_id literal format symmetry between handler + engine ✓. 3-tier cascade: PRIMARY caller_session_id literal equality (closed by #146) → SECONDARY in_reply_to chain → LAST RESORT ±60s timestamp window. Slack-first precedence preserved: existing #141/#143 tests (42/42) still pass without modification. response_text carve-out for email-sourced rows: stays null per #124 design (body never in email JSONL); same shape as slack_dm text + #143 message_id carve-outs. Tracked via xref_source local so the conditional null-set cannot regress to populating from a future field rename. 400/400 admin-panel + audit tests pass across 29 suites.
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three endpoint flips at once, all reading from CC#3's
${KORA_HOME}/kora_audit_log.jsonl(KR-AUDIT-JSONL-SINK, PR #139). Backend-only — no FE changes; TS interfaces already match the projected shapes. Stub-then-real arc closes foragent-activity/reasoning/webhook-events.Shared reader (new)
kora_cli/audit/jsonl_reader.py—read_audit_entries(seam, limit, since)is the single helper all 3 endpoints use. Per-callget_kora_home()lookup so monkeypatch in tests works without ContextVar plumbing (per #137's fixture lesson). Tolerates missing file, malformed JSON, non-dict lines, PydanticValidationErrorper line. Newest-first byemitted_at.Flip 1 —
/api/agent-activity/recent(mcp.tool_called)K-DG drift caught: spec said
details.duration_ms/tool_status/result_summary. Actual writer atmcp_tools.py:714-724only emitstool_name/tool_kind/caller_actor_kind/args_keys/result.Projection:
duration_ms=0;status="ok"(audit only fires on success path today;KR-MCP-RUNTIME-SURFACEfollow-on extends);result_summary=details.result(writer docstring confirms pre-filtered short string).Flip 2 —
/api/reasoning/recent(reasoning.tool_called)GROUPS by
caller_session_idso a multi-tool reasoning iteration collapses into ONEReasoningCallrow withtools_used: [name1, name2, name3].tools_usedis a FE extension field — existing TSReasoningCallignores it; a follow-on FE bucket renders.Status derivation: all-ok →
ok; anynot_allowed→halted+capability_denied; anyexecution_error→failed+handler_error.Limitations (per spec §2 Flip 2):
model_used/input_tokens/output_tokens/response_text_truncated_200set to null — those fields live inslack_dm_log.jsonloutbound entries, not audit. TheKR-REASONING-PANEL-MODEL-XREFfollow-on bucket cross-references by timestamp + thread_ts.cost_rung_at_callset to"unknown"(lowercase Enum.value — preserves the K-DG pin from PR #132 againstengine.py:47-49CostLadderRungNameliteral). Keeping this PR small per spec §4.Aggregate counts (
total_recent_24h,by_status_24h) use individual audit rows — not groups — so headline reflects total reasoning activity volume.Flip 3 —
/api/webhooks/events/recent(webhook.dead_letter)Status pinned to
"dead_letter"(this seam ONLY emits dead-letters; verified happy-path events stay in chain log; rate-limited from slowapi doesn't currently callemit_audit).SECURITY:
source_ipOCTET-MASKED at the projection edge. Audit writer atwebhook_dead_letter.py:142passes RAWpeer_ip; endpoint enforces_mask_ipv4_last_two_octets("54.203.99.142"→"54.203.x.x") per the KR-WEBHOOK-EVENTS #109 contract. IPv6 / unexpected shapes →"—"defensively (never leaks unmasked).SECURITY:
detailssub-set to{reason, header_present}— never the full auditdetailsdict (which carriesbody_bytes/request_id/headersthe panel hasn't vetted).4-layer security carry-forward
Preserved across all 3 endpoints:
caller_actor_kindlabel-shape pin (no hash/base64 runs).result_summaryno-raw-JSON pin.cost_runglowercase literal pin.detailssub-set enforcement (webhook).Fixture-isolation discipline (per #137 lesson)
All 3 endpoint test files monkeypatch
get_kora_homein all 3 module namespaces (kora_constants,kora_cli.config,kora_cli.web_server). Reader tests use the pattern too. The reader's own resolution uses a localkora_constantsimport so the test patch takes effect on a fresh call.Test plan
tests/kora_cli/audit/test_jsonl_reader.py— 14 reader tests (missing/empty file, seam filter, since filter, naive datetime UTC, limit cap, malformed/non-dict/validation-error tolerance, blank lines, env override).test_web_server_agent_activity.py— 11 tests (rewrite): empty, shape, projection, seam filtering,?limit, newest-first, walk-payload SECURITY, label-shapecaller_actor_kind, no-raw-JSONresult_summary,by_caller_24hreconciliation, cron-regression sanity.test_web_server_reasoning.py— 15 tests (rewrite): empty, shape, single-tool, multi-row session grouping, separate sessions → separate rows,duration_mssum, status derivation (ok/halted/failed), seam filtering, walk-payload SECURITY, lowercasecost_rungpin,?limiton groups,by_status_24hcounts individual rows not groups, cron-regression sanity.test_web_server_webhook_events.py— 15 tests (rewrite): empty, shape, projection, status pin, source→endpoint mapping, IPv4 octet-mask, walk-payload no-full-IPv4, IPv6/dash→"—"fallback, details sub-set enforcement, seam filtering, newest-first,?limit, cron-regression sanity.stub:false.K-DG drift summary
details.duration_ms/tool_status/result_summaryresultin writerduration_ms=0;status="ok";result_summary=details.result; flagged in endpoint docstringcost_rung_at_callcould be null"unknown"(lowercase Enum.value member); preserves PR #132 K-DG pinRefs
rafe-walker/kora-docs→17_cc_bucket_prompts/KR-AUDIT-PANEL-ENDPOINTS_three_flips.md🤖 Generated with Claude Code