Skip to content
This repository was archived by the owner on May 26, 2026. It is now read-only.

feat(kora): KR-REASONING-PANEL-MODEL-XREF — populate model + tokens via xref#143

Merged
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-REASONING-PANEL-MODEL-XREF
May 23, 2026
Merged

feat(kora): KR-REASONING-PANEL-MODEL-XREF — populate model + tokens via xref#143
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-REASONING-PANEL-MODEL-XREF

Conversation

@rafe-walker

Copy link
Copy Markdown
Owner

Summary

Cited follow-on from PR #141: cross-references kora_audit_log.jsonl reasoning rows with slack_dm_log.jsonl outbound entries to populate the previously-null model_used / cost_rung_at_call / input_tokens / output_tokens / response_text_truncated_200 fields. Backend-only; TS interface already has the fields.

K-DG verified before drafting

Per PM instruction + the feedback_no_pm_memory_assertions_grep_yourself rule:

  • Audit caller_session_id shape is "{channel_id}:{event_ts}" for slack_dm-sourced groups per anthropic_engine.py:844-876.
  • slack_dm outbound writer at slack_dm_handler.py:753-833 does NOT include caller_session_id in the JSONL entry. The spec flagged this to verify ("PRIMARY: match by caller_session_id if outbound JSONL writes it") — it does NOT, so only the FALLBACK timestamp-window match is workable. Documented in the helper module.

Correlation algorithm:

  1. Parse audit caller_session_id as "{channel_id}:{event_ts}" for slack_dm sources
  2. Match outbound entries with same channel_id where thread_ts == event_ts (natural threading — Kora's reply threads under the user's inbound)
  3. Fallback: closest sent_at to group's latest emitted_at within ±60s window
  4. Other sources (email/mcp/unknown) → no xref attempt; row renders with null fields (graceful degradation)

What's in here

Graceful degradation

When xref fails for a group (no slack_dm match, or outside the ±60s window), the row still renders with null model_used / 0 tokens / null response_text — same shape as PR #141's pre-xref output. The FE handles both without conditional logic.

4-layer security carry-forward + xref-specific

Test plan

  • tests/kora_cli/audit/test_reasoning_xref.py27 tests: session-id parsing (slack_dm happy path + email/mcp/unknown degrade), cost-rung derivation (5 cases), response-text truncation (4 cases), graceful degradation, thread_ts match, timestamp-window fallback, outside-window degrades, different-channel ignored, non-slack_dm session-ids skip xref, multiple groups each pick own match, long text truncation, cost_ladder_halted xref supersedes status, malformed slack_dm line tolerated, individual-row aggregate count semantics, walk-payload SECURITY, endpoint integration (xref success + degradation paths).
  • tests/kora_cli/test_web_server_reasoning.py (from feat(kora): KR-AUDIT-PANEL-ENDPOINTS — flip 3 stub panels using audit JSONL #141) — 15/15 still pass (graceful degradation preserves old behavior).
  • Full admin-panel regression: 384/384 across 28 suites.

Refs

Follow-on cited

KR-REASONING-PANEL-EMAIL-XREF — when Feature 3 email reasoning replies write to email_outbound_log.jsonl, extend xref to handle email-source session ids.

🤖 Generated with Claude Code

…ia xref

Cited follow-on from PR #141: cross-references kora_audit_log.jsonl
reasoning rows with slack_dm_log.jsonl outbound entries to populate
the previously-null model_used / cost_rung_at_call / input_tokens /
output_tokens / response_text_truncated_200 fields. Backend-only;
TS interface already has the fields.

K-DG verified BEFORE drafting (per PM instruction + the
feedback_no_pm_memory_assertions_grep_yourself rule):

  * Audit caller_session_id shape is
    "{channel_id}:{event_ts}" for slack_dm-sourced groups per
    kora_cli/reasoning/anthropic_engine.py:844-876.

  * slack_dm outbound writer at
    kora_cli/handlers/slack_dm_handler.py:753-833 does NOT
    include caller_session_id in the JSONL entry. The spec
    flagged this to verify ("PRIMARY: match by caller_session_id
    if outbound JSONL writes it") — it does NOT, so only the
    FALLBACK timestamp-window match is workable. Documented in
    the helper module's docstring.

  * Correlation algorithm (the only path given the above):
      1. Parse audit caller_session_id as
         "{channel_id}:{event_ts}" for slack_dm sources
      2. Match outbound entries with same channel_id where
         thread_ts == event_ts (natural threading match —
         Kora's reply threads under the user's inbound)
      3. Fallback: closest sent_at to group's latest emitted_at
         within ±60s window
      4. Other sources (email/mcp/unknown) → no xref attempt;
         row renders with null fields (graceful degradation)

New module: kora_cli/audit/reasoning_xref.py

  * load_reasoning_calls_with_xref(limit) → (calls, raw_count)
  * _parse_slack_dm_session_id → channel_id, event_ts tuple
  * _find_xref_for_slack_dm_group → matched outbound or None
  * _derive_cost_rung(model_used, reasoning_error):
      - cost_ladder_halted → "hard_stop_100" (supersedes model)
      - "opus" substring → "normal"
      - "sonnet" substring → "warn_75"
      - "haiku" substring → "downshift_90"
      - unmapped / missing → "unknown"
    Lowercase Enum.value strings per PR #132 / #141 K-DG pin.
    Substring match so future model revs (e.g. opus-4-7 → opus-5-0)
    keep mapping correctly without code changes.
  * _truncate_response_text → 200-char cap matching the field
    name's semantic contract; full text stays in slack_dm_log

Endpoint update: kora_cli/web_server.py /api/reasoning/recent

  * Replaces the inline projection from PR #141 with a call to
    load_reasoning_calls_with_xref(). Same shape returned; just
    populates the previously-null fields when xref succeeds.
  * by_model_24h + tokens_total_24h aggregates now reflect xref
    enrichment (no longer always {} / {0,0}).
  * Aggregate counts continue to operate on INDIVIDUAL audit rows
    per the PR #141 rationale (headline reflects activity volume).
  * Graceful degradation preserved: when xref fails for a group
    (no slack_dm match), row still renders with null fields,
    matching pre-xref behavior exactly so the FE handles both
    without conditional logic.

4-layer SECURITY contract (carry-forward + xref-specific):
  * Cross-ref doesn't add security risk — model_used + token
    counts are inherently safe metadata.
  * response_text_truncated_200 IS Joshua-content (intentional
    carve-out from PII regex sweep — same shape as PR #141's
    message_id carve-out + slack_dm panel's text carve-out).
    Plain-text rendering already enforced FE-side via
    dangerouslySetInnerHTML ban from PR #132.
  * Walk-payload sweep for Anthropic key shapes + HMAC secrets
    continues; response_text excluded from PII regex (carve-out).

Tests (tests/kora_cli/audit/test_reasoning_xref.py — 27 tests):
  * _parse_slack_dm_session_id: happy path + all other-source
    shapes degrade
  * _derive_cost_rung: opus/sonnet/haiku/cost_ladder_halted/
    unmapped/missing
  * _truncate_response_text: under-cap / at-cap / over-cap / None
  * Graceful degradation: audit-only (no slack_dm log) → null fields
  * Successful xref: thread_ts match populates fields
  * Timestamp-window fallback when thread_ts mismatches
  * Outside ±60s window → no xref
  * Different-channel outbound ignored
  * Email/mcp session ids skip xref (no slack_dm target)
  * Multiple groups each pick own match
  * Long response_text truncated to 200 chars
  * cost_ladder_halted xref supersedes audit status
  * Malformed slack_dm line tolerated (log + skip)
  * raw_in_window_count uses individual rows not groups (#141 rule)
  * SECURITY walk-payload (Anthropic/HMAC) with response_text
    carve-out
  * Endpoint integration: xref populates fields end-to-end
  * Endpoint integration: graceful degradation when slack_dm absent

Existing test_web_server_reasoning.py (#141) — 15/15 still pass.
Fixture-isolation per #137 + #141 lesson preserved (monkeypatch
get_kora_home in all 3 module namespaces).

Full admin-panel regression: 384/384 across 28 suites.

Refs:
  * rafe-walker/kora-docs 17_cc_bucket_prompts/KR-REASONING-PANEL-MODEL-XREF_cost_visibility.md
  * PR #141 — KR-AUDIT-PANEL-ENDPOINTS (audit endpoint + jsonl_reader)
  * PR #131 — KR-FEAT-AI-RESPONSE-LOOP ST2 (model_used + tokens in
    slack_dm outbound)
  * PR #136 — KR-FEAT-AGENTIC-REASONING ST2 (tools_used)
  * PR #132 — KR-REASONING-PANEL (lowercase CostRung pin)
  * PR #137 — KR-SLACK-DM-PANEL-FLIP (fixture-isolation pattern)

Follow-on cited: KR-REASONING-PANEL-EMAIL-XREF — when Feature 3
email reasoning replies write to email_outbound_log.jsonl, extend
xref to handle email-source session ids.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rafe-walker rafe-walker merged commit f464ff3 into feature/phase2-upgrades May 23, 2026
@rafe-walker rafe-walker deleted the feat/kora-KR-REASONING-PANEL-MODEL-XREF branch May 23, 2026 20:04
rafe-walker added a commit that referenced this pull request May 23, 2026
…ound (#148)

CC#2 follow-on after CC#1 KR-EMAIL-OUTBOUND-REASONING-META (#146) unblocked the gap her STOP-ASK caught.

- 2 files, +737/-7: extension to reasoning_xref.py (email path: loader + parser + 3-tier matcher) + 20 new email-specific tests.

All 3 K-DG gates verified before drafting per re-dispatch: send_email kwargs ✓; opt-in writer ✓; caller_session_id literal format symmetry between handler + engine ✓.

3-tier cascade: PRIMARY caller_session_id literal equality (closed by #146) → SECONDARY in_reply_to chain → LAST RESORT ±60s timestamp window.

Slack-first precedence preserved: existing #141/#143 tests (42/42) still pass without modification.

response_text carve-out for email-sourced rows: stays null per #124 design (body never in email JSONL); same shape as slack_dm text + #143 message_id carve-outs. Tracked via xref_source local so the conditional null-set cannot regress to populating from a future field rename.

400/400 admin-panel + audit tests pass across 29 suites.
rafe-walker added a commit that referenced this pull request May 23, 2026
#151)

Pure refactor. Banked observation from CC#2 #143 ship report. ~300 LOC removal + centralized behavior.

Part A: web/src/lib/panelHelpers.ts with formatRelative / formatTimestamp / formatLatency (all nullable signatures); 7 panels updated. HeartbeatPanel keeps thin never checked wrapper but delegates time-math to shared helper.

Part B: tests/kora_cli/_panel_test_helpers.py with strip_ts_comments / assert_no_token_shapes / isolated_kora_home; 5 test files swap inline strip_ts_comments, 6 swap inline 3-namespace fixture.

Heartbeat source-pin updated: pinning moved from local function signatures to canonical panelHelpers.ts signatures + verification that HeartbeatPanel imports from shared module.

434/434 admin-panel + audit tests pass; tsc -b + vite build clean.

Other observations (empty-state convergence / TZ rendering / Show More) stay as separate banked micro-buckets per PM default.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant