KR-PROBE-INVESTIGATION-DATA-COMPLETION — close #171 V1NotesBanner gaps by rafe-walker · Pull Request #184 · rafe-walker/kora

rafe-walker · 2026-05-24T05:04:47Z

Summary

Bundles the three follow-on gaps CC#2 surfaced honestly via V1NotesBanner in PR #171:

DM events don't appear in slack_dm_log.jsonl — wake_consumer bypassed SlackDMHandler outbound-log
Per-call cost/model only in aggregate CostTelemetry — nothing per-investigation
Investigation summary text sent to Slack but not persisted

After this lands, the probe-investigation viewer (#171) can render dm_sent time + investigation cost + summary text per row by joining four audit streams on the same caller_session_id.

Bucket spec: 17_cc_bucket_prompts/KR-PROBE-INVESTIGATION-DATA-COMPLETION_close_171_data_gaps.md.

K-DG findings

_append_outbound_log_entry is safely extractable. The only instance state it touches is self._log_path — pure I/O after that. Extracted to a free function append_outbound_log_entry(log_path, …) in kora_cli/handlers/slack_dm_handler.py; the handler's instance method now delegates. Byte-identical JSONL rows regardless of caller. Added a public resolve_slack_dm_log_path() accessor so non-handler call sites don't depend on a private helper.
caller_session_id derivation matches engine. _derive_caller_session_id in anthropic_engine.py:1546 already emits "probe:{probe}:{category}" for source="probe_investigation"; wake_consumer reuses that exact shape, so the 4 streams join on the same key without coordination.

Cost computation choice

Chose agent.usage_pricing.estimate_usage_cost over the cost_telemetry.snapshot() alternative. Two reasons:

Per-call precision. The telemetry snapshot is aggregated across windows — there's no per-investigation row to fetch. Picking "the most recent matching call" is racy under concurrent investigations.
Accounting lockstep. estimate_usage_cost is the same calculation agent.cost_state_holder.record_inference runs to bill the cost-ladder. Reusing it keeps the audit row in lockstep with the holder's daily-spend rung — operator can reconcile SUM(audit_row.total_cost_usd) by day against the cost-ladder without rounding drift.

Returns None when the model is unknown to the pricing registry (e.g., custom OpenRouter slug) OR usage_pricing raises. Field is always present so the panel can render "(unknown)" without a key-error.

Sample 4-stream audit trace (one synthetic investigation)

All four entries share caller_session_id="probe:fly:service_unhealthy".

Stream 1 — probe.wake_requested (emitted by probe runner per #163):

{
  "seam": "probe.wake_requested",
  "details": {"probe": "fly", "severity": "critical", "category": "service_unhealthy",
              "title": "Fly app(s) unreachable: HTTP 401",
              "envelope_enabled": true, "envelope_fix_name": "restart_unhealthy_machine"},
  "caller_session_id": "probe:fly:service_unhealthy",
  "source": "cron"
}

Stream 2 — tool.probe_autofix_attempted (emitted by kora__attempt_probe_autofix per #182, during the in-flight investigation):

{
  "seam": "tool.probe_autofix_attempted",
  "details": {"probe": "fly", "action": "restart_machine", "target_id": "1781e9f6c12d83",
              "reason_from_reasoning": "machine state stopped for 3 consecutive probe cycles",
              "status": "attempted", "before_state": {"state": "stopped"},
              "after_state": {"state": "started"}, "executor_duration_ms": 842},
  "caller_session_id": "probe:fly:service_unhealthy",
  "source": "reasoning"
}

Stream 3 — probe.investigation_completed (NEW, this PR):

{
  "seam": "probe.investigation_completed",
  "details": {
    "probe": "fly", "issue_category": "service_unhealthy", "severity": "critical",
    "model_used": "claude-haiku-4-5-20251001",
    "input_tokens": 1200, "output_tokens": 250,
    "cache_creation_input_tokens": 0, "cache_read_input_tokens": 800,
    "total_cost_usd": 0.00123,
    "investigation_duration_ms": 3147,
    "investigation_summary_text": "Fly machine 1781e9f6c12d83 was in 'stopped' state for the last 3 probe cycles. I tried restart_machine via the envelope; before=stopped, after=started in 842ms. Next probe cycle (≤5 min) will confirm health holds. If it flaps back, check fly logs for OOM patterns before another restart.",
    "dm_status": "sent",
    "autofix_attempted": true
  },
  "caller_session_id": "probe:fly:service_unhealthy",
  "source": "reasoning"
}

Stream 4 — slack_dm_log.jsonl outbound entry (NEW path via free function):

{
  "sent_at": "2026-05-24T00:14:09.412+00:00",
  "channel_id": "U01JOSHUA",
  "thread_ts": null,
  "text": "🚨 Probe alert · fly\n[reasoning text body]",
  "slack_message_ts": "1742345059.123456",
  "send_status": "ok",
  "model_used": "claude-haiku-4-5-20251001",
  "input_tokens": 1200,
  "output_tokens": 250,
  "reasoning_duration_ms": 3147,
  "cache_read_input_tokens": 800,
  "caller_session_id": "probe:fly:service_unhealthy"
}

dm_status values in the completed audit row: sent / failed_send / engine_unavailable_fallback / engine_unavailable_failed_send (combined fallback + send-failure path so CC#2 can branch cleanly).

Privacy posture

investigation_summary_text IS recorded verbatim. Per the #182 precedent (autofix reason_from_reasoning recorded verbatim because operator triage of "what did Kora decide and why" is the primary use case), and unlike #179 (email body redacted because user-supplied text). Kora's investigation summary is Kora-composed — doesn't echo back arbitrary unsanitized strings from external sources.

CC#2 follow-on recommendation: `KR-FE-PROBE-INVESTIGATION-VIEWER-V2`

With the BE plumbing in place, CC#2 can:

Update /api/probe-investigations to JOIN on caller_session_id:
- dm_sent time from slack_dm_log.jsonl (sent_at of the send_status="ok" entry)
- investigation.summary from the probe.investigation_completed audit row
- investigation.cost_usd + investigation.model_used from same
- investigation.autofix_attempted boolean for the badge column
Remove the three V1NotesBanner entries that were specific to these gaps (Per-call cost not yet displayed, DM sent time not yet populated, Investigation summary not yet shown).
Add a dm_status chip filter (operator's primary triage lens — failed_send should be the top filter).

The wire shape is forward-compatible: total_cost_usd may be null when pricing is unknown; renderer handles with "(unknown)". Same for model_used on the engine-unavailable path.

Files

MOD kora_cli/handlers/slack_dm_handler.py — extracted append_outbound_log_entry free function + resolve_slack_dm_log_path public accessor; the existing instance method delegates
MOD kora_cli/probes/wake_consumer.py — _send_operator_dm_routed writes to slack_dm_log; _emit_investigation_completed writes the new audit seam; module-level helpers for meta projection, cost computation, autofix back-reference
MOD kora_cli/audit/jsonl_sink.py — new probe.investigation_completed SeamName Literal entry
MOD tests/kora_cli/probes/test_wake_consumer.py — 9 new tests covering happy path / engine_unavailable_fallback / failed_send / combined fallback+failed / autofix back-reference (positive + negative) / audit-emit failure swallowed / 4-stream caller_session_id consistency / cost helper (missing model / zero tokens / pricing exception)

Test plan

37 wake_consumer tests pass (28 existing + 9 new)
Regression: 401 passed across probes + handlers + audit + tools + reasoning
ruff check clean on all changed files

🤖 Generated with Claude Code

…esBanner gaps Bundles the three follow-on gaps CC#2 surfaced honestly via V1NotesBanner in PR #171 so the probe-investigation viewer can render dm_sent / per-investigation cost / summary text per row. Three changes ------------- 1. Route wake_consumer DM through SlackDMHandler outbound-log * Extracted `_append_outbound_log_entry` from the handler into a free function `append_outbound_log_entry(log_path, ...)` in `kora_cli/handlers/slack_dm_handler.py`. The handler's instance method now delegates — byte-identical JSONL rows regardless of caller. * Added `resolve_slack_dm_log_path()` public accessor so non-handler callers don't depend on the private helper. * `wake_consumer._send_operator_dm_routed()` calls the free function with the probe's `caller_session_id` (`probe:{probe}:{category}`) so CC#2's audit-stream join can light up. Failure paths (slack client unavailable, channel_id unset, post_dm raises) all write a `send_status= "failed"` row so the panel renders the failure too. 2. New audit seam `probe.investigation_completed` * Added to `SeamName` Literal next to `tool.probe_autofix_attempted`. * Emitted once per dispatched investigation (success + fallback paths alike). Fields per spec: probe / issue_category / severity model_used / input_tokens / output_tokens / cache_creation_input_tokens / cache_read_input_tokens total_cost_usd / investigation_duration_ms investigation_summary_text (VERBATIM — operator-decision- relevant per the #182 reason-field precedent; Kora- composed, no external-string leakage) dm_status ∈ {sent, failed_send, engine_unavailable_fallback, engine_unavailable_failed_send} autofix_attempted (back-reference: did tool.probe_autofix_attempted fire with the same caller_session_id since investigation_started_at?) reasoning_error (when set) * source="reasoning" (probe_wake_consumer isn't in SourceName Literal; reasoning is the closest semantic match + matches the autofix seam's attribution). * caller_session_id="probe:{probe}:{category}" — matches the engine-side derivation in anthropic_engine._derive_caller_session_id for `probe_investigation` source, so the 4 audit streams (wake / autofix-attempted / completed / slack_dm_log) all join on the same key. 3. Cost computation * `_compute_total_cost_usd(meta)` uses `agent.usage_pricing.estimate_usage_cost` — the same call `cost_state_holder.record_inference` runs to bill the cost-ladder. Reuse vs. duplication keeps audit-sum-by-day and ladder accounting in lockstep without rounding drift. * Rejected the cost_telemetry snapshot alternative the spec mentioned: aggregated, no per-investigation row to fetch, racy under concurrent investigations. * Returns None when model unknown to pricing registry OR usage_pricing raises — field is always present (panel can render "(unknown)" without key-error). Helpers (module-level, pure) ---------------------------- * `_reasoning_meta_from_result(result)` — projects ResponseResult into the 5-key meta dict the outbound log + completed audit share. Tolerant of None / partial attribute presence. * `_autofix_attempted_during(caller_session_id, since)` — reads recent audit JSONL via `read_audit_entries(seam=..., since=...)` and matches caller_session_id. Fail-soft on read error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rafe-walker merged commit 266f159 into feature/phase2-upgrades May 24, 2026

rafe-walker deleted the feat/kora-KR-PROBE-INVESTIGATION-DATA-COMPLETION branch May 24, 2026 05:10

This was referenced May 24, 2026

feat(kora): KR-FE-PROMOTION-REVIEW-AND-V2-MEGABUCKET — review panel + probe V2 + kora-actions extended #191

Merged

feat(kora): KR-ALERT-WAKE-AND-EMAIL-INTENT-PROMOTION-AND-ROUTER-LOOSEN-MEGABUCKET — 3 deliverables #197

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KR-PROBE-INVESTIGATION-DATA-COMPLETION — close #171 V1NotesBanner gaps#184

KR-PROBE-INVESTIGATION-DATA-COMPLETION — close #171 V1NotesBanner gaps#184
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-PROBE-INVESTIGATION-DATA-COMPLETION

rafe-walker commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rafe-walker commented May 24, 2026

Summary

K-DG findings

Cost computation choice

Sample 4-stream audit trace (one synthetic investigation)

Privacy posture

CC#2 follow-on recommendation: KR-FE-PROBE-INVESTIGATION-VIEWER-V2

Files

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

CC#2 follow-on recommendation: `KR-FE-PROBE-INVESTIGATION-VIEWER-V2`