This repository was archived by the owner on May 26, 2026. It is now read-only.
KR-PROBE-INVESTIGATION-DATA-COMPLETION — close #171 V1NotesBanner gaps#184
Merged
rafe-walker merged 1 commit intoMay 24, 2026
Conversation
…esBanner gaps Bundles the three follow-on gaps CC#2 surfaced honestly via V1NotesBanner in PR #171 so the probe-investigation viewer can render dm_sent / per-investigation cost / summary text per row. Three changes ------------- 1. Route wake_consumer DM through SlackDMHandler outbound-log * Extracted `_append_outbound_log_entry` from the handler into a free function `append_outbound_log_entry(log_path, ...)` in `kora_cli/handlers/slack_dm_handler.py`. The handler's instance method now delegates — byte-identical JSONL rows regardless of caller. * Added `resolve_slack_dm_log_path()` public accessor so non-handler callers don't depend on the private helper. * `wake_consumer._send_operator_dm_routed()` calls the free function with the probe's `caller_session_id` (`probe:{probe}:{category}`) so CC#2's audit-stream join can light up. Failure paths (slack client unavailable, channel_id unset, post_dm raises) all write a `send_status= "failed"` row so the panel renders the failure too. 2. New audit seam `probe.investigation_completed` * Added to `SeamName` Literal next to `tool.probe_autofix_attempted`. * Emitted once per dispatched investigation (success + fallback paths alike). Fields per spec: probe / issue_category / severity model_used / input_tokens / output_tokens / cache_creation_input_tokens / cache_read_input_tokens total_cost_usd / investigation_duration_ms investigation_summary_text (VERBATIM — operator-decision- relevant per the #182 reason-field precedent; Kora- composed, no external-string leakage) dm_status ∈ {sent, failed_send, engine_unavailable_fallback, engine_unavailable_failed_send} autofix_attempted (back-reference: did tool.probe_autofix_attempted fire with the same caller_session_id since investigation_started_at?) reasoning_error (when set) * source="reasoning" (probe_wake_consumer isn't in SourceName Literal; reasoning is the closest semantic match + matches the autofix seam's attribution). * caller_session_id="probe:{probe}:{category}" — matches the engine-side derivation in anthropic_engine._derive_caller_session_id for `probe_investigation` source, so the 4 audit streams (wake / autofix-attempted / completed / slack_dm_log) all join on the same key. 3. Cost computation * `_compute_total_cost_usd(meta)` uses `agent.usage_pricing.estimate_usage_cost` — the same call `cost_state_holder.record_inference` runs to bill the cost-ladder. Reuse vs. duplication keeps audit-sum-by-day and ladder accounting in lockstep without rounding drift. * Rejected the cost_telemetry snapshot alternative the spec mentioned: aggregated, no per-investigation row to fetch, racy under concurrent investigations. * Returns None when model unknown to pricing registry OR usage_pricing raises — field is always present (panel can render "(unknown)" without key-error). Helpers (module-level, pure) ---------------------------- * `_reasoning_meta_from_result(result)` — projects ResponseResult into the 5-key meta dict the outbound log + completed audit share. Tolerant of None / partial attribute presence. * `_autofix_attempted_during(caller_session_id, since)` — reads recent audit JSONL via `read_audit_entries(seam=..., since=...)` and matches caller_session_id. Fail-soft on read error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bundles the three follow-on gaps CC#2 surfaced honestly via V1NotesBanner in PR #171:
slack_dm_log.jsonl— wake_consumer bypassed SlackDMHandler outbound-logAfter this lands, the probe-investigation viewer (#171) can render
dm_senttime + investigation cost + summary text per row by joining four audit streams on the samecaller_session_id.Bucket spec:
17_cc_bucket_prompts/KR-PROBE-INVESTIGATION-DATA-COMPLETION_close_171_data_gaps.md.K-DG findings
_append_outbound_log_entryis safely extractable. The only instance state it touches isself._log_path— pure I/O after that. Extracted to a free functionappend_outbound_log_entry(log_path, …)inkora_cli/handlers/slack_dm_handler.py; the handler's instance method now delegates. Byte-identical JSONL rows regardless of caller. Added a publicresolve_slack_dm_log_path()accessor so non-handler call sites don't depend on a private helper._derive_caller_session_idinanthropic_engine.py:1546already emits"probe:{probe}:{category}"forsource="probe_investigation"; wake_consumer reuses that exact shape, so the 4 streams join on the same key without coordination.Cost computation choice
Chose
agent.usage_pricing.estimate_usage_costover thecost_telemetry.snapshot()alternative. Two reasons:estimate_usage_costis the same calculationagent.cost_state_holder.record_inferenceruns to bill the cost-ladder. Reusing it keeps the audit row in lockstep with the holder's daily-spend rung — operator can reconcileSUM(audit_row.total_cost_usd) by dayagainst the cost-ladder without rounding drift.Returns
Nonewhen the model is unknown to the pricing registry (e.g., custom OpenRouter slug) ORusage_pricingraises. Field is always present so the panel can render "(unknown)" without a key-error.Sample 4-stream audit trace (one synthetic investigation)
All four entries share
caller_session_id="probe:fly:service_unhealthy".Stream 1 —
probe.wake_requested(emitted by probe runner per #163):{ "seam": "probe.wake_requested", "details": {"probe": "fly", "severity": "critical", "category": "service_unhealthy", "title": "Fly app(s) unreachable: HTTP 401", "envelope_enabled": true, "envelope_fix_name": "restart_unhealthy_machine"}, "caller_session_id": "probe:fly:service_unhealthy", "source": "cron" }Stream 2 —
tool.probe_autofix_attempted(emitted bykora__attempt_probe_autofixper #182, during the in-flight investigation):{ "seam": "tool.probe_autofix_attempted", "details": {"probe": "fly", "action": "restart_machine", "target_id": "1781e9f6c12d83", "reason_from_reasoning": "machine state stopped for 3 consecutive probe cycles", "status": "attempted", "before_state": {"state": "stopped"}, "after_state": {"state": "started"}, "executor_duration_ms": 842}, "caller_session_id": "probe:fly:service_unhealthy", "source": "reasoning" }Stream 3 —
probe.investigation_completed(NEW, this PR):{ "seam": "probe.investigation_completed", "details": { "probe": "fly", "issue_category": "service_unhealthy", "severity": "critical", "model_used": "claude-haiku-4-5-20251001", "input_tokens": 1200, "output_tokens": 250, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 800, "total_cost_usd": 0.00123, "investigation_duration_ms": 3147, "investigation_summary_text": "Fly machine 1781e9f6c12d83 was in 'stopped' state for the last 3 probe cycles. I tried restart_machine via the envelope; before=stopped, after=started in 842ms. Next probe cycle (≤5 min) will confirm health holds. If it flaps back, check fly logs for OOM patterns before another restart.", "dm_status": "sent", "autofix_attempted": true }, "caller_session_id": "probe:fly:service_unhealthy", "source": "reasoning" }Stream 4 —
slack_dm_log.jsonloutbound entry (NEW path via free function):{ "sent_at": "2026-05-24T00:14:09.412+00:00", "channel_id": "U01JOSHUA", "thread_ts": null, "text": "🚨 Probe alert · fly\n[reasoning text body]", "slack_message_ts": "1742345059.123456", "send_status": "ok", "model_used": "claude-haiku-4-5-20251001", "input_tokens": 1200, "output_tokens": 250, "reasoning_duration_ms": 3147, "cache_read_input_tokens": 800, "caller_session_id": "probe:fly:service_unhealthy" }dm_statusvalues in the completed audit row:sent/failed_send/engine_unavailable_fallback/engine_unavailable_failed_send(combined fallback + send-failure path so CC#2 can branch cleanly).Privacy posture
investigation_summary_textIS recorded verbatim. Per the #182 precedent (autofixreason_from_reasoningrecorded verbatim because operator triage of "what did Kora decide and why" is the primary use case), and unlike #179 (email body redacted because user-supplied text). Kora's investigation summary is Kora-composed — doesn't echo back arbitrary unsanitized strings from external sources.CC#2 follow-on recommendation:
KR-FE-PROBE-INVESTIGATION-VIEWER-V2With the BE plumbing in place, CC#2 can:
/api/probe-investigationsto JOIN oncaller_session_id:dm_senttime fromslack_dm_log.jsonl(sent_atof thesend_status="ok"entry)investigation.summaryfrom theprobe.investigation_completedaudit rowinvestigation.cost_usd+investigation.model_usedfrom sameinvestigation.autofix_attemptedboolean for the badge columnPer-call cost not yet displayed,DM sent time not yet populated,Investigation summary not yet shown).dm_statuschip filter (operator's primary triage lens —failed_sendshould be the top filter).The wire shape is forward-compatible:
total_cost_usdmay benullwhen pricing is unknown; renderer handles with "(unknown)". Same formodel_usedon the engine-unavailable path.Files
kora_cli/handlers/slack_dm_handler.py— extractedappend_outbound_log_entryfree function +resolve_slack_dm_log_pathpublic accessor; the existing instance method delegateskora_cli/probes/wake_consumer.py—_send_operator_dm_routedwrites to slack_dm_log;_emit_investigation_completedwrites the new audit seam; module-level helpers for meta projection, cost computation, autofix back-referencekora_cli/audit/jsonl_sink.py— newprobe.investigation_completedSeamName Literal entrytests/kora_cli/probes/test_wake_consumer.py— 9 new tests covering happy path / engine_unavailable_fallback / failed_send / combined fallback+failed / autofix back-reference (positive + negative) / audit-emit failure swallowed / 4-stream caller_session_id consistency / cost helper (missing model / zero tokens / pricing exception)Test plan
ruff checkclean on all changed files🤖 Generated with Claude Code