This repository was archived by the owner on May 26, 2026. It is now read-only.
feat(kora): KR-ALERT-NOTIFY ST1 — push alerts to Joshua via Slack DM + email#149
Merged
rafe-walker merged 1 commit intoMay 23, 2026
Merged
Conversation
…+ email Closes the operator-feedback loop. The cockpit-complete observation from #145: every panel now reads real data, but Joshua still has to LOOK at the cockpit to see alerts. This bucket flips that — when a critical/warning alert fires, Kora pings Joshua via Slack DM (immediate); info alerts go to email. New module: ``kora_cli/alerts/notifier.py`` * ``AlertNotifier`` class — periodic-task entry point ``run_notification_cycle`` calls #145's aggregator, computes set-diff against last cycle's alert IDs, dispatches only newly-firing alerts. * Channel routing (PM §4 Q2 default): critical/warning → Slack DM, info → email. * Dedup state in-memory only (PM §4 Q3 default): empty on first cycle → all currently-active alerts fire as "new" (one-time re-ping on restart; persistence is a future bucket). * Failed dispatch still adds alert ID to last_alert_ids (no spam on transient SMTP/Slack errors). * Audit emit ``notification.dispatched`` per dispatch attempt (success OR failure) — operator triages via audit panel if expecting a notification that didn't arrive. * Pure formatting helpers (Slack DM text, email subject, email body, relative-time) extracted for isolated unit-testing. New listener: ``kora_cli/listeners/alert_notifier_listener.py`` * Constructs the notifier bound to ``current_slack_client`` + ``current_purelymail_client`` lazy factories (matches the accessor pattern from KR-MCP-SEND-TOOLS). * Periodic task ``alerts.notify`` registered at import time @ 180s default (PM §4 Q1); ``KORA_ALERT_NOTIFY_INTERVAL_SEC`` env override. * Shutdown resets dedup state so a subsequent listener start sees a clean slate. * Defense-in-depth outer try/except in run_notification_cycle so any path that bypasses the notifier's inner catch can't crash the heartbeat scheduler. Audit seam extension: ``notification.dispatched`` added to the SeamName Literal in ``kora_cli/audit/jsonl_sink.py`` (small + additive; existing seams unchanged). K-DG verification at HEAD ``2345d51`` (matches the spec's cited SHA): all 5 accessors confirmed in place + interfaces matched (compute_active_alerts ✓ in #145 / current_slack_client / current_purelymail_client / register_periodic_task ✓ in heartbeat scheduler / emit_audit ✓ in audit/jsonl_sink). 50 new tests pass: * 36 notifier tests (formatting + routing + dedup + failure isolation + audit shape + telemetry + dispatch outcome) * 14 listener tests (registration + cadence resolution + lifecycle + run_notification_cycle short-circuits + defense-in-depth + factory wiring) Cross-bucket regression: 600/600 when run serially. With pytest-xdist parallelism, 5-ish flaky failures appear in test_email_inbound_handler.py — VERIFIED pre-existing on bare HEAD without these changes (3 runs of bare HEAD reproduced 4/6/0 failures, same email_inbound tests). Not introduced by this PR; filed as a separate xdist-ordering pollution issue for the follow-on bucket queue. Ruff clean. §4 PM-open status — all DEFAULTS applied in ST1: Q1 cadence 180s (3 min) — accepted Q2 routing critical/warning→Slack, info→email — accepted Q3 fire on first cycle (no persistence) — accepted Q4 kora__send_test_alert MCP tool — DEFERRED to ST2 After ST2 lands (per-category cooldown + burst dampening + digest mode + operator runbook + kora__send_test_alert MCP tool), the operator-feedback loop closes fully. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 23, 2026
rafe-walker
added a commit
that referenced
this pull request
May 24, 2026
…or (#166) Closes the unified-operator-interface loop. Tails audit JSONL for probe.wake_requested events (PR #163 emits); per (probe, issue_category) inline debounce; invokes engine.respond() with structured probe context (issue + recent observations + envelope status); DMs operator via existing client.post_dm path. Activates route='probe_investigation' telemetry literal (PR #161 reserved). Engine reads message.source to derive route through existing record_inference site — no telemetry-side changes needed. Env vars added: KORA_PROBE_DEBOUNCE_SECONDS=600 (10 min default; 0 disables), KORA_PROBE_DEBOUNCE_BYPASS_CRITICAL=false (fail-closed; opt-in even for critical), KORA_PROBE_WAKE_POLL_SEC=30 (listener tail cadence). KORA_SLACK_JOSHUA_USER_ID reused from PR #149. All 4 STOP-ASK conditions resolved inline: - MessageSource Literal extended (1-line) with 'probe_investigation' + _derive_caller_session_id returns 'probe:{probe}:{category}' for future panel xref - Listener-coordinator wire uniform across 9 listeners (register_daemon_listener pattern) - Operator channel canonicalized at KORA_SLACK_JOSHUA_USER_ID (PR #149 precedent) - Tail-position stamping at first-tick (don't replay history at boot) — inverse of AlertNotifier's set-diff semantic; documented Wake-to-DM latency ~30s worst case (poll cadence), tunable to 5s. 42 new tests + 634/634 cross-bucket regression + ruff clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the operator-feedback loop. Cockpit-complete observation from #145: every panel reads real data, but Joshua still has to LOOK at the cockpit. With this bucket, when a critical/warning alert fires Kora pings Joshua via Slack DM; info alerts go to email.
Bucket spec: `17_cc_bucket_prompts/KR-ALERT-NOTIFY_push_to_slack_email.md`
Source PRs cited:
§4 PM-open status — ST1 defaults applied
Surface
Failure semantics — fail-soft + no spam
Each dispatch attempt is independent. If SlackClient is unavailable / SMTP rejects / Joshua envs are unset:
This trades a single missed ping (transient Slack 429) for clean operator UX. ST2's per-category cooldown + the runbook addendum will cover the gap with a different mechanism.
Audit seam extension
Adds `notification.dispatched` to the `SeamName` Literal in `kora_cli/audit/jsonl_sink.py`. Small + additive — existing seams unchanged. Details schema:
```
{
channel: "slack_dm" | "email",
alert_id: ,
severity: "critical" | "warning" | "info",
category: ,
status: "ok" | "failed",
error?: <exception type name when failed; omitted on ok>
}
```
Test plan
Pre-existing xdist flake (NOT introduced by this PR)
When run under pytest-xdist with the full `tests/kora_cli/alerts/ + test_listeners/ + handlers/ + audit/ + clients/` suite, ~4-6 tests in `test_email_inbound_handler.py` flake intermittently (`assert result.status == HANDLED_RECEIVED` failing with `'filtered_paused'`). Verified pre-existing on bare HEAD without these changes — ran 3 times on `git stash`d HEAD: 4 failed / 0 failed / 6 failed. Filed as separate concern for a follow-on xdist-ordering / state-pollution bucket; not blocking this PR.
Cascade
ST2: per-category cooldown + burst dampening + daily-digest mode + operator runbook addendum + `kora__send_test_alert` MCP tool (Q4).
After ST2 merges, the operator-feedback loop closes fully — Joshua doesn't have to look at the cockpit; Kora pings him when something matters.
🤖 Generated with Claude Code