feat(kora): KR-PROMOTE-LOOPS-COMPLETION-MEGABUCKET — 3 loops + debounce + audit batching by rafe-walker · Pull Request #193 · rafe-walker/kora

rafe-walker · 2026-05-24T07:18:17Z

Summary

CC#1 megabucket completing the promotion-loops product direction. Lands the 3 remaining promotion loops (router-tuning / tool-trimming / probe-fix-envelopes), upgrades probe debounce with consecutive-failure buffering (closes the queued #163 follow-on), and ships R3-4 #9 audit batching — the final cheap-substrate item.

After this lands: all 5 promotion loops live; R3-4 complete.

Per-deliverable status

ID	Deliverable	Status	Notes
A	KR-PROMOTE-ROUTER-TUNING (loop 3)	✅ shipped	Observer reads cost_telemetry rolling_24h; proposer emits `tighten_review` when escalation_rate ≥ threshold. `loosen_review` deferred — needs a new audit row for operator `/opus` overrides. Endpoints + audit seam `promotion.router_trigger_proposed`.
B	KR-PROMOTE-TOOL-TRIMMING (loop 4)	✅ shipped	Observer tallies `reasoning.tool_called` per (route, tool_name); proposer identifies tools any route never invoked. v1 uses union-across-routes as the "available tools" set (documented swap-in seam). Endpoints + audit seam `promotion.tool_trim_proposed`.
C	KR-PROMOTE-PROBE-FIX-ENVELOPES (loop 5)	✅ shipped	Observer clusters `probe.investigation_completed` by (probe, issue_category); proposer emits envelope-action suggestions with conservative "operator must review" blast-radius. Auto-apply HARDCODED FALSE per spec §2. STOP-ASK §4 mitigation: approve endpoint transitions status + emits audit ONLY; operator manually scaffolds approved envelopes into `probes/fix_envelopes.py` (proposal payload carries the suggested `FixEnvelope` shape for mechanical copy-paste).
D	KR-PROBE-DEBOUNCE consecutive-failure	✅ shipped	New env `KORA_PROBE_DEBOUNCE_CONSECUTIVE_REQUIRED` (default 2). Buffer state (probe, category) → (count, first_seen_at). Critical+bypass still skips. `required=1` restores PR #166 behavior.
E	KR-CHEAP-AUDIT-BATCHING (R3-4 #9)	✅ shipped	Queue + background daemon thread; flush at size (default 100) or time (default 5s); atexit drain on shutdown. Per-emit interface unchanged. `BATCH_SIZE=0` preserves legacy sync path (used by the test conftest globally to keep existing emit-then-read suites sync).

Plus a shared store helper at kora_cli/promote/_shared/proposal_store.py so the 3 new loops don't each duplicate the phrasebook store's per-status-dir layout.

Sample proposal JSON — one per new loop

Loop 3 (router-tuning) — `promotion.router_trigger_proposed`

{
  "proposal_id": "p-router-001",
  "route": "slack_dm",
  "calls_count": 100,
  "escalation_count": 60,
  "escalation_rate": 0.6,
  "cost_estimate_usd_total": 1.2,
  "recommendation_kind": "tighten_review",
  "rationale": "Route 'slack_dm' escalated to Opus on 60/100 calls (60.0%) in the rolling 24h window. Per-call escalations cost a full Opus turn on top of the original Haiku turn. Operator review of the escalation trigger pattern for this route is recommended; spend so far: \$1.2000.",
  "confidence": 1.0,
  "created_at": "2026-05-24T08:00:00Z",
  "status": "pending",
  "review_notes": ""
}

Loop 4 (tool-trimming) — `promotion.tool_trim_proposed`

{
  "proposal_id": "p-tool-001",
  "route": "email_inbound",
  "unused_tools": ["kora__attempt_probe_autofix", "kora__send_email_to_operator"],
  "total_calls_for_route": 42,
  "observation_window_days": 30,
  "confidence": 1.0,
  "created_at": "2026-05-24T09:00:00Z",
  "status": "pending",
  "review_notes": ""
}

Loop 5 (probe-fix-envelopes) — `promotion.probe_envelope_action_proposed`

{
  "proposal_id": "p-envelope-001",
  "probe": "fly",
  "issue_category": "machine_down",
  "fix_name_suggestion": "proposed_fly_machine_down",
  "cluster_size": 4,
  "sample_caller_session_ids": ["probe:fly:machine_down:1", "probe:fly:machine_down:2", "probe:fly:machine_down:3"],
  "recurring_recommendation_text": "Restart the fly machine to recover.",
  "blast_radius_summary": "operator must review — proposed envelope action has not been classified for production-mutation risk; treat as broad-impact by default until operator narrows the scope",
  "confidence": 0.6667,
  "created_at": "2026-05-24T10:00:00Z",
  "status": "pending",
  "review_notes": ""
}

Combined daily cost — all 5 promotion loops

Loop	Per-cycle cost	Daily
1 — phrasebook (#186)	≤$0.005 (Haiku synthesis, ≤5 proposals × ~$0.001)	≤$0.005
2 — snapshot-expand (#190)	$0 (fully lexical)	$0
3 — router-tuning (this PR)	$0 (telemetry counter math)	$0
4 — tool-trimming (this PR)	$0 (audit-log scan + set diff)	$0
5 — probe-fix-envelopes (this PR)	$0 (audit-log scan + lexical cluster)	$0
Combined		≤$0.005/day

Well inside the [[feedback-promotion-loops-self-improving-subsystems]] $0.01-0.05/day target. The four new loops are all $0 — phrasebook is the only one with any LLM cost, capped at $0.005/day.

STOP-ASK posture

✅ Router-tuning quality-diff data — not available today (documented in observer); v1 ships with the data we DO have (calls + escalation counts) and surfaces "review this route" recommendations rather than auto-tuning. No PM ask needed.
✅ Probe-fix-envelope codegen fragility — mitigated via store-only + manual-scaffold approval path; fix_envelopes.py is NEVER mutated by the loop or its endpoints.
✅ Audit batching lifecycle — background daemon thread starts lazily on first emit; atexit drains on shutdown; works for both daemon (long-lived) and CLI (short-lived) processes.
✅ Daily cost — all 4 new loops are $0; combined daily ceiling still ≤$0.005.

CC#2 follow-on recommendation

CC#2's PromotionReviewPage (delivered alongside #186/#190) handles the phrasebook + snapshot-expand loop shapes. The 3 new loops add 3 distinct proposal shapes that the panel needs to render. Recommended follow-on:

Extend the panel's loop-discriminator to handle the 4 new endpoint groups (/api/promotions/{router-tuning,tool-trimming,probe-envelopes}/...) alongside the existing phrasebook group. A loop-type tab strip is cleaner than nested tabs per-loop.
Per-loop card renderers — each loop's proposal shape needs its own card layout:
- Router-tuning: route + escalation_rate badge + rationale + Approve/Reject
- Tool-trimming: route + collapsible list of unused_tools + total_calls + Approve/Reject
- Probe-envelopes: probe + category + fix_name_suggestion + blast_radius_summary rendered as a prominent warning + recurring_recommendation_text + Approve/Reject. Bigger visual treatment matches the HIGH-RISK posture.
Status enum drift-guard pin — the panel's PROMOTION_STATUS_VALUES constant should pin against _PROMOTION_STATUS_VALUES in kora_cli/web_server.py via a snapshot test; same drift-guard rule the phrasebook panel established.
Pending count rollup — sidebar nav badge can sum pending counts across all 4 endpoint groups so operator sees "8 proposals waiting" without drilling into per-loop tabs.

Recommended bucket title: KR-FE-PROMOTION-REVIEW-PANEL-EXTEND — small to medium bucket; the loop-discriminator + 3 card renderers are mostly mechanical given the existing panel scaffolding.

Test plan

32 new tests across the 3 new promote loops + shared store
6 new tests for KR-PROBE-DEBOUNCE consecutive-failure paths (first-failure-buffered, second-failure-dispatches, critical-bypass, independent-pairs, required=1 backward-compat, reset clears buffer)
6 new tests for KR-CHEAP-AUDIT-BATCHING (size-trigger, time-trigger via background thread, per-path grouping, shutdown-drain, opt-out path)
tests/kora_cli/conftest.py autouse fixture defaults to BATCH_SIZE=0 so legacy suites keep sync semantics
582 pass across snapshot / promote / reasoning / probes / audit / hermes plugin suites locally. Remaining failures across the full suite are missing local dev-deps (prompt_toolkit, aiosmtplib, etc.) unrelated to this PR.

🤖 Generated with Claude Code

…ce + audit batching Deliverable A — KR-PROMOTE-ROUTER-TUNING (loop 3): * ``kora_cli/promote/router_tuning/`` — observer reads cost_telemetry.snapshot()'s rolling_24h per-route counters; proposer surfaces routes whose Haiku→Opus escalation rate crosses a tunable threshold for operator-review. v1 ships ``tighten_review`` recommendations only — the ``loosen_review`` signal needs a new audit row for operator /opus overrides that doesn't exist yet (documented in observer + proposer). * Audit seam ``promotion.router_trigger_proposed``. * Endpoints ``GET /api/promotions/router-tuning/pending`` + ``POST .../{id}/approve|reject``. * Default auto-apply OFF — operator scaffolds trigger-pattern changes manually from the proposal rationale. Deliverable B — KR-PROMOTE-TOOL-TRIMMING (loop 4): * ``kora_cli/promote/tool_trimming/`` — observer tallies ``reasoning.tool_called`` audit rows by (route, tool_name) over a 30-day window; proposer identifies tools that any route never invoked + proposes adding them to that route's drop-list. * v1 uses union-across-routes as the "available tools" set (cheaper than wiring a per-route registered-tool projection; documented as a clean future seam). * Audit seam ``promotion.tool_trim_proposed``. * Endpoints ``/api/promotions/tool-trimming/...``. * Default auto-apply OFF — enforcement of the drop-list is deferred to the future KR-PLUGIN-TOOL-DESC-TRIM bucket which will wire the ``pre_tool_list_finalized`` hook to honor it. Deliverable C — KR-PROMOTE-PROBE-FIX-ENVELOPES (loop 5): * ``kora_cli/promote/probe_fix_envelopes/`` — observer reads ``probe.investigation_completed`` rows, skips those that already triggered an autofix attempt; clusters by (probe, issue_category); proposer emits new envelope-action proposals with conservative "operator must review" blast-radius default. * Audit seam ``promotion.probe_envelope_action_proposed``. * Endpoints ``/api/promotions/probe-envelopes/...``. * Auto-apply HARDCODED FALSE (no env to flip) per spec §2 deliverable C — high-risk; never auto-apply. STOP-ASK §4 mitigation: approve endpoint transitions status + emits audit ONLY; operator manually scaffolds approved envelopes into ``probes/fix_envelopes.py``. Proposal payload carries the suggested FixEnvelope shape verbatim for mechanical copy-paste. Shared store helper: * ``kora_cli/promote/_shared/proposal_store.py`` — generic pending/approved/rejected/expired file-backed store factored out so the 3 new loops don't each duplicate the phrasebook store's per-status-dir layout. Snapshot_expand + phrasebook keep their bespoke stores unchanged. Deliverable D — KR-PROBE-DEBOUNCE consecutive-failure buffering: * Closes the queued #163 follow-on. New env ``KORA_PROBE_DEBOUNCE_CONSECUTIVE_REQUIRED`` (default 2). * Buffer state (probe, category) → (count, first_seen_at) alongside the existing flat-window debounce map. Single wakes return buffered_skipped; second wake within the existing debounce window dispatches; window-expiry resets count. * Critical severity + ``KORA_PROBE_DEBOUNCE_BYPASS_CRITICAL=true`` still bypasses the buffer (same operator opt-in env as the flat-window bypass). * Backward compat: ``required=1`` restores PR #166 behavior. Deliverable E — KR-CHEAP-AUDIT-BATCHING (R3-4 #9): * ``emit_audit`` now queues to a module-level batch; background daemon thread flushes every ``KORA_AUDIT_FLUSH_INTERVAL_SECONDS`` (default 5s) and on size hit ``KORA_AUDIT_BATCH_SIZE`` (default 100), whichever first. * Per-emit interface UNCHANGED — callers still call ``emit_audit`` synchronously; queue + flusher are internal. * atexit handler drains pending events on process shutdown so short-lived CLI invocations don't lose audit rows. * BATCH_SIZE=0 explicit opt-out preserves the legacy per-emit write path — used by the test conftest globally so existing emit-then-read tests stay sync without per-file fixtures. * Tests cover: size-trigger, time-trigger (background thread), per-path grouping, shutdown-drain, opt-out path. Tests: * 32 new tests across the 3 new promote loops + shared store. * 6 new tests for KR-PROBE-DEBOUNCE consecutive-failure paths. * 6 new tests for KR-CHEAP-AUDIT-BATCHING. * ``tests/kora_cli/conftest.py`` adds an autouse fixture that defaults to BATCH_SIZE=0 so the existing emit-then-read suites (wake_consumer, phrasebook, snapshot_expand, hermes_plugin audit) keep sync semantics. Dedicated batching tests opt back in via their own fixture. * 582 pass across snapshot / promote / reasoning / probes / audit / hermes plugin suites; remaining failures are missing local dev-deps (prompt_toolkit, aiosmtplib, etc.) unrelated to this PR. After this lands: all 5 promotion loops live; KR-CHEAP-AUDIT- BATCHING completes R3-4 cheap-substrate items; KR-PROBE-DEBOUNCE upgrade closes the queued #163 follow-on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rafe-walker merged commit eeb724d into feature/phase2-upgrades May 24, 2026

rafe-walker deleted the feat/kora-KR-PROMOTE-LOOPS-COMPLETION-MEGABUCKET branch May 24, 2026 07:20

This was referenced May 24, 2026

feat(kora): KR-REASONING-ROUTE-THROUGH-GATEWAY-ST3 — flip KORA_REASONING_USE_GATEWAY default to gateway (DRAFT) #195

Merged

feat(kora): KR-ALERT-WAKE-AND-EMAIL-INTENT-PROMOTION-AND-ROUTER-LOOSEN-MEGABUCKET — 3 deliverables #197

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kora): KR-PROMOTE-LOOPS-COMPLETION-MEGABUCKET — 3 loops + debounce + audit batching#193

feat(kora): KR-PROMOTE-LOOPS-COMPLETION-MEGABUCKET — 3 loops + debounce + audit batching#193
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-PROMOTE-LOOPS-COMPLETION-MEGABUCKET

rafe-walker commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rafe-walker commented May 24, 2026

Summary

Per-deliverable status

Sample proposal JSON — one per new loop

Loop 3 (router-tuning) — promotion.router_trigger_proposed

Loop 4 (tool-trimming) — promotion.tool_trim_proposed

Loop 5 (probe-fix-envelopes) — promotion.probe_envelope_action_proposed

Combined daily cost — all 5 promotion loops

STOP-ASK posture

CC#2 follow-on recommendation

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Loop 3 (router-tuning) — `promotion.router_trigger_proposed`

Loop 4 (tool-trimming) — `promotion.tool_trim_proposed`

Loop 5 (probe-fix-envelopes) — `promotion.probe_envelope_action_proposed`