This repository was archived by the owner on May 26, 2026. It is now read-only.
feat(kora): KR-PROMOTE-LOOPS-COMPLETION-MEGABUCKET — 3 loops + debounce + audit batching#193
Merged
rafe-walker merged 1 commit intoMay 24, 2026
Conversation
…ce + audit batching
Deliverable A — KR-PROMOTE-ROUTER-TUNING (loop 3):
* ``kora_cli/promote/router_tuning/`` — observer reads
cost_telemetry.snapshot()'s rolling_24h per-route counters;
proposer surfaces routes whose Haiku→Opus escalation rate
crosses a tunable threshold for operator-review. v1 ships
``tighten_review`` recommendations only — the ``loosen_review``
signal needs a new audit row for operator /opus overrides that
doesn't exist yet (documented in observer + proposer).
* Audit seam ``promotion.router_trigger_proposed``.
* Endpoints ``GET /api/promotions/router-tuning/pending`` +
``POST .../{id}/approve|reject``.
* Default auto-apply OFF — operator scaffolds trigger-pattern
changes manually from the proposal rationale.
Deliverable B — KR-PROMOTE-TOOL-TRIMMING (loop 4):
* ``kora_cli/promote/tool_trimming/`` — observer tallies
``reasoning.tool_called`` audit rows by (route, tool_name) over
a 30-day window; proposer identifies tools that any route never
invoked + proposes adding them to that route's drop-list.
* v1 uses union-across-routes as the "available tools" set
(cheaper than wiring a per-route registered-tool projection;
documented as a clean future seam).
* Audit seam ``promotion.tool_trim_proposed``.
* Endpoints ``/api/promotions/tool-trimming/...``.
* Default auto-apply OFF — enforcement of the drop-list is
deferred to the future KR-PLUGIN-TOOL-DESC-TRIM bucket which
will wire the ``pre_tool_list_finalized`` hook to honor it.
Deliverable C — KR-PROMOTE-PROBE-FIX-ENVELOPES (loop 5):
* ``kora_cli/promote/probe_fix_envelopes/`` — observer reads
``probe.investigation_completed`` rows, skips those that already
triggered an autofix attempt; clusters by (probe, issue_category);
proposer emits new envelope-action proposals with conservative
"operator must review" blast-radius default.
* Audit seam ``promotion.probe_envelope_action_proposed``.
* Endpoints ``/api/promotions/probe-envelopes/...``.
* Auto-apply HARDCODED FALSE (no env to flip) per spec §2
deliverable C — high-risk; never auto-apply. STOP-ASK §4
mitigation: approve endpoint transitions status + emits audit
ONLY; operator manually scaffolds approved envelopes into
``probes/fix_envelopes.py``. Proposal payload carries the
suggested FixEnvelope shape verbatim for mechanical copy-paste.
Shared store helper:
* ``kora_cli/promote/_shared/proposal_store.py`` — generic
pending/approved/rejected/expired file-backed store factored
out so the 3 new loops don't each duplicate the phrasebook
store's per-status-dir layout. Snapshot_expand + phrasebook
keep their bespoke stores unchanged.
Deliverable D — KR-PROBE-DEBOUNCE consecutive-failure buffering:
* Closes the queued #163 follow-on. New env
``KORA_PROBE_DEBOUNCE_CONSECUTIVE_REQUIRED`` (default 2).
* Buffer state (probe, category) → (count, first_seen_at)
alongside the existing flat-window debounce map. Single wakes
return buffered_skipped; second wake within the existing
debounce window dispatches; window-expiry resets count.
* Critical severity + ``KORA_PROBE_DEBOUNCE_BYPASS_CRITICAL=true``
still bypasses the buffer (same operator opt-in env as the
flat-window bypass).
* Backward compat: ``required=1`` restores PR #166 behavior.
Deliverable E — KR-CHEAP-AUDIT-BATCHING (R3-4 #9):
* ``emit_audit`` now queues to a module-level batch; background
daemon thread flushes every ``KORA_AUDIT_FLUSH_INTERVAL_SECONDS``
(default 5s) and on size hit ``KORA_AUDIT_BATCH_SIZE`` (default
100), whichever first.
* Per-emit interface UNCHANGED — callers still call ``emit_audit``
synchronously; queue + flusher are internal.
* atexit handler drains pending events on process shutdown so
short-lived CLI invocations don't lose audit rows.
* BATCH_SIZE=0 explicit opt-out preserves the legacy per-emit
write path — used by the test conftest globally so existing
emit-then-read tests stay sync without per-file fixtures.
* Tests cover: size-trigger, time-trigger (background thread),
per-path grouping, shutdown-drain, opt-out path.
Tests:
* 32 new tests across the 3 new promote loops + shared store.
* 6 new tests for KR-PROBE-DEBOUNCE consecutive-failure paths.
* 6 new tests for KR-CHEAP-AUDIT-BATCHING.
* ``tests/kora_cli/conftest.py`` adds an autouse fixture that
defaults to BATCH_SIZE=0 so the existing emit-then-read suites
(wake_consumer, phrasebook, snapshot_expand, hermes_plugin
audit) keep sync semantics. Dedicated batching tests opt back
in via their own fixture.
* 582 pass across snapshot / promote / reasoning / probes /
audit / hermes plugin suites; remaining failures are missing
local dev-deps (prompt_toolkit, aiosmtplib, etc.) unrelated
to this PR.
After this lands: all 5 promotion loops live; KR-CHEAP-AUDIT-
BATCHING completes R3-4 cheap-substrate items; KR-PROBE-DEBOUNCE
upgrade closes the queued #163 follow-on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
CC#1 megabucket completing the promotion-loops product direction. Lands the 3 remaining promotion loops (router-tuning / tool-trimming / probe-fix-envelopes), upgrades probe debounce with consecutive-failure buffering (closes the queued #163 follow-on), and ships R3-4 #9 audit batching — the final cheap-substrate item.
After this lands: all 5 promotion loops live; R3-4 complete.
Per-deliverable status
tighten_reviewwhen escalation_rate ≥ threshold.loosen_reviewdeferred — needs a new audit row for operator/opusoverrides. Endpoints + audit seampromotion.router_trigger_proposed.reasoning.tool_calledper (route, tool_name); proposer identifies tools any route never invoked. v1 uses union-across-routes as the "available tools" set (documented swap-in seam). Endpoints + audit seampromotion.tool_trim_proposed.probe.investigation_completedby (probe, issue_category); proposer emits envelope-action suggestions with conservative "operator must review" blast-radius. Auto-apply HARDCODED FALSE per spec §2. STOP-ASK §4 mitigation: approve endpoint transitions status + emits audit ONLY; operator manually scaffolds approved envelopes intoprobes/fix_envelopes.py(proposal payload carries the suggestedFixEnvelopeshape for mechanical copy-paste).KORA_PROBE_DEBOUNCE_CONSECUTIVE_REQUIRED(default 2). Buffer state (probe, category) → (count, first_seen_at). Critical+bypass still skips.required=1restores PR #166 behavior.BATCH_SIZE=0preserves legacy sync path (used by the test conftest globally to keep existing emit-then-read suites sync).Plus a shared store helper at
kora_cli/promote/_shared/proposal_store.pyso the 3 new loops don't each duplicate the phrasebook store's per-status-dir layout.Sample proposal JSON — one per new loop
Loop 3 (router-tuning) —
promotion.router_trigger_proposed{ "proposal_id": "p-router-001", "route": "slack_dm", "calls_count": 100, "escalation_count": 60, "escalation_rate": 0.6, "cost_estimate_usd_total": 1.2, "recommendation_kind": "tighten_review", "rationale": "Route 'slack_dm' escalated to Opus on 60/100 calls (60.0%) in the rolling 24h window. Per-call escalations cost a full Opus turn on top of the original Haiku turn. Operator review of the escalation trigger pattern for this route is recommended; spend so far: \$1.2000.", "confidence": 1.0, "created_at": "2026-05-24T08:00:00Z", "status": "pending", "review_notes": "" }Loop 4 (tool-trimming) —
promotion.tool_trim_proposed{ "proposal_id": "p-tool-001", "route": "email_inbound", "unused_tools": ["kora__attempt_probe_autofix", "kora__send_email_to_operator"], "total_calls_for_route": 42, "observation_window_days": 30, "confidence": 1.0, "created_at": "2026-05-24T09:00:00Z", "status": "pending", "review_notes": "" }Loop 5 (probe-fix-envelopes) —
promotion.probe_envelope_action_proposed{ "proposal_id": "p-envelope-001", "probe": "fly", "issue_category": "machine_down", "fix_name_suggestion": "proposed_fly_machine_down", "cluster_size": 4, "sample_caller_session_ids": ["probe:fly:machine_down:1", "probe:fly:machine_down:2", "probe:fly:machine_down:3"], "recurring_recommendation_text": "Restart the fly machine to recover.", "blast_radius_summary": "operator must review — proposed envelope action has not been classified for production-mutation risk; treat as broad-impact by default until operator narrows the scope", "confidence": 0.6667, "created_at": "2026-05-24T10:00:00Z", "status": "pending", "review_notes": "" }Combined daily cost — all 5 promotion loops
Well inside the
[[feedback-promotion-loops-self-improving-subsystems]]$0.01-0.05/day target. The four new loops are all $0 — phrasebook is the only one with any LLM cost, capped at $0.005/day.STOP-ASK posture
fix_envelopes.pyis NEVER mutated by the loop or its endpoints.CC#2 follow-on recommendation
CC#2's PromotionReviewPage (delivered alongside #186/#190) handles the phrasebook + snapshot-expand loop shapes. The 3 new loops add 3 distinct proposal shapes that the panel needs to render. Recommended follow-on:
/api/promotions/{router-tuning,tool-trimming,probe-envelopes}/...) alongside the existingphrasebookgroup. A loop-type tab strip is cleaner than nested tabs per-loop.fix_name_suggestion+blast_radius_summaryrendered as a prominent warning +recurring_recommendation_text+ Approve/Reject. Bigger visual treatment matches the HIGH-RISK posture.PROMOTION_STATUS_VALUESconstant should pin against_PROMOTION_STATUS_VALUESinkora_cli/web_server.pyvia a snapshot test; same drift-guard rule the phrasebook panel established.pendingcounts across all 4 endpoint groups so operator sees "8 proposals waiting" without drilling into per-loop tabs.Recommended bucket title:
KR-FE-PROMOTION-REVIEW-PANEL-EXTEND— small to medium bucket; the loop-discriminator + 3 card renderers are mostly mechanical given the existing panel scaffolding.Test plan
tests/kora_cli/conftest.pyautouse fixture defaults to BATCH_SIZE=0 so legacy suites keep sync semantics🤖 Generated with Claude Code