This repository was archived by the owner on May 26, 2026. It is now read-only.
KR-PROMOTE-PHRASEBOOK-FOUNDATION-MEGABUCKET — first promotion loop end-to-end#186
Merged
rafe-walker merged 1 commit intoMay 24, 2026
Conversation
…tion loop end-to-end
First concrete implementation of the promotion-loop pattern per
``feedback-promotion-loops-self-improving-subsystems``. After this
lands: Kora observes recurring DM patterns, drafts phrasebook
proposals at idle cadence, operator approves via cockpit (CC#2
follow-on KR-FE-PROMOTION-REVIEW-PANEL), audit captures the loop
end-to-end. Six deliverables in one batched bucket — single PR.
Six deliverables
----------------
A. Shared Haiku-clustering utility — ``kora_cli/clustering/``:
* ``text_similarity.embed_texts`` — async embedder with disk
cache. Lexical (token + char 3-gram, L2-normalized,
deterministic, $0) rather than Haiku-based. Decision was
anticipated by the bucket's STOP-ASK §4 alternative; the
proposer still uses Haiku where natural-language synthesis
genuinely helps (reply-template generation, one call per
proposal).
* ``cosine_similarity`` + ``cluster_by_similarity`` (greedy
agglomerative).
B. Reasoning DM observation collector —
``kora_cli/promote/phrasebook/observer.py``:
* Reads ``slack_dm_log.jsonl`` (handler-driven replies +
wake-consumer replies both write here post-#184) →
``ReasoningObservation`` records.
* Filters: route allow-list, drop short-circuit hits, drop
missing-engine canned fallbacks, drop failed sends, time
window via ``since``.
* Per-call cost reused from
``agent.usage_pricing.estimate_usage_cost`` (same canonical
pricing as ``cost_state_holder.record_inference``).
C. Promotion proposal generator —
``kora_cli/promote/phrasebook/proposer.py``:
* Clusters observations, applies size + cohesion +
answer-consistency thresholds.
* Derives a conservative regex pattern (escaped alternation
of top tokens — protects against the STOP-ASK §4 regex-
pathology concern; phrasebook editor's validator catches
anything that slips through at approve-time).
* Reply template: dominant verbatim response OR optional
Haiku synthesis (injectable for tests).
* Confidence = (cohesion + consistency) / 2.
D. Three new audit seams in ``SeamName`` Literal:
* ``promotion.proposed`` — per proposal at proposer time.
``synth_cost_usd`` field per row for cost telemetry.
* ``promotion.approved`` — at endpoint time. Also emits
``phrasebook.updated`` with ``actor="kora_proposal_approved"``
per PR #177 forward-compat.
* ``promotion.rejected`` — with operator-supplied
``review_notes`` recorded verbatim (#182 precedent).
E. Three new backend endpoints in ``web_server.py``:
* ``GET /api/promotions/phrasebook/pending``
* ``POST /api/promotions/phrasebook/{id}/approve`` —
optional pattern_override / reply_template_override /
category_override / review_notes payload.
* ``POST /api/promotions/phrasebook/{id}/reject`` —
``{review_notes}`` payload.
* Drift-guard pin ``_PROMOTION_STATUS_VALUES``; CC#2's
KR-FE-PROMOTION-REVIEW-PANEL adds the symmetric FE
constant.
F. Cron registration —
``kora_cli/listeners/promote_phrasebook_listener.py``:
* Registers ``run_phrasebook_promotion_cycle`` via the
heartbeat scheduler (interval-based, 86400s default).
* Cron-string suggestion ("0 6 * * *") superseded by the
interval shape since the heartbeat scheduler is
interval-based; documented in the listener docstring.
* Env tunables: ``KORA_PROMOTE_PHRASEBOOK_ENABLED`` (default
true), ``KORA_PROMOTE_PHRASEBOOK_INTERVAL_SEC`` (86400),
``KORA_PROMOTE_PHRASEBOOK_OBSERVATION_WINDOW_DAYS`` (7),
``KORA_PROMOTE_PHRASEBOOK_MIN_CLUSTER_SIZE`` (5),
``KORA_PROMOTE_PHRASEBOOK_EXPIRY_DAYS`` (14).
Persistence
-----------
Proposals live at
``${KORA_HOME}/promotions/phrasebook/{pending,approved,rejected,expired}/<uuid>.json``.
Status transitions move the file atomically via os.replace. Files
+ audit JSONL together are the forensic-truth stream.
Tests
-----
52 new tests across 4 files. ``ruff check`` clean. 432-test
regression set (clustering + promote + audit + handlers +
short_circuit + reasoning + probes) all pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First concrete implementation of the promotion-loop pattern per
feedback-promotion-loops-self-improving-subsystems. Six deliverables shipped as a single PR (batched perfeedback-batch-bigger-buckets):kora_cli/clustering/text_similarity.pykora_cli/promote/phrasebook/observer.pykora_cli/promote/phrasebook/proposer.pykora_cli/audit/jsonl_sink.pykora_cli/web_server.pykora_cli/listeners/promote_phrasebook_listener.pyAfter this lands: Kora observes recurring DM patterns at idle cadence (daily), drafts phrasebook proposals via clustering, operator reviews via cockpit (CC#2 follow-on KR-FE-PROMOTION-REVIEW-PANEL), audit captures the loop end-to-end.
STOP-ASKs resolved inline
"0 6 * * *"suggestion was superseded because the heartbeat scheduler is interval-based; a "fire at 6am UTC" anchor would need a separate watch-and-act task (modeled oncost_telemetry_listener's daily-reset) which is appropriate for v2 if the operator wants a deterministic UTC anchor. Promotion outputs go to a review queue, not real-time alerts, so the wall-clock anchor doesn't matter operationally for v1.Sample proposal JSON
Per the proposer projection (
/api/promotions/phrasebook/pendingreturns a list of these):{ "proposal_id": "4f3a7c2b-9d12-4e8f-a1b2-c3d4e5f6a7b8", "cluster_size": 6, "sample_questions": [ "Burn is $42 today; 75% of budget used.", "Burn is $42 today; 75% of budget used.", "Burn is $42 today; 75% of budget used." ], "proposed_pattern": "(?i)(budget|burn|used)", "proposed_reply_template": "Burn is $42 today; 75% of budget used.", "proposed_category": "cost_query", "confidence": 0.94, "created_at": "2026-05-24T06:00:00Z", "status": "pending", "review_notes": "", "cluster_caller_session_ids": [ "D1JOSH:1742300000.1", "D1JOSH:1742310000.1", "D1JOSH:1742320000.1", "D1JOSH:1742330000.1", "D1JOSH:1742340000.1", "D1JOSH:1742350000.1" ], "haiku_synthesized": false }sample_questionsin v1 shows the response excerpts (the bucket spec's reply-only clustering means we don't have the operator question text; future bucket can plumb the inbound JSONL via the observer'sinbound_lookuphook).Sample cycle log
Synthetic 24-hour run with 200 observations → 4 proposals → 1 expired:
With Haiku reply-template synthesis enabled (operator opt-in for v2):
Estimated $/day cost ceiling
Both well under the $0.01-0.05/day spec target. Plenty of headroom for the future promotion loops the spec called out (snapshot-expand, router-trigger, tool-trimming, probe-fix-envelope) to also share the daily budget.
The lexical-default decision is the load-bearing one — it means the daily cost is essentially zero, leaving the entire budget for future loops that genuinely need LLM judgment (which the phrasebook proposer doesn't, since reply-template uses the dominant verbatim response by default).
Env vars added
KORA_PROMOTE_PHRASEBOOK_ENABLEDtrueKORA_PROMOTE_PHRASEBOOK_INTERVAL_SEC86400KORA_PROMOTE_PHRASEBOOK_OBSERVATION_WINDOW_DAYS7KORA_PROMOTE_PHRASEBOOK_MIN_CLUSTER_SIZE5KORA_PROMOTE_PHRASEBOOK_EXPIRY_DAYS14KORA_PROMOTIONS_DIR${KORA_HOME}/promotions/).CC#2 follow-on recommendation:
KR-FE-PROMOTION-REVIEW-PANEL(refined)The endpoints + persisted proposal shape are stable. CC#2 builds:
GET /api/promotions/phrasebook/pendinglength is the badge.sample_questionsexcerpts (operator recognizes the cluster)proposed_pattern(operator-editable inline)proposed_reply_template(operator-editable inline;haiku_synthesized=truerows get a "Kora wrote this" badge for extra-careful review)proposed_category(operator-editable; dropdown of existing categories + free text)_PROMOTION_STATUS_VALUESfrom BE endpoint response (status_valuesfield of the list endpoint) into a FE constant; snapshot-test fails CI if the two drift.promotion.proposed/promotion.approved/promotion.rejectedaudit JSONL rows in a sidebar log so operator can see the history (the in-flight feat: OBLITERATUS skill — LLM refusal removal via SVD-based weight projection NousResearch/hermes-agent#408 AuditPanelKit makes this nearly free once landed).The wire shape is forward-compatible: future fields (e.g.,
synth_cost_usd,cluster_caller_session_ids) are additive and the cockpit can ignore unknown keys.Test plan
ruff checkclean on all changed files🤖 Generated with Claude Code