Routing: add durable decision-feedback review loop primitives#128
Conversation
dgarson
left a comment
There was a problem hiding this comment.
Summary
This PR adds durable router decision/feedback/review-loop primitives: JSONL-backed store, duplicate detection, calibration summary metrics, gateway API handlers for decision/feedback/review lifecycle, and Slack-driven implicit/reaction ingestion to feed the loop.
What looks good
- Nice end-to-end path from runtime events → store → gateway query/update surfaces.
- Review queue model (open/resolved/dismissed + update log) is practical and auditable.
- Added tests cover core linking, summary metrics, and gateway handler behavior.
Concerns
-
Implicit feedback parser is fairly permissive and may create false positives (
src/routing/feedback-loop.ts,src/auto-reply/reply/dispatch-from-config.ts)- Regex path
\bt([1-4])\bwill classify any generic T1/T2/T3/T4 mention as routing feedback. - Since this runs in live inbound Slack message flow, noisy capture can inflate mismatch/review metrics.
- Regex path
-
JSONL read path is brittle to single-line corruption (
src/routing/feedback-loop.ts)readJsonl()blindlyJSON.parses every line; one malformed line can break all reads and disable summary/review APIs.- Consider resilient parsing (skip+log bad lines) or atomic write/rotate strategy.
-
Duplicate feedback still creates review items
- Duplicates are identified (
duplicateOfFeedbackId) but still participate in review queue creation and can bias open-review counts. - If intentional, worth documenting; otherwise gate review creation for duplicates.
- Duplicates are identified (
Suggestions
- Tighten implicit parser with stronger intent cues (e.g., require explicit correction phrases when tier is inferred).
- Harden JSONL ingestion against partial/corrupt lines.
- Add tests for duplicate-handling policy in review queue generation.
- Consider normalizing reaction values in one place before
shouldReview()checks.
Overall this is a strong first implementation; with those safeguards, it’ll be much more reliable in production telemetry.
Motivation
Description
RouterFeedbackLoopStoreinsrc/routing/feedback-loop.tsthat implements durable JSONL persistence forrouter-decisions,router-feedback, androuter-review-queue, plus linking logic, review-item generation, and calibration rollups.parseImplicitFeedbackto detect correction phrases and expected tier/action from free text.src/routing/feedback-loop.test.tsthat validate thread-based linking, high-severity review creation, false-escalation accounting, and implicit parsing.docs/experiments/router-feedback-loop-analytics.mdthat maps dashboard views (routing quality, feedback operations, review queue health, calibration tuning) to the recorded JSONL streams and derived metrics.Testing
pnpm test src/routing/feedback-loop.test.ts, which executed 3 tests and all passed.pnpm formatto apply formatting; the formatting run completed successfully.Codex Task