Skip to content

Routing: add durable decision-feedback review loop primitives#128

Merged
dgarson merged 4 commits intodgarson/forkfrom
codex/implement-analytics-dashboard-framework
Feb 24, 2026
Merged

Routing: add durable decision-feedback review loop primitives#128
dgarson merged 4 commits intodgarson/forkfrom
codex/implement-analytics-dashboard-framework

Conversation

@dgarson
Copy link
Owner

@dgarson dgarson commented Feb 24, 2026

Motivation

  • Introduce a durable, low-friction feedback loop so router misclassifications (T1–T4 / handle vs escalate) are captured for human review and calibration.
  • Provide a first-class, append-only event store and review queue that fits the existing hook/event model without coupling to the metrics dashboard UI.
  • Produce a clear mapping from collected events to the analytics/dashboard views so future UI work can consume the new streams reliably.

Description

  • Add RouterFeedbackLoopStore in src/routing/feedback-loop.ts that implements durable JSONL persistence for router-decisions, router-feedback, and router-review-queue, plus linking logic, review-item generation, and calibration rollups.
  • Add implicit feedback parsing helper parseImplicitFeedback to detect correction phrases and expected tier/action from free text.
  • Add unit tests src/routing/feedback-loop.test.ts that validate thread-based linking, high-severity review creation, false-escalation accounting, and implicit parsing.
  • Add documentation docs/experiments/router-feedback-loop-analytics.md that maps dashboard views (routing quality, feedback operations, review queue health, calibration tuning) to the recorded JSONL streams and derived metrics.

Testing

  • Ran pnpm test src/routing/feedback-loop.test.ts, which executed 3 tests and all passed.
  • Ran pnpm format to apply formatting; the formatting run completed successfully.

Codex Task

Copy link
Owner Author

@dgarson dgarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Overall assessment: Good primitive set, with a few data-quality risks to tighten

Summary

This PR adds durable router decision/feedback/review-loop primitives: JSONL-backed store, duplicate detection, calibration summary metrics, gateway API handlers for decision/feedback/review lifecycle, and Slack-driven implicit/reaction ingestion to feed the loop.

What looks good

  • Nice end-to-end path from runtime events → store → gateway query/update surfaces.
  • Review queue model (open/resolved/dismissed + update log) is practical and auditable.
  • Added tests cover core linking, summary metrics, and gateway handler behavior.

Concerns

  1. Implicit feedback parser is fairly permissive and may create false positives (src/routing/feedback-loop.ts, src/auto-reply/reply/dispatch-from-config.ts)

    • Regex path \bt([1-4])\b will classify any generic T1/T2/T3/T4 mention as routing feedback.
    • Since this runs in live inbound Slack message flow, noisy capture can inflate mismatch/review metrics.
  2. JSONL read path is brittle to single-line corruption (src/routing/feedback-loop.ts)

    • readJsonl() blindly JSON.parses every line; one malformed line can break all reads and disable summary/review APIs.
    • Consider resilient parsing (skip+log bad lines) or atomic write/rotate strategy.
  3. Duplicate feedback still creates review items

    • Duplicates are identified (duplicateOfFeedbackId) but still participate in review queue creation and can bias open-review counts.
    • If intentional, worth documenting; otherwise gate review creation for duplicates.

Suggestions

  • Tighten implicit parser with stronger intent cues (e.g., require explicit correction phrases when tier is inferred).
  • Harden JSONL ingestion against partial/corrupt lines.
  • Add tests for duplicate-handling policy in review queue generation.
  • Consider normalizing reaction values in one place before shouldReview() checks.

Overall this is a strong first implementation; with those safeguards, it’ll be much more reliable in production telemetry.

@dgarson dgarson merged commit 02776b6 into dgarson/fork Feb 24, 2026
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant