-
Notifications
You must be signed in to change notification settings - Fork 0
feat: implement quality scoring Layers 2+3 -- LLM judge and human override #230
Copy link
Copy link
Labels
prio:lowNice to have, can deferNice to have, can deferscope:medium1-3 days of work1-3 days of workspec:agent-systemDESIGN_SPEC Section 3 - Agent SystemDESIGN_SPEC Section 3 - Agent Systemspec:budgetDESIGN_SPEC Section 10 - Cost & Budget ManagementDESIGN_SPEC Section 10 - Cost & Budget Managementspec:hrDESIGN_SPEC Section 8 - HR & Workforce ManagementDESIGN_SPEC Section 8 - HR & Workforce Managementspec:providersDESIGN_SPEC Section 9 - Model Provider LayerDESIGN_SPEC Section 9 - Model Provider Layertype:featureNew feature implementationNew feature implementationv0.6Minor version v0.6Minor version v0.6v0.6.4Patch release v0.6.4Patch release v0.6.4
Metadata
Metadata
Assignees
Labels
prio:lowNice to have, can deferNice to have, can deferscope:medium1-3 days of work1-3 days of workspec:agent-systemDESIGN_SPEC Section 3 - Agent SystemDESIGN_SPEC Section 3 - Agent Systemspec:budgetDESIGN_SPEC Section 10 - Cost & Budget ManagementDESIGN_SPEC Section 10 - Cost & Budget Managementspec:hrDESIGN_SPEC Section 8 - HR & Workforce ManagementDESIGN_SPEC Section 8 - HR & Workforce Managementspec:providersDESIGN_SPEC Section 9 - Model Provider LayerDESIGN_SPEC Section 9 - Model Provider Layertype:featureNew feature implementationNew feature implementationv0.6Minor version v0.6Minor version v0.6v0.6.4Patch release v0.6.4Patch release v0.6.4
Summary
Implement quality scoring Layers 2 and 3. Layer 1 (CI signals) is already implemented. Consolidates #230 (LLM judge) and #231 (human override).
Design Spec Reference
Layer 2: LLM judge (formerly #230)
QualityScoringStrategyprotocolLayer 3: Human override via API (formerly #231)
QualityScoringStrategyprotocol andPerformanceTrackerDependencies