@@ -131,7 +131,10 @@ quality-cost tradeoff. The `DegradationConfig` and quota degradation strategies
131131(Amdahl ceiling, straggler gap) provide efficiency bounds.
132132
133133** Implication** : The existing budget architecture is sound. The missing piece is exposing
134- the quality-cost tradeoffs in the API (see #688 coordination metrics gap).
134+ the quality-cost tradeoffs via the REST API: specifically, ` GET /tasks/{id} ` response
135+ and the ` CoordinationResult ` Python type should surface cost, quality, and efficiency
136+ metadata (estimated cost, actual cost, quality score, Amdahl ceiling, straggler gap).
137+ See #688 coordination metrics gap (Gap G4) for the full scoping.
135138
136139---
137140
@@ -159,7 +162,14 @@ SynthOrg currently attributes all failure information to the executing agent's
159162
160163### Proposed Design
161164
162- ** AgentContribution model** -- extend ` CoordinationResult ` :
165+ ** AgentContribution model** -- integrate with ` CoordinationResult ` :
166+
167+ Note: ` CoordinationResult ` has ` model_config = ConfigDict(frozen=True) ` . Adding
168+ ` agent_contributions ` directly is a breaking change. The recommended approach is a
169+ separate wrapper: `CoordinationResultWithAttribution(result: CoordinationResult,
170+ agent_contributions: tuple[ AgentContribution, ...] )`, stored and returned in place of
171+ the bare result by ` _post_execution_pipeline ` . This preserves immutability and avoids
172+ migrating existing persisted ` CoordinationResult ` records.
163173
164174``` python
165175class AgentContribution (BaseModel ):
@@ -231,7 +241,7 @@ Four signal categories that should drive pruning recommendations:
231241
232242### Proposed Protocol
233243
234- ```
244+ ``` python
235245PruningEvaluation (new model)
236246 agent_id: str
237247 pruning_score: float # 0.0 = retain, 1.0 = prune
@@ -253,9 +263,19 @@ PruningService (new service)
253263```
254264
255265** Human approval gate** : Any ` PruningEvaluation ` with ` recommendation="PRUNE" ` creates an
256- ` ApprovalItem ` with ` action_type="org:prune" ` and ` ApprovalRiskLevel.MEDIUM ` . This follows
257- the same approval pattern used by the hiring and promotion pipelines. Pruning is never
258- fully automated -- it is recommendation + human approval.
266+ ` ApprovalItem ` following the same approval pattern used by the hiring and promotion
267+ pipelines. Required fields:
268+
269+ - ` id ` : unique UUID per ` PruningEvaluation `
270+ - ` title ` : short summary, e.g. ` "Prune agent {agent_id} ({reason})" `
271+ - ` description ` : rationale from ` PruningEvaluation.signals ` (quality decline slope,
272+ utilization, Jaccard overlap), affected team, and safety constraint check results
273+ - ` requested_by ` : the ` PruningService ` identifier or calling system
274+ - ` action_type ` : ` "org:prune" `
275+ - ` risk_level ` : ` ApprovalRiskLevel.MEDIUM `
276+ - ` created_at ` : ISO 8601 timestamp
277+
278+ Pruning is never fully automated -- it is recommendation + human approval.
259279
260280### Safety Constraints
261281
@@ -322,6 +342,13 @@ node types executed in that turn would improve execution trace analysis without
322342significant refactoring. This is optional but would directly enable structural credit
323343assignment (knowing which node type failed).
324344
345+ ** Backward compatibility** : ` TurnRecord ` is part of execution traces and may be
346+ persisted. The ` node_types ` field must be added as ** optional with a default** (e.g.,
347+ ` node_types: tuple[NodeType, ...] = () ` ) so existing records remain valid without
348+ migration. Serialization/deserialization must tolerate the absent field. Consumers
349+ (trace analyzers, evaluation pipelines) should treat an empty tuple as "unknown
350+ composition" rather than erroring.
351+
325352---
326353
327354## Summary of Recommendations
0 commit comments