-
Notifications
You must be signed in to change notification settings - Fork 0
feat: structured failure diagnosis in RecoveryResult #706
Copy link
Copy link
Closed
Labels
prio:highImportant, should be prioritizedImportant, should be prioritizedscope:smallLess than 1 day of workLess than 1 day of workspec:task-workflowDESIGN_SPEC Section 6 - Task & Workflow EngineDESIGN_SPEC Section 6 - Task & Workflow Enginetype:featureNew feature implementationNew feature implementationv0.7Minor version v0.7Minor version v0.7v0.7.0Patch release v0.7.0Patch release v0.7.0
Metadata
Metadata
Assignees
Labels
prio:highImportant, should be prioritizedImportant, should be prioritizedscope:smallLess than 1 day of workLess than 1 day of workspec:task-workflowDESIGN_SPEC Section 6 - Task & Workflow EngineDESIGN_SPEC Section 6 - Task & Workflow Enginetype:featureNew feature implementationNew feature implementationv0.7Minor version v0.7Minor version v0.7v0.7.0Patch release v0.7.0Patch release v0.7.0
Context
Deep dive on Hive (aden-hive) revealed that their self-healing mechanism starts with structured failure diagnosis -- not just "task failed" but which specific node failed, which criteria it fell short on, and the full decision log.
SynthOrg's
RecoveryResultcurrently lacks this structure, making checkpoint recovery and task reassignment routing less informed.Action Items
failure_categoryenum toRecoveryResult(e.g.,TOOL_FAILURE,STAGNATION,BUDGET_EXCEEDED,QUALITY_GATE_FAILED,TIMEOUT,DELEGATION_FAILED)criteria_failed: list[str]-- which specific acceptance criteria were not metstagnation_evidence: StagnationEvidence | None-- link stagnation detection data when applicablefailure_context: dict[str, Any]-- structured bag for domain-specific failure data (tool error messages, provider errors, etc.)Design Notes
This is a model extension, not a new module. Extends existing
RecoveryResultinengine/recovery.py. Low effort, immediate improvement to recovery quality.References