-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add intra-loop stagnation detector to execution loops #415
Copy link
Copy link
Closed
Labels
prio:mediumShould do, but not blockingShould do, but not blockingscope:smallLess than 1 day of workLess than 1 day of workspec:task-workflowDESIGN_SPEC Section 6 - Task & Workflow EngineDESIGN_SPEC Section 6 - Task & Workflow Enginetype:featureNew feature implementationNew feature implementation
Description
Summary
Add stagnation detection to execution loops (ReactLoop, PlanExecuteLoop) that identifies when an agent is repeating the same tool calls without making progress, and intervenes.
Motivation
From MADQA paper (arXiv:2603.12180, Snowflake/Oxford): agents persist in unproductive loops, achieving ~20% below oracle accuracy via brute-force exhaustive search rather than strategic reasoning. Current execution loops only have max_turns ceiling and budget_checker as cost gate — no mechanism to detect repetitive patterns.
Design
- Analyze
TurnRecord.tool_calls_madehistory across recent N turns - Detect repetitive patterns: same tool + same args repeated, same documents accessed, no new information gained
- On detection, two responses:
- Corrective prompt injection: "You appear to be repeating the same actions without progress — try a different approach"
- Early termination: Terminate with
STAGNATIONtermination reason (new enum value)
- Configurable sensitivity: window size (N turns), similarity threshold, max repeated patterns
Affected Files
src/ai_company/engine/react_loop.pysrc/ai_company/engine/plan_execute_loop.pysrc/ai_company/engine/loop_protocol.py(add STAGNATION to termination reasons)src/ai_company/engine/run_result.py
Research
- MADQA: Strategic Nav vs Stochastic Search (arXiv:2603.12180) — primary motivation
- Connects to Memex (arXiv:2603.04257) redundant tool call penalty concept
- Raw data already available in
TurnRecord.tool_calls_made
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
prio:mediumShould do, but not blockingShould do, but not blockingscope:smallLess than 1 day of workLess than 1 day of workspec:task-workflowDESIGN_SPEC Section 6 - Task & Workflow EngineDESIGN_SPEC Section 6 - Task & Workflow Enginetype:featureNew feature implementationNew feature implementation