feat(engine): implement execution loop auto-selection based on task complexity#567
feat(engine): implement execution loop auto-selection based on task complexity#567
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds automatic per-task execution-loop selection based on task complexity with budget-aware downgrade and hybrid fallback. Introduces a new loop_selector module and re-exports, integrates auto-loop config into AgentEngine (per-task resolution and resume wiring), adds budget utilization querying, new observability events, tests, docs updates, and CI vulnerability ignore entries. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant AgentEngine
participant LoopSelector
participant BudgetEnforcer
participant LoopFactory
participant ExecutionLoop
Client->>AgentEngine: run(task, auto_loop_config)
AgentEngine->>AgentEngine: _resolve_loop(task)
AgentEngine->>LoopSelector: select_loop_type(complexity, rules, budget_utilization=None)
LoopSelector-->>AgentEngine: loop_type
alt loop_type == "hybrid" and BudgetEnforcer present
AgentEngine->>BudgetEnforcer: get_budget_utilization_pct()
BudgetEnforcer-->>AgentEngine: budget_pct or None
AgentEngine->>LoopSelector: select_loop_type(complexity, rules, budget_pct)
LoopSelector-->>AgentEngine: final_loop_type
end
AgentEngine->>LoopFactory: build_execution_loop(final_loop_type, ...)
LoopFactory-->>AgentEngine: ExecutionLoop instance
AgentEngine->>ExecutionLoop: execute(task)
ExecutionLoop-->>AgentEngine: result
AgentEngine-->>Client: AgentRunResult
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a sophisticated, automated system for dynamically selecting the most appropriate execution loop for agent tasks. By considering both task complexity and current budget utilization, the system optimizes resource allocation and operational efficiency, ensuring that agents use the most suitable strategy for their given workload while also managing costs. This enhancement significantly improves the adaptability and intelligence of the agent engine. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a well-designed feature for automatically selecting the agent execution loop based on task complexity and budget utilization. The implementation is robust, with new configuration models, a dedicated selection module, and comprehensive tests. My review identifies a critical syntax error and a significant logic bug in the checkpoint resume path that should be addressed. I've also included a couple of medium-severity suggestions to improve logging and future flexibility.
| monthly_cost = await self._cost_tracker.get_total_cost( | ||
| start=period_start, | ||
| ) | ||
| except MemoryError, RecursionError: |
src/synthorg/engine/agent_engine.py
Outdated
| ) | ||
|
|
||
| loop = self._make_loop_with_callback(agent_id, task_id) | ||
| loop = self._make_loop_with_callback(self._loop, agent_id, task_id) |
There was a problem hiding this comment.
When resuming from a checkpoint, the execution loop is hardcoded to self._loop, which bypasses the new auto-selection logic. If a task was using an auto-selected loop (e.g., plan_execute for a complex task) and crashed, it would be resumed with the default loop (e.g., ReactLoop), which could lead to incorrect behavior. The loop for a resumed execution should also be resolved dynamically using _resolve_loop based on the task's complexity.
resolved_loop = await self._resolve_loop(checkpoint_ctx.task_execution.task)
loop = self._make_loop_with_callback(resolved_loop, agent_id, task_id)
src/synthorg/engine/agent_engine.py
Outdated
| note=( | ||
| "budget enforcer present but utilization " | ||
| "unknown; proceeding without budget-aware " | ||
| "loop downgrade" | ||
| ), |
src/synthorg/engine/loop_selector.py
Outdated
| loop_type = next( | ||
| (r.loop_type for r in rules if r.complexity == complexity), | ||
| "react", | ||
| ) |
There was a problem hiding this comment.
The fallback loop type for when no rule matches a task's complexity is hardcoded to "react". This could be made more flexible by adding a default_loop_type field to AutoLoopConfig, similar to hybrid_fallback. This would allow users to configure a different default if desired, for instance, defaulting to plan_execute for any unmapped complexities.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #567 +/- ##
==========================================
+ Coverage 92.64% 92.66% +0.01%
==========================================
Files 544 545 +1
Lines 26931 27061 +130
Branches 2582 2603 +21
==========================================
+ Hits 24951 25075 +124
Misses 1568 1568
- Partials 412 418 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/engine/agent_engine.py`:
- Around line 903-904: The resume path is rebuilding the loop using self._loop
(which may be the default ReactLoop) causing checkpoints created under a
different loop like PlanExecuteLoop to resume under the wrong loop; change the
resume code in the resume path (where self._make_loop_with_callback and
loop.execute are used) to resolve/recreate the original loop type from persisted
checkpoint metadata (preferably stored on checkpoint_ctx, e.g.
checkpoint_ctx.task_execution.task or an explicit loop_type field) before
injecting the checkpoint callback, and pass that reconstructed loop into
self._make_loop_with_callback instead of self._loop so loop-specific state and
callbacks are preserved (also apply same change to the other resume locations
around the 987-1031 range).
In `@src/synthorg/engine/loop_selector.py`:
- Around line 40-41: The AutoLoopConfig should validate provided loop_type and
hybrid_fallback against the allowed set _KNOWN_LOOP_TYPES as soon as the config
is constructed (instead of letting build_execution_loop fail later); add a
validation guard in AutoLoopConfig (e.g., in its __post_init__ or factory
method) that checks loop_type is in _KNOWN_LOOP_TYPES and, if hybrid_fallback is
provided, that it is also in _KNOWN_LOOP_TYPES (and disallow None if "hybrid"
cannot be built), raising a clear ValueError for invalid values; update any
callers that construct AutoLoopConfig to rely on this and remove downstream
assumptions in build_execution_loop or factory code (reference symbols:
AutoLoopConfig, loop_type, hybrid_fallback, _KNOWN_LOOP_TYPES,
build_execution_loop).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0665d678-2aab-4e6a-b3c7-1f434f03681f
📒 Files selected for processing (12)
CLAUDE.mdREADME.mddocs/design/engine.mdsrc/synthorg/budget/enforcer.pysrc/synthorg/engine/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.pysrc/synthorg/observability/events/budget.pysrc/synthorg/observability/events/execution.pytests/unit/budget/test_enforcer.pytests/unit/engine/test_agent_engine_auto_loop.pytests/unit/engine/test_loop_selector.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web
- GitHub Check: Build Sandbox
- GitHub Check: Build Backend
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (10)
{src/synthorg/**/*.py,tests/**/*.py,**/*.md,web/**/{*.ts,*.js,*.vue}}
📄 CodeRabbit inference engine (CLAUDE.md)
Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:
example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Vendor names only in: (1) Operations design page, (2).claude/files, (3) third-party import paths/modules
Files:
README.mddocs/design/engine.mdsrc/synthorg/budget/enforcer.pyCLAUDE.mdsrc/synthorg/observability/events/execution.pytests/unit/engine/test_loop_selector.pysrc/synthorg/engine/loop_selector.pysrc/synthorg/observability/events/budget.pytests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/__init__.pytests/unit/budget/test_enforcer.py
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Markdown: use for all documentation files (
docs/,site/, README, etc.)
Files:
README.mddocs/design/engine.mdCLAUDE.md
docs/design/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
docs/design/*.md: When approved deviations occur, update the relevantdocs/design/page to reflect the new reality
Design spec pages: 7 pages indocs/design/— index, agents, organization, communication, engine, memory, operations
Files:
docs/design/engine.md
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax: useexcept A, B:(no parentheses) — ruff enforces this on Python 3.14
Type hints required on all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Create new objects instead of mutating existing ones; for non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement
Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (usingmodel_copy(update=...)) for runtime state that evolves — never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict); use@computed_fieldfor derived values instead of storing + validating redundant fields; useNotBlankStrfor all identifier/name fields including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over barecreate_task
Line length: 88 characters (ruff)
Functions should be < 50 lines, files < 800 lines
Validate at system boundaries (user input, external APIs, config files)
Files:
src/synthorg/budget/enforcer.pysrc/synthorg/observability/events/execution.pytests/unit/engine/test_loop_selector.pysrc/synthorg/engine/loop_selector.pysrc/synthorg/observability/events/budget.pytests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/__init__.pytests/unit/budget/test_enforcer.py
**/{*.py,*.go}
📄 CodeRabbit inference engine (CLAUDE.md)
Handle errors explicitly, never silently swallow them
Files:
src/synthorg/budget/enforcer.pysrc/synthorg/observability/events/execution.pytests/unit/engine/test_loop_selector.pysrc/synthorg/engine/loop_selector.pysrc/synthorg/observability/events/budget.pytests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/__init__.pytests/unit/budget/test_enforcer.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Variable name for logger: alwayslogger(not_logger, notlog)
Event names: always use constants from domain-specific modules undersynthorg.observability.events(e.g.,API_REQUEST_STARTEDfromevents.api,TOOL_INVOKE_STARTfromevents.tool). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT
Use structured logging: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG level logging for object creation, internal flow, entry/exit of key functions
Library reference: auto-generated from docstrings via mkdocstrings + Griffe (AST-based, no imports) indocs/api/
Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP, tool factory, sandbox factory)
Files:
src/synthorg/budget/enforcer.pysrc/synthorg/observability/events/execution.pysrc/synthorg/engine/loop_selector.pysrc/synthorg/observability/events/budget.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/__init__.py
src/synthorg/budget/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Files:
src/synthorg/budget/enforcer.py
src/synthorg/observability/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Files:
src/synthorg/observability/events/execution.pysrc/synthorg/observability/events/budget.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Test markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Async testing:asyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded
Prefer@pytest.mark.parametrizefor testing similar cases
Never skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()to make them deterministic instead of widening timing margins
Files:
tests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pytests/unit/budget/test_enforcer.py
src/synthorg/engine/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Files:
src/synthorg/engine/loop_selector.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/__init__.py
🧠 Learnings (14)
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Applied to files:
README.mddocs/design/engine.mdCLAUDE.mdsrc/synthorg/engine/loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/__init__.py
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Applied to files:
src/synthorg/budget/enforcer.pysrc/synthorg/observability/events/budget.pytests/unit/budget/test_enforcer.py
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
CLAUDE.mdsrc/synthorg/engine/__init__.py
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/communication/**/*.py : Communication package (communication/): message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution; meeting/ subpackage for meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue — DESIGN_SPEC.md is a pointer file linking to 7 design pages (Agents, Organization, Communication, Engine, Memory, Operations)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/backup/**/*.py : Backup package (backup/): scheduled/manual/lifecycle backups of persistence DB, agent memory, company config. BackupService orchestrator, BackupScheduler (periodic asyncio task), RetentionManager (count + age pruning), tar.gz compression, SHA-256 checksums, manifest tracking, validated restore with atomic rollback and safety backup. handlers/ subpackage: ComponentHandler protocol + concrete handlers (PersistenceComponentHandler, MemoryComponentHandler, ConfigComponentHandler)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/execution.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/execution.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/execution.pysrc/synthorg/observability/events/budget.py
🧬 Code graph analysis (5)
tests/unit/engine/test_loop_selector.py (1)
src/synthorg/engine/loop_selector.py (4)
AutoLoopConfig(76-116)AutoLoopRule(44-56)build_execution_loop(194-236)select_loop_type(119-191)
src/synthorg/engine/loop_selector.py (4)
src/synthorg/engine/plan_execute_loop.py (1)
PlanExecuteLoop(84-930)src/synthorg/engine/react_loop.py (1)
ReactLoop(61-352)src/synthorg/engine/loop_protocol.py (1)
ExecutionLoop(158-196)src/synthorg/engine/stagnation/protocol.py (1)
StagnationDetector(15-46)
src/synthorg/engine/agent_engine.py (6)
src/synthorg/engine/loop_selector.py (3)
AutoLoopConfig(76-116)build_execution_loop(194-236)select_loop_type(119-191)src/synthorg/engine/react_loop.py (3)
stagnation_detector(104-106)get_loop_type(113-115)approval_gate(99-101)src/synthorg/engine/plan_execute_loop.py (3)
stagnation_detector(131-133)get_loop_type(140-142)approval_gate(126-128)src/synthorg/engine/loop_protocol.py (2)
get_loop_type(194-196)ExecutionLoop(158-196)src/synthorg/engine/checkpoint/resume.py (1)
make_loop_with_callback(99-144)src/synthorg/budget/enforcer.py (1)
get_budget_utilization_pct(96-130)
src/synthorg/engine/__init__.py (1)
src/synthorg/engine/loop_selector.py (4)
AutoLoopConfig(76-116)AutoLoopRule(44-56)build_execution_loop(194-236)select_loop_type(119-191)
tests/unit/budget/test_enforcer.py (3)
tests/unit/budget/test_enforcer_quota.py (1)
_patch_periods(61-75)tests/unit/budget/conftest.py (1)
make_cost_record(288-309)src/synthorg/budget/enforcer.py (2)
cost_tracker(92-94)get_budget_utilization_pct(96-130)
🪛 markdownlint-cli2 (0.21.0)
docs/design/engine.md
[warning] 421-421: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
🔇 Additional comments (11)
tests/unit/budget/test_enforcer.py (1)
1088-1161: LGTM! Comprehensive test coverage forget_budget_utilization_pct.The test suite covers all key scenarios: correct percentage calculation, disabled budget (returns
None), zero spend, over-budget (>100%), graceful degradation on tracker failure, andMemoryErrorpropagation. Good use ofpytest.approxfor float comparisons and proper isolation via_patch_periods.CLAUDE.md (1)
127-127: LGTM! Documentation accurately reflects the new auto-loop selection feature.The engine package description now includes the loop selector module and its public API surface (
AutoLoopConfig,AutoLoopRule,select_loop_type,build_execution_loop), aligning with the implementation inloop_selector.py.README.md (1)
35-35: LGTM! README accurately reflects the new auto-selection capability.The concise addition of "auto-selection by complexity" correctly summarizes the feature for end users.
src/synthorg/observability/events/budget.py (1)
34-36: LGTM! New budget utilization event constants follow established patterns.The constants
BUDGET_UTILIZATION_QUERIEDandBUDGET_UTILIZATION_ERRORare correctly typed withFinal[str]and follow the existingbudget.<domain>.<action>naming convention.src/synthorg/engine/__init__.py (1)
121-127: LGTM! Public API surface correctly expanded for auto-loop selection.The engine package properly re-exports the new loop selector symbols (
AutoLoopConfig,AutoLoopRule,DEFAULT_AUTO_LOOP_RULES,build_execution_loop,select_loop_type) with alphabetically sorted__all__entries.Also applies to: 213-213, 237-238, 375-375, 382-382
src/synthorg/budget/enforcer.py (1)
96-130: LGTM! Well-implemented budget utilization query with proper error handling.The method follows established patterns in the class:
- Graceful degradation on tracker failures (returns
None)MemoryError/RecursionErrorproperly re-raised- Structured logging with domain-specific event constants
- Clear docstring explaining return semantics
src/synthorg/observability/events/execution.py (1)
75-82: LGTM! New auto-selection event constants follow established patterns.The six new
EXECUTION_LOOP_*constants are consistently named, properly typed withFinal[str], and cover the key decision points in the auto-loop selection flow (auto-selected, budget downgrade, hybrid fallback, no rule match, unknown type, budget unavailable).docs/design/engine.md (2)
417-428: LGTM! Auto-selection documentation clearly explains the three-layer decision process.The tip block now accurately describes:
- Rule matching (complexity → loop type)
- Budget-aware downgrade (hybrid → plan_execute when utilization ≥ threshold)
- Hybrid fallback (until HybridLoop is implemented)
This aligns with the implementation in
select_loop_type().
475-493: LGTM! Pipeline steps updated to reflect auto-loop resolution.The new step 8 "Resolve execution loop" clearly documents the dynamic loop selection via
select_loop_type()with budget utilization query, and subsequent steps are properly renumbered.tests/unit/engine/test_loop_selector.py (1)
168-193: Good precedence coverage.These cases pin the
budget downgrade -> hybrid fallbackordering, which is exactly the kind of selector behavior that tends to regress when new branches get added.tests/unit/engine/test_agent_engine_auto_loop.py (1)
150-253: Useful end-to-end budget-selection coverage.Driving the real engine and asserting
EXECUTION_LOOP_AUTO_SELECTEDgives good protection against selector/engine wiring regressions.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/synthorg/engine/loop_selector.py (1)
122-143:⚠️ Potential issue | 🟠 MajorThe validator still allows runtime-only
hybridfailures.It now rejects unknown loop names, but configs like the default rules plus
hybrid_fallback=Noneorhybrid_fallback="hybrid"still pass construction even thoughbuild_execution_loop()cannot instantiate"hybrid". That means the first complex/epic task fails only at execution time. Reject any config whose reachable output set still includes"hybrid"untilHybridLoopexists.Suggested validation guard
+_BUILDABLE_LOOP_TYPES: frozenset[str] = frozenset({"react", "plan_execute"}) + class AutoLoopConfig(BaseModel): @@ `@model_validator`(mode="after") def _validate_rules_and_fallbacks(self) -> Self: """Validate unique complexities and known loop types.""" seen: set[Complexity] = set() for rule in self.rules: @@ msg = f"Unknown loop type in rules: {rule.loop_type!r}" raise ValueError(msg) seen.add(rule.complexity) + hybrid_reachable = ( + self.default_loop_type == "hybrid" + or any(rule.loop_type == "hybrid" for rule in self.rules) + ) if ( self.hybrid_fallback is not None - and self.hybrid_fallback not in _KNOWN_LOOP_TYPES + and self.hybrid_fallback not in _BUILDABLE_LOOP_TYPES ): - msg = f"Unknown hybrid_fallback: {self.hybrid_fallback!r}" + msg = f"Unsupported hybrid_fallback: {self.hybrid_fallback!r}" raise ValueError(msg) + if hybrid_reachable and self.hybrid_fallback is None: + msg = "hybrid_fallback=None is unsupported until HybridLoop exists" + raise ValueError(msg) if self.default_loop_type not in _KNOWN_LOOP_TYPES: msg = f"Unknown default_loop_type: {self.default_loop_type!r}" raise ValueError(msg) return selfAs per coding guidelines, "Validate at system boundaries (user input, external APIs, config files)."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/loop_selector.py` around lines 122 - 143, The validator _validate_rules_and_fallbacks currently only checks membership in _KNOWN_LOOP_TYPES, allowing configurations (rules, hybrid_fallback, default_loop_type) that include the "hybrid" name to pass even though build_execution_loop cannot instantiate a HybridLoop; update the validator to treat "hybrid" as currently unavailable by rejecting any configuration where any reachable loop name (from each rule.loop_type, hybrid_fallback if not None, and default_loop_type) equals "hybrid" (or more generally any name in a new UNAVAILABLE_LOOP_TYPES set) and raise a ValueError with a clear message; refer to symbols _validate_rules_and_fallbacks, rules, rule.loop_type, hybrid_fallback, default_loop_type, build_execution_loop, and HybridLoop so the check prevents configs that would fail at execution time.src/synthorg/engine/agent_engine.py (1)
909-914:⚠️ Potential issue | 🟠 MajorResume still recomputes the loop from live state.
This uses the current budget/config instead of the loop that produced the checkpoint. With a custom
hybrid_fallback="react", a complex task can checkpoint underReactLoopand resume underPlanExecuteLooponce budget crosses the threshold, or the reverse. That replays checkpoint state under a different loop family. Persist the selected loop type with the checkpoint and rebuild from that value on resume.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/agent_engine.py` around lines 909 - 914, Resume currently recomputes the loop from live config via _resolve_loop, which can replay a checkpoint under a different loop family (e.g., ReactLoop vs PlanExecuteLoop); persist the loop identity with the checkpoint (e.g., checkpoint_ctx.loop_type or similar) when creating a checkpoint, and on resume use that persisted loop_type to reconstruct the original base loop instead of calling _resolve_loop. Update the checkpoint creation path to record the loop type and update the resume path around checkpoint_ctx.task_execution / base_loop selection to rebuild the exact loop class (via the existing loop factory/mapping used by _resolve_loop or a small switch handling ReactLoop/PlanExecuteLoop/hybrid_fallback) before calling _make_loop_with_callback(agent_id, task_id).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/design/engine.md`:
- Around line 417-433: The admonition under "Auto-selection" is being parsed as
an indented code block because the numbered list and subsequent lines are
indented; remove the leading indentation so the content is normal prose.
Specifically, unindent the numbered list and the paragraphs referencing
execution_loop: "auto", AutoLoopConfig.rules, AutoLoopRule, hybrid_fallback, and
default_loop_type so they align directly under the admonition header (ensure
there's a blank line after the tip header), use standard Markdown list
indentation (no 4-space block), and verify the three numbered items render as a
normal ordered list rather than a code fence.
In `@tests/unit/engine/test_agent_engine_auto_loop.py`:
- Around line 331-358: The test currently calls _resolve_loop() directly instead
of exercising the resume path; update the test to call
AgentEngine._execute_resumed_loop() and ensure the resume path awaits
_resolve_loop by either (a) providing a minimal checkpoint-like context so
_execute_resumed_loop runs through the resume branch, or (b)
patching/monkeypatching AgentEngine._resolve_loop with an async mock/coroutine
that records when it's awaited and returns a loop, then call
engine._execute_resumed_loop(task, ...) and assert the mock was awaited and
returned value used (e.g., loop.get_loop_type() or log event); reference the
methods _execute_resumed_loop and _resolve_loop and keep the existing assertions
about the EXECUTION_LOOP_AUTO_SELECTED log entry.
---
Duplicate comments:
In `@src/synthorg/engine/agent_engine.py`:
- Around line 909-914: Resume currently recomputes the loop from live config via
_resolve_loop, which can replay a checkpoint under a different loop family
(e.g., ReactLoop vs PlanExecuteLoop); persist the loop identity with the
checkpoint (e.g., checkpoint_ctx.loop_type or similar) when creating a
checkpoint, and on resume use that persisted loop_type to reconstruct the
original base loop instead of calling _resolve_loop. Update the checkpoint
creation path to record the loop type and update the resume path around
checkpoint_ctx.task_execution / base_loop selection to rebuild the exact loop
class (via the existing loop factory/mapping used by _resolve_loop or a small
switch handling ReactLoop/PlanExecuteLoop/hybrid_fallback) before calling
_make_loop_with_callback(agent_id, task_id).
In `@src/synthorg/engine/loop_selector.py`:
- Around line 122-143: The validator _validate_rules_and_fallbacks currently
only checks membership in _KNOWN_LOOP_TYPES, allowing configurations (rules,
hybrid_fallback, default_loop_type) that include the "hybrid" name to pass even
though build_execution_loop cannot instantiate a HybridLoop; update the
validator to treat "hybrid" as currently unavailable by rejecting any
configuration where any reachable loop name (from each rule.loop_type,
hybrid_fallback if not None, and default_loop_type) equals "hybrid" (or more
generally any name in a new UNAVAILABLE_LOOP_TYPES set) and raise a ValueError
with a clear message; refer to symbols _validate_rules_and_fallbacks, rules,
rule.loop_type, hybrid_fallback, default_loop_type, build_execution_loop, and
HybridLoop so the check prevents configs that would fail at execution time.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3e2b2f75-d27f-48c4-9a7f-875dddb2bfae
📒 Files selected for processing (7)
CLAUDE.mddocs/design/engine.mdsrc/synthorg/budget/enforcer.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pytests/unit/engine/test_loop_selector.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build Sandbox
- GitHub Check: Build Backend
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (9)
{src/synthorg/**/*.py,tests/**/*.py,**/*.md,web/**/{*.ts,*.js,*.vue}}
📄 CodeRabbit inference engine (CLAUDE.md)
Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:
example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Vendor names only in: (1) Operations design page, (2).claude/files, (3) third-party import paths/modules
Files:
CLAUDE.mdsrc/synthorg/budget/enforcer.pytests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.pydocs/design/engine.md
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Markdown: use for all documentation files (
docs/,site/, README, etc.)
Files:
CLAUDE.mddocs/design/engine.md
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax: useexcept A, B:(no parentheses) — ruff enforces this on Python 3.14
Type hints required on all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Create new objects instead of mutating existing ones; for non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement
Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (usingmodel_copy(update=...)) for runtime state that evolves — never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict); use@computed_fieldfor derived values instead of storing + validating redundant fields; useNotBlankStrfor all identifier/name fields including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over barecreate_task
Line length: 88 characters (ruff)
Functions should be < 50 lines, files < 800 lines
Validate at system boundaries (user input, external APIs, config files)
Files:
src/synthorg/budget/enforcer.pytests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.py
**/{*.py,*.go}
📄 CodeRabbit inference engine (CLAUDE.md)
Handle errors explicitly, never silently swallow them
Files:
src/synthorg/budget/enforcer.pytests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Variable name for logger: alwayslogger(not_logger, notlog)
Event names: always use constants from domain-specific modules undersynthorg.observability.events(e.g.,API_REQUEST_STARTEDfromevents.api,TOOL_INVOKE_STARTfromevents.tool). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT
Use structured logging: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG level logging for object creation, internal flow, entry/exit of key functions
Library reference: auto-generated from docstrings via mkdocstrings + Griffe (AST-based, no imports) indocs/api/
Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP, tool factory, sandbox factory)
Files:
src/synthorg/budget/enforcer.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.py
src/synthorg/budget/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Files:
src/synthorg/budget/enforcer.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Test markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Async testing:asyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded
Prefer@pytest.mark.parametrizefor testing similar cases
Never skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()to make them deterministic instead of widening timing margins
Files:
tests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.py
src/synthorg/engine/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Files:
src/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.py
docs/design/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
docs/design/*.md: When approved deviations occur, update the relevantdocs/design/page to reflect the new reality
Design spec pages: 7 pages indocs/design/— index, agents, organization, communication, engine, memory, operations
Files:
docs/design/engine.md
🧠 Learnings (13)
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Applied to files:
CLAUDE.mdtests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.pydocs/design/engine.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/communication/**/*.py : Communication package (communication/): message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution; meeting/ subpackage for meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue — DESIGN_SPEC.md is a pointer file linking to 7 design pages (Agents, Organization, Communication, Engine, Memory, Operations)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/backup/**/*.py : Backup package (backup/): scheduled/manual/lifecycle backups of persistence DB, agent memory, company config. BackupService orchestrator, BackupScheduler (periodic asyncio task), RetentionManager (count + age pruning), tar.gz compression, SHA-256 checksums, manifest tracking, validated restore with atomic rollback and safety backup. handlers/ subpackage: ComponentHandler protocol + concrete handlers (PersistenceComponentHandler, MemoryComponentHandler, ConfigComponentHandler)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Applied to files:
src/synthorg/budget/enforcer.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-18T21:35:45.198Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:35:45.198Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files)
Applied to files:
src/synthorg/engine/loop_selector.py
🧬 Code graph analysis (3)
tests/unit/engine/test_agent_engine_auto_loop.py (4)
src/synthorg/budget/enforcer.py (2)
BudgetEnforcer(56-473)cost_tracker(89-91)tests/unit/engine/conftest.py (4)
engine(449-460)MockCompletionProvider(207-289)make_completion_response(292-310)mock_provider_factory(314-316)src/synthorg/engine/agent_engine.py (3)
AgentEngine(130-1296)run(287-397)_resolve_loop(998-1044)src/synthorg/engine/loop_selector.py (1)
AutoLoopConfig(83-143)
src/synthorg/engine/agent_engine.py (3)
src/synthorg/engine/loop_selector.py (3)
AutoLoopConfig(83-143)build_execution_loop(228-272)select_loop_type(146-225)src/synthorg/engine/checkpoint/resume.py (1)
make_loop_with_callback(99-144)src/synthorg/budget/enforcer.py (1)
get_budget_utilization_pct(93-123)
src/synthorg/engine/loop_selector.py (5)
tests/unit/engine/conftest.py (1)
engine(449-460)src/synthorg/engine/plan_execute_loop.py (1)
PlanExecuteLoop(84-930)src/synthorg/engine/react_loop.py (1)
ReactLoop(61-352)src/synthorg/engine/loop_protocol.py (1)
ExecutionLoop(158-196)src/synthorg/engine/stagnation/protocol.py (1)
StagnationDetector(15-46)
🪛 markdownlint-cli2 (0.21.0)
docs/design/engine.md
[warning] 421-421: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
tests/unit/engine/test_agent_engine_auto_loop.py (1)
357-385:⚠️ Potential issue | 🟡 MinorAssert that the resolved loop is the one that actually runs.
resolve_mock.assert_awaited_once()only proves the resume path looked up a loop. If_execute_resumed_loop()regresses toself._loopafter that await, this test can still pass. Patchresolved_loop.execute(or return a loop double) and assert that exact mock was awaited.🧪 Tighten the assertion
resolved_loop: ExecutionLoop = PlanExecuteLoop() resolve_mock = AsyncMock(return_value=resolved_loop) @@ with ( patch.object(engine, "_resolve_loop", resolve_mock), patch.object( - PlanExecuteLoop, + resolved_loop, "execute", new_callable=AsyncMock, return_value=exec_result, - ), + ) as execute_mock, ): await engine._execute_resumed_loop( checkpoint_ctx, str(sample_agent_with_personality.id), str(task.id), ) @@ resolve_mock.assert_awaited_once() + execute_mock.assert_awaited_once() call_task = resolve_mock.call_args[0][0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/engine/test_agent_engine_auto_loop.py` around lines 357 - 385, The test currently only asserts resolve_mock.assert_awaited_once(), which doesn't prove the returned resolved_loop actually executed; update the test to patch or replace resolved_loop.execute with an AsyncMock (instead of patching PlanExecuteLoop.execute) and have resolve_mock return that resolved_loop; then after awaiting engine._execute_resumed_loop(...) assert that resolved_loop.execute (the AsyncMock) was awaited exactly once to ensure the specific resolved loop instance was run. Use the existing resolved_loop and resolve_mock identifiers and assert on resolved_loop.execute.await_count or resolved_loop.execute.assert_awaited_once().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/.trivyignore.yaml:
- Around line 13-19: Update the suppression for CVE-2026-32767 in
.trivyignore.yaml to include package scope and an expiration: add a purls entry
that targets the specific package (e.g., pkg:deb/debian/libexpat@2.7.4) and add
an expired_at field set ~90 days out so the suppression is time-limited; ensure
the suppression still contains the original statement text and the CVE id
(CVE-2026-32767) so future scans only ignore this exact CVE for that specific
package until the expiration.
In `@src/synthorg/engine/loop_selector.py`:
- Around line 167-246: The select_loop_type function is too large and should be
split into small helpers: extract the rule lookup, budget downgrade, and hybrid
fallback into three private functions (e.g., _match_loop_type(rules, complexity,
default_loop_type), _downgrade_for_budget(loop_type, budget_utilization_pct,
budget_tight_threshold), and _apply_hybrid_fallback(loop_type, hybrid_fallback))
and have select_loop_type call them in order; ensure each helper preserves the
existing logging calls and semantics (use the same log events
EXECUTION_LOOP_NO_RULE_MATCH, EXECUTION_LOOP_BUDGET_DOWNGRADE,
EXECUTION_LOOP_HYBRID_FALLBACK and the same parameter names), accept and return
the same values (strings or None where applicable), and keep select_loop_type
under 50 lines by delegating rule matching, budget-aware downgrade, and hybrid
fallback to these helpers.
- Around line 148-157: The current validation in loop_selector.py incorrectly
rejects configurations where default_loop_type == "hybrid" even when
self.hybrid_fallback is provided and would redirect to a buildable loop type;
update the validation logic so that the ValueError for an unbuildable
default_loop_type is only raised if the default remains unbuildable after
applying hybrid_fallback (i.e., if self.default_loop_type not in
_BUILDABLE_LOOP_TYPES AND not (self.default_loop_type == "hybrid" and
self.hybrid_fallback in _BUILDABLE_LOOP_TYPES)). Keep the existing
has_hybrid_rule check that requires hybrid_fallback when unbuildable loop types
exist, and ensure select_loop_type and AutoLoopConfig behavior remains
consistent with this adjusted validation.
---
Duplicate comments:
In `@tests/unit/engine/test_agent_engine_auto_loop.py`:
- Around line 357-385: The test currently only asserts
resolve_mock.assert_awaited_once(), which doesn't prove the returned
resolved_loop actually executed; update the test to patch or replace
resolved_loop.execute with an AsyncMock (instead of patching
PlanExecuteLoop.execute) and have resolve_mock return that resolved_loop; then
after awaiting engine._execute_resumed_loop(...) assert that
resolved_loop.execute (the AsyncMock) was awaited exactly once to ensure the
specific resolved loop instance was run. Use the existing resolved_loop and
resolve_mock identifiers and assert on resolved_loop.execute.await_count or
resolved_loop.execute.assert_awaited_once().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: fcd9ff83-0145-4eee-9e0e-7f7428653746
📒 Files selected for processing (5)
.github/.grype.yaml.github/.trivyignore.yamlsrc/synthorg/engine/loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pytests/unit/engine/test_loop_selector.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 except syntax, enforced by ruff on Python 3.14.
All public functions require type hints — mypy strict mode enforced.
Docstrings must use Google style and are required on all public classes and functions — enforced by ruff D rules.
Files:
tests/unit/engine/test_agent_engine_auto_loop.pytests/unit/engine/test_loop_selector.pysrc/synthorg/engine/loop_selector.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow.
Async tests use asyncio_mode = 'auto' — no manual@pytest.mark.asyncioneeded.
Test timeout: 30 seconds per test.
Prefer@pytest.mark.parametrizefor testing similar cases.
Use Hypothesis for property-based testing with@given+@settings. Hypothesis profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.
Files:
tests/unit/engine/test_agent_engine_auto_loop.pytests/unit/engine/test_loop_selector.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use@computed_fieldfor derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff).
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow them.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.
Files:
src/synthorg/engine/loop_selector.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__).
Never useimport logging/logging.getLogger()/print()in application code.
Always useloggeras the variable name (not_logger, notlog).
Event names must always use constants from the domain-specific module under synthorg.observability.events. Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured kwargs in logger calls: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
DEBUG logging for object creation, internal flow, entry/exit of key functions.
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately.
RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.
Files:
src/synthorg/engine/loop_selector.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Security scanning: pip-audit (Python), npm audit (web dependencies), Trivy + Grype (Docker images), govulncheck (Go), gitleaks (secrets), zizmor (GitHub Actions workflows), OSSF Scorecard (supply chain), ZAP DAST (API), Socket.dev (typosquatting/malware detection).
Applied to files:
.github/.grype.yaml
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Validate at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/loop_selector.py
🧬 Code graph analysis (2)
tests/unit/engine/test_agent_engine_auto_loop.py (6)
src/synthorg/budget/enforcer.py (1)
BudgetEnforcer(56-473)tests/unit/engine/conftest.py (3)
engine(449-460)MockCompletionProvider(207-289)make_completion_response(292-310)src/synthorg/engine/agent_engine.py (1)
_execute_resumed_loop(886-926)src/synthorg/engine/loop_selector.py (1)
AutoLoopConfig(85-164)src/synthorg/engine/plan_execute_loop.py (1)
PlanExecuteLoop(84-930)src/synthorg/engine/react_loop.py (1)
ReactLoop(61-352)
tests/unit/engine/test_loop_selector.py (2)
src/synthorg/engine/loop_selector.py (4)
AutoLoopConfig(85-164)AutoLoopRule(53-65)build_execution_loop(249-293)select_loop_type(167-246)src/synthorg/engine/react_loop.py (1)
ReactLoop(61-352)
🔇 Additional comments (1)
.github/.grype.yaml (1)
10-14: Duplicate of the same suppression-risk pattern: avoid CVE-only global ignore.Mirror the Trivy fix here: constrain by package/artifact where possible and track an expiry/revalidation date so this exception doesn’t become permanent technical risk.
In the current Grype configuration format, what ignore-rule fields can constrain a CVE suppression to a specific package/image/namespace, and is there built-in support for expiration metadata?Based on learnings, this repository treats Trivy and Grype as core security scanners and should keep suppressions tightly controlled.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
.github/.trivyignore.yaml (1)
14-16:⚠️ Potential issue | 🟠 MajorUse Trivy package scoping key
purls(notpaths) and verifyexpired_atschema format.At Line 14, this looks like a package URL scope but it is under
paths. In Trivy YAML ignores, package scoping is handled viapurls; usingpathshere can make the suppression mis-scoped or ineffective. Also confirm whether Line 16 accepts RFC3339 timestamp or requires date-only format in your Trivy version.Suggested fix
- id: CVE-2026-32767 - paths: + purls: - "pkg:apk/alpine/libexpat" - expired_at: "2026-06-17T00:00:00Z" + expired_at: "2026-06-17"#!/bin/bash # Read-only verification for Trivy ignore schema usage in repo. # 1) Confirm current key usage in ignore file. # 2) Check Trivy docs for supported fields and date format. set -euo pipefail echo "== Current ignore entry ==" cat -n .github/.trivyignore.yaml | sed -n '10,30p' echo echo "== Search for purls/paths usage ==" rg -n '^\s*(purls|paths|expired_at)\s*:' .github/.trivyignore.yaml echo echo "== Fetch Trivy docs snippets (public) ==" curl -fsSL https://trivy.dev/latest/docs/configuration/filtering/ | rg -n 'trivyignore|purls|paths|expired_at' -C 2 || true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/.trivyignore.yaml around lines 14 - 16, Replace the incorrect "paths" key with the Trivy package scoping key "purls" for the entry that currently reads a package URL (i.e., change the mapping that uses "paths: - \"pkg:apk/alpine/libexpat\"" to use "purls: - \"pkg:apk/alpine/libexpat\""), and validate/adjust the "expired_at" value to the schema your Trivy version expects (confirm whether it requires full RFC3339 timestamp or a date-only string and update the "expired_at" value accordingly); look for the keys "paths", "purls", and "expired_at" to locate and fix the entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/engine/loop_selector.py`:
- Around line 62-65: The Pydantic models AutoLoopRule and AutoLoopConfig
currently use model_config = ConfigDict(frozen=True) which leaves the default
extra behavior (ignore); update both to forbid unknown fields by changing their
model_config to ConfigDict(frozen=True, extra="forbid") so typos in config keys
raise errors instead of being dropped.
- Around line 53-65: AutoLoopRule currently allows any non-blank string for
loop_type though the docstring lists allowed values; add a pydantic
field_validator on AutoLoopRule.loop_type that checks the value is one of the
known options ("react","plan_execute","hybrid") and raises a ValidationError for
anything else so invalid types are rejected at construction; update error text
to mention the allowed values and ensure this prevents typos from propagating to
select_loop_type() and build_execution_loop() (AutoLoopConfig can still perform
its own validation but the model must enforce the contract).
---
Duplicate comments:
In @.github/.trivyignore.yaml:
- Around line 14-16: Replace the incorrect "paths" key with the Trivy package
scoping key "purls" for the entry that currently reads a package URL (i.e.,
change the mapping that uses "paths: - \"pkg:apk/alpine/libexpat\"" to use
"purls: - \"pkg:apk/alpine/libexpat\""), and validate/adjust the "expired_at"
value to the schema your Trivy version expects (confirm whether it requires full
RFC3339 timestamp or a date-only string and update the "expired_at" value
accordingly); look for the keys "paths", "purls", and "expired_at" to locate and
fix the entry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: dce009fd-f218-4410-bd7c-16c08c1532df
📒 Files selected for processing (4)
.github/.trivyignore.yamlsrc/synthorg/engine/loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pytests/unit/engine/test_loop_selector.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build Backend
- GitHub Check: Build Sandbox
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 except syntax, enforced by ruff on Python 3.14.
All public functions require type hints — mypy strict mode enforced.
Docstrings must use Google style and are required on all public classes and functions — enforced by ruff D rules.
Files:
src/synthorg/engine/loop_selector.pytests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use@computed_fieldfor derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff).
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow them.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.
Files:
src/synthorg/engine/loop_selector.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__).
Never useimport logging/logging.getLogger()/print()in application code.
Always useloggeras the variable name (not_logger, notlog).
Event names must always use constants from the domain-specific module under synthorg.observability.events. Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured kwargs in logger calls: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
DEBUG logging for object creation, internal flow, entry/exit of key functions.
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately.
RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.
Files:
src/synthorg/engine/loop_selector.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow.
Async tests use asyncio_mode = 'auto' — no manual@pytest.mark.asyncioneeded.
Test timeout: 30 seconds per test.
Prefer@pytest.mark.parametrizefor testing similar cases.
Use Hypothesis for property-based testing with@given+@settings. Hypothesis profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.
Files:
tests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Validate at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Functions must be less than 50 lines; files must be less than 800 lines.
Applied to files:
src/synthorg/engine/loop_selector.py
🧬 Code graph analysis (2)
tests/unit/engine/test_loop_selector.py (1)
src/synthorg/engine/loop_selector.py (4)
AutoLoopConfig(85-171)AutoLoopRule(53-65)build_execution_loop(270-314)select_loop_type(231-267)
tests/unit/engine/test_agent_engine_auto_loop.py (4)
src/synthorg/budget/enforcer.py (1)
cost_tracker(89-91)src/synthorg/engine/agent_engine.py (1)
_execute_resumed_loop(886-926)src/synthorg/engine/loop_selector.py (1)
AutoLoopConfig(85-171)src/synthorg/engine/react_loop.py (1)
ReactLoop(61-352)
93f65f4 to
e891cb7
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
.github/.trivyignore.yaml (1)
14-15:⚠️ Potential issue | 🟠 MajorNarrow
purlsto the concrete vulnerable package version.Good improvement adding
purls, but Line 15 still suppresses all Alpinelibexpatversions. Use a versioned PURL so the ignore remains minimal and doesn’t hide future unrelated findings.🔧 Proposed tightening
- purls: - - "pkg:apk/alpine/libexpat" + purls: + - "pkg:apk/alpine/libexpat@2.7.4"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/.trivyignore.yaml around lines 14 - 15, The purls entry currently suppresses all Alpine libexpat versions by using "pkg:apk/alpine/libexpat"; narrow this to the concrete vulnerable package version by replacing that string with a versioned PURL (e.g., "pkg:apk/alpine/libexpat@<version>" or include the revision like "@<version>-r<rev>") that matches the exact version reported by the scanner; ensure you follow the purl format used elsewhere so only the specific vulnerable release is ignored rather than all libexpat packages.src/synthorg/engine/agent_engine.py (1)
909-914:⚠️ Potential issue | 🟠 MajorPersist the selected loop in checkpoint metadata.
This resume path re-runs auto-selection against the current budget/config state. With a valid config like
hybrid_fallback="react", a task can checkpoint underReactLoopand resume underPlanExecuteLoopafter the budget threshold flips, which changes loop semantics mid-execution. Resume should rebuild from the loop type stored with the checkpoint instead of calling_resolve_loop(...)again here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/agent_engine.py` around lines 909 - 914, The resume path currently re-runs loop auto-selection by calling _resolve_loop and then _make_loop_with_callback, which allows the loop type to change between checkpoint/save and resume; instead, persist the resolved loop identifier/type into the checkpoint metadata when creating/saving a checkpoint and, in the resume code path where checkpoint_ctx.task_execution is present, read that stored loop type and reconstruct the same loop instance (use the stored loop id to choose the loop factory instead of calling _resolve_loop); ensure _make_loop_with_callback is invoked with the loop built from the stored loop type so resumed executions retain the original loop semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/.grype.yaml:
- Around line 10-14: The suppression for CVE-2026-32767 is currently global;
narrow it to the affected package by adding a package scope (e.g., set
package.name: libexpat, and optionally package.version/type/language) under the
CVE-2026-32767 entry so only libexpat matches, and consider adding namespace or
match-type/fix-state if needed; also add a nearby comment with an audit date to
remind reviewers there is no native expiration.
In @.github/.trivyignore.yaml:
- Line 16: The expired_at value is using an RFC3339 timestamp instead of the
date-only format Trivy expects; update the expired_at key value (expired_at)
from the timestamp string to a date-only string in YYYY-MM-DD format (e.g.,
"2026-06-17") so the .trivyignore rule is recognized correctly.
In `@src/synthorg/engine/agent_engine.py`:
- Around line 1013-1031: Determine the candidate loop type first without hitting
the budget API by calling select_loop_type with budget_utilization_pct=None
(using the same complexity, rules, budget_tight_threshold, hybrid_fallback and
default_loop_type from _auto_loop_config and task.estimated_complexity) to get a
preliminary_loop_type; only if preliminary_loop_type == "hybrid" and
self._budget_enforcer is not None, call await
self._budget_enforcer.get_budget_utilization_pct(), log
EXECUTION_LOOP_BUDGET_UNAVAILABLE if it returns None, then call select_loop_type
again with the obtained budget_utilization_pct to get the final loop_type;
reference functions/fields: select_loop_type, get_budget_utilization_pct,
self._budget_enforcer, EXECUTION_LOOP_BUDGET_UNAVAILABLE, _auto_loop_config,
task.estimated_complexity.
---
Duplicate comments:
In @.github/.trivyignore.yaml:
- Around line 14-15: The purls entry currently suppresses all Alpine libexpat
versions by using "pkg:apk/alpine/libexpat"; narrow this to the concrete
vulnerable package version by replacing that string with a versioned PURL (e.g.,
"pkg:apk/alpine/libexpat@<version>" or include the revision like
"@<version>-r<rev>") that matches the exact version reported by the scanner;
ensure you follow the purl format used elsewhere so only the specific vulnerable
release is ignored rather than all libexpat packages.
In `@src/synthorg/engine/agent_engine.py`:
- Around line 909-914: The resume path currently re-runs loop auto-selection by
calling _resolve_loop and then _make_loop_with_callback, which allows the loop
type to change between checkpoint/save and resume; instead, persist the resolved
loop identifier/type into the checkpoint metadata when creating/saving a
checkpoint and, in the resume code path where checkpoint_ctx.task_execution is
present, read that stored loop type and reconstruct the same loop instance (use
the stored loop id to choose the loop factory instead of calling _resolve_loop);
ensure _make_loop_with_callback is invoked with the loop built from the stored
loop type so resumed executions retain the original loop semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9ecf9e7a-ad8b-4d3b-8ecd-1176eeee76d9
📒 Files selected for processing (14)
.github/.grype.yaml.github/.trivyignore.yamlCLAUDE.mdREADME.mddocs/design/engine.mdsrc/synthorg/budget/enforcer.pysrc/synthorg/engine/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.pysrc/synthorg/observability/events/budget.pysrc/synthorg/observability/events/execution.pytests/unit/budget/test_enforcer.pytests/unit/engine/test_agent_engine_auto_loop.pytests/unit/engine/test_loop_selector.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web
- GitHub Check: Build Sandbox
- GitHub Check: Build Backend
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 except syntax, enforced by ruff on Python 3.14.
All public functions require type hints — mypy strict mode enforced.
Docstrings must use Google style and are required on all public classes and functions — enforced by ruff D rules.
Files:
src/synthorg/observability/events/budget.pysrc/synthorg/engine/__init__.pytests/unit/budget/test_enforcer.pysrc/synthorg/budget/enforcer.pysrc/synthorg/observability/events/execution.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.pysrc/synthorg/engine/loop_selector.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use@computed_fieldfor derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff).
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow them.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.
Files:
src/synthorg/observability/events/budget.pysrc/synthorg/engine/__init__.pysrc/synthorg/budget/enforcer.pysrc/synthorg/observability/events/execution.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__).
Never useimport logging/logging.getLogger()/print()in application code.
Always useloggeras the variable name (not_logger, notlog).
Event names must always use constants from the domain-specific module under synthorg.observability.events. Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured kwargs in logger calls: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
DEBUG logging for object creation, internal flow, entry/exit of key functions.
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately.
RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.
Files:
src/synthorg/observability/events/budget.pysrc/synthorg/engine/__init__.pysrc/synthorg/budget/enforcer.pysrc/synthorg/observability/events/execution.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/loop_selector.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow.
Async tests use asyncio_mode = 'auto' — no manual@pytest.mark.asyncioneeded.
Test timeout: 30 seconds per test.
Prefer@pytest.mark.parametrizefor testing similar cases.
Use Hypothesis for property-based testing with@given+@settings. Hypothesis profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.
Files:
tests/unit/budget/test_enforcer.pytests/unit/engine/test_loop_selector.pytests/unit/engine/test_agent_engine_auto_loop.py
🧠 Learnings (15)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Security scanning: pip-audit (Python), npm audit (web dependencies), Trivy + Grype (Docker images), govulncheck (Go), gitleaks (secrets), zizmor (GitHub Actions workflows), OSSF Scorecard (supply chain), ZAP DAST (API), Socket.dev (typosquatting/malware detection).
Applied to files:
.github/.grype.yaml
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Docs source in docs/ (Markdown, built with Zensical). Design spec: docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture: docs/architecture/. Roadmap: docs/roadmap/. Security: docs/security.md. Licensing: docs/licensing.md. Reference: docs/reference/. Custom templates: docs/overrides/.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/observability/events/execution.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/synthorg/**/*.py : Event names must always use constants from the domain-specific module under synthorg.observability.events. Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/observability/events/execution.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).
Applied to files:
tests/unit/budget/test_enforcer.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 except syntax, enforced by ruff on Python 3.14.
Applied to files:
src/synthorg/budget/enforcer.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to tests/**/*.py : Test timeout: 30 seconds per test.
Applied to files:
tests/unit/engine/test_loop_selector.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Validate at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Functions must be less than 50 lines; files must be less than 800 lines.
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
src/synthorg/engine/loop_selector.py
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Applies to src/**/*.py : Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/engine/loop_selector.py
🧬 Code graph analysis (5)
src/synthorg/engine/__init__.py (2)
tests/unit/engine/conftest.py (1)
engine(449-460)src/synthorg/engine/loop_selector.py (4)
AutoLoopConfig(94-180)AutoLoopRule(53-74)build_execution_loop(279-323)select_loop_type(240-276)
tests/unit/budget/test_enforcer.py (4)
tests/unit/budget/test_enforcer_quota.py (2)
_make_budget_config(33-42)_patch_periods(61-75)tests/unit/engine/test_agent_engine_budget.py (1)
_make_budget_config(34-46)tests/unit/budget/conftest.py (1)
make_cost_record(288-309)src/synthorg/budget/enforcer.py (2)
cost_tracker(89-91)get_budget_utilization_pct(93-123)
src/synthorg/engine/agent_engine.py (1)
src/synthorg/engine/loop_selector.py (3)
AutoLoopConfig(94-180)build_execution_loop(279-323)select_loop_type(240-276)
tests/unit/engine/test_agent_engine_auto_loop.py (4)
src/synthorg/budget/enforcer.py (2)
BudgetEnforcer(56-473)cost_tracker(89-91)src/synthorg/engine/agent_engine.py (3)
AgentEngine(130-1296)run(287-397)_execute_resumed_loop(886-926)src/synthorg/engine/loop_selector.py (1)
AutoLoopConfig(94-180)src/synthorg/engine/react_loop.py (1)
ReactLoop(61-352)
src/synthorg/engine/loop_selector.py (4)
src/synthorg/engine/plan_execute_loop.py (1)
PlanExecuteLoop(84-930)src/synthorg/engine/react_loop.py (1)
ReactLoop(61-352)src/synthorg/engine/loop_protocol.py (1)
ExecutionLoop(158-196)src/synthorg/engine/stagnation/protocol.py (1)
StagnationDetector(15-46)
🪛 markdownlint-cli2 (0.21.0)
docs/design/engine.md
[warning] 421-421: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
🔇 Additional comments (12)
src/synthorg/observability/events/execution.py (1)
75-82: LGTM!The new execution loop auto-selection event constants follow the established naming convention (
execution.loop.*namespace) and are properly typed withFinal[str]. The grouping comment clearly identifies their purpose.README.md (1)
35-35: LGTM!The README update accurately reflects the new auto-selection capability alongside the existing ReAct and Plan-and-Execute loops.
CLAUDE.md (1)
127-127: LGTM!The package structure documentation accurately captures all new loop selector components and their behavior (complexity-based routing, budget-aware downgrade, hybrid fallback, and configurable defaults).
src/synthorg/observability/events/budget.py (1)
34-36: LGTM!The new budget utilization event constants follow the established naming convention and are consistent with the existing budget event patterns.
src/synthorg/engine/__init__.py (2)
121-127: LGTM!The new loop selector API is properly imported and re-exported, maintaining the package's public interface consistency.
213-238: LGTM!The
__all__list is correctly updated with the new exports in alphabetical order.tests/unit/budget/test_enforcer.py (1)
1088-1161: LGTM!Comprehensive test coverage for
get_budget_utilization_pct()including:
- Normal percentage calculation
- Disabled budget (
total_monthly=0) returningNone- Zero spend returning
0.0%- Over-budget scenario returning
>100%- Graceful degradation on tracker failure
MemoryErrorpropagation (correctly not swallowed)The tests follow project conventions with proper markers and use
pytest.approxfor float comparisons.docs/design/engine.md (2)
417-433: LGTM!The "Auto-selection" documentation clearly explains the three-layer selection flow:
- Rule matching with
default_loop_typefallback- Budget-aware downgrade when utilization ≥ threshold
- Hybrid fallback when hybrid loop is unavailable
This matches the implementation in
select_loop_type().
479-486: LGTM!The new "Resolve execution loop" step (step 8) accurately documents the per-task loop resolution flow, including budget utilization query and the various fallback behaviors.
src/synthorg/budget/enforcer.py (2)
93-123: LGTM!The new
get_budget_utilization_pct()method is well-implemented:
- Correctly guards against division by zero with the
total_monthly <= 0early return- Uses PEP 758 except syntax (
except MemoryError, RecursionError:)- Implements graceful degradation (returns
Noneon failure, logs exception)- Properly propagates critical errors (
MemoryError,RecursionError)- Structured logging with domain-specific event constants
34-36: LGTM!New budget utilization event constant imports properly added alongside existing budget event imports.
src/synthorg/engine/loop_selector.py (1)
133-180: Good fail-fast validation boundary.Rejecting unknown and unbuildable loop configs in
AutoLoopConfigkeeps bad selector setups from escaping into runtime failures inbuild_execution_loop().Based on learnings, "Validate at system boundaries (user input, external APIs, config files)."
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/engine/agent_engine.py`:
- Around line 318-325: The auto-selection event currently omits agent/task
identifiers making it hard to correlate under concurrency; either (A) add
agent_id and task_id to the EXECUTION_LOOP_AUTO_SELECTED log by passing them
into _resolve_loop() so that _resolve_loop() includes agent_id and task_id in
its event emission (update the _resolve_loop signature and any callers
accordingly), or (B) emit a separate logger.info immediately after determining
loop_mode (the block using loop_mode and EXECUTION_ENGINE_START) that logs
EXECUTION_LOOP_AUTO_SELECTED with agent_id and task_id and the resolved loop
value; update references to _auto_loop_config, loop_mode,
EXECUTION_LOOP_AUTO_SELECTED and _resolve_loop() to ensure the identifiers are
included in the emitted event.
- Around line 998-1057: The _resolve_loop method mixes loop-selection logic,
budget I/O, logging, and loop construction and should be extracted into a
standalone resolver function in loop_selector.py; move the core logic that calls
select_loop_type (both preliminary and final), awaits
self._budget_enforcer.get_budget_utilization_pct(), logs the
EXECUTION_LOOP_AUTO_SELECTED/EXECUTION_LOOP_BUDGET_UNAVAILABLE events, and
returns the result of build_execution_loop into a new function (e.g.,
resolve_execution_loop(cfg, task, budget_enforcer, approval_gate,
stagnation_detector)) and have AgentEngine._resolve_loop delegate to it while
preserving use of self._auto_loop_config, self._budget_enforcer,
self._approval_gate, and self._stagnation_detector so tests can import and
exercise the selector in isolation.
- Around line 184-197: In AgentEngine.__init__, besides the mutual-exclusion
check for execution_loop and auto_loop_config, validate the provided
auto_loop_config (self._auto_loop_config / auto_loop_config) immediately to
ensure it cannot resolve to an unbuildable loop type (specifically avoid any
reachable "hybrid" selection); if validation fails, log via
EXECUTION_ENGINE_ERROR with a clear reason and raise ValueError so invalid
configs fail fast (this is the same validation that must prevent the later
failure when the engine attempts to build the loop around lines ~1053-1057).
Implement or call a helper validator that enumerates reachable auto-selected
loop types from auto_loop_config, rejects any disallowed type (e.g., "hybrid"),
and surface a descriptive error during construction.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: fbfc3ef4-983e-48b4-a606-7aa2fa3d67c7
📒 Files selected for processing (3)
.github/.grype.yaml.github/.trivyignore.yamlsrc/synthorg/engine/agent_engine.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 except syntax, enforced by ruff on Python 3.14.
All public functions require type hints — mypy strict mode enforced.
Docstrings must use Google style and are required on all public classes and functions — enforced by ruff D rules.
Files:
src/synthorg/engine/agent_engine.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use@computed_fieldfor derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff).
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow them.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.
Files:
src/synthorg/engine/agent_engine.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__).
Never useimport logging/logging.getLogger()/print()in application code.
Always useloggeras the variable name (not_logger, notlog).
Event names must always use constants from the domain-specific module under synthorg.observability.events. Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured kwargs in logger calls: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
DEBUG logging for object creation, internal flow, entry/exit of key functions.
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately.
RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.
Files:
src/synthorg/engine/agent_engine.py
🧠 Learnings (1)
📚 Learning: 2026-03-19T07:09:59.660Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:09:59.660Z
Learning: Security scanning: pip-audit (Python), npm audit (web dependencies), Trivy + Grype (Docker images), govulncheck (Go), gitleaks (secrets), zizmor (GitHub Actions workflows), OSSF Scorecard (supply chain), ZAP DAST (API), Socket.dev (typosquatting/malware detection).
Applied to files:
.github/.grype.yaml
🧬 Code graph analysis (1)
src/synthorg/engine/agent_engine.py (2)
src/synthorg/engine/loop_selector.py (3)
AutoLoopConfig(94-180)build_execution_loop(279-323)select_loop_type(240-276)src/synthorg/budget/enforcer.py (1)
get_budget_utilization_pct(93-123)
🔇 Additional comments (2)
.github/.trivyignore.yaml (1)
13-22: Scoped, time-bounded Trivy suppression looks good.This ignore rule is now properly constrained with
purlsandexpired_at, and the rationale is clear..github/.grype.yaml (1)
10-18: Grype suppression scoping and audit note are well implemented.Package-level scoping plus the explicit review reminder is a solid, low-risk configuration update.
| auto_loop_config: AutoLoopConfig | None = None, | ||
| ) -> None: | ||
| if execution_loop is not None and auto_loop_config is not None: | ||
| msg = "execution_loop and auto_loop_config are mutually exclusive" | ||
| logger.warning( | ||
| EXECUTION_ENGINE_ERROR, | ||
| reason=msg, | ||
| ) | ||
| raise ValueError(msg) | ||
| self._provider = provider | ||
| self._approval_store = approval_store | ||
| self._parked_context_repo = parked_context_repo | ||
| self._stagnation_detector = stagnation_detector | ||
| self._auto_loop_config = auto_loop_config |
There was a problem hiding this comment.
Reject unbuildable auto-loop configs at construction.
Right now __init__ only enforces mutual exclusivity. A config that can still resolve to hybrid will instantiate successfully and then fail later when Line 1053 tries to build that loop, so the engine accepts bad configuration and surfaces it only on live traffic. Fail fast here by validating the reachable auto-selected loop types during initialization.
As per coding guidelines, Validate at system boundaries (user input, external APIs, config files).
Also applies to: 1053-1057
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/engine/agent_engine.py` around lines 184 - 197, In
AgentEngine.__init__, besides the mutual-exclusion check for execution_loop and
auto_loop_config, validate the provided auto_loop_config (self._auto_loop_config
/ auto_loop_config) immediately to ensure it cannot resolve to an unbuildable
loop type (specifically avoid any reachable "hybrid" selection); if validation
fails, log via EXECUTION_ENGINE_ERROR with a clear reason and raise ValueError
so invalid configs fail fast (this is the same validation that must prevent the
later failure when the engine attempts to build the loop around lines
~1053-1057). Implement or call a helper validator that enumerates reachable
auto-selected loop types from auto_loop_config, rejects any disallowed type
(e.g., "hybrid"), and surface a descriptive error during construction.
src/synthorg/engine/agent_engine.py
Outdated
| async def _resolve_loop(self, task: Task) -> ExecutionLoop: | ||
| """Select the execution loop for a task. | ||
|
|
||
| When ``auto_loop_config`` is set, selects the loop based on | ||
| task complexity and optional budget state. Otherwise returns | ||
| the statically configured loop (``self._loop``). | ||
|
|
||
| Note: auto-selected loops use default ``PlanExecuteConfig`` | ||
| and do not receive a compaction callback. Provide an | ||
| ``execution_loop`` directly for custom plan-execute config | ||
| or compaction. | ||
| """ | ||
| if self._auto_loop_config is None: | ||
| return self._loop | ||
|
|
||
| cfg = self._auto_loop_config | ||
| # Dry-run without budget and without hybrid fallback to see the | ||
| # raw rule result. Only query budget when "hybrid" is the raw | ||
| # match (budget downgrade applies before hybrid fallback). | ||
| preliminary = select_loop_type( | ||
| complexity=task.estimated_complexity, | ||
| rules=cfg.rules, | ||
| budget_utilization_pct=None, | ||
| budget_tight_threshold=cfg.budget_tight_threshold, | ||
| hybrid_fallback=None, | ||
| default_loop_type=cfg.default_loop_type, | ||
| ) | ||
|
|
||
| budget_utilization_pct: float | None = None | ||
| if preliminary == "hybrid" and self._budget_enforcer is not None: | ||
| budget_utilization_pct = ( | ||
| await self._budget_enforcer.get_budget_utilization_pct() | ||
| ) | ||
| if budget_utilization_pct is None: | ||
| logger.debug( | ||
| EXECUTION_LOOP_BUDGET_UNAVAILABLE, | ||
| note="budget utilization unknown; skipping budget-aware downgrade", | ||
| ) | ||
|
|
||
| loop_type = select_loop_type( | ||
| complexity=task.estimated_complexity, | ||
| rules=cfg.rules, | ||
| budget_utilization_pct=budget_utilization_pct, | ||
| budget_tight_threshold=cfg.budget_tight_threshold, | ||
| hybrid_fallback=cfg.hybrid_fallback, | ||
| default_loop_type=cfg.default_loop_type, | ||
| ) | ||
|
|
||
| logger.info( | ||
| EXECUTION_LOOP_AUTO_SELECTED, | ||
| complexity=task.estimated_complexity.value, | ||
| selected_loop=loop_type, | ||
| budget_utilization_pct=budget_utilization_pct, | ||
| ) | ||
|
|
||
| return build_execution_loop( | ||
| loop_type, | ||
| approval_gate=self._approval_gate, | ||
| stagnation_detector=self._stagnation_detector, | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Extract _resolve_loop() into a dedicated resolver.
This helper now mixes rule evaluation, budget I/O, observability, and loop construction inside a ~60-line method on a ~1.3k-line class. Moving it beside src/synthorg/engine/loop_selector.py would keep AgentEngine within the repo’s size limits and make the auto-selection path easier to test in isolation.
As per coding guidelines, Functions must be less than 50 lines; files must be less than 800 lines.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/engine/agent_engine.py` around lines 998 - 1057, The
_resolve_loop method mixes loop-selection logic, budget I/O, logging, and loop
construction and should be extracted into a standalone resolver function in
loop_selector.py; move the core logic that calls select_loop_type (both
preliminary and final), awaits
self._budget_enforcer.get_budget_utilization_pct(), logs the
EXECUTION_LOOP_AUTO_SELECTED/EXECUTION_LOOP_BUDGET_UNAVAILABLE events, and
returns the result of build_execution_loop into a new function (e.g.,
resolve_execution_loop(cfg, task, budget_enforcer, approval_gate,
stagnation_detector)) and have AgentEngine._resolve_loop delegate to it while
preserving use of self._auto_loop_config, self._budget_enforcer,
self._approval_gate, and self._stagnation_detector so tests can import and
exercise the selector in isolation.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/synthorg/engine/agent_engine.py (1)
1-10: 🧹 Nitpick | 🔵 TrivialFile exceeds 800-line limit (currently ~1318 lines).
The file significantly exceeds the 800-line limit per coding guidelines. Consider extracting cohesive groups of functionality:
- Auto-loop resolution logic →
loop_selector.py(as previously suggested)- Checkpoint resume methods (
_resume_from_checkpoint,_reconstruct_and_run_resume,_execute_resumed_loop,_finalize_resume) → a dedicatedresume.pyor similar- Error handling methods (
_handle_budget_error,_handle_fatal_error,_build_error_execution) →error_handlers.pyThis would improve maintainability and testability while bringing the file within guidelines.
As per coding guidelines,
Functions must be less than 50 lines; files must be less than 800 lines.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/agent_engine.py` around lines 1 - 10, The file src/synthorg/engine/agent_engine.py exceeds the 800-line limit and must be split into smaller modules; extract the auto-loop resolution logic into a new loop_selector.py, move checkpoint resume methods (_resume_from_checkpoint, _reconstruct_and_run_resume, _execute_resumed_loop, _finalize_resume) into a resume.py, and move error handling methods (_handle_budget_error, _handle_fatal_error, _build_error_execution) into error_handlers.py; update imports in agent_engine.py to reference the new modules and ensure all moved functions keep their signatures and any shared helper dependencies are either moved or imported so agent_engine.run() and its callers continue to work without behavior changes.
♻️ Duplicate comments (1)
src/synthorg/engine/agent_engine.py (1)
1000-1066: 🧹 Nitpick | 🔵 TrivialFunction exceeds 50-line limit — consider extraction to
loop_selector.py.
_resolve_loopis approximately 55-60 lines of code (excluding docstring), slightly exceeding the 50-line function limit. A past review suggested extracting this logic intoloop_selector.pyalongside the other selection functions. This would:
- Keep
AgentEnginewithin file size limits.- Make the auto-selection logic testable in isolation.
- Improve cohesion by co-locating all loop selection logic.
The implementation itself is correct: the preliminary check optimizes budget lookups, and the logging includes the required identifiers.
As per coding guidelines,
Functions must be less than 50 lines; files must be less than 800 lines.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/agent_engine.py` around lines 1000 - 1066, _extract the auto-selection logic from AgentEngine._resolve_loop into a new helper module loop_selector.py: move the preliminary select_loop_type budget-dry-run, the conditional budget_utilization_pct fetch (using self._budget_enforcer), the second select_loop_type call, and the logger.info(EXECUTION_LOOP_AUTO_SELECTED) into a single exported function (e.g., select_and_build_execution_loop) that accepts the task (or task.estimated_complexity), cfg (self._auto_loop_config), agent_id, task_id, budget_enforcer, approval_gate, and stagnation_detector and returns the built ExecutionLoop via build_execution_loop; then simplify AgentEngine._resolve_loop to return self._loop when _auto_loop_config is None or delegate to the new helper when present. Ensure you reference and reuse existing symbols: select_loop_type, build_execution_loop, EXECUTION_LOOP_AUTO_SELECTED, self._auto_loop_config, self._budget_enforcer, self._approval_gate, and self._stagnation_detector.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/engine/agent_engine.py`:
- Around line 909-916: The resume logic currently calls _resolve_loop during
resume (using checkpoint_ctx.task_execution.task, agent_id, task_id) which can
yield a different loop type if budget/utilization changed; instead persist the
originally selected loop type in the checkpoint metadata when creating/saving a
checkpoint and use that stored loop type on resume to reconstruct base_loop
(fall back to calling _resolve_loop only if the persisted loop type is missing
or invalid). Update the checkpoint save/load paths to include the loop type,
then modify the resume path that computes base_loop (checkpoint_ctx,
_resolve_loop, _make_loop_with_callback, base_loop, task_execution, task,
agent_id, task_id) to prefer the persisted loop type and only recompute as a
last resort.
- Around line 630-633: The parameter named loop is reassigned with
self._make_loop_with_callback(loop, agent_id, task_id) which shadows the
original parameter; rename the reassigned variable (e.g., wrapped_loop) to avoid
confusion and improve readability. Update the assignment where loop =
self._make_loop_with_callback(...) to wrapped_loop =
self._make_loop_with_callback(...), and then update subsequent uses (such as
wrapped_loop.execute(...) that currently use coro = loop.execute(...)) to
reference wrapped_loop; ensure references to the original parameter name are
preserved if needed elsewhere in the function.
---
Outside diff comments:
In `@src/synthorg/engine/agent_engine.py`:
- Around line 1-10: The file src/synthorg/engine/agent_engine.py exceeds the
800-line limit and must be split into smaller modules; extract the auto-loop
resolution logic into a new loop_selector.py, move checkpoint resume methods
(_resume_from_checkpoint, _reconstruct_and_run_resume, _execute_resumed_loop,
_finalize_resume) into a resume.py, and move error handling methods
(_handle_budget_error, _handle_fatal_error, _build_error_execution) into
error_handlers.py; update imports in agent_engine.py to reference the new
modules and ensure all moved functions keep their signatures and any shared
helper dependencies are either moved or imported so agent_engine.run() and its
callers continue to work without behavior changes.
---
Duplicate comments:
In `@src/synthorg/engine/agent_engine.py`:
- Around line 1000-1066: _extract the auto-selection logic from
AgentEngine._resolve_loop into a new helper module loop_selector.py: move the
preliminary select_loop_type budget-dry-run, the conditional
budget_utilization_pct fetch (using self._budget_enforcer), the second
select_loop_type call, and the logger.info(EXECUTION_LOOP_AUTO_SELECTED) into a
single exported function (e.g., select_and_build_execution_loop) that accepts
the task (or task.estimated_complexity), cfg (self._auto_loop_config), agent_id,
task_id, budget_enforcer, approval_gate, and stagnation_detector and returns the
built ExecutionLoop via build_execution_loop; then simplify
AgentEngine._resolve_loop to return self._loop when _auto_loop_config is None or
delegate to the new helper when present. Ensure you reference and reuse existing
symbols: select_loop_type, build_execution_loop, EXECUTION_LOOP_AUTO_SELECTED,
self._auto_loop_config, self._budget_enforcer, self._approval_gate, and
self._stagnation_detector.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7a5d58c1-deb6-4c41-8eb1-21a827103f90
📒 Files selected for processing (1)
src/synthorg/engine/agent_engine.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 except syntax, enforced by ruff on Python 3.14.
All public functions require type hints — mypy strict mode enforced.
Docstrings must use Google style and are required on all public classes and functions — enforced by ruff D rules.
Files:
src/synthorg/engine/agent_engine.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use@computed_fieldfor derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff).
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow them.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.
Files:
src/synthorg/engine/agent_engine.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__).
Never useimport logging/logging.getLogger()/print()in application code.
Always useloggeras the variable name (not_logger, notlog).
Event names must always use constants from the domain-specific module under synthorg.observability.events. Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured kwargs in logger calls: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
DEBUG logging for object creation, internal flow, entry/exit of key functions.
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately.
RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.
Files:
src/synthorg/engine/agent_engine.py
🧠 Learnings (1)
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/agent_engine.py
🔇 Additional comments (6)
src/synthorg/engine/agent_engine.py (6)
38-42: LGTM!Imports for the new auto-loop selection components are correctly added.
244-255: LGTM!The
loop_typelogging correctly distinguishes between auto mode and static loop configuration. DEBUG level is appropriate for engine creation events.
318-327: LGTM!Loop mode determination and logging at execution start is correctly implemented with proper INFO-level observability.
431-443: LGTM!Per-task loop resolution is correctly integrated into the execution flow, enabling task complexity-based loop selection.
597-611: LGTM!The explicit
loopparameter enables both static and auto-selected loops to receive checkpoint callbacks.
184-197: AutoLoopConfig already validates unbuildable loop types at construction—no additional validation needed.AutoLoopConfig's
@model_validator(loop_selector.py:133–187) comprehensively validates that configs cannot resolve to unbuildable loop types:
- If rules contain unbuildable types (e.g., "hybrid"),
hybrid_fallbackmust be non-None and buildable- If
default_loop_type="hybrid", it must have a buildablehybrid_fallback- These constraints ensure
_match_loop_type()never returns an unbuildable typeValidation occurs at construction, so any invalid config fails immediately. The code at lines 184–197 simply stores an already-validated instance.
| base_loop = self._loop | ||
| if checkpoint_ctx.task_execution is not None: | ||
| base_loop = await self._resolve_loop( | ||
| checkpoint_ctx.task_execution.task, | ||
| agent_id, | ||
| task_id, | ||
| ) | ||
| loop = self._make_loop_with_callback(base_loop, agent_id, task_id) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Resume path now resolves loop from task — note potential inconsistency.
This correctly addresses the past review comment about resume bypassing auto-loop resolution. However, if budget utilization changes between the original execution and resume, the re-resolved loop type may differ from what was originally selected. For deterministic resume behavior, consider persisting the originally selected loop type in checkpoint metadata rather than recomputing it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/engine/agent_engine.py` around lines 909 - 916, The resume logic
currently calls _resolve_loop during resume (using
checkpoint_ctx.task_execution.task, agent_id, task_id) which can yield a
different loop type if budget/utilization changed; instead persist the
originally selected loop type in the checkpoint metadata when creating/saving a
checkpoint and use that stored loop type on resume to reconstruct base_loop
(fall back to calling _resolve_loop only if the persisted loop type is missing
or invalid). Update the checkpoint save/load paths to include the loop type,
then modify the resume path that computes base_loop (checkpoint_ctx,
_resolve_loop, _make_loop_with_callback, base_loop, task_execution, task,
agent_id, task_id) to prefer the persisted loop type and only recompute as a
last resort.
…omplexity Add automatic execution loop selection that inspects task estimated_complexity and optional budget state to choose the optimal loop per task: - simple -> ReAct - medium -> Plan-and-Execute - complex/epic -> Hybrid (falls back to Plan-and-Execute until HybridLoop is implemented) Budget-aware: when monthly utilization >= threshold, complex tasks are downgraded from Hybrid to Plan-and-Execute to conserve budget. New modules: - loop_selector.py: AutoLoopConfig, AutoLoopRule, select_loop_type(), build_execution_loop() - BudgetEnforcer.get_budget_utilization_pct() for budget state queries AgentEngine accepts auto_loop_config (mutually exclusive with execution_loop) and resolves the loop per-task in _execute(). Closes #200 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ction Pre-reviewed by 11 agents, 16 findings addressed: - Add NotBlankStr for loop_type and hybrid_fallback fields - Add uniqueness validator for complexities in AutoLoopConfig.rules - Add import-time completeness guard on DEFAULT_AUTO_LOOP_RULES - Add warning log before ValueError raises in build_execution_loop and AgentEngine.__init__ - Fix EXECUTION_ENGINE_CREATED log to show "auto" when auto_loop_config set - Add budget-unavailable warning in _resolve_loop - Add no-rule-match warning in select_loop_type - Use next() idiom instead of for-loop + break - Update module docstring to describe budget-downgrade layer - Add MemoryError re-raise test for get_budget_utilization_pct - Add validation boundary tests for AutoLoopConfig - Update CLAUDE.md Package Structure with loop_selector.py - Update docs/design/engine.md auto-selection tip with 3-layer logic - Add loop resolution step to AgentEngine pipeline docs - Update README.md with auto-selection mention Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… and Gemini - Fix resume path to call _resolve_loop instead of using static self._loop (#1) - Validate loop_type/hybrid_fallback against _KNOWN_LOOP_TYPES at config time (#3) - Fix redundant any() scan producing false-positive NO_RULE_MATCH warning (#4) - Downgrade EXECUTION_LOOP_BUDGET_UNAVAILABLE to DEBUG to avoid log noise (#5) - Add auto_loop_config to AgentEngine class docstring (#6) - Reduce enforcer.py to 799 lines (was 806, limit 800) (#7) - Fix select_loop_type Returns docstring accuracy (#8) - Fix build_execution_loop docstring to mention hybrid (#9) - Add EXECUTION_LOOP_BUDGET_UNAVAILABLE assertion in budget-error test (#10) - Add resume path test for _resolve_loop (#11) - Add test: rule mapping to react does not trigger NO_RULE_MATCH (#12) - Add _resolve_loop docstring note about compaction/plan_execute_config (#13) - Update module docstring to mention AutoLoopConfig/AutoLoopRule (#14) - Simplify verbose log note string (#15) - Add configurable default_loop_type to AutoLoopConfig (Gemini enhancement) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…election - Reject unbuildable loop configs at construction: hybrid_fallback=None with hybrid rules, unbuildable default_loop_type/hybrid_fallback - Add _BUILDABLE_LOOP_TYPES set (react, plan_execute) for validation - Rewrite resume path test to exercise _execute_resumed_loop via mocked _resolve_loop (verifies wiring, not just direct call) - 5 new tests for buildability validation Assessed and skipped (not needed): - engine.md admonition formatting: 4-space indent is correct MkDocs admonition syntax; markdownlint MD046 is a false positive - Persist loop type in checkpoint: requires AgentContext schema change; current _resolve_loop approach gives consistent results for same task complexity (acceptable interim) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CVE-2026-32767 is a SiYuan Note authorization bypass (CWE-863, arbitrary SQL execution), not a libexpat vulnerability. Trivy incorrectly maps this CVE to libexpat 2.7.4 in the web image. Our nginx-unprivileged image serves static files and does not run SiYuan or any SQL database. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…election - Split select_loop_type into three private helpers (_match_loop_type, _downgrade_for_budget, _apply_hybrid_fallback) to satisfy <50 line limit - Fix validation: default_loop_type="hybrid" is now accepted when hybrid_fallback redirects to a buildable type (was incorrectly rejected) - Add _BUILDABLE_LOOP_TYPES validation for hybrid_fallback (must be buildable since it is the redirect target, not the source) - Resume test now verifies resolved_loop.execute was actually awaited, not just that _resolve_loop was called - Trivyignore: add paths scope (pkg:apk/alpine/libexpat) and expired_at (90 days) for CVE-2026-32767 suppression Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ore purls - AutoLoopRule and AutoLoopConfig: add extra="forbid" to reject typos - AutoLoopRule: field_validator on loop_type checks _KNOWN_LOOP_TYPES at rule construction (catches typos before reaching AutoLoopConfig) - .trivyignore.yaml: fix paths -> purls for PURL-scoped suppression Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- _resolve_loop: skip budget API call unless raw rule match is "hybrid" (dry-run with hybrid_fallback=None to see pre-fallback result) - .grype.yaml: scope CVE-2026-32767 to package libexpat/apk + audit date - .trivyignore.yaml: fix expired_at from RFC3339 to YYYY-MM-DD date format Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
EXECUTION_LOOP_AUTO_SELECTED now includes agent_id and task_id for log correlation under concurrency. _resolve_loop accepts optional agent_id/task_id params, threaded from _execute and resume path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…h_timeout Avoids parameter shadowing where the loop param was reassigned after wrapping with checkpoint callback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Trivy's Go time parser requires full RFC3339 timestamps (2026-06-17T00:00:00Z), not date-only strings. The previous round's change to YYYY-MM-DD broke all three Docker image scans. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/synthorg/engine/agent_engine.py (2)
909-916:⚠️ Potential issue | 🟠 MajorPersist the selected loop type in checkpoint metadata.
Line 911 re-runs auto-selection from the current task/budget state. Because
select_loop_type()downgrades for budget before applyinghybrid_fallback, a checkpoint can resume under a different concrete loop than the one that created it. Prefer rebuildingbase_loopfrom a loop type stored with the checkpoint, and only fall back to_resolve_loop()for older checkpoints.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/agent_engine.py` around lines 909 - 916, The checkpoint resume logic should use the loop type that was recorded when the checkpoint was created instead of re-running select_loop_type() (which can downgrade due to budget/hybrid_fallback); update the checkpoint writer to persist the chosen loop type (e.g., a checkpoint_ctx.loop_type or similar metadata field) when creating checkpoints, and modify the resume path in agent_engine.py (the block that sets base_loop and calls _resolve_loop/_make_loop_with_callback) to reconstruct base_loop from that persisted loop type first, only calling _resolve_loop(agent_task, agent_id, task_id) for older checkpoints that lack the stored loop_type; ensure the code references the persisted field name consistently and retains the existing _make_loop_with_callback usage.
1000-1066: 🛠️ Refactor suggestion | 🟠 MajorExtract
_resolve_loop()into a dedicated resolver.This helper is already 60+ lines inside a 1.3k-line class, and it now mixes rule evaluation, budget I/O, observability, and loop construction. Moving it beside
loop_selector.pywould bring this change back inside the repo’s size limits and make the auto-selection path easier to test in isolation. As per coding guidelines,Functions must be less than 50 lines; files must be less than 800 lines.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/agent_engine.py` around lines 1000 - 1066, The _resolve_loop method mixes rule evaluation, budget I/O, logging, and loop construction inside a large class; extract it into a dedicated resolver function (e.g., resolve_execution_loop) placed next to loop_selector.py. Move the logic that calls select_loop_type twice, queries self._budget_enforcer.get_budget_utilization_pct(), logs using EXECUTION_LOOP_AUTO_SELECTED/EXECUTION_LOOP_BUDGET_UNAVAILABLE, and returns build_execution_loop into the new resolver; make it accept the Task (or its estimated_complexity), the auto-loop config (cfg), budget_enforcer, approval_gate, stagnation_detector, and agent_id/task_id as parameters so no class state is referenced directly. Replace the original _resolve_loop body with a thin wrapper that forwards the right attributes to the new resolver. Ensure you preserve behavior (including hybrid/dry-run semantics, budget None handling, and the same log fields) and add unit tests for select_loop_type interactions and budget-unavailable branching.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/synthorg/engine/agent_engine.py`:
- Around line 909-916: The checkpoint resume logic should use the loop type that
was recorded when the checkpoint was created instead of re-running
select_loop_type() (which can downgrade due to budget/hybrid_fallback); update
the checkpoint writer to persist the chosen loop type (e.g., a
checkpoint_ctx.loop_type or similar metadata field) when creating checkpoints,
and modify the resume path in agent_engine.py (the block that sets base_loop and
calls _resolve_loop/_make_loop_with_callback) to reconstruct base_loop from that
persisted loop type first, only calling _resolve_loop(agent_task, agent_id,
task_id) for older checkpoints that lack the stored loop_type; ensure the code
references the persisted field name consistently and retains the existing
_make_loop_with_callback usage.
- Around line 1000-1066: The _resolve_loop method mixes rule evaluation, budget
I/O, logging, and loop construction inside a large class; extract it into a
dedicated resolver function (e.g., resolve_execution_loop) placed next to
loop_selector.py. Move the logic that calls select_loop_type twice, queries
self._budget_enforcer.get_budget_utilization_pct(), logs using
EXECUTION_LOOP_AUTO_SELECTED/EXECUTION_LOOP_BUDGET_UNAVAILABLE, and returns
build_execution_loop into the new resolver; make it accept the Task (or its
estimated_complexity), the auto-loop config (cfg), budget_enforcer,
approval_gate, stagnation_detector, and agent_id/task_id as parameters so no
class state is referenced directly. Replace the original _resolve_loop body with
a thin wrapper that forwards the right attributes to the new resolver. Ensure
you preserve behavior (including hybrid/dry-run semantics, budget None handling,
and the same log fields) and add unit tests for select_loop_type interactions
and budget-unavailable branching.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a5f71b15-2ccf-4a5e-bb6c-ffe2c68f8955
📒 Files selected for processing (1)
src/synthorg/engine/agent_engine.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 except syntax, enforced by ruff on Python 3.14.
All public functions require type hints — mypy strict mode enforced.
Docstrings must use Google style and are required on all public classes and functions — enforced by ruff D rules.
Files:
src/synthorg/engine/agent_engine.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use@computed_fieldfor derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff).
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow them.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.
Files:
src/synthorg/engine/agent_engine.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__).
Never useimport logging/logging.getLogger()/print()in application code.
Always useloggeras the variable name (not_logger, notlog).
Event names must always use constants from the domain-specific module under synthorg.observability.events. Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured kwargs in logger calls: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
DEBUG logging for object creation, internal flow, entry/exit of key functions.
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately.
RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.
Files:
src/synthorg/engine/agent_engine.py
🧠 Learnings (1)
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).
Applied to files:
src/synthorg/engine/agent_engine.py
🧬 Code graph analysis (1)
src/synthorg/engine/agent_engine.py (3)
src/synthorg/engine/loop_selector.py (3)
AutoLoopConfig(94-180)build_execution_loop(279-323)select_loop_type(240-276)src/synthorg/engine/loop_protocol.py (3)
get_loop_type(194-196)ExecutionLoop(158-196)execute(166-192)src/synthorg/budget/enforcer.py (1)
get_budget_utilization_pct(93-123)
🔇 Additional comments (1)
src/synthorg/engine/agent_engine.py (1)
431-434: Per-task loop resolution is wired correctly.Resolving the loop inside
_execute()and threading it through_run_loop_with_timeout()keeps auto mode on the selected loop instead of falling back toself._loop.
cf33c78 to
2c232a3
Compare
🤖 I have created a release *beep* *boop* --- ## [0.3.6](v0.3.5...v0.3.6) (2026-03-19) ### Features * **cli:** add backup subcommands (backup, backup list, backup restore) ([#568](#568)) ([4c06b1d](4c06b1d)) * **engine:** implement execution loop auto-selection based on task complexity ([#567](#567)) ([5bfc2c6](5bfc2c6)) ### Bug Fixes * activate structured logging pipeline -- wire 8-sink system, integrate Uvicorn, suppress spam ([#572](#572)) ([9b6bf33](9b6bf33)) * **cli:** bump grpc-go v1.79.3 -- CVE-2026-33186 auth bypass ([#574](#574)) ([f0171c9](f0171c9)) * resolve OpenAPI schema validation warnings for union/optional fields ([#558](#558)) ([5d96b2b](5d96b2b)) ### CI/CD * bump codecov/codecov-action from 5.5.2 to 5.5.3 in the minor-and-patch group ([#571](#571)) ([267f685](267f685)) * ignore chainguard/python in Dependabot docker updates ([#575](#575)) ([1935eaa](1935eaa)) ### Maintenance * bump the major group across 1 directory with 2 updates ([#570](#570)) ([b98f82c](b98f82c)) * bump the minor-and-patch group across 2 directories with 4 updates ([#569](#569)) ([3295168](3295168)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
estimated_complexityto the optimal loop: simple -> ReAct, medium -> Plan-and-Execute, complex/epic -> Hybrid (falls back to Plan-and-Execute until HybridLoop is implemented)AutoLoopConfig(frozen Pydantic model) with configurable rules, budget threshold, and hybrid fallbackBudgetEnforcer.get_budget_utilization_pct()for querying current monthly budget stateAgentEngineacceptsauto_loop_config(mutually exclusive withexecution_loop) and resolves the loop per-task in_execute()Test plan
loop_selector.py(all complexity mappings, budget downgrade, hybrid fallback, interaction priority, model validation, factory, logging)AgentEngineauto-loop (simple->react, medium->plan_execute, mutual exclusivity, budget-aware tight/ok, budget error fallback)BudgetEnforcer.get_budget_utilization_pct(correct %, disabled, zero, over-budget, tracker failure, MemoryError propagation)Review coverage
Pre-reviewed by 11 agents (docs-consistency, code-reviewer, python-reviewer, test-analyzer, silent-failure-hunter, type-design-analyzer, logging-audit, conventions-enforcer, resilience-audit, async-concurrency-reviewer, issue-resolution-verifier). 16 findings addressed in second commit.
Closes #200
🤖 Generated with Claude Code