feat: wire MultiAgentCoordinator into runtime#396
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedPull request was closed or merged during review Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughSummary by CodeRabbit
WalkthroughWires MultiAgentCoordinator into runtime: adds API endpoint to trigger coordination, coordinator factory and config, AgentEngine/AppState wiring, coordination DTOs/events, dispatcher/service error-path adjustments, channels helper, and comprehensive unit and integration tests. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant API as API Controller
participant Registry as Agent Registry
participant Coordinator as MultiAgent<br/>Coordinator
participant Engine as Agent Engine
participant MB as Message Bus
Client->>API: POST /api/v1/tasks/{id}/coordinate
API->>API: Validate access & fetch task
API->>Registry: Resolve agents (explicit or registry)
Registry-->>API: Agent list
API->>API: Build CoordinationContext
API->>MB: Publish api.coordination.started
API->>Coordinator: coordinate(context)
Coordinator->>Coordinator: Decompose → Route → Dispatch → Rollup → Update parent
Coordinator->>Engine: Dispatch waves (execute)
Engine-->>Coordinator: Wave results
alt success
Coordinator-->>API: CoordinationResult
API->>MB: Publish api.coordination.completed
API-->>Client: 200 + result
else phase failure
Coordinator-->>API: CoordinationResult (phase error)
API->>MB: Publish api.coordination.phase_completed / failed
API-->>Client: 200 + result (is_success false)
else no coordinator
API-->>Client: 503 Service Unavailable
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly advances the multi-agent capabilities of the system by integrating the previously developed Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new multi-agent coordination feature, including a dedicated API endpoint (/tasks/{task_id}/coordinate), new data transfer objects (DTOs) for coordination requests and responses, and WebSocket event types for real-time updates. It integrates a MultiAgentCoordinator service into the application's state and AgentEngine, allowing for dynamic agent selection and task decomposition. Configuration for this feature is managed through a new CoordinationSectionConfig in the RootConfig, and a factory (build_coordinator) is provided to wire up the coordination services. The changes also include comprehensive unit and integration tests for the new API and coordination logic. A review comment suggests improving observability by using a more specific logging event (COORDINATION_FAILED) instead of COORDINATION_STARTED when decomposition fails due to a missing LLM provider.
| logger.warning( | ||
| COORDINATION_STARTED, | ||
| note="Decomposition attempted without LLM provider", | ||
| ) |
There was a problem hiding this comment.
The log event COORDINATION_STARTED is misleading here as this code path represents an immediate failure. Using a more specific failure event like COORDINATION_FAILED with additional context (phase) would improve observability and make it easier to track down this specific failure mode. You'll also need to update the imports to include COORDINATION_FAILED from ai_company.observability.events.coordination.
| logger.warning( | |
| COORDINATION_STARTED, | |
| note="Decomposition attempted without LLM provider", | |
| ) | |
| logger.warning( | |
| COORDINATION_FAILED, | |
| phase="decompose", | |
| note="Decomposition attempted without LLM provider", | |
| ) |
Greptile SummaryThis PR wires the pre-existing Key findings:
Confidence Score: 3/5
Important Files Changed
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/ai_company/engine/coordination/section_config.py
Line: 57-83
Comment:
**`topology` field silently dropped — YAML setting has no runtime effect**
`CoordinationSectionConfig.topology` is documented as "Default coordination topology" and exposed in the YAML `coordination:` block, but `to_coordination_config()` never forwards it to `CoordinationConfig`. `CoordinationConfig` has no `topology` field, and the factory only uses it for a `logger.debug` call. A user who sets `topology: CENTRALIZED` in their YAML gets no error and no effect — the `TopologySelector` still resolves topology dynamically from routing decisions.
Either:
- Add a `topology` field to `CoordinationConfig` and pass it through here so `TopologySelector` can use it as a default/override, **or**
- Remove the `topology` field from `CoordinationSectionConfig` (since it's already controlled through `auto_topology_rules`), **or**
- Update the field description to make clear it is purely informational / a logging hint.
As-is, this is a broken configuration contract: operators configure a value that the system silently ignores.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/ai_company/engine/coordination/factory.py
Line: 147-153
Comment:
**Wrong event constant on error path**
`COORDINATION_FACTORY_BUILT` semantically means "factory construction succeeded", so using it here on the mismatched-deps error path is misleading. Log aggregators or dashboards that filter on this event name will silently include failed construction attempts alongside successful ones.
Compare with `_build_decomposition_strategy` directly above, which correctly uses `DECOMPOSITION_FAILED` for its analogous error path.
```suggestion
logger.warning(
COORDINATION_FACTORY_BUILT,
```
should be replaced with a dedicated failure event (e.g. `COORDINATION_FACTORY_BUILD_FAILED`) or reuse an existing error-scoped constant.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/ai_company/api/dto.py
Line: 302-313
Comment:
**Case-insensitive dedup may reject valid distinct agents**
`_validate_unique_agent_names` normalises names to lowercase before checking for duplicates (`lower = name.lower()`), but `registry.get_by_name(name)` is called with the original, un-normalised string. If the registry is case-sensitive, `"Alice"` and `"alice"` are two different agents, yet this validator would reject them as duplicates at the request boundary with a confusing `Duplicate agent name: 'alice'` error.
If the registry is case-insensitive, the dedup is correct but should document that assumption. If it is case-sensitive (or the behaviour is unspecified), the comparison should match registry semantics:
```python
seen: set[str] = set()
for name in self.agent_names:
if name in seen: # exact-match — mirrors registry lookup
msg = f"Duplicate agent name: {name!r}"
raise ValueError(msg)
seen.add(name)
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 1edd0b4 |
There was a problem hiding this comment.
Pull request overview
Adds first-class multi-agent coordination support to the backend by introducing a company-level coordination config section, a coordinator factory/service wiring path, and a new API endpoint to trigger coordination (with WebSocket event publishing). This is accompanied by unit + integration tests and new observability event constants.
Changes:
- Introduce
coordination:config section (CoordinationSectionConfig) and wire it intoRootConfigdefaults. - Add a
build_coordinator()factory and expose coordinator access viaAgentEngineandAppState. - Implement
/api/v1/tasks/{task_id}/coordinatecontroller + DTOs + WS event types, with tests covering success/error paths.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/engine/test_coordination_section_config.py | Unit tests for coordination section defaults, validation, and conversion to runtime config. |
| tests/unit/engine/test_coordination_factory.py | Unit tests for build_coordinator() construction paths and placeholder decomposition strategy behavior. |
| tests/unit/engine/test_agent_engine.py | Adds unit tests for new AgentEngine.coordinator and AgentEngine.coordinate() delegation/error behavior. |
| tests/unit/config/conftest.py | Updates RootConfig factory used in config tests to include coordination section. |
| tests/unit/api/test_state.py | Adds coverage for AppState.coordinator/has_coordinator and AppState.agent_registry accessors. |
| tests/unit/api/controllers/test_coordination.py | New controller tests for coordination endpoint (happy path + validation/error cases). |
| tests/integration/engine/test_coordination_wiring.py | New integration-style test exercising API flow with coordinator + agent registry wired into the app. |
| src/ai_company/observability/events/api.py | Adds API observability event constants for coordination lifecycle. |
| src/ai_company/engine/coordination/service.py | Improves exception passthrough for Memory/Recursion errors; adjusts update-parent phase skip behavior. |
| src/ai_company/engine/coordination/section_config.py | New Pydantic model bridging YAML coordination: section to runtime CoordinationConfig. |
| src/ai_company/engine/coordination/factory.py | New coordinator factory wiring decomposition/routing/execution/workspace dependencies. |
| src/ai_company/engine/coordination/dispatchers.py | Ensures Memory/Recursion errors are not swallowed by broad exception handlers. |
| src/ai_company/engine/coordination/init.py | Exposes build_coordinator and CoordinationSectionConfig at package level. |
| src/ai_company/engine/agent_engine.py | Adds optional coordinator injection + coordinate() helper with clear error when unset. |
| src/ai_company/config/schema.py | Adds coordination to RootConfig. |
| src/ai_company/config/defaults.py | Adds empty coordination section to default config dict. |
| src/ai_company/api/ws_models.py | Adds WS event types for coordination lifecycle. |
| src/ai_company/api/state.py | Adds coordinator + agent registry services to AppState with typed accessors and has_coordinator. |
| src/ai_company/api/dto.py | Adds request/response DTOs for coordination endpoint. |
| src/ai_company/api/controllers/coordination.py | New controller implementing coordinate endpoint, agent resolution, and WS event publishing. |
| src/ai_company/api/controllers/init.py | Registers the new CoordinationController in ALL_CONTROLLERS. |
| src/ai_company/api/app.py | Allows injecting coordinator + agent registry into create_app() / AppState. |
| CLAUDE.md | Updates repo documentation to mention coordination endpoint/config/factory and lists new API coordination event constants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.warning( | ||
| COORDINATION_STARTED, | ||
| note="Decomposition attempted without LLM provider", | ||
| ) |
| logger.debug( | ||
| COORDINATION_STARTED, | ||
| note="Building coordinator from config", | ||
| topology=config.topology.value, | ||
| has_provider=provider is not None, | ||
| has_workspace=workspace_strategy is not None, | ||
| ) |
| if self._task_engine is None: | ||
| return | ||
| if rollup is None: | ||
| logger.info( | ||
| COORDINATION_PHASE_STARTED, | ||
| phase="update_parent", | ||
| note="Skipped — rollup is None (rollup phase failed)", | ||
| ) | ||
| return |
| ws_event_type = ( | ||
| WsEventType.COORDINATION_COMPLETED | ||
| if result.is_success | ||
| else WsEventType.COORDINATION_FAILED | ||
| ) | ||
| _publish_ws_event( | ||
| request, | ||
| ws_event_type, | ||
| { | ||
| "task_id": task_id, | ||
| "topology": result.topology.value, | ||
| "is_success": result.is_success, | ||
| "total_duration_seconds": result.total_duration_seconds, | ||
| }, | ||
| ) | ||
| logger.info( | ||
| API_COORDINATION_COMPLETED, | ||
| task_id=task_id, | ||
| topology=result.topology.value, | ||
| is_success=result.is_success, | ||
| total_duration_seconds=result.total_duration_seconds, | ||
| ) |
| Validates the full bootstrap-to-API path: | ||
| 1. Create RootConfig with coordination section | ||
| 2. Build coordinator via build_coordinator() with ManualDecompositionStrategy | ||
| 3. Create TaskEngine with mock persistence | ||
| 4. Create AgentRegistryService and register test agents | ||
| 5. Build app via create_app() with coordinator + agent_registry | ||
| 6. Use TestClient to create a task and trigger coordination |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/ai_company/api/app.py (1)
351-415:⚠️ Potential issue | 🟡 MinorWarn when coordination services are absent at startup.
The new coordination route can now fail through
AppState.coordinator/AppState.agent_registry, but the startup warning still only tracks persistence/message_bus/cost_tracker/task_engine. Ifcoordinatororagent_registryare the only missing dependencies,/coordinatewill 503 and startup stays silent.💡 Suggested update
def create_app( # noqa: PLR0913 @@ - if ( - persistence is None - or message_bus is None - or cost_tracker is None - or task_engine is None - ): - msg = ( - "create_app called without persistence, message_bus, " - "cost_tracker, and/or task_engine — controllers accessing " - "missing services will return 503. Use test fakes for testing." - ) - logger.warning(API_APP_STARTUP, note=msg) + missing_services: list[str] = [] + if persistence is None: + missing_services.append("persistence") + if message_bus is None: + missing_services.append("message_bus") + if cost_tracker is None: + missing_services.append("cost_tracker") + if task_engine is None: + missing_services.append("task_engine") + if coordinator is None: + missing_services.append("coordinator") + if agent_registry is None: + missing_services.append("agent_registry") + if missing_services: + logger.warning( + API_APP_STARTUP, + missing_services=tuple(missing_services), + note=( + "Controllers accessing missing services will return 503. " + "Use test fakes for testing." + ), + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/api/app.py` around lines 351 - 415, The startup warning in create_app does not include coordinator and agent_registry, so missing coordination services can silently cause /coordinate to 503; update the conditional that currently checks persistence/message_bus/cost_tracker/task_engine in create_app to also test for coordinator is None or agent_registry is None, and expand the msg passed to logger.warning(API_APP_STARTUP, ...) to mention coordinator and agent_registry (or a general “coordination services”) so the warning reflects these missing dependencies when constructing AppState.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/api/controllers/coordination.py`:
- Around line 174-189: The _get_task method calls app_state.task_engine.get_task
which can raise ServiceUnavailableError (503) and mask a more relevant
coordination error; update handling in _get_task (or in the coordinator
preflight) to first validate both services or to catch ServiceUnavailableError
from app_state.task_engine.get_task and, if the coordinator is missing/invalid,
raise the coordinator-specific error instead (or include both statuses in the
response); specifically reference app_state.task_engine, the _get_task function,
and the ServiceUnavailableError so the change either pre-validates the
coordinator state before calling get_task or wraps get_task in a try/except that
re-raises a coordinator-specific message when appropriate.
In `@src/ai_company/api/dto.py`:
- Around line 315-341: The CoordinationPhaseResponse model's
_validate_success_error_consistency currently allows empty strings for error on
failed phases; update that validator so when success is False it requires error
to be a non-empty, non-whitespace string (e.g., check error is not None and
error.strip() != "") and raise a ValueError with a clear message like "failed
phase must have a non-empty error description"; keep the existing check that a
successful phase must not have an error (error must be None) and return self at
the end.
In `@src/ai_company/api/state.py`:
- Around line 161-164: Add a boolean accessor for agent_registry to mirror other
optional services: implement a has_agent_registry property that returns whether
self._agent_registry is set (e.g., bool(self._agent_registry)) so callers can
conditionally use the service instead of triggering the 503 from agent_registry;
follow the pattern used by has_coordinator / has_task_engine / has_auth_service
and keep naming consistent with agent_registry and _require_service.
In `@src/ai_company/engine/coordination/dispatchers.py`:
- Around line 163-164: Before re-raising MemoryError or RecursionError in these
except blocks, emit a minimal low-allocation log entry with static context so
the fatal coordination failure is recorded; specifically, in each except
(MemoryError, RecursionError): block that currently just does "raise" (handling
the MemoryError/RecursionError cases at the noted locations), call the module's
existing logger (e.g., logger or process_logger) with logger.error or
logger.warning and a short static message like "Fatal coordination failure in
dispatchers: preserving MemoryError/RecursionError in <function_name_or_phase>"
(include the current function/phase/wave identifier available in scope), then
re-raise; apply the same change to all similar except blocks handling
MemoryError/RecursionError.
In `@src/ai_company/engine/coordination/factory.py`:
- Around line 67-71: The logger.warning call that currently uses
COORDINATION_STARTED when raising DecompositionError is wrong; update the event
constant to DECOMPOSITION_FAILED and change the import to pull
DECOMPOSITION_FAILED from ai_company.observability.events.decomposition (not the
coordination module). Locate the logger.warning(...) + raise
DecompositionError(msg) sequence in factory.py (the block that logs
"Decomposition attempted without LLM provider") and replace the event constant
and import so the warning emits DECOMPOSITION_FAILED.
In `@src/ai_company/engine/coordination/service.py`:
- Around line 517-523: The log uses the wrong event constant — when rollup is
None the code in the update_parent branch logs COORDINATION_PHASE_STARTED even
though the phase is being skipped; change the logger.info call to emit a
semantically correct event (e.g. COORDINATION_PHASE_SKIPPED or
COORDINATION_PHASE_COMPLETED) and keep the same phase="update_parent" and note
explaining the skip (or add a dedicated skip constant if one exists); update any
imports/usages of COORDINATION_PHASE_* in the file to use the chosen
skip/completed constant so logs accurately reflect the skipped state for rollup
being None.
In `@tests/integration/engine/test_coordination_wiring.py`:
- Around line 144-180: The test currently injects an AsyncMock coordinator
(coordinator = AsyncMock()) instead of assembling the real coordinator, so it
never exercises RootConfig.coordination or build_coordinator; replace the
AsyncMock injection in test_build_coordinator_and_coordinate_via_api with a call
to the real factory (e.g. call build_coordinator(...) or use
RootConfig.coordination to build the coordinator instance) and, if determinism
is required, configure the RootConfig or coordination strategy to a
task-id-aware manual strategy (rather than mocking the coordinator.coordinate
method); remove the AsyncMock coordinator and ensure the test calls the real
coordinator.coordinate to validate wiring and multi-agent execution.
In `@tests/unit/api/controllers/test_coordination.py`:
- Around line 129-166: The test function test_coordinate_task_success is
declared async but uses the synchronous TestClient (coordination_client.post);
make the test fully synchronous by changing async def
test_coordinate_task_success to def test_coordinate_task_success and replace the
await agent_registry.register(agent) call with a synchronous run of the
coroutine (e.g., use asyncio.get_event_loop().run_until_complete or asyncio.run
to execute agent_registry.register(agent)); keep the rest of the assertions and
mocked mock_coordinator.coordinate behavior unchanged and import asyncio if not
already present.
---
Outside diff comments:
In `@src/ai_company/api/app.py`:
- Around line 351-415: The startup warning in create_app does not include
coordinator and agent_registry, so missing coordination services can silently
cause /coordinate to 503; update the conditional that currently checks
persistence/message_bus/cost_tracker/task_engine in create_app to also test for
coordinator is None or agent_registry is None, and expand the msg passed to
logger.warning(API_APP_STARTUP, ...) to mention coordinator and agent_registry
(or a general “coordination services”) so the warning reflects these missing
dependencies when constructing AppState.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9281b6b1-453f-4f2d-8fe3-ec165972f98e
📒 Files selected for processing (23)
CLAUDE.mdsrc/ai_company/api/app.pysrc/ai_company/api/controllers/__init__.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/api/dto.pysrc/ai_company/api/state.pysrc/ai_company/api/ws_models.pysrc/ai_company/config/defaults.pysrc/ai_company/config/schema.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/coordination/__init__.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/section_config.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/observability/events/api.pytests/integration/engine/test_coordination_wiring.pytests/unit/api/controllers/test_coordination.pytests/unit/api/test_state.pytests/unit/config/conftest.pytests/unit/engine/test_agent_engine.pytests/unit/engine/test_coordination_factory.pytests/unit/engine/test_coordination_section_config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Agent
- GitHub Check: Build Web
- GitHub Check: Build Backend
- GitHub Check: Test (Python 3.14)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Python 3.14+ required; use PEP 649 native lazy annotations (nofrom __future__ import annotations)
Use PEP 758 except syntax:except A, B:(no parentheses) — enforced by ruff on Python 3.14
Add type hints to all public functions; enforce mypy strict mode
Use Google style docstrings required on all public classes and functions — enforced by ruff D rules
Create new objects instead of mutating existing ones; for non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves — never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields; use NotBlankStr for all identifier/name fields (including optional and tuple variants)
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — prefer structured concurrency over bare create_task; existing code is being migrated incrementally
Enforce 88 character line length — ruff enforces this
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Files:
src/ai_company/api/ws_models.pysrc/ai_company/observability/events/api.pytests/unit/config/conftest.pysrc/ai_company/api/controllers/__init__.pysrc/ai_company/api/state.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/config/schema.pysrc/ai_company/config/defaults.pytests/unit/engine/test_coordination_factory.pysrc/ai_company/engine/agent_engine.pytests/integration/engine/test_coordination_wiring.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/__init__.pytests/unit/engine/test_coordination_section_config.pytests/unit/api/test_state.pytests/unit/api/controllers/test_coordination.pysrc/ai_company/api/app.pysrc/ai_company/engine/coordination/section_config.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/api/dto.pysrc/ai_company/engine/coordination/dispatchers.pytests/unit/engine/test_agent_engine.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Every module with business logic must import get_logger from ai_company.observability and define logger = get_logger(name); never use import logging or logging.getLogger() or print() in application code
Always use logger variable name (not _logger or log)
Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Use structured kwargs in logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Maintain 80% minimum code coverage — enforced in CI
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases
Lint Python code with: uv run ruff check src/ tests/
Auto-fix lint issues with: uv run ruff check src/ tests/ --fix
Format code with: uv run ruff format src/ tests/
Type-check with strict mode: uv run mypy src/ tests/
Files:
src/ai_company/api/ws_models.pysrc/ai_company/observability/events/api.pysrc/ai_company/api/controllers/__init__.pysrc/ai_company/api/state.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/config/schema.pysrc/ai_company/config/defaults.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/__init__.pysrc/ai_company/api/app.pysrc/ai_company/engine/coordination/section_config.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/api/dto.pysrc/ai_company/engine/coordination/dispatchers.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError; non-retryable errors raise immediately without retry; RetryExhaustedError signals all retries failed
Rate limiter respects RateLimitError.retry_after from providers — automatically pauses future requests
Files:
src/ai_company/api/ws_models.pysrc/ai_company/observability/events/api.pysrc/ai_company/api/controllers/__init__.pysrc/ai_company/api/state.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/config/schema.pysrc/ai_company/config/defaults.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/__init__.pysrc/ai_company/api/app.pysrc/ai_company/engine/coordination/section_config.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/api/dto.pysrc/ai_company/engine/coordination/dispatchers.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, and@pytest.mark.slow
Use pytest-xdist via -n auto for parallel test execution — ALWAYS include -n auto when running pytest, never run tests sequentially
Prefer@pytest.mark.parametrizefor testing similar cases
Use test-provider, test-small-001, etc. in tests instead of real vendor names
Run unit tests with: uv run pytest tests/ -m unit -n auto
Run integration tests with: uv run pytest tests/ -m integration -n auto
Run e2e tests with: uv run pytest tests/ -m e2e -n auto
Run full test suite with coverage: uv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80
Files:
tests/unit/config/conftest.pytests/unit/engine/test_coordination_factory.pytests/integration/engine/test_coordination_wiring.pytests/unit/engine/test_coordination_section_config.pytests/unit/api/test_state.pytests/unit/api/controllers/test_coordination.pytests/unit/engine/test_agent_engine.py
🧠 Learnings (11)
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Applied to files:
src/ai_company/observability/events/api.pyCLAUDE.mdsrc/ai_company/api/app.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/ai_company/providers/**/*.py : RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig
Applied to files:
tests/unit/config/conftest.pysrc/ai_company/config/schema.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Every module with business logic must import get_logger from ai_company.observability and define logger = get_logger(__name__); never use import logging or logging.getLogger() or print() in application code
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use structured kwargs in logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Always use logger variable name (not _logger or log)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/ai_company/**/*.py : Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError; non-retryable errors raise immediately without retry; RetryExhaustedError signals all retries failed
Applied to files:
src/ai_company/engine/agent_engine.py
🧬 Code graph analysis (13)
tests/unit/config/conftest.py (1)
src/ai_company/engine/coordination/section_config.py (1)
CoordinationSectionConfig(15-83)
src/ai_company/api/controllers/__init__.py (1)
src/ai_company/api/controllers/coordination.py (1)
CoordinationController(111-313)
src/ai_company/api/state.py (3)
src/ai_company/engine/coordination/service.py (1)
MultiAgentCoordinator(50-580)src/ai_company/hr/registry.py (1)
AgentRegistryService(32-262)src/ai_company/engine/agent_engine.py (1)
coordinator(224-226)
src/ai_company/engine/coordination/service.py (1)
src/ai_company/memory/errors.py (1)
MemoryError(13-14)
src/ai_company/config/schema.py (1)
src/ai_company/engine/coordination/section_config.py (1)
CoordinationSectionConfig(15-83)
src/ai_company/engine/agent_engine.py (3)
src/ai_company/engine/errors.py (1)
ExecutionStateError(17-18)src/ai_company/engine/coordination/models.py (2)
CoordinationContext(30-61)CoordinationResult(128-198)src/ai_company/engine/coordination/service.py (2)
MultiAgentCoordinator(50-580)coordinate(92-196)
tests/integration/engine/test_coordination_wiring.py (5)
src/ai_company/api/app.py (1)
create_app(351-491)src/ai_company/api/auth/config.py (1)
AuthConfig(28-110)src/ai_company/hr/registry.py (1)
AgentRegistryService(32-262)src/ai_company/providers/enums.py (1)
FinishReason(15-22)src/ai_company/engine/coordination/models.py (2)
CoordinationPhaseResult(64-96)CoordinationResult(128-198)
src/ai_company/engine/coordination/__init__.py (2)
src/ai_company/engine/coordination/factory.py (1)
build_coordinator(74-176)src/ai_company/engine/coordination/section_config.py (1)
CoordinationSectionConfig(15-83)
tests/unit/api/controllers/test_coordination.py (3)
src/ai_company/engine/coordination/models.py (1)
CoordinationResult(128-198)src/ai_company/api/state.py (1)
coordinator(152-154)src/ai_company/engine/coordination/service.py (1)
coordinate(92-196)
src/ai_company/api/app.py (4)
src/ai_company/engine/coordination/service.py (1)
MultiAgentCoordinator(50-580)src/ai_company/api/state.py (3)
task_engine(119-121)coordinator(152-154)agent_registry(162-164)src/ai_company/hr/registry.py (1)
AgentRegistryService(32-262)tests/unit/api/controllers/test_coordination.py (1)
agent_registry(80-81)
src/ai_company/engine/coordination/section_config.py (2)
src/ai_company/engine/coordination/config.py (1)
CoordinationConfig(8-38)src/ai_company/engine/routing/models.py (1)
AutoTopologyConfig(160-207)
src/ai_company/api/dto.py (1)
src/ai_company/config/schema.py (1)
_validate_unique_agent_names(538-550)
src/ai_company/engine/coordination/dispatchers.py (1)
src/ai_company/memory/errors.py (1)
MemoryError(13-14)
🪛 LanguageTool
CLAUDE.md
[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...
(EG_NO_COMMA)
🔇 Additional comments (22)
src/ai_company/api/controllers/__init__.py (1)
13-13: Coordination controller wiring is consistent and complete.Import, controller registry, and public export were updated together, so route discovery and module exports stay aligned.
Also applies to: 39-39, 53-53
src/ai_company/observability/events/api.py (1)
44-49: New coordination API event constants look good.These additions follow the existing event taxonomy and support consistent structured logging/event emission.
src/ai_company/api/dto.py (2)
274-313:CoordinateTaskRequestvalidation is solid.The bounds, optional overrides, and case-insensitive duplicate-agent guard are well implemented.
344-372:CoordinationResultResponseaggregation design is clean.Using a computed
is_successover phase outcomes avoids redundant state and keeps the model consistent.src/ai_company/api/ws_models.py (1)
46-49: Coordination WebSocket event taxonomy is well added.The lifecycle coverage (
started,phase_completed,completed,failed) is clear and consistent.src/ai_company/config/defaults.py (1)
42-42: Default config now correctly includes coordination.Good addition to keep baseline config hydration aligned with the new schema.
tests/unit/config/conftest.py (1)
25-25: Test config factory wiring is correct.Including
CoordinationSectionConfiginRootConfigFactorykeeps fixtures aligned with the updated root schema.Also applies to: 100-100
src/ai_company/engine/coordination/__init__.py (1)
18-18: Coordination package exports are updated appropriately.Re-exporting both factory and section config improves discoverability for runtime bootstrap and tests.
Also applies to: 26-26, 36-36, 43-43
tests/unit/engine/test_coordination_section_config.py (1)
15-122: Strong unit coverage forCoordinationSectionConfig.The tests exercise defaults, immutability, override precedence, and validation/error paths thoroughly.
src/ai_company/engine/coordination/section_config.py (1)
57-83: Thetopologyandauto_topology_rulesfields are not dead configuration—they are actively used during app startup.In
src/ai_company/engine/coordination/factory.py, thebuild_coordinator()function receives the fullCoordinationSectionConfigand uses:
config.topology(line 117) for loggingconfig.auto_topology_rules(line 146) to constructTopologySelectorThe
TopologySelectoris then wired into theTaskRoutingServiceat build time. The coordinator singleton is created once at app startup with these topology settings baked in.The method
to_coordination_config()is for per-request runtime overrides only (max concurrency, fail-fast, workspace settings). The topology selection happens during the routing phase of coordination, not via the per-requestCoordinationConfig.> Likely an incorrect or invalid review comment.tests/unit/api/controllers/test_coordination.py (2)
37-48: LGTM!Clean helper function with appropriate test data. Using
date(2026, 1, 1)as a future date for test fixtures is fine.
325-342: LGTM!Good coverage of the 503 scenario when coordinator is not configured. The test correctly uses the shared
test_clientfixture (without coordinator) and verifies the expected status code.src/ai_company/api/state.py (1)
55-80: LGTM!The new
coordinatorandagent_registryparameters follow the established pattern for optional service injection. The__slots__ordering is maintained alphabetically.tests/unit/engine/test_coordination_factory.py (2)
26-36: LGTM!Good basic test for the factory. The minimal configuration test ensures the factory produces a valid coordinator with default settings.
126-136: LGTM!Good test for the placeholder strategy's error behavior. The async test correctly exercises the
decomposemethod and verifies the expectedDecompositionErrorwith a clear message.src/ai_company/engine/agent_engine.py (2)
223-251: LGTM!Clean implementation of the coordinator property and
coordinate()method. The error handling follows the "log-before-raise" pattern specified in the PR objectives, and the docstring documents all expected exceptions.
166-166: LGTM!The coordinator parameter and initialization follow the established pattern for optional dependencies. Including
has_coordinatorin the creation log provides useful debugging information.Also applies to: 212-212, 220-220
src/ai_company/engine/coordination/factory.py (1)
74-176: LGTM!Well-structured factory with clear numbered steps matching the docstring. The deferred imports for heavy dependencies and the optional workspace service construction are appropriate design choices.
src/ai_company/api/controllers/coordination.py (3)
268-313: LGTM!Clean agent resolution logic with proper error handling and logging. The sequential lookup by name is appropriate for typical use cases.
216-266: LGTM!The
_executemethod properly handles both success and failure paths, publishing appropriate WebSocket events and logging outcomes. The mapping ofCoordinationPhaseErrortoApiValidationError(422) aligns with the PR's documented behavior.
89-108: LGTM!Clean mapping from domain model to API response DTO. The comment about
is_successbeing a@computed_fieldis helpful documentation.src/ai_company/engine/coordination/service.py (1)
177-178: No changes needed—MemoryErrorcorrectly refers to the builtin class.The custom
MemoryErrorfromsrc/ai_company/memory/errors.pyis not imported in this file, so all references toMemoryErrorin the except clauses (lines 177, 212, 268, 409, 474, 564) correctly resolve to the builtinMemoryError. The syntax follows PEP 758 correctly withexcept MemoryError, RecursionError:and the guards properly re-raise non-recoverable system errors as intended.
| async def _get_task( | ||
| self, | ||
| app_state: AppState, | ||
| task_id: str, | ||
| ) -> Task: | ||
| """Fetch task or raise 404.""" | ||
| task = await app_state.task_engine.get_task(task_id) | ||
| if task is None: | ||
| logger.warning( | ||
| API_RESOURCE_NOT_FOUND, | ||
| resource="task", | ||
| id=task_id, | ||
| ) | ||
| msg = f"Task {task_id!r} not found" | ||
| raise NotFoundError(msg) | ||
| return task |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Task engine access may raise 503 before coordinator check is meaningful.
_get_task accesses app_state.task_engine, which raises ServiceUnavailableError (503) if not configured. This happens after the coordinator check (line 143), so if task_engine is missing but coordinator is present, users get a generic "Task Engine not configured" 503 instead of a more specific error about the coordination setup.
Consider whether the error messaging is acceptable or if you want to validate both services upfront.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/api/controllers/coordination.py` around lines 174 - 189, The
_get_task method calls app_state.task_engine.get_task which can raise
ServiceUnavailableError (503) and mask a more relevant coordination error;
update handling in _get_task (or in the coordinator preflight) to first validate
both services or to catch ServiceUnavailableError from
app_state.task_engine.get_task and, if the coordinator is missing/invalid, raise
the coordinator-specific error instead (or include both statuses in the
response); specifically reference app_state.task_engine, the _get_task function,
and the ServiceUnavailableError so the change either pre-validates the
coordinator state before calling get_task or wraps get_task in a try/except that
re-raises a coordinator-specific message when appropriate.
| async def test_build_coordinator_and_coordinate_via_api(self) -> None: | ||
| """End-to-end: build coordinator, create app, coordinate task.""" | ||
| # 1. Create config | ||
| config = RootConfig(company_name="test-corp") | ||
|
|
||
| # 2. Build a coordinator that wraps real services but uses | ||
| # a task-id-aware manual strategy so decomposition succeeds | ||
| from unittest.mock import AsyncMock | ||
|
|
||
| from ai_company.engine.coordination.models import ( | ||
| CoordinationPhaseResult, | ||
| CoordinationResult, | ||
| ) | ||
|
|
||
| async def _mock_coordinate(context): # type: ignore[no-untyped-def] | ||
| """Return a realistic result keyed to the actual task.""" | ||
| return CoordinationResult( | ||
| parent_task_id=context.task.id, | ||
| topology=CoordinationTopology.SAS, | ||
| phases=( | ||
| CoordinationPhaseResult( | ||
| phase="decompose", | ||
| success=True, | ||
| duration_seconds=0.01, | ||
| ), | ||
| CoordinationPhaseResult( | ||
| phase="route", | ||
| success=True, | ||
| duration_seconds=0.01, | ||
| ), | ||
| ), | ||
| total_duration_seconds=0.05, | ||
| total_cost_usd=0.001, | ||
| ) | ||
|
|
||
| coordinator = AsyncMock() | ||
| coordinator.coordinate.side_effect = _mock_coordinate |
There was a problem hiding this comment.
This test still bypasses the runtime wiring it claims to validate.
Lines 179-180 inject an AsyncMock coordinator instead of calling build_coordinator(). That means the test never exercises RootConfig.coordination, factory assembly, or real multi-agent execution, so the main coordinator bootstrap path can regress while this still passes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration/engine/test_coordination_wiring.py` around lines 144 - 180,
The test currently injects an AsyncMock coordinator (coordinator = AsyncMock())
instead of assembling the real coordinator, so it never exercises
RootConfig.coordination or build_coordinator; replace the AsyncMock injection in
test_build_coordinator_and_coordinate_via_api with a call to the real factory
(e.g. call build_coordinator(...) or use RootConfig.coordination to build the
coordinator instance) and, if determinism is required, configure the RootConfig
or coordination strategy to a task-id-aware manual strategy (rather than mocking
the coordinator.coordinate method); remove the AsyncMock coordinator and ensure
the test calls the real coordinator.coordinate to validate wiring and
multi-agent execution.
| @pytest.mark.unit | ||
| class TestCoordinationControllerHappyPath: | ||
| async def test_coordinate_task_success( | ||
| self, | ||
| coordination_client: TestClient[Any], | ||
| mock_coordinator: AsyncMock, | ||
| agent_registry: AgentRegistryService, | ||
| ) -> None: | ||
| agent = _make_agent() | ||
| await agent_registry.register(agent) | ||
|
|
||
| resp = coordination_client.post( | ||
| "/api/v1/tasks", | ||
| json={ | ||
| "title": "Test task", | ||
| "description": "A test task for coordination", | ||
| "type": "development", | ||
| "project": "proj-1", | ||
| "created_by": "api", | ||
| }, | ||
| ) | ||
| assert resp.status_code == 201 | ||
| task_id = resp.json()["data"]["id"] | ||
|
|
||
| mock_coordinator.coordinate.return_value = _make_coordination_result(task_id) | ||
|
|
||
| resp = coordination_client.post( | ||
| f"/api/v1/tasks/{task_id}/coordinate", | ||
| json={}, | ||
| ) | ||
| assert resp.status_code == 200 | ||
| body = resp.json() | ||
| assert body["success"] is True | ||
| assert body["data"]["parent_task_id"] == task_id | ||
| assert body["data"]["topology"] == "sas" | ||
| assert body["data"]["is_success"] is True | ||
| assert body["data"]["wave_count"] == 0 | ||
| assert len(body["data"]["phases"]) == 1 |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Test methods are async but TestClient is synchronous.
The test methods are declared async (e.g., async def test_coordinate_task_success), but Litestar's TestClient is synchronous. While pytest-asyncio may handle this, the await agent_registry.register(agent) call requires async context, so this works. However, the HTTP calls (coordination_client.post(...)) are synchronous.
For consistency and clarity, consider either:
- Making the tests fully synchronous and using a sync registry setup, or
- Using
AsyncTestClientif available for true async testing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/api/controllers/test_coordination.py` around lines 129 - 166, The
test function test_coordinate_task_success is declared async but uses the
synchronous TestClient (coordination_client.post); make the test fully
synchronous by changing async def test_coordinate_task_success to def
test_coordinate_task_success and replace the await
agent_registry.register(agent) call with a synchronous run of the coroutine
(e.g., use asyncio.get_event_loop().run_until_complete or asyncio.run to execute
agent_registry.register(agent)); keep the rest of the assertions and mocked
mock_coordinator.coordinate behavior unchanged and import asyncio if not already
present.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #396 +/- ##
==========================================
- Coverage 93.84% 93.79% -0.06%
==========================================
Files 463 466 +3
Lines 21691 21944 +253
Branches 2086 2103 +17
==========================================
+ Hits 20357 20582 +225
- Misses 1032 1058 +26
- Partials 302 304 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (3)
tests/integration/engine/test_coordination_wiring.py (1)
144-180: 🧹 Nitpick | 🔵 TrivialTest uses mock coordinator—intentional per docstring but limits integration coverage.
The past review noted this test bypasses
build_coordinator(). The docstring (lines 5-6) indicates this is intentional: "Create a mock coordinator (real build_coordinator() is unit-tested separately)." This design trades full end-to-end coverage for test isolation.Consider whether a separate integration test that uses
build_coordinator()would catch wiring regressions betweenRootConfig.coordinationand the factory.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/engine/test_coordination_wiring.py` around lines 144 - 180, The test replaces the real coordinator with an AsyncMock (coordinator.coordinate side_effect) which intentionally isolates behavior but misses wiring between RootConfig.coordination and the build_coordinator factory; add a separate integration test that constructs a RootConfig with real coordination settings and calls build_coordinator() (referencing build_coordinator and RootConfig) and then invokes the coordinator.coordinate flow to ensure the factory wiring and configuration are exercised end-to-end rather than mocked.src/ai_company/engine/coordination/dispatchers.py (1)
163-164:⚠️ Potential issue | 🟡 MinorAdd minimal logging before re-raising MemoryError/RecursionError.
These
except MemoryError, RecursionError: raiseblocks skip logging entirely. While allocating duringMemoryErroris risky, a static low-allocation log helps trace fatal failures. Note: the relevant snippet showsai_company.memory.errors.MemoryErrorshadows the builtin—verify which exception this catches.#!/bin/bash # Check if the builtin MemoryError is intended or the custom one from memory.errors rg -n "from ai_company.memory.errors import" src/ai_company/engine/coordination/dispatchers.py rg -n "^import " src/ai_company/engine/coordination/dispatchers.py | head -20🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/coordination/dispatchers.py` around lines 163 - 164, Before re-raising the caught MemoryError/RecursionError in the except block that currently reads "except MemoryError, RecursionError: raise", add a minimal, low-allocation log statement (use a module-level logger created once, e.g. logger = logging.getLogger(__name__)) to record the exception type and a short message, then re-raise; ensure the log call is simple (logger.critical or logger.error) to avoid heavy allocations. Also verify which MemoryError is being caught (the custom ai_company.memory.errors.MemoryError vs the built-in) by checking imports in this module and, if the custom type is intended, consider logging its module name in the message for clarity. Finally, keep the log content minimal so it is safe during low-memory conditions and preserve the existing re-raise behavior.tests/unit/api/controllers/test_coordination.py (1)
129-166: 🧹 Nitpick | 🔵 TrivialAsync test methods work but mix sync/async patterns.
The test methods are
async defwhileTestClientis synchronous. This works because pytest-asyncio handles the coroutine, andawait agent_registry.register(agent)requires async context. The pattern is functional but slightly inconsistent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/api/controllers/test_coordination.py` around lines 129 - 166, The test mixes async setup (await agent_registry.register) with synchronous TestClient usage (coordination_client.post). Make the test fully async: in TestCoordinationControllerHappyPath.test_coordinate_task_success, switch to using an async HTTP client (await coordination_client.post(...)) or ensure coordination_client is an AsyncClient fixture, and await both POST calls to "/api/v1/tasks" and f"/api/v1/tasks/{task_id}/coordinate"; keep the async agent_registry.register and leave mock_coordinator.coordinate.return_value assignment as-is so the flow is consistently awaited.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CLAUDE.md`:
- Line 154: The sentence listing event name constants uses "e.g." without a
following comma; update the paragraph that starts with "**Event names**: always
use constants..." (the list including PROVIDER_CALL_START, BUDGET_RECORD_ADDED,
etc.) to use "e.g.," (add a comma after "e.g.") so it reads "(e.g.,
`PROVIDER_CALL_START`...)" throughout that line.
In `@src/ai_company/api/controllers/coordination.py`:
- Around line 267-276: The current block always calls logger.info for
coordination outcomes; change it to log at a higher level when coordination
failed by using logger.warning (or logger.error if preferred) when
result.is_success is False: keep the same log_event selection
(API_COORDINATION_COMPLETED vs API_COORDINATION_FAILED) and the same structured
fields (task_id, topology=result.topology.value, is_success,
total_duration_seconds) but call logger.warning for failures and logger.info for
successes (adjust the call site where log_event and logger.info are used).
In `@src/ai_company/api/dto.py`:
- Line 360: The parent_task_id field is missing a max_length constraint which
makes it inconsistent with other task ID fields like
CreateApprovalRequest.task_id; update the parent_task_id declaration to include
the same max_length=128 constraint (i.e., adjust the parent_task_id field
definition to mirror the other task_id fields so it enforces max_length=128).
In `@src/ai_company/engine/coordination/factory.py`:
- Around line 77-91: The factory returns a _NoProviderDecompositionStrategy
instance that currently relies on structural typing; make its implementation
explicit by annotating/declaring _NoProviderDecompositionStrategy as
implementing the DecompositionStrategy protocol (e.g., inherit from
DecompositionStrategy or add an explicit typing annotation) so missing methods
are caught at definition time; to do this, import DecompositionStrategy at
runtime (move the DecompositionStrategy import out of the TYPE_CHECKING block)
and update the class declaration for _NoProviderDecompositionStrategy to
reference DecompositionStrategy, ensuring method signatures match the protocol.
In `@src/ai_company/engine/coordination/service.py`:
- Around line 179-180: The except clause currently written as "except
MemoryError, RecursionError:" should log a minimal static error before
re-raising; change it to the tuple form "except (MemoryError, RecursionError):",
call the module's logger (e.g. logger.error or process_logger.error) with a
concise static message like "Fatal coordination failure:
MemoryError/RecursionError encountered" (no mutable state), then re-raise the
exception; update the block around that except in service.py to mirror the
logging pattern used in dispatchers.py.
---
Duplicate comments:
In `@src/ai_company/engine/coordination/dispatchers.py`:
- Around line 163-164: Before re-raising the caught MemoryError/RecursionError
in the except block that currently reads "except MemoryError, RecursionError:
raise", add a minimal, low-allocation log statement (use a module-level logger
created once, e.g. logger = logging.getLogger(__name__)) to record the exception
type and a short message, then re-raise; ensure the log call is simple
(logger.critical or logger.error) to avoid heavy allocations. Also verify which
MemoryError is being caught (the custom ai_company.memory.errors.MemoryError vs
the built-in) by checking imports in this module and, if the custom type is
intended, consider logging its module name in the message for clarity. Finally,
keep the log content minimal so it is safe during low-memory conditions and
preserve the existing re-raise behavior.
In `@tests/integration/engine/test_coordination_wiring.py`:
- Around line 144-180: The test replaces the real coordinator with an AsyncMock
(coordinator.coordinate side_effect) which intentionally isolates behavior but
misses wiring between RootConfig.coordination and the build_coordinator factory;
add a separate integration test that constructs a RootConfig with real
coordination settings and calls build_coordinator() (referencing
build_coordinator and RootConfig) and then invokes the coordinator.coordinate
flow to ensure the factory wiring and configuration are exercised end-to-end
rather than mocked.
In `@tests/unit/api/controllers/test_coordination.py`:
- Around line 129-166: The test mixes async setup (await
agent_registry.register) with synchronous TestClient usage
(coordination_client.post). Make the test fully async: in
TestCoordinationControllerHappyPath.test_coordinate_task_success, switch to
using an async HTTP client (await coordination_client.post(...)) or ensure
coordination_client is an AsyncClient fixture, and await both POST calls to
"/api/v1/tasks" and f"/api/v1/tasks/{task_id}/coordinate"; keep the async
agent_registry.register and leave mock_coordinator.coordinate.return_value
assignment as-is so the flow is consistently awaited.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 328e4540-4814-4449-8e9e-90d114832482
📒 Files selected for processing (19)
CLAUDE.mddocs/design/engine.mddocs/design/operations.mddocs/roadmap/index.mdsrc/ai_company/api/channels.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/api/dto.pysrc/ai_company/api/state.pysrc/ai_company/api/ws_models.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/observability/events/coordination.pytests/integration/engine/test_coordination_wiring.pytests/unit/api/controllers/test_coordination.pytests/unit/api/test_dto.pytests/unit/api/test_state.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (7)
docs/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Build docs with: uv run zensical build (output: _site/docs/) — use without --strict until zensical/backlog#72 is resolved
Files:
docs/roadmap/index.mddocs/design/operations.mddocs/design/engine.md
docs/**
📄 CodeRabbit inference engine (CLAUDE.md)
Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)
Files:
docs/roadmap/index.mddocs/design/operations.mddocs/design/engine.md
docs/roadmap/**
📄 CodeRabbit inference engine (CLAUDE.md)
Roadmap in docs/roadmap/ (status, open questions, future vision)
Files:
docs/roadmap/index.md
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Python 3.14+ required; use PEP 649 native lazy annotations (nofrom __future__ import annotations)
Use PEP 758 except syntax:except A, B:(no parentheses) — enforced by ruff on Python 3.14
Add type hints to all public functions; enforce mypy strict mode
Use Google style docstrings required on all public classes and functions — enforced by ruff D rules
Create new objects instead of mutating existing ones; for non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves — never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields; use NotBlankStr for all identifier/name fields (including optional and tuple variants)
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — prefer structured concurrency over bare create_task; existing code is being migrated incrementally
Enforce 88 character line length — ruff enforces this
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Files:
tests/unit/api/controllers/test_coordination.pysrc/ai_company/api/channels.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/observability/events/coordination.pysrc/ai_company/api/dto.pysrc/ai_company/api/ws_models.pytests/unit/api/test_dto.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/coordination/factory.pytests/unit/api/test_state.pysrc/ai_company/api/state.pytests/integration/engine/test_coordination_wiring.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/controllers/coordination.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, and@pytest.mark.slow
Use pytest-xdist via -n auto for parallel test execution — ALWAYS include -n auto when running pytest, never run tests sequentially
Prefer@pytest.mark.parametrizefor testing similar cases
Use test-provider, test-small-001, etc. in tests instead of real vendor names
Run unit tests with: uv run pytest tests/ -m unit -n auto
Run integration tests with: uv run pytest tests/ -m integration -n auto
Run e2e tests with: uv run pytest tests/ -m e2e -n auto
Run full test suite with coverage: uv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80
Files:
tests/unit/api/controllers/test_coordination.pytests/unit/api/test_dto.pytests/unit/api/test_state.pytests/integration/engine/test_coordination_wiring.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Every module with business logic must import get_logger from ai_company.observability and define logger = get_logger(name); never use import logging or logging.getLogger() or print() in application code
Always use logger variable name (not _logger or log)
Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Use structured kwargs in logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Maintain 80% minimum code coverage — enforced in CI
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases
Lint Python code with: uv run ruff check src/ tests/
Auto-fix lint issues with: uv run ruff check src/ tests/ --fix
Format code with: uv run ruff format src/ tests/
Type-check with strict mode: uv run mypy src/ tests/
Files:
src/ai_company/api/channels.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/observability/events/coordination.pysrc/ai_company/api/dto.pysrc/ai_company/api/ws_models.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/api/state.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/controllers/coordination.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError; non-retryable errors raise immediately without retry; RetryExhaustedError signals all retries failed
Rate limiter respects RateLimitError.retry_after from providers — automatically pauses future requests
Files:
src/ai_company/api/channels.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/observability/events/coordination.pysrc/ai_company/api/dto.pysrc/ai_company/api/ws_models.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/api/state.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/controllers/coordination.py
🧠 Learnings (11)
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to **/*.py : Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — prefer structured concurrency over bare create_task; existing code is being migrated incrementally
Applied to files:
tests/unit/api/controllers/test_coordination.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Applied to files:
src/ai_company/observability/events/coordination.pysrc/ai_company/engine/coordination/factory.pyCLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/ai_company/**/*.py : Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError; non-retryable errors raise immediately without retry; RetryExhaustedError signals all retries failed
Applied to files:
src/ai_company/engine/agent_engine.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Every module with business logic must import get_logger from ai_company.observability and define logger = get_logger(__name__); never use import logging or logging.getLogger() or print() in application code
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use structured kwargs in logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
CLAUDE.mdsrc/ai_company/engine/coordination/dispatchers.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Always use logger variable name (not _logger or log)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging
Applied to files:
CLAUDE.md
🧬 Code graph analysis (7)
src/ai_company/engine/coordination/service.py (1)
src/ai_company/memory/errors.py (1)
MemoryError(13-14)
src/ai_company/api/dto.py (1)
src/ai_company/config/schema.py (1)
_validate_unique_agent_names(538-550)
tests/unit/api/test_dto.py (1)
src/ai_company/api/dto.py (8)
ApiResponse(39-57)ApproveRequest(247-256)CoordinateTaskRequest(274-313)CoordinationPhaseResponse(316-342)CoordinationResultResponse(345-373)success(55-57)success(94-96)is_success(371-373)
tests/unit/api/test_state.py (3)
src/ai_company/api/state.py (4)
coordinator(153-155)has_coordinator(158-160)agent_registry(163-165)has_agent_registry(168-170)src/ai_company/api/errors.py (1)
ServiceUnavailableError(68-74)src/ai_company/hr/registry.py (1)
AgentRegistryService(32-262)
tests/integration/engine/test_coordination_wiring.py (7)
src/ai_company/api/app.py (1)
create_app(351-491)src/ai_company/core/agent.py (1)
AgentIdentity(266-342)src/ai_company/core/role.py (1)
Authority(33-61)src/ai_company/hr/registry.py (1)
AgentRegistryService(32-262)src/ai_company/providers/enums.py (1)
FinishReason(15-22)src/ai_company/engine/coordination/models.py (2)
CoordinationPhaseResult(64-96)CoordinationResult(128-198)src/ai_company/engine/coordination/service.py (1)
coordinate(94-198)
src/ai_company/engine/coordination/dispatchers.py (1)
src/ai_company/memory/errors.py (1)
MemoryError(13-14)
src/ai_company/api/controllers/approvals.py (1)
src/ai_company/api/channels.py (1)
get_channels_plugin(33-40)
🪛 LanguageTool
CLAUDE.md
[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...
(EG_NO_COMMA)
🔇 Additional comments (33)
docs/roadmap/index.md (1)
9-9: LGTM! Documentation accurately reflects the new capability.The addition of "multi-agent coordination" to the Agent engine's feature list correctly documents the MultiAgentCoordinator integration completed in this PR. The placement in "Current Status" is appropriate given the comprehensive testing and integration work described in the PR objectives.
docs/design/operations.md (1)
958-958: Coordination endpoint documentation looks correct.Line 958 cleanly documents the new coordination trigger endpoint in the API surface.
src/ai_company/api/ws_models.py (1)
46-50: Coordination WebSocket event types are well integrated.The new lifecycle event variants are clear and follow the existing naming pattern.
src/ai_company/engine/agent_engine.py (1)
225-254: Coordinator delegation API is correctly implemented.The optional coordinator accessor plus guarded
coordinate()path (warning +ExecutionStateErrorwhen absent) is robust and matches runtime wiring goals.src/ai_company/api/state.py (1)
152-170: New AppState service accessors are consistent and safe.The coordinator and agent-registry properties correctly mirror the existing
_require_service/has_*pattern.tests/unit/api/test_state.py (1)
139-192: Good unit coverage for new AppState properties.These tests correctly exercise both success and
ServiceUnavailableErrorpaths for coordinator and agent registry.src/ai_company/observability/events/coordination.py (1)
18-18: Event constant addition is clean and aligned.
COORDINATION_FACTORY_BUILTextends the coordination observability namespace in the expected way.src/ai_company/api/channels.py (1)
33-40: Shared ChannelsPlugin lookup helper is a solid addition.This centralizes plugin resolution logic and keeps controller-side usage cleaner.
docs/design/engine.md (1)
411-413: Engine design docs are updated in the right places.The new coordinator delegation note and added coordination config keys are clearly reflected.
Also applies to: 793-796
src/ai_company/engine/coordination/dispatchers.py (2)
173-178: LGTM: Addedexc_info=Trueto warning logs.The addition of
exc_info=Trueimproves observability by including stack traces in error logs, aligning with best practices for debugging coordination failures.
210-211: Same pattern: MemoryError/RecursionError re-raises without logging.These blocks follow the same pattern as lines 163-164. The past review comment applies to all these locations.
Also applies to: 253-254, 336-337, 665-666, 769-770
tests/unit/api/test_dto.py (3)
186-236: LGTM: Comprehensive validation tests for CoordinateTaskRequest.Tests cover defaults, non-empty agent_names, max_length constraints, duplicate name rejection (case-insensitive), and parametrized bounds testing for
max_subtasksandmax_concurrency_per_wave. Good use of@pytest.mark.parametrizefor edge cases.
239-272: LGTM: CoordinationPhaseResponse consistency validation tests.Tests properly verify the model validator rejects inconsistent success/error combinations and accepts valid configurations.
275-318: LGTM: CoordinationResultResponse computed field tests.Tests validate the
is_successcomputed field behavior (all pass vs. any failure) and themin_length=1constraint on phases.CLAUDE.md (1)
101-107: LGTM: Package structure documentation updated.The
api/andengine/descriptions now accurately reflect the coordination endpoint and multi-agent coordination pipeline additions.src/ai_company/engine/coordination/service.py (1)
517-527: LGTM: Skip handling for None rollup uses appropriate event.The code now logs
COORDINATION_PHASE_COMPLETEDwith a descriptive note when skipping due to rollup being None, which is semantically correct—the phase didn't fail, it was skipped.tests/integration/engine/test_coordination_wiring.py (1)
211-245: LGTM: API interaction validates response structure.The test properly validates response status codes, JSON structure, and field types. Assertions cover
parent_task_id,topology(excluding AUTO),total_duration_seconds, andphasespresence.tests/unit/api/controllers/test_coordination.py (2)
238-296: LGTM: Error scenario coverage is thorough.Tests cover task not found (404), unknown agent name (422), no active agents (422), and coordination phase error (422). Assertions validate both status codes and error message content.
331-348: LGTM: 503 test validates missing coordinator handling.This test correctly uses the shared
test_clientfixture (which lacks a coordinator) to verify the 503 response when coordination is attempted without a configured coordinator.src/ai_company/api/controllers/approvals.py (2)
47-66: LGTM: Refactored_require_channels_pluginwith proper error handling.The helper now delegates to
get_channels_plugin()and logs before raisingRuntimeErrorwhen the plugin is missing. This aligns with the coding guideline requiring error paths to log before raising.
95-109: LGTM: Best-effort publish with proper exception handling.The updated
_publish_approval_eventcorrectly:
- Calls
_require_channels_pluginwhich logs if missing- Re-raises
MemoryError/RecursionErrorimmediately (PEP 758 syntax)- Logs other failures with
exc_info=Truefor debuggingsrc/ai_company/api/controllers/coordination.py (5)
45-52: LGTM: Task ID length validation at API boundary.The
_validate_task_idhelper rejects oversized task IDs early, preventing potential issues with downstream storage or processing. The 128-character limit is reasonable.
55-84: LGTM: Best-effort WebSocket publishing with proper exception handling.The function correctly returns early if channels plugin is unavailable, re-raises fatal exceptions (
MemoryError,RecursionError), and logs other failures withexc_info=True.
139-156: LGTM: Service availability checks before proceeding.The endpoint validates both
coordinatorandagent_registryavailability upfront with appropriate logging and 503 responses. This prevents confusing errors later in the flow.
181-196: Task engine access order is acceptable given upfront checks.The past review noted that
app_state.task_enginecould raise 503 before a meaningful coordinator error. However, lines 141-155 now validate bothhas_coordinatorandhas_agent_registryupfront, so users get coordinator-specific 503s first. Thetask_engineaccess here is guarded by those checks in practice.
279-324: LGTM: Agent resolution with clear error handling.The
_resolve_agentsmethod handles both explicit agent name lookup (with 422 for unknown names) and fallback to all active agents (with 422 if none available). Logging covers both failure paths.src/ai_company/api/dto.py (3)
274-313: LGTM — well-structured request DTO with proper constraints.The
CoordinateTaskRequestcorrectly uses:
NotBlankStrtuple with bounds (min 1, max 50 agents)- Case-insensitive duplicate detection in
_validate_unique_agent_names- Sensible limits for
max_subtasksandmax_concurrency_per_wave
316-342: LGTM — consistency validator is correct and past feedback addressed.The
error: NotBlankStr | Nonechange ensures failed phases cannot have blank error messages, and the validator properly enforces mutual exclusivity betweensuccess=True(no error) andsuccess=False(error required).
345-373: LGTM — computedis_successcleanly derives from phase results.The
@computed_fieldpattern aligns with guidelines, and the implementation correctly returnsTrueonly when all phases succeed.src/ai_company/engine/coordination/factory.py (4)
1-24: LGTM — module structure and imports follow guidelines.Event constants are correctly imported from domain-specific modules under
ai_company.observability.events.*, matching the project convention. The TYPE_CHECKING guard keeps type-only imports clean.
47-74: LGTM — placeholder strategy correctly logs and raises on decomposition attempt.The
_NoProviderDecompositionStrategyprovides clear error messaging and uses the correctDECOMPOSITION_FAILEDevent constant. The# noqa: ARG002annotations correctly document that unused parameters are intentional for protocol conformance.
94-108: LGTM — workspace service construction follows conditional pattern correctly.Returns
Nonewhen either dependency is missing, allowing the coordinator to function without workspace isolation when not configured.
111-180: LGTM — well-documented factory with clear wiring sequence.The
build_coordinatorfunction:
- Uses keyword-only arguments for clarity on a complex signature
- Has a comprehensive Google-style docstring documenting the dependency tree
- Logs at DEBUG level with structured kwargs before construction
- Follows a deterministic wiring order matching the documented steps
The
# noqa: PLR0913is justified for a factory function that assembles multiple collaborating services.
| except MemoryError, RecursionError: | ||
| raise |
There was a problem hiding this comment.
MemoryError/RecursionError re-raised without logging.
Same pattern as dispatchers.py—consider a minimal static log before re-raising to aid debugging fatal coordination failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/engine/coordination/service.py` around lines 179 - 180, The
except clause currently written as "except MemoryError, RecursionError:" should
log a minimal static error before re-raising; change it to the tuple form
"except (MemoryError, RecursionError):", call the module's logger (e.g.
logger.error or process_logger.error) with a concise static message like "Fatal
coordination failure: MemoryError/RecursionError encountered" (no mutable
state), then re-raise the exception; update the block around that except in
service.py to mirror the logging pattern used in dispatchers.py.
Connect the existing MultiAgentCoordinator to the config schema,
AppState, API layer, AgentEngine, and WebSocket events so it becomes
a usable runtime feature.
- Add CoordinationSectionConfig bridging YAML coordination section
to per-run CoordinationConfig
- Add build_coordinator() factory for constructing the full
dependency tree from config + AgentEngine
- Add coordinator param, property, and coordinate() method to
AgentEngine
- Extend AppState and create_app() with coordinator +
agent_registry slots
- Add coordination WsEventType entries and API event constants
- Add CoordinateTaskRequest/CoordinationResultResponse DTOs
- Add CoordinationController with POST /tasks/{id}/coordinate
- Export new public API from coordination package
- Add coordination field to RootConfig and config defaults
Pre-reviewed by 11 agents, 30 findings addressed: - Add get_strategy_name() to _NoProviderDecompositionStrategy (bug fix) - Add MemoryError/RecursionError guards to service.py (4 locations) and dispatchers.py (6 locations) — pre-existing safety gaps - Split coordinate_task into helper methods (<50 lines each) - Fix fail_fast: bool → bool | None to preserve section config defaults - Use API_WS_SEND_FAILED for WebSocket publish errors - Add log-before-raise on 3 error paths (controller, engine, factory) - Add extra="forbid" to CoordinationSectionConfig - DTOs: NotBlankStr for topology, computed_field for is_success, success/error validator, ge=0.0 constraints, min_length=1 on phases - Add min_length=1 + uniqueness validator on agent_names - Add le=50 upper bound on max_concurrency_per_wave - Update CLAUDE.md: api/ + engine/ descriptions, 4 new event constants - Add RootConfig docstring for coordination field - Add test: is_success=False path, provider/model-only factory edges, extra="forbid" validation, replace broad pytest.raises(Exception) - Log _phase_update_parent skip when rollup is None
Critical (6): - Fix misused COORDINATION_STARTED event in factory.py (→ DECOMPOSITION_FAILED and new COORDINATION_FACTORY_BUILT) - Branch log event based on result.is_success in controller _execute - Fix _phase_update_parent skip event (COORDINATION_PHASE_STARTED → COORDINATION_PHASE_COMPLETED) - Sanitize internal CoordinationPhaseError messages in API responses and WS payloads to prevent info leakage - Add exc_info=True to 6 warning log calls in dispatchers.py - Change CoordinationPhaseResponse.error to NotBlankStr | None Major (14): - Add has_agent_registry property + early-exit guard in controller - Add task_id length validation (_MAX_TASK_ID_LEN = 128) - Add max_length=50 to agent_names tuple in CoordinateTaskRequest - Add DTO validation tests (CoordinateTaskRequest, CoordinationPhaseResponse, CoordinationResultResponse) - Strengthen test_coordinate_with_specific_agents assertion - Update AppState docstring with coordinator/agent_registry - Fix MultiAgentCoordinator docstring (peer → dual-access) - Fix integration test docstring (build_coordinator → mock) - Update design spec: API Surface table, YAML config, AgentEngine section - Update roadmap with multi-agent coordination - Extract helpers from build_coordinator (under 50-line convention) Medium (4): - Add comment for reserved COORDINATION_PHASE_COMPLETED WsEventType - Extract shared get_channels_plugin to channels.py - Update CLAUDE.md with COORDINATION_FACTORY_BUILT event - Add coordinator param to AgentEngine docstring Args
- Fix mypy errors in test_dto.py (type: ignore for dict unpacking) - Add WS COORDINATION_FAILED event for unexpected exceptions in _execute - Log at WARNING level (not INFO) when coordination result is_success=False - Add max_length=128 to CoordinationResultResponse.parent_task_id - Make _NoProviderDecompositionStrategy explicitly implement DecompositionStrategy - Add logging to MemoryError/RecursionError handlers in service.py and dispatchers.py - Add build_coordinator() factory wiring integration tests - Fix CLAUDE.md "e.g." → "e.g.," and merge conflict resolution
3ab7f2d to
e801bea
Compare
There was a problem hiding this comment.
Pull request overview
Adds first-class multi-agent coordination wiring from config → engine services → API endpoint, including DTOs/events and test coverage for the new coordination surface area.
Changes:
- Introduce
CoordinationSectionConfig(RootConfig YAML section) and abuild_coordinator()factory to wireMultiAgentCoordinator. - Add API endpoint
POST /api/v1/tasks/{task_id}/coordinatewith request/response DTOs and WebSocket event publishing. - Extend
AppState/AgentEngineto optionally expose coordinator access and convenience delegation, plus unit/integration tests and docs updates.
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/engine/test_coordination_section_config.py | Unit tests for coordination YAML section defaults, validation, and conversion into runtime config. |
| tests/unit/engine/test_coordination_factory.py | Unit tests for coordinator factory wiring and placeholder decomposition strategy behavior. |
| tests/unit/engine/test_agent_engine.py | Adds tests for AgentEngine.coordinator and AgentEngine.coordinate() delegation/error path. |
| tests/unit/config/conftest.py | Ensures RootConfig factory includes the new coordination section for config tests. |
| tests/unit/api/test_state.py | Adds tests for new AppState coordinator and has_coordinator helpers. |
| tests/unit/api/test_dto.py | Adds validation tests for new coordination request/response DTOs. |
| tests/unit/api/controllers/test_coordination.py | Controller-level tests for the new coordination endpoint (happy path + error cases). |
| tests/integration/engine/test_coordination_wiring.py | Integration test covering bootstrap → app wiring → API call path for coordination. |
| src/ai_company/observability/events/coordination.py | Adds coordination factory-built event constant. |
| src/ai_company/observability/events/api.py | Adds API coordination lifecycle event constants. |
| src/ai_company/engine/coordination/service.py | Updates coordinator docs and adds explicit handling for MemoryError/RecursionError; adjusts parent-update phase behavior. |
| src/ai_company/engine/coordination/section_config.py | New YAML section model bridging config → runtime CoordinationConfig. |
| src/ai_company/engine/coordination/factory.py | New factory to construct a fully wired MultiAgentCoordinator from runtime services + config. |
| src/ai_company/engine/coordination/dispatchers.py | Improves fatal/system-error handling and exception logging detail during dispatcher phases. |
| src/ai_company/engine/coordination/init.py | Exports new coordination factory + section config for public module surface. |
| src/ai_company/engine/agent_engine.py | Adds optional coordinator dependency and coordinate() convenience method. |
| src/ai_company/config/schema.py | Adds coordination section to RootConfig schema. |
| src/ai_company/config/defaults.py | Adds empty coordination section to default config dict. |
| src/ai_company/api/ws_models.py | Adds websocket event types for coordination lifecycle. |
| src/ai_company/api/state.py | Adds coordinator + agent registry storage/accessors to AppState. |
| src/ai_company/api/dto.py | Adds coordination request/response DTOs. |
| src/ai_company/api/controllers/coordination.py | New controller implementing POST /tasks/{task_id}/coordinate. |
| src/ai_company/api/controllers/approvals.py | Refactors channels plugin lookup to use shared get_channels_plugin(). |
| src/ai_company/api/controllers/init.py | Registers CoordinationController in ALL_CONTROLLERS. |
| src/ai_company/api/channels.py | Adds shared get_channels_plugin() helper. |
| src/ai_company/api/app.py | Extends app factory to accept coordinator + agent registry in state wiring. |
| docs/roadmap/index.md | Updates roadmap to reflect coordination support in engine. |
| docs/design/operations.md | Documents new coordination endpoint in API surface table. |
| docs/design/engine.md | Documents AgentEngine.coordinate() and coordination YAML config fields. |
| CLAUDE.md | Updates repo structure notes and logging event guidance to include new coordination events. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.info( | ||
| COORDINATION_PHASE_COMPLETED, | ||
| phase="update_parent", | ||
| note="Skipped — rollup is None (rollup phase failed)", | ||
| success=True, | ||
| duration_seconds=0.0, | ||
| ) |
| except Exception: | ||
| _publish_ws_event( | ||
| request, | ||
| WsEventType.COORDINATION_FAILED, | ||
| {"task_id": task_id, "error": "Unexpected coordination error"}, | ||
| ) | ||
| raise |
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
src/ai_company/engine/coordination/dispatchers.py (1)
686-687:⚠️ Potential issue | 🟠 MajorLog context before fatal re-raise in per-wave paths.
Line 686 and Line 790 re-raise fatal exceptions without any phase/wave log, so crash localization is lost for context-dependent execution.
💡 Suggested patch
@@ - except MemoryError, RecursionError: - raise + except MemoryError, RecursionError: + logger.error( + COORDINATION_PHASE_FAILED, + phase=f"workspace_setup_wave_{wave_idx}", + wave_index=wave_idx, + error=( + "Fatal: MemoryError or RecursionError during " + "workspace setup" + ), + ) + raise @@ - except MemoryError, RecursionError: - raise + except MemoryError, RecursionError: + logger.error( + COORDINATION_PHASE_FAILED, + phase=f"execute_wave_{wave_idx}", + wave_index=wave_idx, + error=( + "Fatal: MemoryError or RecursionError during " + "wave execution" + ), + ) + raiseAs per coding guidelines, "All error paths must log at WARNING or ERROR with context before raising".
Also applies to: 790-791
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/coordination/dispatchers.py` around lines 686 - 687, The except handlers that re-raise fatal exceptions (the "except MemoryError, RecursionError:" blocks in the per-wave dispatch paths) must log contextual information before re-raising; modify those handlers to call the module's logger (e.g., logger or process_logger used elsewhere in dispatchers.py) to emit an ERROR/WARNING containing the current phase/wave identifier variables (e.g., wave, phase, wave_id or similar local context available in the surrounding function) and the exception details (exc_info=True or str(exception)) and then re-raise; update both occurrences that match this pattern so every fatal raise is preceded by a contextual log message naming the function and wave.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/design/engine.md`:
- Around line 793-796: The YAML snippet has fields (max_concurrency_per_wave,
fail_fast, enable_workspace_isolation, base_branch) incorrectly indented under
auto_topology_rules; move these keys to the same indentation level as
auto_topology_rules so they become top-level keys of the coordination section
per CoordinationSectionConfig, i.e., detach them from auto_topology_rules and
place them alongside it to match the expected schema used by the code that reads
CoordinationSectionConfig.
In `@src/ai_company/api/channels.py`:
- Around line 33-40: Update the get_channels_plugin docstring to Google-style
with Args and Returns: describe the request parameter type and purpose in an
"Args:" section (Request[Any, Any, Any] representing the incoming
Starlette/FastAPI request) and add a "Returns:" section that documents it
returns a ChannelsPlugin instance or None; keep the existing one-line summary
and ensure the docstring mentions the ChannelsPlugin class and the function name
get_channels_plugin for clarity.
In `@src/ai_company/api/controllers/coordination.py`:
- Around line 261-275: The endpoint only publishes terminal WS events but never
emits the per-phase event "coordination.phase_completed", so add a
_publish_ws_event call for WsEventType.COORDINATION_PHASE_COMPLETED (or the
string "coordination.phase_completed") before emitting the terminal event; use
the existing request, include task_id, phase identifier/name (from result.phase
or result.completed_phases), topology (result.topology.value), is_success, and
duration so dashboards can observe intermediate phase completions, and if result
contains a list of completed phases loop and publish one event per phase prior
to the final _publish_ws_event call.
- Around line 253-259: The except Exception handler that calls _publish_ws_event
with WsEventType.COORDINATION_FAILED should also log the caught exception and
contextual data before re-raising; modify the except block around
_publish_ws_event (referencing _publish_ws_event,
WsEventType.COORDINATION_FAILED and task_id/request) to call the module logger
(e.g., logger.error or logger.exception) including the exception info and
task_id/request context, then publish the WS event and re-raise.
In `@src/ai_company/engine/coordination/factory.py`:
- Around line 77-109: The factory currently silently degrades when only one of
the paired deps is provided; update _build_decomposition_strategy to raise a
clear ValueError if exactly one of provider or decomposition_model is non-None
(instead of returning _NoProviderDecompositionStrategy), and similarly update
_build_workspace_service to raise a ValueError if exactly one of
workspace_strategy or workspace_config is non-None (instead of returning None);
reference the existing symbols LlmDecompositionStrategy,
_NoProviderDecompositionStrategy, WorkspaceIsolationService,
_build_decomposition_strategy and _build_workspace_service when making the
change so callers fail fast at construction with an explicit error message
indicating the missing partner dependency.
---
Duplicate comments:
In `@src/ai_company/engine/coordination/dispatchers.py`:
- Around line 686-687: The except handlers that re-raise fatal exceptions (the
"except MemoryError, RecursionError:" blocks in the per-wave dispatch paths)
must log contextual information before re-raising; modify those handlers to call
the module's logger (e.g., logger or process_logger used elsewhere in
dispatchers.py) to emit an ERROR/WARNING containing the current phase/wave
identifier variables (e.g., wave, phase, wave_id or similar local context
available in the surrounding function) and the exception details (exc_info=True
or str(exception)) and then re-raise; update both occurrences that match this
pattern so every fatal raise is preceded by a contextual log message naming the
function and wave.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e3045e4a-819e-4900-97a9-6871dfd97bdd
📒 Files selected for processing (30)
CLAUDE.mddocs/design/engine.mddocs/design/operations.mddocs/roadmap/index.mdsrc/ai_company/api/app.pysrc/ai_company/api/channels.pysrc/ai_company/api/controllers/__init__.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/api/dto.pysrc/ai_company/api/state.pysrc/ai_company/api/ws_models.pysrc/ai_company/config/defaults.pysrc/ai_company/config/schema.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/coordination/__init__.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/section_config.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/observability/events/api.pysrc/ai_company/observability/events/coordination.pytests/integration/engine/test_coordination_wiring.pytests/unit/api/controllers/test_coordination.pytests/unit/api/test_dto.pytests/unit/api/test_state.pytests/unit/config/conftest.pytests/unit/engine/test_agent_engine.pytests/unit/engine/test_coordination_factory.pytests/unit/engine/test_coordination_section_config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Agent
- GitHub Check: Build Web
- GitHub Check: Build Backend
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotationsin Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax withexcept A, B:(no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Add type hints to all public functions. Enforce strict mypy mode.
Maintain line length of 88 characters—enforced by ruff.
Handle errors explicitly—never silently swallow exceptions.
Files:
src/ai_company/api/app.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/section_config.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/dto.pysrc/ai_company/api/ws_models.pysrc/ai_company/api/channels.pysrc/ai_company/api/controllers/__init__.pytests/unit/api/test_state.pytests/unit/api/test_dto.pytests/unit/engine/test_coordination_section_config.pysrc/ai_company/observability/events/coordination.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/state.pysrc/ai_company/engine/coordination/factory.pytests/unit/config/conftest.pytests/integration/engine/test_coordination_wiring.pysrc/ai_company/config/schema.pytests/unit/engine/test_coordination_factory.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/observability/events/api.pytests/unit/api/controllers/test_coordination.pysrc/ai_company/engine/coordination/service.pytests/unit/engine/test_agent_engine.pysrc/ai_company/config/defaults.pysrc/ai_company/engine/coordination/__init__.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use Google-style docstrings on all public classes and functions—required and enforced by ruff D rules.
Immutability: Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention and usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Distinguish between config and runtime state: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (usingmodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 conventions:BaseModel,model_validator,computed_field,ConfigDict. Use@computed_fieldfor derived values instead of storing redundant fields. UseNotBlankStr(fromcore.types) for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task. Existing code is being migrated incrementally.
Keep functions under 50 lines and files under 800 lines.
Validate at system boundaries: user input, external APIs, config files.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. In tests, usetest-provider,test-small-001, etc. Vendor names may only appear in: (1) Operations design page provider list, (2).claude/skill/agent files, (3) thir...
Files:
src/ai_company/api/app.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/section_config.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/dto.pysrc/ai_company/api/ws_models.pysrc/ai_company/api/channels.pysrc/ai_company/api/controllers/__init__.pysrc/ai_company/observability/events/coordination.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/state.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/config/schema.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/observability/events/api.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/config/defaults.pysrc/ai_company/engine/coordination/__init__.py
src/ai_company/!(observability)/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/!(observability)/**/*.py: Every module with business logic must includefrom ai_company.observability import get_loggerand setlogger = get_logger(__name__). Never useimport logging,logging.getLogger(), orprint()in application code. Use the variable namelogger(not_loggerorlog).
Use event name constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider). Import directly:from ai_company.observability.events.<domain> import EVENT_CONSTANT.
Use structured logging kwargs:logger.info(EVENT, key=value)—never use format strings likelogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions.
Files:
src/ai_company/api/app.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/section_config.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/dto.pysrc/ai_company/api/ws_models.pysrc/ai_company/api/channels.pysrc/ai_company/api/controllers/__init__.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/state.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/config/schema.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/config/defaults.pysrc/ai_company/engine/coordination/__init__.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Apply pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slowto categorize tests.
Maintain 80% minimum code coverage—enforced in CI. Use@pytest.mark.parametrizefor testing similar cases.
Files:
tests/unit/api/test_state.pytests/unit/api/test_dto.pytests/unit/engine/test_coordination_section_config.pytests/unit/config/conftest.pytests/integration/engine/test_coordination_wiring.pytests/unit/engine/test_coordination_factory.pytests/unit/api/controllers/test_coordination.pytests/unit/engine/test_agent_engine.py
🧠 Learnings (10)
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/ai_company/api/app.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/observability/events/api.pyCLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/service.pyCLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All state transitions must log at INFO level.
Applied to files:
src/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/service.pyCLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/**/*.py : Distinguish between config and runtime state: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/ai_company/engine/coordination/section_config.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/providers/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`. Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately. `RetryExhaustedError` signals all retries failed—catch at engine layer for fallback chains.
Applied to files:
tests/unit/config/conftest.pysrc/ai_company/config/schema.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.
Applied to files:
tests/unit/api/controllers/test_coordination.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Every module with business logic must include `from ai_company.observability import get_logger` and set `logger = get_logger(__name__)`. Never use `import logging`, `logging.getLogger()`, or `print()` in application code. Use the variable name `logger` (not `_logger` or `log`).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use structured logging kwargs: `logger.info(EVENT, key=value)`—never use format strings like `logger.info("msg %s", val)`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use DEBUG logging for object creation, internal flow, and entry/exit of key functions.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.
Applied to files:
CLAUDE.md
🧬 Code graph analysis (19)
src/ai_company/api/app.py (3)
src/ai_company/engine/coordination/service.py (1)
MultiAgentCoordinator(50-589)src/ai_company/api/state.py (3)
task_engine(120-122)coordinator(153-155)agent_registry(163-165)src/ai_company/hr/registry.py (1)
AgentRegistryService(32-262)
src/ai_company/api/controllers/coordination.py (9)
src/ai_company/api/channels.py (1)
get_channels_plugin(33-40)src/ai_company/api/dto.py (4)
CoordinateTaskRequest(274-313)success(55-57)success(94-96)is_success(371-373)src/ai_company/api/errors.py (3)
ApiValidationError(32-38)NotFoundError(23-29)ServiceUnavailableError(68-74)src/ai_company/api/ws_models.py (2)
WsEvent(53-78)WsEventType(20-50)src/ai_company/engine/coordination/models.py (2)
CoordinationContext(30-61)CoordinationResult(128-198)src/ai_company/engine/errors.py (1)
CoordinationPhaseError(132-153)src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/api/state.py (4)
has_coordinator(158-160)has_agent_registry(168-170)coordinator(153-155)agent_registry(163-165)src/ai_company/engine/coordination/service.py (1)
coordinate(94-203)
src/ai_company/engine/coordination/section_config.py (2)
src/ai_company/engine/coordination/config.py (1)
CoordinationConfig(8-38)src/ai_company/engine/routing/models.py (1)
AutoTopologyConfig(160-207)
src/ai_company/api/controllers/approvals.py (1)
src/ai_company/api/channels.py (1)
get_channels_plugin(33-40)
src/ai_company/api/dto.py (1)
src/ai_company/config/schema.py (1)
_validate_unique_agent_names(538-550)
src/ai_company/api/controllers/__init__.py (1)
src/ai_company/api/controllers/coordination.py (1)
CoordinationController(109-334)
tests/unit/api/test_state.py (2)
src/ai_company/api/state.py (2)
coordinator(153-155)has_coordinator(158-160)src/ai_company/api/errors.py (1)
ServiceUnavailableError(68-74)
tests/unit/api/test_dto.py (1)
src/ai_company/api/dto.py (6)
CoordinateTaskRequest(274-313)CoordinationPhaseResponse(316-342)CoordinationResultResponse(345-373)success(55-57)success(94-96)is_success(371-373)
tests/unit/engine/test_coordination_section_config.py (3)
src/ai_company/engine/coordination/config.py (1)
CoordinationConfig(8-38)src/ai_company/engine/coordination/section_config.py (2)
CoordinationSectionConfig(15-83)to_coordination_config(57-83)src/ai_company/engine/routing/models.py (1)
AutoTopologyConfig(160-207)
src/ai_company/engine/coordination/dispatchers.py (1)
src/ai_company/memory/errors.py (1)
MemoryError(13-14)
src/ai_company/api/state.py (4)
src/ai_company/engine/coordination/service.py (1)
MultiAgentCoordinator(50-589)src/ai_company/hr/registry.py (1)
AgentRegistryService(32-262)src/ai_company/engine/agent_engine.py (1)
coordinator(226-228)tests/unit/api/controllers/test_coordination.py (1)
agent_registry(80-81)
src/ai_company/engine/coordination/factory.py (6)
src/ai_company/engine/decomposition/protocol.py (1)
DecompositionStrategy(14-40)src/ai_company/engine/errors.py (1)
DecompositionError(46-47)src/ai_company/config/schema.py (1)
TaskAssignmentConfig(319-383)src/ai_company/engine/decomposition/models.py (2)
DecompositionContext(262-287)DecompositionPlan(66-122)src/ai_company/engine/workspace/protocol.py (1)
WorkspaceIsolationStrategy(14-90)src/ai_company/providers/protocol.py (1)
CompletionProvider(21-80)
tests/unit/config/conftest.py (1)
src/ai_company/engine/coordination/section_config.py (1)
CoordinationSectionConfig(15-83)
tests/integration/engine/test_coordination_wiring.py (4)
src/ai_company/api/app.py (1)
create_app(351-491)src/ai_company/hr/registry.py (1)
AgentRegistryService(32-262)src/ai_company/engine/coordination/factory.py (1)
build_coordinator(111-180)src/ai_company/engine/coordination/models.py (2)
CoordinationPhaseResult(64-96)CoordinationResult(128-198)
src/ai_company/config/schema.py (1)
src/ai_company/engine/coordination/section_config.py (1)
CoordinationSectionConfig(15-83)
tests/unit/engine/test_coordination_factory.py (1)
src/ai_company/engine/errors.py (1)
DecompositionError(46-47)
src/ai_company/engine/coordination/service.py (1)
src/ai_company/memory/errors.py (1)
MemoryError(13-14)
tests/unit/engine/test_agent_engine.py (3)
src/ai_company/engine/agent_engine.py (2)
coordinator(226-228)coordinate(230-253)src/ai_company/api/state.py (1)
coordinator(153-155)src/ai_company/engine/errors.py (1)
ExecutionStateError(17-18)
src/ai_company/engine/coordination/__init__.py (2)
src/ai_company/engine/coordination/factory.py (1)
build_coordinator(111-180)src/ai_company/engine/coordination/section_config.py (1)
CoordinationSectionConfig(15-83)
🔇 Additional comments (35)
docs/roadmap/index.md (1)
9-9: Roadmap status update is accurate and consistent with the shipped feature set.This documentation change correctly reflects the new runtime capability for multi-agent coordination.
src/ai_company/config/defaults.py (1)
42-42: LGTM!The addition of the
"coordination": {}default follows the established pattern for other config sections. An empty dict correctly allowsCoordinationSectionConfig's Pydantic field defaults to take effect.docs/design/engine.md (1)
411-413: LGTM!The documentation for the optional
coordinate()method delegation is clear and appropriately references the coordination documentation.CLAUDE.md (3)
101-101: LGTM!Package description updated to reflect the new coordination endpoint.
107-107: LGTM!Engine package description accurately documents the new
CoordinationSectionConfigbridge andbuild_coordinatorfactory.
154-154: LGTM!Event constants list expanded with the new coordination events (
API_COORDINATION_STARTED,API_COORDINATION_COMPLETED,API_COORDINATION_FAILED,API_COORDINATION_AGENT_RESOLVE_FAILED,COORDINATION_FACTORY_BUILT). The previously flagged "e.g." comma issue has been addressed.src/ai_company/api/controllers/approvals.py (2)
47-66: LGTM!The refactor to
_require_channels_plugincorrectly delegates to the sharedget_channels_pluginhelper and adds appropriate error handling with logging before raising. The docstring follows Google style with Args/Returns/Raises sections.
96-96: LGTM!Call site updated to use the renamed helper function.
docs/design/operations.md (1)
967-967: LGTM!API endpoint documentation added for the new coordination trigger endpoint. The description is accurate and placement in the table is appropriate.
tests/unit/config/conftest.py (2)
25-25: LGTM!Import added for the new
CoordinationSectionConfigmodel.
100-100: LGTM!
RootConfigFactoryextended with thecoordinationfield using defaultCoordinationSectionConfig(), following the established pattern for other config sections.tests/unit/api/test_state.py (2)
139-164: LGTM!Comprehensive test coverage for the new
coordinatorproperty andhas_coordinatorflag. Tests follow the established pattern fromTestAppStateTaskEngineand cover both the error case (None) and success case (set).
167-192: LGTM!Comprehensive test coverage for the new
agent_registryproperty andhas_agent_registryflag. Uses the actualAgentRegistryServiceclass rather than a mock, which provides stronger type guarantees.src/ai_company/engine/coordination/dispatchers.py (1)
163-169: Good fatal-error hardening in shared dispatch paths.These branches now preserve fatal exceptions and add contextual phase logs (
exc_info=Trueon non-fatal paths), which improves diagnosability without downcasting critical failures.Also applies to: 182-183, 215-221, 234-235, 263-269, 275-276, 351-358, 366-367
src/ai_company/api/controllers/__init__.py (1)
13-13: Controller registry/export wiring looks correct.
CoordinationControlleris consistently imported, included inALL_CONTROLLERS, and exposed via__all__.Also applies to: 39-40, 53-53
tests/unit/engine/test_coordination_section_config.py (1)
15-122: Solid config test coverage.The new tests cover defaults, frozen behavior, override precedence, and validation boundaries comprehensively.
src/ai_company/observability/events/coordination.py (1)
18-18: Event constant addition is clean and consistent.
COORDINATION_FACTORY_BUILTfollows the existing coordination event naming scheme.src/ai_company/config/schema.py (1)
24-24: Root config coordination wiring is correct.The new
coordinationfield is properly typed, documented, and initialized with a safe default factory.Also applies to: 420-420, 532-535
src/ai_company/engine/coordination/section_config.py (1)
15-83: Section config design is well-structured.This keeps company-level defaults immutable and cleanly derives per-run
CoordinationConfigwith override precedence.Based on learnings, "Distinguish between config and runtime state: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using
model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model."tests/unit/api/test_dto.py (1)
4-13: DTO validation tests are comprehensive.The new cases correctly exercise uniqueness, bounds, consistency validators, and computed success behavior.
Also applies to: 186-318
src/ai_company/api/app.py (1)
38-40: App wiring for coordination dependencies looks correct.
create_app()andAppStateintegration forcoordinatorandagent_registryare consistent and safely optional.Also applies to: 360-361, 376-377, 412-413
tests/unit/api/controllers/test_coordination.py (1)
1-348: LGTM!The test file provides comprehensive coverage of the coordination controller, including:
- Happy path scenarios (task coordination, specific agents, failed phases)
- Error scenarios (404, 422, 503)
- Proper use of
@pytest.mark.unitmarkers on test classes- Clean helper functions for test data creation
The async/sync pattern with
TestClientworks because pytest-asyncio handles the coroutine registration calls while HTTP assertions are synchronous.src/ai_company/engine/coordination/__init__.py (1)
18-46: LGTM!The new public exports for
build_coordinatorandCoordinationSectionConfigare correctly added, maintaining alphabetical ordering in__all__.tests/unit/engine/test_coordination_factory.py (1)
1-136: LGTM!Comprehensive unit tests for the
build_coordinatorfactory covering:
- Basic coordinator creation
- Various configuration combinations (provider/model, workspace deps, custom min_score)
- Placeholder strategy behavior when provider is missing
- Clear error message verification for
_NoProviderDecompositionStrategysrc/ai_company/api/ws_models.py (1)
46-51: LGTM!The new coordination WebSocket event types follow the established naming convention and are appropriately documented. The inline comment clarifying that
COORDINATION_PHASE_COMPLETEDis reserved for future use is helpful.src/ai_company/observability/events/api.py (1)
44-49: LGTM!The new API coordination event constants follow the established naming pattern and are correctly typed with
Final[str].tests/integration/engine/test_coordination_wiring.py (2)
218-254: Intentional mock usage documented.The past review comment noted that
AsyncMockis used instead ofbuild_coordinator(). However, the module docstring (lines 3-10) explicitly explains this design decision: the realbuild_coordinator()is unit-tested separately (inTestBuildCoordinatorFactory), while this test focuses on the API-to-coordinator wiring path.This separation of concerns is valid—the integration test verifies that:
create_appcorrectly wires the coordinator- The
/coordinateendpoint correctly resolves agents and invokes the coordinator- Response structure matches DTOs
The
_mock_coordinatefunction (lines 232-251) provides task-id-aware responses, enabling realistic assertions without requiring LLM provider dependencies.
1-319: LGTM!The integration test file provides solid coverage:
- Factory verification (
TestBuildCoordinatorFactory) tests realbuild_coordinator()output- Full wiring test (
TestCoordinationWiring) validates bootstrap → API → coordination flow- Proper
@pytest.mark.integrationmarkers- Clean helper utilities for agents and mock provider
src/ai_company/api/state.py (1)
152-170: LGTM!The new
coordinatorandagent_registryaccessors follow the established_require_servicepattern, and both include correspondinghas_*properties for conditional access. The past review comment requestinghas_agent_registryhas been addressed.src/ai_company/engine/coordination/service.py (3)
219-220: ConsistentMemoryError, RecursionErrorhandling across all phases.The same PEP 758
except MemoryError, RecursionError:pattern is applied consistently in_phase_decompose,_phase_route,_phase_dispatch,_phase_rollup, and_phase_update_parent. This ensures fatal errors propagate without being wrapped inCoordinationPhaseError.Assuming the builtin
MemoryErrorconcern above is verified, this pattern is correct.Also applies to: 275-276, 416-417, 481-482, 573-574
522-532: Skip handling for_phase_update_parentcorrectly addressed.The past review comment about semantic mismatch has been resolved:
- Early return when
self._task_engine is None(line 522-523)- When
rollup is None, logsCOORDINATION_PHASE_COMPLETEDwithnote="Skipped — rollup is None"andsuccess=True(lines 524-531)This accurately reflects that the phase completed (by skipping) rather than failed.
179-185: The shadowing concern is not applicable here.src/ai_company/engine/coordination/service.pydoes not importai_company.memory.errors.MemoryError, and no transitively-imported module brings it into scope. Theexcept MemoryError, RecursionError:clause correctly catches the Python builtinMemoryError. The PEP 758 except syntax is correct per coding guidelines.src/ai_company/engine/agent_engine.py (1)
225-253: Coordinator integration inAgentEngineis clean and fail-fast.The new
coordinatorproperty pluscoordinate()delegation is straightforward, typed, and handles missing wiring explicitly.tests/unit/engine/test_agent_engine.py (1)
1320-1369: Coordinator test coverage is solid for the new engine surface.This block validates both failure and delegation paths for
AgentEngine.coordinate()and protects the new wiring contract.src/ai_company/api/dto.py (1)
274-373: DTO validation and computed response semantics look correct.The new coordination DTOs enforce strong request/response invariants and keep
is_successderived fromphases, which avoids redundant state.
docs/design/engine.md
Outdated
| max_concurrency_per_wave: null # None = unlimited | ||
| fail_fast: false | ||
| enable_workspace_isolation: true | ||
| base_branch: main |
There was a problem hiding this comment.
YAML structure appears incorrect: these fields belong at coordination level, not under auto_topology_rules.
Based on CoordinationSectionConfig (see section_config.py), max_concurrency_per_wave, fail_fast, enable_workspace_isolation, and base_branch are top-level fields of the coordination section, not nested under auto_topology_rules. The current indentation suggests they're children of auto_topology_rules.
📝 Suggested fix
coordination:
topology: "auto" # auto, sas, centralized, decentralized, context_dependent
auto_topology_rules:
# sequential tasks -> always single-agent
sequential_override: "sas"
# parallel tasks -> select based on domain structure
parallel_default: "centralized"
# mixed tasks -> SAS backbone for sequential phases, delegates parallel sub-tasks
mixed_default: "context_dependent" # hybrid: not a single topology -- engine selects per-phase
- max_concurrency_per_wave: null # None = unlimited
- fail_fast: false
- enable_workspace_isolation: true
- base_branch: main
+ max_concurrency_per_wave: null # None = unlimited
+ fail_fast: false
+ enable_workspace_isolation: true
+ base_branch: mainNote: If the current indentation is intentional (2-space indent at coordination level), then the YAML is correct but the visual alignment is misleading. Consider adjusting for clarity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/design/engine.md` around lines 793 - 796, The YAML snippet has fields
(max_concurrency_per_wave, fail_fast, enable_workspace_isolation, base_branch)
incorrectly indented under auto_topology_rules; move these keys to the same
indentation level as auto_topology_rules so they become top-level keys of the
coordination section per CoordinationSectionConfig, i.e., detach them from
auto_topology_rules and place them alongside it to match the expected schema
used by the code that reads CoordinationSectionConfig.
| except Exception: | ||
| _publish_ws_event( | ||
| request, | ||
| WsEventType.COORDINATION_FAILED, | ||
| {"task_id": task_id, "error": "Unexpected coordination error"}, | ||
| ) | ||
| raise |
There was a problem hiding this comment.
Log unexpected coordination exceptions before re-raising.
This broad error path publishes a WS failure event but does not log the exception context before raise, which weakens incident triage.
Suggested fix
- except Exception:
+ except Exception as exc:
+ logger.exception(
+ API_COORDINATION_FAILED,
+ task_id=task_id,
+ error=f"{type(exc).__name__}: {exc}",
+ )
_publish_ws_event(
request,
WsEventType.COORDINATION_FAILED,
{"task_id": task_id, "error": "Unexpected coordination error"},
)
raiseBased on learnings: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/api/controllers/coordination.py` around lines 253 - 259, The
except Exception handler that calls _publish_ws_event with
WsEventType.COORDINATION_FAILED should also log the caught exception and
contextual data before re-raising; modify the except block around
_publish_ws_event (referencing _publish_ws_event,
WsEventType.COORDINATION_FAILED and task_id/request) to call the module logger
(e.g., logger.error or logger.exception) including the exception info and
task_id/request context, then publish the WS event and re-raise.
| ws_event_type = ( | ||
| WsEventType.COORDINATION_COMPLETED | ||
| if result.is_success | ||
| else WsEventType.COORDINATION_FAILED | ||
| ) | ||
| _publish_ws_event( | ||
| request, | ||
| ws_event_type, | ||
| { | ||
| "task_id": task_id, | ||
| "topology": result.topology.value, | ||
| "is_success": result.is_success, | ||
| "total_duration_seconds": result.total_duration_seconds, | ||
| }, | ||
| ) |
There was a problem hiding this comment.
coordination.phase_completed is defined but never emitted.
The endpoint currently emits started + terminal events only. Per-phase progress events are missing, so dashboards cannot observe intermediate coordination phase completion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/api/controllers/coordination.py` around lines 261 - 275, The
endpoint only publishes terminal WS events but never emits the per-phase event
"coordination.phase_completed", so add a _publish_ws_event call for
WsEventType.COORDINATION_PHASE_COMPLETED (or the string
"coordination.phase_completed") before emitting the terminal event; use the
existing request, include task_id, phase identifier/name (from result.phase or
result.completed_phases), topology (result.topology.value), is_success, and
duration so dashboards can observe intermediate phase completions, and if result
contains a list of completed phases loop and publish one event per phase prior
to the final _publish_ws_event call.
- Revert MemoryError/RecursionError logging in dispatchers.py back to bare raise (logging itself can trigger MemoryError during low-memory conditions) - Same revert in service.py coordinate() method - Add logger.exception() with structured context before WS publish in coordination controller's generic except Exception handler - Fix misleading success=True log when rollup is None — now logs as COORDINATION_PHASE_FAILED and appends failed phase to results - Add fail-fast ValueError in factory when mismatched deps are provided (e.g. provider without decomposition_model or vice versa) - Improve get_channels_plugin docstring to Google-style (Args/Returns) - Fix engine.md YAML coordination config comment alignment - Update factory tests to expect ValueError for mismatched deps
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
src/ai_company/api/controllers/coordination.py (1)
233-250:⚠️ Potential issue | 🟠 Major
coordination.phase_completedis still never emitted.Neither
exc.partial_phaseson failure norresult.phaseson success is published asWsEventType.COORDINATION_PHASE_COMPLETED, so subscribers only ever seestartedand terminal events. That leaves the new phase-progress event type effectively dead and dashboards still miss intermediate coordination progress.Also applies to: 266-280
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/api/controllers/coordination.py` around lines 233 - 250, When handling coordination success and failure in coordination.py, emit WsEventType.COORDINATION_PHASE_COMPLETED with the intermediate phase list: in the except CoordinationPhaseError block (where exc is available) call _publish_ws_event(request, WsEventType.COORDINATION_PHASE_COMPLETED, {"task_id": task_id, "phases": exc.partial_phases}) before publishing COORDINATION_FAILED/raising ApiValidationError, and in the success path (where result is returned) call _publish_ws_event(request, WsEventType.COORDINATION_PHASE_COMPLETED, {"task_id": task_id, "phases": result.phases}) before publishing the terminal success event; ensure you reference the existing _publish_ws_event helper and include task_id and the phase list in the payload so subscribers see intermediate progress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/api/controllers/coordination.py`:
- Around line 233-250: The current except block turns every
CoordinationPhaseError into an ApiValidationError (4xx); instead, update the
except in coordination.py to distinguish user-correctable phases from
internal/server faults: inspect exc.phase (or an explicit flag on
CoordinationPhaseError if available) and only raise ApiValidationError for known
client-fixable phases (e.g., validation/input-related phases); for internal
phases (e.g., decomposition/provider resolution, runtime MultiAgentCoordinator
failures) log the error, publish the same WsEvent via _publish_ws_event with
WsEventType.COORDINATION_FAILED, and raise an appropriate server-side error (or
re-raise the original exception) so it surfaces as a 5xx; keep the existing
logger.warning and _publish_ws_event behavior but change the exception type
raised based on the phase/flag in the CoordinationPhaseError.
In `@src/ai_company/engine/coordination/dispatchers.py`:
- Around line 163-164: Add a brief inline comment above the except MemoryError,
RecursionError: block explaining that logging is intentionally omitted because
emitting logs may itself raise MemoryError or RecursionError and that these are
re-raised immediately per the fatal-error policy; explicitly note that these are
the built-in exceptions (not ai_company.memory.errors.MemoryError) to avoid
confusion and reference the coding guideline requiring documentation when
skipping logging in fatal error paths.
In `@src/ai_company/engine/coordination/factory.py`:
- Around line 185-191: The COORDINATION_FACTORY_BUILT telemetry is emitted too
early; move the logger.debug call so it runs only after both
_build_decomposition_strategy() and _build_workspace_service() complete
successfully (or alternatively emit a distinct "starting_build" event before
construction and keep COORDINATION_FACTORY_BUILT for post-success), i.e.,
relocate the logger.debug(COORDINATION_FACTORY_BUILT, ...) currently before the
calls to _build_decomposition_strategy and _build_workspace_service to after the
construction code path that produces the coordinator (or add a new pre-build
constant/event name and use that before starting the two build calls).
- Around line 96-103: Before raising the ValueError when exactly one of provider
or decomposition_model is supplied, log a structured warning or error that
includes which of the two was given and which is missing (e.g.,
logger.error("Invalid decomposition wiring: given=%s missing=%s", given,
missing) or logger.warning(...)), then raise; apply the same change to the other
fail-fast branch around the 126-141 region so every error path logs context (use
the module logger variable name `logger` and include the `provider` and
`decomposition_model` values/state in the log).
In `@tests/unit/engine/test_coordination_factory.py`:
- Around line 38-123: The tests currently only assert build_coordinator doesn't
raise; update the relevant tests (those using build_coordinator, e.g.,
test_with_provider_and_model, test_with_task_engine, test_with_workspace_deps,
test_custom_min_score, test_shutdown_manager_passed_to_executor) to also assert
that the provided inputs are propagated to collaborators: verify the provider
and decomposition_model are passed into the created coordinator/executor
(reference build_coordinator, provider, decomposition_model), that task_engine
is attached or injected (reference task_engine), that workspace_strategy and
workspace_config are wired into the workspace component created by
build_coordinator (reference workspace_strategy, workspace_config), that
TaskAssignmentConfig.min_score is used by the scorer (reference
TaskAssignmentConfig(min_score)), and that shutdown_manager is forwarded to the
executor (reference shutdown_manager); additionally add tests asserting that
passing only workspace_strategy or only workspace_config raises the expected
failure (mirror existing provider/model missing tests). Ensure assertions
inspect collaborator state or call args on the coordinator/MultiAgentCoordinator
or its created children rather than only isinstance checks.
---
Duplicate comments:
In `@src/ai_company/api/controllers/coordination.py`:
- Around line 233-250: When handling coordination success and failure in
coordination.py, emit WsEventType.COORDINATION_PHASE_COMPLETED with the
intermediate phase list: in the except CoordinationPhaseError block (where exc
is available) call _publish_ws_event(request,
WsEventType.COORDINATION_PHASE_COMPLETED, {"task_id": task_id, "phases":
exc.partial_phases}) before publishing COORDINATION_FAILED/raising
ApiValidationError, and in the success path (where result is returned) call
_publish_ws_event(request, WsEventType.COORDINATION_PHASE_COMPLETED, {"task_id":
task_id, "phases": result.phases}) before publishing the terminal success event;
ensure you reference the existing _publish_ws_event helper and include task_id
and the phase list in the payload so subscribers see intermediate progress.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: cfe1a0cc-daad-4a98-bf2b-3845472e0461
📒 Files selected for processing (7)
docs/design/engine.mdsrc/ai_company/api/channels.pysrc/ai_company/api/controllers/coordination.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/service.pytests/unit/engine/test_coordination_factory.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotationsin Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax withexcept A, B:(no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Add type hints to all public functions. Enforce strict mypy mode.
Maintain line length of 88 characters—enforced by ruff.
Handle errors explicitly—never silently swallow exceptions.
Files:
tests/unit/engine/test_coordination_factory.pysrc/ai_company/api/channels.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/controllers/coordination.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Apply pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slowto categorize tests.
Maintain 80% minimum code coverage—enforced in CI. Use@pytest.mark.parametrizefor testing similar cases.
Files:
tests/unit/engine/test_coordination_factory.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use Google-style docstrings on all public classes and functions—required and enforced by ruff D rules.
Immutability: Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention and usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Distinguish between config and runtime state: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (usingmodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 conventions:BaseModel,model_validator,computed_field,ConfigDict. Use@computed_fieldfor derived values instead of storing redundant fields. UseNotBlankStr(fromcore.types) for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task. Existing code is being migrated incrementally.
Keep functions under 50 lines and files under 800 lines.
Validate at system boundaries: user input, external APIs, config files.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. In tests, usetest-provider,test-small-001, etc. Vendor names may only appear in: (1) Operations design page provider list, (2).claude/skill/agent files, (3) thir...
Files:
src/ai_company/api/channels.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/controllers/coordination.py
src/ai_company/!(observability)/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/!(observability)/**/*.py: Every module with business logic must includefrom ai_company.observability import get_loggerand setlogger = get_logger(__name__). Never useimport logging,logging.getLogger(), orprint()in application code. Use the variable namelogger(not_loggerorlog).
Use event name constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider). Import directly:from ai_company.observability.events.<domain> import EVENT_CONSTANT.
Use structured logging kwargs:logger.info(EVENT, key=value)—never use format strings likelogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions.
Files:
src/ai_company/api/channels.pysrc/ai_company/engine/coordination/service.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/controllers/coordination.py
🧠 Learnings (3)
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/ai_company/engine/coordination/service.pysrc/ai_company/engine/coordination/factory.pysrc/ai_company/engine/coordination/dispatchers.pysrc/ai_company/api/controllers/coordination.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All state transitions must log at INFO level.
Applied to files:
src/ai_company/engine/coordination/service.pysrc/ai_company/api/controllers/coordination.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/ai_company/engine/coordination/factory.py
🧬 Code graph analysis (3)
tests/unit/engine/test_coordination_factory.py (3)
src/ai_company/config/schema.py (1)
TaskAssignmentConfig(319-383)src/ai_company/engine/coordination/factory.py (3)
_NoProviderDecompositionStrategy(47-74)build_coordinator(145-214)decompose(59-74)src/ai_company/engine/errors.py (1)
DecompositionError(46-47)
src/ai_company/engine/coordination/service.py (3)
src/ai_company/memory/errors.py (1)
MemoryError(13-14)src/ai_company/engine/coordination/models.py (1)
CoordinationPhaseResult(64-96)src/ai_company/api/dto.py (2)
success(55-57)success(94-96)
src/ai_company/engine/coordination/dispatchers.py (1)
src/ai_company/memory/errors.py (1)
MemoryError(13-14)
🔇 Additional comments (4)
src/ai_company/api/channels.py (2)
7-30: Clean typing-only import setup.Using
TYPE_CHECKINGforRequestkeeps runtime imports lean while preserving precise type hints.
33-47:get_channels_plugin()looks correct and well-documented.The helper is concise, has a clear return contract, and the docstring now follows the expected public-function format.
As per coding guidelines,
src/**/*.py: "Use Google-style docstrings on all public classes and functions—required and enforced by ruff D rules."src/ai_company/engine/coordination/dispatchers.py (2)
173-178: LGTM! Good addition ofexc_info=Truefor enhanced observability.The
exc_info=Trueparameter ensures tracebacks are captured in error logs, improving debuggability. This pattern is consistently applied across all exception handlers in the file.
210-211: Consistent application of the fatal error handling pattern.These blocks follow the same pattern established at line 163-164: immediate re-raise of
MemoryError/RecursionErrorwithout logging, withexc_info=Trueadded to the subsequent general exception handlers. The consistency is good.Also applies to: 253-254, 336-337, 665-666, 769-770
| except CoordinationPhaseError as exc: | ||
| logger.warning( | ||
| API_COORDINATION_FAILED, | ||
| task_id=task_id, | ||
| phase=exc.phase, | ||
| error=str(exc), | ||
| ) | ||
| client_msg = f"Coordination failed at phase {exc.phase!r}" | ||
| _publish_ws_event( | ||
| request, | ||
| WsEventType.COORDINATION_FAILED, | ||
| { | ||
| "task_id": task_id, | ||
| "phase": exc.phase, | ||
| "error": client_msg, | ||
| }, | ||
| ) | ||
| raise ApiValidationError(client_msg) from exc |
There was a problem hiding this comment.
Don't collapse internal coordination failures into ApiValidationError.
CoordinationPhaseError also wraps server-side faults from the coordinator stack (for example the no-provider decomposition path in src/ai_company/engine/coordination/factory.py and runtime failures wrapped by MultiAgentCoordinator). Turning the whole class into a 4xx response tells clients their request was invalid when the real problem is a 5xx/misconfiguration path. Reserve ApiValidationError for genuinely user-correctable phases and let internal failures surface as server errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/api/controllers/coordination.py` around lines 233 - 250, The
current except block turns every CoordinationPhaseError into an
ApiValidationError (4xx); instead, update the except in coordination.py to
distinguish user-correctable phases from internal/server faults: inspect
exc.phase (or an explicit flag on CoordinationPhaseError if available) and only
raise ApiValidationError for known client-fixable phases (e.g.,
validation/input-related phases); for internal phases (e.g.,
decomposition/provider resolution, runtime MultiAgentCoordinator failures) log
the error, publish the same WsEvent via _publish_ws_event with
WsEventType.COORDINATION_FAILED, and raise an appropriate server-side error (or
re-raise the original exception) so it surfaces as a 5xx; keep the existing
logger.warning and _publish_ws_event behavior but change the exception type
raised based on the phase/flag in the CoordinationPhaseError.
- Add inline comment to dispatchers.py explaining bare re-raise pattern for MemoryError/RecursionError (logging omitted intentionally as it may itself trigger MemoryError; these are built-in exceptions) - Move COORDINATION_FACTORY_BUILT log to after successful construction - Add structured logging before ValueError raises in factory mismatched deps paths (decomposition and workspace) - Deepen factory tests: verify provider/model wiring into decomposition strategy, task_engine injection, workspace service creation, min_score forwarding to scorer, shutdown_manager/engine forwarding to executor - Add workspace mismatched-deps tests (strategy-only, config-only)
| def to_coordination_config( | ||
| self, | ||
| *, | ||
| max_concurrency_per_wave: int | None = None, | ||
| fail_fast: bool | None = None, | ||
| ) -> CoordinationConfig: | ||
| """Convert to a per-run ``CoordinationConfig``. | ||
|
|
||
| Request-level overrides take precedence over section defaults. | ||
|
|
||
| Args: | ||
| max_concurrency_per_wave: Override for max concurrency. | ||
| fail_fast: Override for fail-fast behaviour. | ||
|
|
||
| Returns: | ||
| A ``CoordinationConfig`` with merged values. | ||
| """ | ||
| return CoordinationConfig( | ||
| max_concurrency_per_wave=( | ||
| max_concurrency_per_wave | ||
| if max_concurrency_per_wave is not None | ||
| else self.max_concurrency_per_wave | ||
| ), | ||
| fail_fast=fail_fast if fail_fast is not None else self.fail_fast, | ||
| enable_workspace_isolation=self.enable_workspace_isolation, | ||
| base_branch=self.base_branch, | ||
| ) |
There was a problem hiding this comment.
topology field silently dropped — YAML setting has no runtime effect
CoordinationSectionConfig.topology is documented as "Default coordination topology" and exposed in the YAML coordination: block, but to_coordination_config() never forwards it to CoordinationConfig. CoordinationConfig has no topology field, and the factory only uses it for a logger.debug call. A user who sets topology: CENTRALIZED in their YAML gets no error and no effect — the TopologySelector still resolves topology dynamically from routing decisions.
Either:
- Add a
topologyfield toCoordinationConfigand pass it through here soTopologySelectorcan use it as a default/override, or - Remove the
topologyfield fromCoordinationSectionConfig(since it's already controlled throughauto_topology_rules), or - Update the field description to make clear it is purely informational / a logging hint.
As-is, this is a broken configuration contract: operators configure a value that the system silently ignores.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/engine/coordination/section_config.py
Line: 57-83
Comment:
**`topology` field silently dropped — YAML setting has no runtime effect**
`CoordinationSectionConfig.topology` is documented as "Default coordination topology" and exposed in the YAML `coordination:` block, but `to_coordination_config()` never forwards it to `CoordinationConfig`. `CoordinationConfig` has no `topology` field, and the factory only uses it for a `logger.debug` call. A user who sets `topology: CENTRALIZED` in their YAML gets no error and no effect — the `TopologySelector` still resolves topology dynamically from routing decisions.
Either:
- Add a `topology` field to `CoordinationConfig` and pass it through here so `TopologySelector` can use it as a default/override, **or**
- Remove the `topology` field from `CoordinationSectionConfig` (since it's already controlled through `auto_topology_rules`), **or**
- Update the field description to make clear it is purely informational / a logging hint.
As-is, this is a broken configuration contract: operators configure a value that the system silently ignores.
How can I resolve this? If you propose a fix, please make it concise.| logger.warning( | ||
| COORDINATION_FACTORY_BUILT, | ||
| note="Mismatched workspace dependencies", | ||
| given=given, | ||
| missing=missing, | ||
| ) | ||
| raise ValueError(msg) |
There was a problem hiding this comment.
Wrong event constant on error path
COORDINATION_FACTORY_BUILT semantically means "factory construction succeeded", so using it here on the mismatched-deps error path is misleading. Log aggregators or dashboards that filter on this event name will silently include failed construction attempts alongside successful ones.
Compare with _build_decomposition_strategy directly above, which correctly uses DECOMPOSITION_FAILED for its analogous error path.
| logger.warning( | |
| COORDINATION_FACTORY_BUILT, | |
| note="Mismatched workspace dependencies", | |
| given=given, | |
| missing=missing, | |
| ) | |
| raise ValueError(msg) | |
| logger.warning( | |
| COORDINATION_FACTORY_BUILT, |
should be replaced with a dedicated failure event (e.g. COORDINATION_FACTORY_BUILD_FAILED) or reuse an existing error-scoped constant.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/engine/coordination/factory.py
Line: 147-153
Comment:
**Wrong event constant on error path**
`COORDINATION_FACTORY_BUILT` semantically means "factory construction succeeded", so using it here on the mismatched-deps error path is misleading. Log aggregators or dashboards that filter on this event name will silently include failed construction attempts alongside successful ones.
Compare with `_build_decomposition_strategy` directly above, which correctly uses `DECOMPOSITION_FAILED` for its analogous error path.
```suggestion
logger.warning(
COORDINATION_FACTORY_BUILT,
```
should be replaced with a dedicated failure event (e.g. `COORDINATION_FACTORY_BUILD_FAILED`) or reuse an existing error-scoped constant.
How can I resolve this? If you propose a fix, please make it concise.| @model_validator(mode="after") | ||
| def _validate_unique_agent_names(self) -> Self: | ||
| """Reject duplicate agent names.""" | ||
| if self.agent_names is not None: | ||
| seen: set[str] = set() | ||
| for name in self.agent_names: | ||
| lower = name.lower() | ||
| if lower in seen: | ||
| msg = f"Duplicate agent name: {name!r}" | ||
| raise ValueError(msg) | ||
| seen.add(lower) | ||
| return self |
There was a problem hiding this comment.
Case-insensitive dedup may reject valid distinct agents
_validate_unique_agent_names normalises names to lowercase before checking for duplicates (lower = name.lower()), but registry.get_by_name(name) is called with the original, un-normalised string. If the registry is case-sensitive, "Alice" and "alice" are two different agents, yet this validator would reject them as duplicates at the request boundary with a confusing Duplicate agent name: 'alice' error.
If the registry is case-insensitive, the dedup is correct but should document that assumption. If it is case-sensitive (or the behaviour is unspecified), the comparison should match registry semantics:
seen: set[str] = set()
for name in self.agent_names:
if name in seen: # exact-match — mirrors registry lookup
msg = f"Duplicate agent name: {name!r}"
raise ValueError(msg)
seen.add(name)Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/api/dto.py
Line: 302-313
Comment:
**Case-insensitive dedup may reject valid distinct agents**
`_validate_unique_agent_names` normalises names to lowercase before checking for duplicates (`lower = name.lower()`), but `registry.get_by_name(name)` is called with the original, un-normalised string. If the registry is case-sensitive, `"Alice"` and `"alice"` are two different agents, yet this validator would reject them as duplicates at the request boundary with a confusing `Duplicate agent name: 'alice'` error.
If the registry is case-insensitive, the dedup is correct but should document that assumption. If it is case-sensitive (or the behaviour is unspecified), the comparison should match registry semantics:
```python
seen: set[str] = set()
for name in self.agent_names:
if name in seen: # exact-match — mirrors registry lookup
msg = f"Duplicate agent name: {name!r}"
raise ValueError(msg)
seen.add(name)
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Pull request overview
Adds first-class multi-agent coordination support to the API/runtime by wiring a MultiAgentCoordinator through config, app state, and a new /tasks/{task_id}/coordinate endpoint, plus extensive unit/integration coverage and documentation updates.
Changes:
- Introduces coordination config bridge (
CoordinationSectionConfig) and abuild_coordinator()factory for fully wiring coordinator dependencies. - Adds coordination API endpoint + DTOs + WS event publication, and extends
AppState/create_app()to carry coordinator + agent registry. - Improves coordination error handling/logging (incl. MemoryError/RecursionError re-raise guards) and expands observability event constants.
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/engine/test_coordination_section_config.py | Unit tests for new YAML coordination section config defaults/validation/merging. |
| tests/unit/engine/test_coordination_factory.py | Unit tests for build_coordinator() wiring and placeholder decomposition strategy behavior. |
| tests/unit/engine/test_agent_engine.py | Tests for new optional coordinator plumbing on AgentEngine. |
| tests/unit/config/conftest.py | Updates RootConfig test factory to include coordination section. |
| tests/unit/api/test_state.py | Adds coverage for AppState coordinator + agent registry accessors. |
| tests/unit/api/test_dto.py | Adds validation tests for coordination request/response DTOs. |
| tests/unit/api/controllers/test_coordination.py | Controller tests for happy-path and failure modes of the new endpoint. |
| tests/integration/engine/test_coordination_wiring.py | End-to-end wiring test ensuring bootstrap → API → coordination path works. |
| src/ai_company/observability/events/coordination.py | Adds factory event constant for coordination build observability. |
| src/ai_company/observability/events/api.py | Adds API-level coordination event constants. |
| src/ai_company/engine/coordination/service.py | Enhances coordinator pipeline error handling and update-parent phase behavior. |
| src/ai_company/engine/coordination/section_config.py | New company-level coordination section config model + merge to per-run config. |
| src/ai_company/engine/coordination/factory.py | New coordinator factory + placeholder decomposition strategy + workspace wiring logic. |
| src/ai_company/engine/coordination/dispatchers.py | Adds MemoryError/RecursionError re-raise guards + richer exception logging. |
| src/ai_company/engine/coordination/init.py | Re-exports build_coordinator and CoordinationSectionConfig. |
| src/ai_company/engine/agent_engine.py | Adds optional coordinator dependency + coordinate() convenience method. |
| src/ai_company/config/schema.py | Extends RootConfig schema with coordination section. |
| src/ai_company/config/defaults.py | Adds default empty coordination config section. |
| src/ai_company/api/ws_models.py | Adds WS event types for coordination lifecycle events. |
| src/ai_company/api/state.py | Extends AppState with coordinator + agent registry services and accessors. |
| src/ai_company/api/dto.py | Adds coordination request/response DTO models. |
| src/ai_company/api/controllers/coordination.py | New REST endpoint to trigger coordination and publish WS events. |
| src/ai_company/api/controllers/approvals.py | Refactors ChannelsPlugin lookup to use shared helper. |
| src/ai_company/api/controllers/init.py | Registers the new coordination controller in ALL_CONTROLLERS. |
| src/ai_company/api/channels.py | Adds shared get_channels_plugin() helper. |
| src/ai_company/api/app.py | Plumbs coordinator + agent registry through create_app() into AppState. |
| docs/roadmap/index.md | Updates roadmap to reflect multi-agent coordination as part of agent engine. |
| docs/design/operations.md | Documents the new coordination endpoint in the API surface table. |
| docs/design/engine.md | Documents engine-level coordinate() and coordination section config keys. |
| CLAUDE.md | Updates repo module overview + event-constant guidance to include coordination additions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.warning( | ||
| DECOMPOSITION_FAILED, | ||
| note="Decomposition attempted without LLM provider", | ||
| ) |
| msg = ( | ||
| f"Workspace isolation requires both workspace_strategy and " | ||
| f"workspace_config, but only {given} was supplied (missing {missing})" | ||
| ) | ||
| logger.warning( | ||
| COORDINATION_FACTORY_BUILT, | ||
| note="Mismatched workspace dependencies", | ||
| given=given, | ||
| missing=missing, | ||
| ) | ||
| raise ValueError(msg) |
🤖 I have created a release *beep* *boop* --- ## [0.2.0](v0.1.4...v0.2.0) (2026-03-15) ##First probably usable release? Most likely not no and everything will break ### Features * add /get/ installation page for CLI installer ([#413](#413)) ([6a47e4a](6a47e4a)) * add cross-platform Go CLI for container lifecycle management ([#401](#401)) ([0353d9e](0353d9e)), closes [#392](#392) * add explicit ScanOutcome signal to OutputScanResult ([#394](#394)) ([be33414](be33414)), closes [#284](#284) * add meeting scheduler, event-triggered meetings, and Go CLI lint fixes ([#407](#407)) ([5550fa1](5550fa1)) * wire MultiAgentCoordinator into runtime ([#396](#396)) ([7a9e516](7a9e516)) ### Bug Fixes * CLA signatures branch + declutter repo root ([#409](#409)) ([cabe953](cabe953)) * correct Release Please branch name in release workflow ([#410](#410)) ([515d816](515d816)) * replace slsa-github-generator with attest-build-provenance, fix DAST ([#424](#424)) ([eeaadff](eeaadff)) * resolve CodeQL path-injection alerts in Go CLI ([#412](#412)) ([f41bf16](f41bf16)) ### Refactoring * rename package from ai_company to synthorg ([#422](#422)) ([df27c6e](df27c6e)), closes [#398](#398) ### Tests * add fuzz and property-based testing across all layers ([#421](#421)) ([115a742](115a742)) ### CI/CD * add SLSA L3 provenance for CLI binaries and container images ([#423](#423)) ([d3dc75d](d3dc75d)) * bump the major group with 4 updates ([#405](#405)) ([20c7a04](20c7a04)) ### Maintenance * bump github.com/spf13/cobra from 1.9.1 to 1.10.2 in /cli in the minor-and-patch group ([#402](#402)) ([e31edbb](e31edbb)) * narrow BSL Additional Use Grant and add CLA ([#408](#408)) ([5ab15bd](5ab15bd)), closes [#406](#406) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
Closes #393
Wire the existing
MultiAgentCoordinator(built in PR #329) into production code paths — config schema, AppState, API layer, AgentEngine, and WebSocket events.CoordinationSectionConfigbridging the YAMLcoordination:section to per-runCoordinationConfig, withextra="forbid"and topology/concurrency/workspace fields. Added toRootConfig.build_coordinator()constructs the full dependency tree (classifier → strategy → decomposition → scorer → topology → routing → executor → coordinator) from config + AgentEngine.coordinatorparameter, property, andcoordinate()convenience method with log-before-raise on missing coordinator.POST /api/v1/tasks/{task_id}/coordinateonCoordinationController— resolves agents (by name or all active), builds context, delegates to coordinator, publishes WS events, maps to response DTOs.coordinator+agent_registryslots with_require_servicepattern (503 when missing).WsEventTypeentries (coordination.started/phase_completed/completed/failed).CoordinateTaskRequest(withmin_length=1+ uniqueness on agent_names,le=50on max_concurrency,fail_fast: bool | Nonefor section default preservation),CoordinationPhaseResponse(with success/error consistency validator),CoordinationResultResponse(with@computed_fieldforis_success).MemoryError/RecursionErrorguards to 10 pre-existingexcept Exceptionblocks inservice.py(4) anddispatchers.py(6).Test plan
CoordinationSectionConfig(12 tests),build_coordinatorfactory (9 tests),CoordinationController(8 tests including is_success=False path),AppStatecoordinator/agent_registry (8 tests),AgentEngine.coordinate()(4 tests)Pre-reviewed by 11 agents, 30 findings addressed.