Skip to content

feat: wire MultiAgentCoordinator into runtime#396

Merged
Aureliolo merged 6 commits intomainfrom
feat/coordinator-runtime-wiring
Mar 14, 2026
Merged

feat: wire MultiAgentCoordinator into runtime#396
Aureliolo merged 6 commits intomainfrom
feat/coordinator-runtime-wiring

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

Closes #393

Wire the existing MultiAgentCoordinator (built in PR #329) into production code paths — config schema, AppState, API layer, AgentEngine, and WebSocket events.

  • Config: Add CoordinationSectionConfig bridging the YAML coordination: section to per-run CoordinationConfig, with extra="forbid" and topology/concurrency/workspace fields. Added to RootConfig.
  • Factory: build_coordinator() constructs the full dependency tree (classifier → strategy → decomposition → scorer → topology → routing → executor → coordinator) from config + AgentEngine.
  • AgentEngine: Add optional coordinator parameter, property, and coordinate() convenience method with log-before-raise on missing coordinator.
  • API: POST /api/v1/tasks/{task_id}/coordinate on CoordinationController — resolves agents (by name or all active), builds context, delegates to coordinator, publishes WS events, maps to response DTOs.
  • AppState/create_app: Add coordinator + agent_registry slots with _require_service pattern (503 when missing).
  • WebSocket: 4 new WsEventType entries (coordination.started/phase_completed/completed/failed).
  • DTOs: CoordinateTaskRequest (with min_length=1 + uniqueness on agent_names, le=50 on max_concurrency, fail_fast: bool | None for section default preservation), CoordinationPhaseResponse (with success/error consistency validator), CoordinationResultResponse (with @computed_field for is_success).
  • Safety: Added MemoryError/RecursionError guards to 10 pre-existing except Exception blocks in service.py (4) and dispatchers.py (6).
  • Docs: Updated CLAUDE.md package structure (api/, engine/) and logging event constants.

Test plan

  • Unit tests: CoordinationSectionConfig (12 tests), build_coordinator factory (9 tests), CoordinationController (8 tests including is_success=False path), AppState coordinator/agent_registry (8 tests), AgentEngine.coordinate() (4 tests)
  • Integration test: full bootstrap → create_app → TestClient → create task → coordinate
  • All 7805 tests pass, 0 failures
  • Coverage above 80%
  • ruff lint + format clean
  • mypy strict passes (956 files)
  • Pre-commit hooks pass

Pre-reviewed by 11 agents, 30 findings addressed.

Copilot AI review requested due to automatic review settings March 14, 2026 14:37
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 14, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 14, 2026

Caution

Review failed

Pull request was closed or merged during review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • POST /tasks/{task_id}/coordinate to run multi‑agent coordination; returns phases, wave count, durations, cost and overall success; app can be wired with a coordinator and agent registry; coordination config and workspace isolation supported.
  • Documentation

    • Added coordination docs and updated API/event/logging guidance for coordination lifecycle and factory build.
  • Tests

    • Many new unit and integration tests covering API, DTOs, factory, config, engine wiring and error paths.

Walkthrough

Wires MultiAgentCoordinator into runtime: adds API endpoint to trigger coordination, coordinator factory and config, AgentEngine/AppState wiring, coordination DTOs/events, dispatcher/service error-path adjustments, channels helper, and comprehensive unit and integration tests.

Changes

Cohort / File(s) Summary
API bootstrap & state
src/ai_company/api/app.py, src/ai_company/api/state.py
create_app accepts optional coordinator and agent_registry; AppState stores/accesses them and exposes has_ helpers that raise 503 when missing.
API controllers
src/ai_company/api/controllers/coordination.py, src/ai_company/api/controllers/__init__.py
New CoordinationController with POST /tasks/{task_id}/coordinate endpoint: access checks, task lookup, agent resolution, context build, WS event publishing, coordination execution, result mapping, and error handling; controller exported.
API DTOs & WS models
src/ai_company/api/dto.py, src/ai_company/api/ws_models.py
Adds CoordinateTaskRequest, CoordinationPhaseResponse, CoordinationResultResponse (validation + computed is_success) and four coordination WS event types.
Channels & approvals
src/ai_company/api/channels.py, src/ai_company/api/controllers/approvals.py
Adds get_channels_plugin() helper; approvals helper renamed to _require_channels_plugin() and now uses the helper and raises when missing.
Engine integration
src/ai_company/engine/agent_engine.py
AgentEngine gets optional coordinator ctor param, coordinator property, and async coordinate() delegating to coordinator (raises ExecutionStateError if not configured).
Coordination factory & config
src/ai_company/engine/coordination/factory.py, src/ai_company/engine/coordination/section_config.py, src/ai_company/engine/coordination/__init__.py
Adds build_coordinator factory wiring decomposition, routing, parallel executor, optional workspace isolation and task engine; introduces CoordinationSectionConfig and to_coordination_config(); re-exports added.
Coordination runtime behavior
src/ai_company/engine/coordination/service.py, src/ai_company/engine/coordination/dispatchers.py
Explicitly re-raise MemoryError/RecursionError in multiple phases/dispatchers; add exc_info=True to logs and minor rollup/control-flow tweaks.
Config & schema defaults
src/ai_company/config/defaults.py, src/ai_company/config/schema.py
Adds top-level coordination section to defaults and coordination: CoordinationSectionConfig to RootConfig schema.
Observability events
src/ai_company/observability/events/coordination.py, src/ai_company/observability/events/api.py
Adds COORDINATION_FACTORY_BUILT and API coordination event constants (api.coordination.*, including agent_resolve_failed).
Tests — integration & unit
tests/integration/engine/test_coordination_wiring.py, tests/unit/api/controllers/test_coordination.py, tests/unit/**/test_coordination_*.py, tests/unit/api/test_state.py, tests/unit/config/conftest.py
Adds integration test covering bootstrap→API→coordination and many unit tests: DTOs, AppState accessors, AgentEngine delegation, factory, section config, and controller behaviors.
Docs & guidance
docs/**, CLAUDE.md, docs/design/operations.md
Documents engine.coordinate(), coordination config examples, new API endpoint in operations table, and updated event/logging guidance.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant API as API Controller
    participant Registry as Agent Registry
    participant Coordinator as MultiAgent<br/>Coordinator
    participant Engine as Agent Engine
    participant MB as Message Bus

    Client->>API: POST /api/v1/tasks/{id}/coordinate
    API->>API: Validate access & fetch task
    API->>Registry: Resolve agents (explicit or registry)
    Registry-->>API: Agent list
    API->>API: Build CoordinationContext
    API->>MB: Publish api.coordination.started
    API->>Coordinator: coordinate(context)
    Coordinator->>Coordinator: Decompose → Route → Dispatch → Rollup → Update parent
    Coordinator->>Engine: Dispatch waves (execute)
    Engine-->>Coordinator: Wave results
    alt success
        Coordinator-->>API: CoordinationResult
        API->>MB: Publish api.coordination.completed
        API-->>Client: 200 + result
    else phase failure
        Coordinator-->>API: CoordinationResult (phase error)
        API->>MB: Publish api.coordination.phase_completed / failed
        API-->>Client: 200 + result (is_success false)
    else no coordinator
        API-->>Client: 503 Service Unavailable
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: wire MultiAgentCoordinator into runtime' accurately summarizes the main change: integrating the existing MultiAgentCoordinator into production code paths.
Description check ✅ Passed The description comprehensively details all changes made: config integration, factory, API endpoint, AppState, WebSocket events, DTOs, safety improvements, and tests, clearly relating to the changeset.
Linked Issues check ✅ Passed The PR successfully addresses all requirements from #393: AgentEngine coordinator integration, REST API endpoint, coordinator bootstrap factory, RootConfig coordination schema, WebSocket event publishing, and integration tests.
Out of Scope Changes check ✅ Passed All code changes are directly scoped to wiring MultiAgentCoordinator into runtime. Minor refactoring in channels.py (_get_channels_plugin extraction) is a supportive change enabling the coordination controller's WebSocket publishing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/coordinator-runtime-wiring
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/coordinator-runtime-wiring
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly advances the multi-agent capabilities of the system by integrating the previously developed MultiAgentCoordinator into the core application flow. It enables users to initiate and monitor complex multi-agent coordination tasks directly through a new API endpoint, backed by a robust configuration system and enhanced error handling. The changes ensure that coordination processes are seamlessly woven into the application's architecture, from configuration and service instantiation to API exposure and real-time feedback.

Highlights

  • Multi-Agent Coordination Integration: The existing MultiAgentCoordinator has been fully integrated into the application's runtime, including the configuration schema, AppState, API layer, AgentEngine, and WebSocket event system.
  • New API Endpoint: A new API endpoint, POST /api/v1/tasks/{task_id}/coordinate, has been added to trigger multi-agent coordination for a given task, allowing for agent selection and configuration overrides.
  • Configuration and Factory: A new CoordinationSectionConfig has been introduced to bridge YAML configuration to per-run CoordinationConfig, and a build_coordinator() factory function now constructs the full dependency tree for the coordinator.
  • Enhanced Robustness: Error handling has been improved by adding MemoryError and RecursionError guards to several existing except Exception blocks in core coordination and service logic.
  • WebSocket Events and DTOs: New WebSocket event types (coordination.started, coordination.phase_completed, coordination.completed, coordination.failed) and data transfer objects (CoordinateTaskRequest, CoordinationPhaseResponse, CoordinationResultResponse) have been added to support real-time updates and structured communication for coordination.
Changelog
  • CLAUDE.md
    • Updated package structure documentation to include the new coordination endpoint in the API and the coordination config/factory in the engine.
    • Added new API coordination event constants to the logging event guidelines.
  • src/ai_company/api/app.py
    • Imported MultiAgentCoordinator and AgentRegistryService.
    • Added coordinator and agent_registry as optional parameters to create_app.
    • Passed coordinator and agent_registry to the AppState constructor.
  • src/ai_company/api/controllers/init.py
    • Imported CoordinationController.
    • Added CoordinationController to the list of exported controllers.
  • src/ai_company/api/controllers/coordination.py
    • Added new file coordination.py.
    • Implemented CoordinationController to handle POST /api/v1/tasks/{task_id}/coordinate requests.
    • Included logic for resolving agents, building coordination context, executing coordination, and publishing WebSocket events.
    • Incorporated error handling for task not found, agent resolution failures, and service unavailability.
  • src/ai_company/api/dto.py
    • Added CoordinateTaskRequest DTO for coordination request payloads, including validation for agent names, subtasks, and concurrency.
    • Added CoordinationPhaseResponse DTO for individual coordination phase results, with consistency validation.
    • Added CoordinationResultResponse DTO for the overall coordination result, including a computed is_success field.
  • src/ai_company/api/state.py
    • Imported MultiAgentCoordinator and AgentRegistryService.
    • Added _coordinator and _agent_registry to AppState's __slots__.
    • Included coordinator and agent_registry as optional parameters in AppState.__init__.
    • Added properties coordinator, has_coordinator, and agent_registry with service availability checks.
  • src/ai_company/api/ws_models.py
    • Added new WsEventType entries: COORDINATION_STARTED, COORDINATION_PHASE_COMPLETED, COORDINATION_COMPLETED, and COORDINATION_FAILED.
  • src/ai_company/config/defaults.py
    • Added an empty dictionary for the coordination section to the default configuration.
  • src/ai_company/config/schema.py
    • Imported CoordinationSectionConfig.
    • Added coordination field to RootConfig with CoordinationSectionConfig as its type and default factory.
  • src/ai_company/engine/agent_engine.py
    • Imported ExecutionStateError, CoordinationContext, CoordinationResult, and MultiAgentCoordinator.
    • Added an optional coordinator parameter to the AgentEngine constructor.
    • Added a coordinator property to access the MultiAgentCoordinator.
    • Implemented an asynchronous coordinate method that delegates to the MultiAgentCoordinator or raises an ExecutionStateError if not configured.
  • src/ai_company/engine/coordination/init.py
    • Imported build_coordinator and CoordinationSectionConfig.
    • Added build_coordinator and CoordinationSectionConfig to the module's __all__ export list.
  • src/ai_company/engine/coordination/dispatchers.py
    • Added MemoryError and RecursionError guards to exception handling blocks in _setup_workspaces, _merge_workspaces, _teardown_workspaces, _execute_waves, _setup_wave, and _execute_wave functions.
  • src/ai_company/engine/coordination/factory.py
    • Added new file factory.py.
    • Implemented build_coordinator function to construct a fully wired MultiAgentCoordinator from various service dependencies and configuration.
    • Included _NoProviderDecompositionStrategy as a placeholder when no LLM provider is configured for decomposition.
  • src/ai_company/engine/coordination/section_config.py
    • Added new file section_config.py.
    • Defined CoordinationSectionConfig for company-level coordination settings, including topology, concurrency, fail-fast behavior, and workspace isolation.
    • Provided a to_coordination_config method to convert section config to a per-run CoordinationConfig with request-level overrides.
  • src/ai_company/engine/coordination/service.py
    • Added MemoryError and RecursionError guards to exception handling blocks in coordinate, _phase_dispatch, _phase_rollup, and _phase_update_parent methods.
    • Modified _phase_update_parent to log when parent task status update is skipped due to a failed rollup phase.
  • src/ai_company/observability/events/api.py
    • Added new constants for API coordination events: API_COORDINATION_STARTED, API_COORDINATION_COMPLETED, API_COORDINATION_FAILED, and API_COORDINATION_AGENT_RESOLVE_FAILED.
  • tests/integration/engine/test_coordination_wiring.py
    • Added new file test_coordination_wiring.py.
    • Implemented an integration test to validate the end-to-end wiring of the coordinator from configuration to API interaction, including task creation and coordination triggering.
  • tests/unit/api/controllers/test_coordination.py
    • Added new file test_coordination.py.
    • Implemented unit tests for the CoordinationController, covering happy path scenarios, specific agent coordination, failed phases, and various error conditions like task not found, unknown agents, no active agents, and coordination phase errors.
  • tests/unit/api/test_state.py
    • Added unit tests for AppState's coordinator property, verifying its behavior when set or not set.
    • Added unit tests for AppState's agent_registry property, verifying its behavior when set or not set.
  • tests/unit/config/conftest.py
    • Added CoordinationSectionConfig to the RootConfigFactory for testing purposes.
  • tests/unit/engine/test_agent_engine.py
    • Added unit tests for AgentEngine's coordinator property, confirming its default None value and correct return when set.
    • Added unit tests for AgentEngine's coordinate method, verifying it raises an error when no coordinator is configured and delegates correctly when one is present.
  • tests/unit/engine/test_coordination_factory.py
    • Added new file test_coordination_factory.py.
    • Implemented unit tests for the build_coordinator factory function, covering various scenarios including presence/absence of LLM provider, task engine, workspace dependencies, and custom configurations.
    • Included tests for the _NoProviderDecompositionStrategy to ensure it raises the expected error.
  • tests/unit/engine/test_coordination_section_config.py
    • Added new file test_coordination_section_config.py.
    • Implemented unit tests for CoordinationSectionConfig, verifying default values, custom value propagation, request-level overrides, and validation constraints.
Activity
  • The pull request author, Aureliolo, addressed 30 findings identified during a pre-review by 11 agents, indicating a thorough review and refinement process before submission.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new multi-agent coordination feature, including a dedicated API endpoint (/tasks/{task_id}/coordinate), new data transfer objects (DTOs) for coordination requests and responses, and WebSocket event types for real-time updates. It integrates a MultiAgentCoordinator service into the application's state and AgentEngine, allowing for dynamic agent selection and task decomposition. Configuration for this feature is managed through a new CoordinationSectionConfig in the RootConfig, and a factory (build_coordinator) is provided to wire up the coordination services. The changes also include comprehensive unit and integration tests for the new API and coordination logic. A review comment suggests improving observability by using a more specific logging event (COORDINATION_FAILED) instead of COORDINATION_STARTED when decomposition fails due to a missing LLM provider.

Comment on lines +67 to +70
logger.warning(
COORDINATION_STARTED,
note="Decomposition attempted without LLM provider",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The log event COORDINATION_STARTED is misleading here as this code path represents an immediate failure. Using a more specific failure event like COORDINATION_FAILED with additional context (phase) would improve observability and make it easier to track down this specific failure mode. You'll also need to update the imports to include COORDINATION_FAILED from ai_company.observability.events.coordination.

Suggested change
logger.warning(
COORDINATION_STARTED,
note="Decomposition attempted without LLM provider",
)
logger.warning(
COORDINATION_FAILED,
phase="decompose",
note="Decomposition attempted without LLM provider",
)

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 14, 2026

Greptile Summary

This PR wires the pre-existing MultiAgentCoordinator into the full production stack — config schema, AppState, create_app, AgentEngine, a new REST controller, WebSocket events, and response DTOs. The overall design is solid: symmetric service-availability guards, safe MemoryError/RecursionError handling, parallelised agent resolution, and well-tested layering. One broken configuration contract and one semantic mismatch in the DTO layer need attention before this is production-safe.

Key findings:

  • CoordinationSectionConfig.topology is silently dropped — the field is documented as "Default coordination topology" and surfaces in YAML, but to_coordination_config() never forwards it to CoordinationConfig (which has no topology field). The TopologySelector always resolves topology dynamically, so any explicit YAML value is ignored without error.
  • Case-insensitive deduplication in CoordinateTaskRequest._validate_unique_agent_names compares names via name.lower() but registry.get_by_name is called with the original casing. If the registry is case-sensitive, valid requests containing agents like "Alice" and "alice" will be incorrectly rejected.
  • COORDINATION_FACTORY_BUILT (a success-scoped event constant) is used on the error/warning path in _build_workspace_service, inconsistent with how the adjacent _build_decomposition_strategy uses DECOMPOSITION_FAILED for its analogous error path.

Confidence Score: 3/5

  • Not safe to merge until the topology configuration contract and the case-sensitivity of agent-name deduplication are resolved.
  • The topology field in CoordinationSectionConfig is a broken configuration contract: operators can set it in YAML and see it silently ignored, leading to unexpected runtime behaviour. The case-insensitive dedup in CoordinateTaskRequest may incorrectly reject valid requests if the registry is case-sensitive. Both are logic-level issues in newly introduced code paths that are otherwise well-structured and well-tested.
  • src/ai_company/engine/coordination/section_config.py (topology drop) and src/ai_company/api/dto.py (dedup case-sensitivity).

Important Files Changed

Filename Overview
src/ai_company/engine/coordination/section_config.py New config bridge class — but the topology field is never forwarded to CoordinationConfig, meaning the YAML coordination.topology setting is silently dropped and has no runtime effect.
src/ai_company/engine/coordination/factory.py Clean factory wiring the full coordinator dependency tree; uses the wrong event constant (COORDINATION_FACTORY_BUILT) on the error path of _build_workspace_service.
src/ai_company/api/controllers/coordination.py Well-structured controller with symmetric service guards, asyncio.gather for parallel agent lookup, and WS failure events for both known and unexpected exceptions; no new issues found.
src/ai_company/api/dto.py New coordination DTOs are well-validated; _validate_unique_agent_names uses case-insensitive deduplication which may not match the registry's case-sensitivity semantics.
src/ai_company/api/state.py Clean addition of coordinator and agent_registry slots with symmetric has_* guard properties following the existing _require_service pattern.
src/ai_company/engine/coordination/service.py Safety additions (MemoryError/RecursionError guards) are correct; bare re-raise pattern applied consistently throughout the pipeline phases.
src/ai_company/engine/coordination/dispatchers.py MemoryError/RecursionError guards added with bare re-raise (safer than logging during OOM), plus exc_info=True added to all error logger calls for better diagnostics.
src/ai_company/engine/agent_engine.py Minimal, clean addition of optional coordinator parameter with a coordinate() convenience method that logs before raising ExecutionStateError when not configured.
src/ai_company/config/schema.py Straightforward addition of coordination: CoordinationSectionConfig field with default_factory to RootConfig.
src/ai_company/api/ws_models.py Four new WsEventType entries added; COORDINATION_PHASE_COMPLETED is reserved but never published, which may surprise WebSocket clients expecting per-phase progress events.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/ai_company/engine/coordination/section_config.py
Line: 57-83

Comment:
**`topology` field silently dropped — YAML setting has no runtime effect**

`CoordinationSectionConfig.topology` is documented as "Default coordination topology" and exposed in the YAML `coordination:` block, but `to_coordination_config()` never forwards it to `CoordinationConfig`. `CoordinationConfig` has no `topology` field, and the factory only uses it for a `logger.debug` call. A user who sets `topology: CENTRALIZED` in their YAML gets no error and no effect — the `TopologySelector` still resolves topology dynamically from routing decisions.

Either:
- Add a `topology` field to `CoordinationConfig` and pass it through here so `TopologySelector` can use it as a default/override, **or**
- Remove the `topology` field from `CoordinationSectionConfig` (since it's already controlled through `auto_topology_rules`), **or**
- Update the field description to make clear it is purely informational / a logging hint.

As-is, this is a broken configuration contract: operators configure a value that the system silently ignores.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/ai_company/engine/coordination/factory.py
Line: 147-153

Comment:
**Wrong event constant on error path**

`COORDINATION_FACTORY_BUILT` semantically means "factory construction succeeded", so using it here on the mismatched-deps error path is misleading. Log aggregators or dashboards that filter on this event name will silently include failed construction attempts alongside successful ones.

Compare with `_build_decomposition_strategy` directly above, which correctly uses `DECOMPOSITION_FAILED` for its analogous error path.

```suggestion
        logger.warning(
            COORDINATION_FACTORY_BUILT,
```
should be replaced with a dedicated failure event (e.g. `COORDINATION_FACTORY_BUILD_FAILED`) or reuse an existing error-scoped constant.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/ai_company/api/dto.py
Line: 302-313

Comment:
**Case-insensitive dedup may reject valid distinct agents**

`_validate_unique_agent_names` normalises names to lowercase before checking for duplicates (`lower = name.lower()`), but `registry.get_by_name(name)` is called with the original, un-normalised string. If the registry is case-sensitive, `"Alice"` and `"alice"` are two different agents, yet this validator would reject them as duplicates at the request boundary with a confusing `Duplicate agent name: 'alice'` error.

If the registry is case-insensitive, the dedup is correct but should document that assumption. If it is case-sensitive (or the behaviour is unspecified), the comparison should match registry semantics:

```python
seen: set[str] = set()
for name in self.agent_names:
    if name in seen:           # exact-match — mirrors registry lookup
        msg = f"Duplicate agent name: {name!r}"
        raise ValueError(msg)
    seen.add(name)
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 1edd0b4

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class multi-agent coordination support to the backend by introducing a company-level coordination config section, a coordinator factory/service wiring path, and a new API endpoint to trigger coordination (with WebSocket event publishing). This is accompanied by unit + integration tests and new observability event constants.

Changes:

  • Introduce coordination: config section (CoordinationSectionConfig) and wire it into RootConfig defaults.
  • Add a build_coordinator() factory and expose coordinator access via AgentEngine and AppState.
  • Implement /api/v1/tasks/{task_id}/coordinate controller + DTOs + WS event types, with tests covering success/error paths.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/unit/engine/test_coordination_section_config.py Unit tests for coordination section defaults, validation, and conversion to runtime config.
tests/unit/engine/test_coordination_factory.py Unit tests for build_coordinator() construction paths and placeholder decomposition strategy behavior.
tests/unit/engine/test_agent_engine.py Adds unit tests for new AgentEngine.coordinator and AgentEngine.coordinate() delegation/error behavior.
tests/unit/config/conftest.py Updates RootConfig factory used in config tests to include coordination section.
tests/unit/api/test_state.py Adds coverage for AppState.coordinator/has_coordinator and AppState.agent_registry accessors.
tests/unit/api/controllers/test_coordination.py New controller tests for coordination endpoint (happy path + validation/error cases).
tests/integration/engine/test_coordination_wiring.py New integration-style test exercising API flow with coordinator + agent registry wired into the app.
src/ai_company/observability/events/api.py Adds API observability event constants for coordination lifecycle.
src/ai_company/engine/coordination/service.py Improves exception passthrough for Memory/Recursion errors; adjusts update-parent phase skip behavior.
src/ai_company/engine/coordination/section_config.py New Pydantic model bridging YAML coordination: section to runtime CoordinationConfig.
src/ai_company/engine/coordination/factory.py New coordinator factory wiring decomposition/routing/execution/workspace dependencies.
src/ai_company/engine/coordination/dispatchers.py Ensures Memory/Recursion errors are not swallowed by broad exception handlers.
src/ai_company/engine/coordination/init.py Exposes build_coordinator and CoordinationSectionConfig at package level.
src/ai_company/engine/agent_engine.py Adds optional coordinator injection + coordinate() helper with clear error when unset.
src/ai_company/config/schema.py Adds coordination to RootConfig.
src/ai_company/config/defaults.py Adds empty coordination section to default config dict.
src/ai_company/api/ws_models.py Adds WS event types for coordination lifecycle.
src/ai_company/api/state.py Adds coordinator + agent registry services to AppState with typed accessors and has_coordinator.
src/ai_company/api/dto.py Adds request/response DTOs for coordination endpoint.
src/ai_company/api/controllers/coordination.py New controller implementing coordinate endpoint, agent resolution, and WS event publishing.
src/ai_company/api/controllers/init.py Registers the new CoordinationController in ALL_CONTROLLERS.
src/ai_company/api/app.py Allows injecting coordinator + agent registry into create_app() / AppState.
CLAUDE.md Updates repo documentation to mention coordination endpoint/config/factory and lists new API coordination event constants.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +67 to +70
logger.warning(
COORDINATION_STARTED,
note="Decomposition attempted without LLM provider",
)
Comment on lines +114 to +120
logger.debug(
COORDINATION_STARTED,
note="Building coordinator from config",
topology=config.topology.value,
has_provider=provider is not None,
has_workspace=workspace_strategy is not None,
)
Comment on lines +515 to 523
if self._task_engine is None:
return
if rollup is None:
logger.info(
COORDINATION_PHASE_STARTED,
phase="update_parent",
note="Skipped — rollup is None (rollup phase failed)",
)
return
Comment on lines +244 to +265
ws_event_type = (
WsEventType.COORDINATION_COMPLETED
if result.is_success
else WsEventType.COORDINATION_FAILED
)
_publish_ws_event(
request,
ws_event_type,
{
"task_id": task_id,
"topology": result.topology.value,
"is_success": result.is_success,
"total_duration_seconds": result.total_duration_seconds,
},
)
logger.info(
API_COORDINATION_COMPLETED,
task_id=task_id,
topology=result.topology.value,
is_success=result.is_success,
total_duration_seconds=result.total_duration_seconds,
)
Comment on lines +3 to +9
Validates the full bootstrap-to-API path:
1. Create RootConfig with coordination section
2. Build coordinator via build_coordinator() with ManualDecompositionStrategy
3. Create TaskEngine with mock persistence
4. Create AgentRegistryService and register test agents
5. Build app via create_app() with coordinator + agent_registry
6. Use TestClient to create a task and trigger coordination
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/ai_company/api/app.py (1)

351-415: ⚠️ Potential issue | 🟡 Minor

Warn when coordination services are absent at startup.

The new coordination route can now fail through AppState.coordinator / AppState.agent_registry, but the startup warning still only tracks persistence/message_bus/cost_tracker/task_engine. If coordinator or agent_registry are the only missing dependencies, /coordinate will 503 and startup stays silent.

💡 Suggested update
 def create_app(  # noqa: PLR0913
@@
-    if (
-        persistence is None
-        or message_bus is None
-        or cost_tracker is None
-        or task_engine is None
-    ):
-        msg = (
-            "create_app called without persistence, message_bus, "
-            "cost_tracker, and/or task_engine — controllers accessing "
-            "missing services will return 503.  Use test fakes for testing."
-        )
-        logger.warning(API_APP_STARTUP, note=msg)
+    missing_services: list[str] = []
+    if persistence is None:
+        missing_services.append("persistence")
+    if message_bus is None:
+        missing_services.append("message_bus")
+    if cost_tracker is None:
+        missing_services.append("cost_tracker")
+    if task_engine is None:
+        missing_services.append("task_engine")
+    if coordinator is None:
+        missing_services.append("coordinator")
+    if agent_registry is None:
+        missing_services.append("agent_registry")
+    if missing_services:
+        logger.warning(
+            API_APP_STARTUP,
+            missing_services=tuple(missing_services),
+            note=(
+                "Controllers accessing missing services will return 503. "
+                "Use test fakes for testing."
+            ),
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/api/app.py` around lines 351 - 415, The startup warning in
create_app does not include coordinator and agent_registry, so missing
coordination services can silently cause /coordinate to 503; update the
conditional that currently checks
persistence/message_bus/cost_tracker/task_engine in create_app to also test for
coordinator is None or agent_registry is None, and expand the msg passed to
logger.warning(API_APP_STARTUP, ...) to mention coordinator and agent_registry
(or a general “coordination services”) so the warning reflects these missing
dependencies when constructing AppState.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/api/controllers/coordination.py`:
- Around line 174-189: The _get_task method calls app_state.task_engine.get_task
which can raise ServiceUnavailableError (503) and mask a more relevant
coordination error; update handling in _get_task (or in the coordinator
preflight) to first validate both services or to catch ServiceUnavailableError
from app_state.task_engine.get_task and, if the coordinator is missing/invalid,
raise the coordinator-specific error instead (or include both statuses in the
response); specifically reference app_state.task_engine, the _get_task function,
and the ServiceUnavailableError so the change either pre-validates the
coordinator state before calling get_task or wraps get_task in a try/except that
re-raises a coordinator-specific message when appropriate.

In `@src/ai_company/api/dto.py`:
- Around line 315-341: The CoordinationPhaseResponse model's
_validate_success_error_consistency currently allows empty strings for error on
failed phases; update that validator so when success is False it requires error
to be a non-empty, non-whitespace string (e.g., check error is not None and
error.strip() != "") and raise a ValueError with a clear message like "failed
phase must have a non-empty error description"; keep the existing check that a
successful phase must not have an error (error must be None) and return self at
the end.

In `@src/ai_company/api/state.py`:
- Around line 161-164: Add a boolean accessor for agent_registry to mirror other
optional services: implement a has_agent_registry property that returns whether
self._agent_registry is set (e.g., bool(self._agent_registry)) so callers can
conditionally use the service instead of triggering the 503 from agent_registry;
follow the pattern used by has_coordinator / has_task_engine / has_auth_service
and keep naming consistent with agent_registry and _require_service.

In `@src/ai_company/engine/coordination/dispatchers.py`:
- Around line 163-164: Before re-raising MemoryError or RecursionError in these
except blocks, emit a minimal low-allocation log entry with static context so
the fatal coordination failure is recorded; specifically, in each except
(MemoryError, RecursionError): block that currently just does "raise" (handling
the MemoryError/RecursionError cases at the noted locations), call the module's
existing logger (e.g., logger or process_logger) with logger.error or
logger.warning and a short static message like "Fatal coordination failure in
dispatchers: preserving MemoryError/RecursionError in <function_name_or_phase>"
(include the current function/phase/wave identifier available in scope), then
re-raise; apply the same change to all similar except blocks handling
MemoryError/RecursionError.

In `@src/ai_company/engine/coordination/factory.py`:
- Around line 67-71: The logger.warning call that currently uses
COORDINATION_STARTED when raising DecompositionError is wrong; update the event
constant to DECOMPOSITION_FAILED and change the import to pull
DECOMPOSITION_FAILED from ai_company.observability.events.decomposition (not the
coordination module). Locate the logger.warning(...) + raise
DecompositionError(msg) sequence in factory.py (the block that logs
"Decomposition attempted without LLM provider") and replace the event constant
and import so the warning emits DECOMPOSITION_FAILED.

In `@src/ai_company/engine/coordination/service.py`:
- Around line 517-523: The log uses the wrong event constant — when rollup is
None the code in the update_parent branch logs COORDINATION_PHASE_STARTED even
though the phase is being skipped; change the logger.info call to emit a
semantically correct event (e.g. COORDINATION_PHASE_SKIPPED or
COORDINATION_PHASE_COMPLETED) and keep the same phase="update_parent" and note
explaining the skip (or add a dedicated skip constant if one exists); update any
imports/usages of COORDINATION_PHASE_* in the file to use the chosen
skip/completed constant so logs accurately reflect the skipped state for rollup
being None.

In `@tests/integration/engine/test_coordination_wiring.py`:
- Around line 144-180: The test currently injects an AsyncMock coordinator
(coordinator = AsyncMock()) instead of assembling the real coordinator, so it
never exercises RootConfig.coordination or build_coordinator; replace the
AsyncMock injection in test_build_coordinator_and_coordinate_via_api with a call
to the real factory (e.g. call build_coordinator(...) or use
RootConfig.coordination to build the coordinator instance) and, if determinism
is required, configure the RootConfig or coordination strategy to a
task-id-aware manual strategy (rather than mocking the coordinator.coordinate
method); remove the AsyncMock coordinator and ensure the test calls the real
coordinator.coordinate to validate wiring and multi-agent execution.

In `@tests/unit/api/controllers/test_coordination.py`:
- Around line 129-166: The test function test_coordinate_task_success is
declared async but uses the synchronous TestClient (coordination_client.post);
make the test fully synchronous by changing async def
test_coordinate_task_success to def test_coordinate_task_success and replace the
await agent_registry.register(agent) call with a synchronous run of the
coroutine (e.g., use asyncio.get_event_loop().run_until_complete or asyncio.run
to execute agent_registry.register(agent)); keep the rest of the assertions and
mocked mock_coordinator.coordinate behavior unchanged and import asyncio if not
already present.

---

Outside diff comments:
In `@src/ai_company/api/app.py`:
- Around line 351-415: The startup warning in create_app does not include
coordinator and agent_registry, so missing coordination services can silently
cause /coordinate to 503; update the conditional that currently checks
persistence/message_bus/cost_tracker/task_engine in create_app to also test for
coordinator is None or agent_registry is None, and expand the msg passed to
logger.warning(API_APP_STARTUP, ...) to mention coordinator and agent_registry
(or a general “coordination services”) so the warning reflects these missing
dependencies when constructing AppState.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9281b6b1-453f-4f2d-8fe3-ec165972f98e

📥 Commits

Reviewing files that changed from the base of the PR and between 27a55d2 and 1490ce7.

📒 Files selected for processing (23)
  • CLAUDE.md
  • src/ai_company/api/app.py
  • src/ai_company/api/controllers/__init__.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/state.py
  • src/ai_company/api/ws_models.py
  • src/ai_company/config/defaults.py
  • src/ai_company/config/schema.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/engine/coordination/__init__.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/section_config.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/observability/events/api.py
  • tests/integration/engine/test_coordination_wiring.py
  • tests/unit/api/controllers/test_coordination.py
  • tests/unit/api/test_state.py
  • tests/unit/config/conftest.py
  • tests/unit/engine/test_agent_engine.py
  • tests/unit/engine/test_coordination_factory.py
  • tests/unit/engine/test_coordination_section_config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Agent
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python 3.14+ required; use PEP 649 native lazy annotations (no from __future__ import annotations)
Use PEP 758 except syntax: except A, B: (no parentheses) — enforced by ruff on Python 3.14
Add type hints to all public functions; enforce mypy strict mode
Use Google style docstrings required on all public classes and functions — enforced by ruff D rules
Create new objects instead of mutating existing ones; for non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves — never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use @computed_field for derived values instead of storing redundant fields; use NotBlankStr for all identifier/name fields (including optional and tuple variants)
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — prefer structured concurrency over bare create_task; existing code is being migrated incrementally
Enforce 88 character line length — ruff enforces this
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)

Files:

  • src/ai_company/api/ws_models.py
  • src/ai_company/observability/events/api.py
  • tests/unit/config/conftest.py
  • src/ai_company/api/controllers/__init__.py
  • src/ai_company/api/state.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/config/schema.py
  • src/ai_company/config/defaults.py
  • tests/unit/engine/test_coordination_factory.py
  • src/ai_company/engine/agent_engine.py
  • tests/integration/engine/test_coordination_wiring.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/__init__.py
  • tests/unit/engine/test_coordination_section_config.py
  • tests/unit/api/test_state.py
  • tests/unit/api/controllers/test_coordination.py
  • src/ai_company/api/app.py
  • src/ai_company/engine/coordination/section_config.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/api/dto.py
  • src/ai_company/engine/coordination/dispatchers.py
  • tests/unit/engine/test_agent_engine.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Every module with business logic must import get_logger from ai_company.observability and define logger = get_logger(name); never use import logging or logging.getLogger() or print() in application code
Always use logger variable name (not _logger or log)
Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Use structured kwargs in logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Maintain 80% minimum code coverage — enforced in CI
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases
Lint Python code with: uv run ruff check src/ tests/
Auto-fix lint issues with: uv run ruff check src/ tests/ --fix
Format code with: uv run ruff format src/ tests/
Type-check with strict mode: uv run mypy src/ tests/

Files:

  • src/ai_company/api/ws_models.py
  • src/ai_company/observability/events/api.py
  • src/ai_company/api/controllers/__init__.py
  • src/ai_company/api/state.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/config/schema.py
  • src/ai_company/config/defaults.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/__init__.py
  • src/ai_company/api/app.py
  • src/ai_company/engine/coordination/section_config.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/api/dto.py
  • src/ai_company/engine/coordination/dispatchers.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError; non-retryable errors raise immediately without retry; RetryExhaustedError signals all retries failed
Rate limiter respects RateLimitError.retry_after from providers — automatically pauses future requests

Files:

  • src/ai_company/api/ws_models.py
  • src/ai_company/observability/events/api.py
  • src/ai_company/api/controllers/__init__.py
  • src/ai_company/api/state.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/config/schema.py
  • src/ai_company/config/defaults.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/__init__.py
  • src/ai_company/api/app.py
  • src/ai_company/engine/coordination/section_config.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/api/dto.py
  • src/ai_company/engine/coordination/dispatchers.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, and @pytest.mark.slow
Use pytest-xdist via -n auto for parallel test execution — ALWAYS include -n auto when running pytest, never run tests sequentially
Prefer @pytest.mark.parametrize for testing similar cases
Use test-provider, test-small-001, etc. in tests instead of real vendor names
Run unit tests with: uv run pytest tests/ -m unit -n auto
Run integration tests with: uv run pytest tests/ -m integration -n auto
Run e2e tests with: uv run pytest tests/ -m e2e -n auto
Run full test suite with coverage: uv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80

Files:

  • tests/unit/config/conftest.py
  • tests/unit/engine/test_coordination_factory.py
  • tests/integration/engine/test_coordination_wiring.py
  • tests/unit/engine/test_coordination_section_config.py
  • tests/unit/api/test_state.py
  • tests/unit/api/controllers/test_coordination.py
  • tests/unit/engine/test_agent_engine.py
🧠 Learnings (11)
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly

Applied to files:

  • src/ai_company/observability/events/api.py
  • CLAUDE.md
  • src/ai_company/api/app.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/ai_company/providers/**/*.py : RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig

Applied to files:

  • tests/unit/config/conftest.py
  • src/ai_company/config/schema.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Every module with business logic must import get_logger from ai_company.observability and define logger = get_logger(__name__); never use import logging or logging.getLogger() or print() in application code

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use structured kwargs in logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Always use logger variable name (not _logger or log)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use DEBUG logging for object creation, internal flow, and entry/exit of key functions

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/ai_company/**/*.py : Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError; non-retryable errors raise immediately without retry; RetryExhaustedError signals all retries failed

Applied to files:

  • src/ai_company/engine/agent_engine.py
🧬 Code graph analysis (13)
tests/unit/config/conftest.py (1)
src/ai_company/engine/coordination/section_config.py (1)
  • CoordinationSectionConfig (15-83)
src/ai_company/api/controllers/__init__.py (1)
src/ai_company/api/controllers/coordination.py (1)
  • CoordinationController (111-313)
src/ai_company/api/state.py (3)
src/ai_company/engine/coordination/service.py (1)
  • MultiAgentCoordinator (50-580)
src/ai_company/hr/registry.py (1)
  • AgentRegistryService (32-262)
src/ai_company/engine/agent_engine.py (1)
  • coordinator (224-226)
src/ai_company/engine/coordination/service.py (1)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
src/ai_company/config/schema.py (1)
src/ai_company/engine/coordination/section_config.py (1)
  • CoordinationSectionConfig (15-83)
src/ai_company/engine/agent_engine.py (3)
src/ai_company/engine/errors.py (1)
  • ExecutionStateError (17-18)
src/ai_company/engine/coordination/models.py (2)
  • CoordinationContext (30-61)
  • CoordinationResult (128-198)
src/ai_company/engine/coordination/service.py (2)
  • MultiAgentCoordinator (50-580)
  • coordinate (92-196)
tests/integration/engine/test_coordination_wiring.py (5)
src/ai_company/api/app.py (1)
  • create_app (351-491)
src/ai_company/api/auth/config.py (1)
  • AuthConfig (28-110)
src/ai_company/hr/registry.py (1)
  • AgentRegistryService (32-262)
src/ai_company/providers/enums.py (1)
  • FinishReason (15-22)
src/ai_company/engine/coordination/models.py (2)
  • CoordinationPhaseResult (64-96)
  • CoordinationResult (128-198)
src/ai_company/engine/coordination/__init__.py (2)
src/ai_company/engine/coordination/factory.py (1)
  • build_coordinator (74-176)
src/ai_company/engine/coordination/section_config.py (1)
  • CoordinationSectionConfig (15-83)
tests/unit/api/controllers/test_coordination.py (3)
src/ai_company/engine/coordination/models.py (1)
  • CoordinationResult (128-198)
src/ai_company/api/state.py (1)
  • coordinator (152-154)
src/ai_company/engine/coordination/service.py (1)
  • coordinate (92-196)
src/ai_company/api/app.py (4)
src/ai_company/engine/coordination/service.py (1)
  • MultiAgentCoordinator (50-580)
src/ai_company/api/state.py (3)
  • task_engine (119-121)
  • coordinator (152-154)
  • agent_registry (162-164)
src/ai_company/hr/registry.py (1)
  • AgentRegistryService (32-262)
tests/unit/api/controllers/test_coordination.py (1)
  • agent_registry (80-81)
src/ai_company/engine/coordination/section_config.py (2)
src/ai_company/engine/coordination/config.py (1)
  • CoordinationConfig (8-38)
src/ai_company/engine/routing/models.py (1)
  • AutoTopologyConfig (160-207)
src/ai_company/api/dto.py (1)
src/ai_company/config/schema.py (1)
  • _validate_unique_agent_names (538-550)
src/ai_company/engine/coordination/dispatchers.py (1)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
🪛 LanguageTool
CLAUDE.md

[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...

(EG_NO_COMMA)

🔇 Additional comments (22)
src/ai_company/api/controllers/__init__.py (1)

13-13: Coordination controller wiring is consistent and complete.

Import, controller registry, and public export were updated together, so route discovery and module exports stay aligned.

Also applies to: 39-39, 53-53

src/ai_company/observability/events/api.py (1)

44-49: New coordination API event constants look good.

These additions follow the existing event taxonomy and support consistent structured logging/event emission.

src/ai_company/api/dto.py (2)

274-313: CoordinateTaskRequest validation is solid.

The bounds, optional overrides, and case-insensitive duplicate-agent guard are well implemented.


344-372: CoordinationResultResponse aggregation design is clean.

Using a computed is_success over phase outcomes avoids redundant state and keeps the model consistent.

src/ai_company/api/ws_models.py (1)

46-49: Coordination WebSocket event taxonomy is well added.

The lifecycle coverage (started, phase_completed, completed, failed) is clear and consistent.

src/ai_company/config/defaults.py (1)

42-42: Default config now correctly includes coordination.

Good addition to keep baseline config hydration aligned with the new schema.

tests/unit/config/conftest.py (1)

25-25: Test config factory wiring is correct.

Including CoordinationSectionConfig in RootConfigFactory keeps fixtures aligned with the updated root schema.

Also applies to: 100-100

src/ai_company/engine/coordination/__init__.py (1)

18-18: Coordination package exports are updated appropriately.

Re-exporting both factory and section config improves discoverability for runtime bootstrap and tests.

Also applies to: 26-26, 36-36, 43-43

tests/unit/engine/test_coordination_section_config.py (1)

15-122: Strong unit coverage for CoordinationSectionConfig.

The tests exercise defaults, immutability, override precedence, and validation/error paths thoroughly.

src/ai_company/engine/coordination/section_config.py (1)

57-83: The topology and auto_topology_rules fields are not dead configuration—they are actively used during app startup.

In src/ai_company/engine/coordination/factory.py, the build_coordinator() function receives the full CoordinationSectionConfig and uses:

  • config.topology (line 117) for logging
  • config.auto_topology_rules (line 146) to construct TopologySelector

The TopologySelector is then wired into the TaskRoutingService at build time. The coordinator singleton is created once at app startup with these topology settings baked in.

The method to_coordination_config() is for per-request runtime overrides only (max concurrency, fail-fast, workspace settings). The topology selection happens during the routing phase of coordination, not via the per-request CoordinationConfig.

			> Likely an incorrect or invalid review comment.
tests/unit/api/controllers/test_coordination.py (2)

37-48: LGTM!

Clean helper function with appropriate test data. Using date(2026, 1, 1) as a future date for test fixtures is fine.


325-342: LGTM!

Good coverage of the 503 scenario when coordinator is not configured. The test correctly uses the shared test_client fixture (without coordinator) and verifies the expected status code.

src/ai_company/api/state.py (1)

55-80: LGTM!

The new coordinator and agent_registry parameters follow the established pattern for optional service injection. The __slots__ ordering is maintained alphabetically.

tests/unit/engine/test_coordination_factory.py (2)

26-36: LGTM!

Good basic test for the factory. The minimal configuration test ensures the factory produces a valid coordinator with default settings.


126-136: LGTM!

Good test for the placeholder strategy's error behavior. The async test correctly exercises the decompose method and verifies the expected DecompositionError with a clear message.

src/ai_company/engine/agent_engine.py (2)

223-251: LGTM!

Clean implementation of the coordinator property and coordinate() method. The error handling follows the "log-before-raise" pattern specified in the PR objectives, and the docstring documents all expected exceptions.


166-166: LGTM!

The coordinator parameter and initialization follow the established pattern for optional dependencies. Including has_coordinator in the creation log provides useful debugging information.

Also applies to: 212-212, 220-220

src/ai_company/engine/coordination/factory.py (1)

74-176: LGTM!

Well-structured factory with clear numbered steps matching the docstring. The deferred imports for heavy dependencies and the optional workspace service construction are appropriate design choices.

src/ai_company/api/controllers/coordination.py (3)

268-313: LGTM!

Clean agent resolution logic with proper error handling and logging. The sequential lookup by name is appropriate for typical use cases.


216-266: LGTM!

The _execute method properly handles both success and failure paths, publishing appropriate WebSocket events and logging outcomes. The mapping of CoordinationPhaseError to ApiValidationError (422) aligns with the PR's documented behavior.


89-108: LGTM!

Clean mapping from domain model to API response DTO. The comment about is_success being a @computed_field is helpful documentation.

src/ai_company/engine/coordination/service.py (1)

177-178: No changes needed—MemoryError correctly refers to the builtin class.

The custom MemoryError from src/ai_company/memory/errors.py is not imported in this file, so all references to MemoryError in the except clauses (lines 177, 212, 268, 409, 474, 564) correctly resolve to the builtin MemoryError. The syntax follows PEP 758 correctly with except MemoryError, RecursionError: and the guards properly re-raise non-recoverable system errors as intended.

Comment on lines +174 to +189
async def _get_task(
self,
app_state: AppState,
task_id: str,
) -> Task:
"""Fetch task or raise 404."""
task = await app_state.task_engine.get_task(task_id)
if task is None:
logger.warning(
API_RESOURCE_NOT_FOUND,
resource="task",
id=task_id,
)
msg = f"Task {task_id!r} not found"
raise NotFoundError(msg)
return task
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Task engine access may raise 503 before coordinator check is meaningful.

_get_task accesses app_state.task_engine, which raises ServiceUnavailableError (503) if not configured. This happens after the coordinator check (line 143), so if task_engine is missing but coordinator is present, users get a generic "Task Engine not configured" 503 instead of a more specific error about the coordination setup.

Consider whether the error messaging is acceptable or if you want to validate both services upfront.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/api/controllers/coordination.py` around lines 174 - 189, The
_get_task method calls app_state.task_engine.get_task which can raise
ServiceUnavailableError (503) and mask a more relevant coordination error;
update handling in _get_task (or in the coordinator preflight) to first validate
both services or to catch ServiceUnavailableError from
app_state.task_engine.get_task and, if the coordinator is missing/invalid, raise
the coordinator-specific error instead (or include both statuses in the
response); specifically reference app_state.task_engine, the _get_task function,
and the ServiceUnavailableError so the change either pre-validates the
coordinator state before calling get_task or wraps get_task in a try/except that
re-raises a coordinator-specific message when appropriate.

Comment on lines +144 to +180
async def test_build_coordinator_and_coordinate_via_api(self) -> None:
"""End-to-end: build coordinator, create app, coordinate task."""
# 1. Create config
config = RootConfig(company_name="test-corp")

# 2. Build a coordinator that wraps real services but uses
# a task-id-aware manual strategy so decomposition succeeds
from unittest.mock import AsyncMock

from ai_company.engine.coordination.models import (
CoordinationPhaseResult,
CoordinationResult,
)

async def _mock_coordinate(context): # type: ignore[no-untyped-def]
"""Return a realistic result keyed to the actual task."""
return CoordinationResult(
parent_task_id=context.task.id,
topology=CoordinationTopology.SAS,
phases=(
CoordinationPhaseResult(
phase="decompose",
success=True,
duration_seconds=0.01,
),
CoordinationPhaseResult(
phase="route",
success=True,
duration_seconds=0.01,
),
),
total_duration_seconds=0.05,
total_cost_usd=0.001,
)

coordinator = AsyncMock()
coordinator.coordinate.side_effect = _mock_coordinate
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This test still bypasses the runtime wiring it claims to validate.

Lines 179-180 inject an AsyncMock coordinator instead of calling build_coordinator(). That means the test never exercises RootConfig.coordination, factory assembly, or real multi-agent execution, so the main coordinator bootstrap path can regress while this still passes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/engine/test_coordination_wiring.py` around lines 144 - 180,
The test currently injects an AsyncMock coordinator (coordinator = AsyncMock())
instead of assembling the real coordinator, so it never exercises
RootConfig.coordination or build_coordinator; replace the AsyncMock injection in
test_build_coordinator_and_coordinate_via_api with a call to the real factory
(e.g. call build_coordinator(...) or use RootConfig.coordination to build the
coordinator instance) and, if determinism is required, configure the RootConfig
or coordination strategy to a task-id-aware manual strategy (rather than mocking
the coordinator.coordinate method); remove the AsyncMock coordinator and ensure
the test calls the real coordinator.coordinate to validate wiring and
multi-agent execution.

Comment on lines +129 to +166
@pytest.mark.unit
class TestCoordinationControllerHappyPath:
async def test_coordinate_task_success(
self,
coordination_client: TestClient[Any],
mock_coordinator: AsyncMock,
agent_registry: AgentRegistryService,
) -> None:
agent = _make_agent()
await agent_registry.register(agent)

resp = coordination_client.post(
"/api/v1/tasks",
json={
"title": "Test task",
"description": "A test task for coordination",
"type": "development",
"project": "proj-1",
"created_by": "api",
},
)
assert resp.status_code == 201
task_id = resp.json()["data"]["id"]

mock_coordinator.coordinate.return_value = _make_coordination_result(task_id)

resp = coordination_client.post(
f"/api/v1/tasks/{task_id}/coordinate",
json={},
)
assert resp.status_code == 200
body = resp.json()
assert body["success"] is True
assert body["data"]["parent_task_id"] == task_id
assert body["data"]["topology"] == "sas"
assert body["data"]["is_success"] is True
assert body["data"]["wave_count"] == 0
assert len(body["data"]["phases"]) == 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Test methods are async but TestClient is synchronous.

The test methods are declared async (e.g., async def test_coordinate_task_success), but Litestar's TestClient is synchronous. While pytest-asyncio may handle this, the await agent_registry.register(agent) call requires async context, so this works. However, the HTTP calls (coordination_client.post(...)) are synchronous.

For consistency and clarity, consider either:

  1. Making the tests fully synchronous and using a sync registry setup, or
  2. Using AsyncTestClient if available for true async testing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/api/controllers/test_coordination.py` around lines 129 - 166, The
test function test_coordinate_task_success is declared async but uses the
synchronous TestClient (coordination_client.post); make the test fully
synchronous by changing async def test_coordinate_task_success to def
test_coordinate_task_success and replace the await
agent_registry.register(agent) call with a synchronous run of the coroutine
(e.g., use asyncio.get_event_loop().run_until_complete or asyncio.run to execute
agent_registry.register(agent)); keep the rest of the assertions and mocked
mock_coordinator.coordinate behavior unchanged and import asyncio if not already
present.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 14, 2026

Codecov Report

❌ Patch coverage is 89.77273% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.79%. Comparing base (be33414) to head (1edd0b4).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/ai_company/api/controllers/coordination.py 76.82% 16 Missing and 3 partials ⚠️
src/ai_company/engine/coordination/service.py 46.66% 7 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #396      +/-   ##
==========================================
- Coverage   93.84%   93.79%   -0.06%     
==========================================
  Files         463      466       +3     
  Lines       21691    21944     +253     
  Branches     2086     2103      +17     
==========================================
+ Hits        20357    20582     +225     
- Misses       1032     1058      +26     
- Partials      302      304       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (3)
tests/integration/engine/test_coordination_wiring.py (1)

144-180: 🧹 Nitpick | 🔵 Trivial

Test uses mock coordinator—intentional per docstring but limits integration coverage.

The past review noted this test bypasses build_coordinator(). The docstring (lines 5-6) indicates this is intentional: "Create a mock coordinator (real build_coordinator() is unit-tested separately)." This design trades full end-to-end coverage for test isolation.

Consider whether a separate integration test that uses build_coordinator() would catch wiring regressions between RootConfig.coordination and the factory.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/engine/test_coordination_wiring.py` around lines 144 - 180,
The test replaces the real coordinator with an AsyncMock (coordinator.coordinate
side_effect) which intentionally isolates behavior but misses wiring between
RootConfig.coordination and the build_coordinator factory; add a separate
integration test that constructs a RootConfig with real coordination settings
and calls build_coordinator() (referencing build_coordinator and RootConfig) and
then invokes the coordinator.coordinate flow to ensure the factory wiring and
configuration are exercised end-to-end rather than mocked.
src/ai_company/engine/coordination/dispatchers.py (1)

163-164: ⚠️ Potential issue | 🟡 Minor

Add minimal logging before re-raising MemoryError/RecursionError.

These except MemoryError, RecursionError: raise blocks skip logging entirely. While allocating during MemoryError is risky, a static low-allocation log helps trace fatal failures. Note: the relevant snippet shows ai_company.memory.errors.MemoryError shadows the builtin—verify which exception this catches.

#!/bin/bash
# Check if the builtin MemoryError is intended or the custom one from memory.errors
rg -n "from ai_company.memory.errors import" src/ai_company/engine/coordination/dispatchers.py
rg -n "^import " src/ai_company/engine/coordination/dispatchers.py | head -20
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/engine/coordination/dispatchers.py` around lines 163 - 164,
Before re-raising the caught MemoryError/RecursionError in the except block that
currently reads "except MemoryError, RecursionError: raise", add a minimal,
low-allocation log statement (use a module-level logger created once, e.g.
logger = logging.getLogger(__name__)) to record the exception type and a short
message, then re-raise; ensure the log call is simple (logger.critical or
logger.error) to avoid heavy allocations. Also verify which MemoryError is being
caught (the custom ai_company.memory.errors.MemoryError vs the built-in) by
checking imports in this module and, if the custom type is intended, consider
logging its module name in the message for clarity. Finally, keep the log
content minimal so it is safe during low-memory conditions and preserve the
existing re-raise behavior.
tests/unit/api/controllers/test_coordination.py (1)

129-166: 🧹 Nitpick | 🔵 Trivial

Async test methods work but mix sync/async patterns.

The test methods are async def while TestClient is synchronous. This works because pytest-asyncio handles the coroutine, and await agent_registry.register(agent) requires async context. The pattern is functional but slightly inconsistent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/api/controllers/test_coordination.py` around lines 129 - 166, The
test mixes async setup (await agent_registry.register) with synchronous
TestClient usage (coordination_client.post). Make the test fully async: in
TestCoordinationControllerHappyPath.test_coordinate_task_success, switch to
using an async HTTP client (await coordination_client.post(...)) or ensure
coordination_client is an AsyncClient fixture, and await both POST calls to
"/api/v1/tasks" and f"/api/v1/tasks/{task_id}/coordinate"; keep the async
agent_registry.register and leave mock_coordinator.coordinate.return_value
assignment as-is so the flow is consistently awaited.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CLAUDE.md`:
- Line 154: The sentence listing event name constants uses "e.g." without a
following comma; update the paragraph that starts with "**Event names**: always
use constants..." (the list including PROVIDER_CALL_START, BUDGET_RECORD_ADDED,
etc.) to use "e.g.," (add a comma after "e.g.") so it reads "(e.g.,
`PROVIDER_CALL_START`...)" throughout that line.

In `@src/ai_company/api/controllers/coordination.py`:
- Around line 267-276: The current block always calls logger.info for
coordination outcomes; change it to log at a higher level when coordination
failed by using logger.warning (or logger.error if preferred) when
result.is_success is False: keep the same log_event selection
(API_COORDINATION_COMPLETED vs API_COORDINATION_FAILED) and the same structured
fields (task_id, topology=result.topology.value, is_success,
total_duration_seconds) but call logger.warning for failures and logger.info for
successes (adjust the call site where log_event and logger.info are used).

In `@src/ai_company/api/dto.py`:
- Line 360: The parent_task_id field is missing a max_length constraint which
makes it inconsistent with other task ID fields like
CreateApprovalRequest.task_id; update the parent_task_id declaration to include
the same max_length=128 constraint (i.e., adjust the parent_task_id field
definition to mirror the other task_id fields so it enforces max_length=128).

In `@src/ai_company/engine/coordination/factory.py`:
- Around line 77-91: The factory returns a _NoProviderDecompositionStrategy
instance that currently relies on structural typing; make its implementation
explicit by annotating/declaring _NoProviderDecompositionStrategy as
implementing the DecompositionStrategy protocol (e.g., inherit from
DecompositionStrategy or add an explicit typing annotation) so missing methods
are caught at definition time; to do this, import DecompositionStrategy at
runtime (move the DecompositionStrategy import out of the TYPE_CHECKING block)
and update the class declaration for _NoProviderDecompositionStrategy to
reference DecompositionStrategy, ensuring method signatures match the protocol.

In `@src/ai_company/engine/coordination/service.py`:
- Around line 179-180: The except clause currently written as "except
MemoryError, RecursionError:" should log a minimal static error before
re-raising; change it to the tuple form "except (MemoryError, RecursionError):",
call the module's logger (e.g. logger.error or process_logger.error) with a
concise static message like "Fatal coordination failure:
MemoryError/RecursionError encountered" (no mutable state), then re-raise the
exception; update the block around that except in service.py to mirror the
logging pattern used in dispatchers.py.

---

Duplicate comments:
In `@src/ai_company/engine/coordination/dispatchers.py`:
- Around line 163-164: Before re-raising the caught MemoryError/RecursionError
in the except block that currently reads "except MemoryError, RecursionError:
raise", add a minimal, low-allocation log statement (use a module-level logger
created once, e.g. logger = logging.getLogger(__name__)) to record the exception
type and a short message, then re-raise; ensure the log call is simple
(logger.critical or logger.error) to avoid heavy allocations. Also verify which
MemoryError is being caught (the custom ai_company.memory.errors.MemoryError vs
the built-in) by checking imports in this module and, if the custom type is
intended, consider logging its module name in the message for clarity. Finally,
keep the log content minimal so it is safe during low-memory conditions and
preserve the existing re-raise behavior.

In `@tests/integration/engine/test_coordination_wiring.py`:
- Around line 144-180: The test replaces the real coordinator with an AsyncMock
(coordinator.coordinate side_effect) which intentionally isolates behavior but
misses wiring between RootConfig.coordination and the build_coordinator factory;
add a separate integration test that constructs a RootConfig with real
coordination settings and calls build_coordinator() (referencing
build_coordinator and RootConfig) and then invokes the coordinator.coordinate
flow to ensure the factory wiring and configuration are exercised end-to-end
rather than mocked.

In `@tests/unit/api/controllers/test_coordination.py`:
- Around line 129-166: The test mixes async setup (await
agent_registry.register) with synchronous TestClient usage
(coordination_client.post). Make the test fully async: in
TestCoordinationControllerHappyPath.test_coordinate_task_success, switch to
using an async HTTP client (await coordination_client.post(...)) or ensure
coordination_client is an AsyncClient fixture, and await both POST calls to
"/api/v1/tasks" and f"/api/v1/tasks/{task_id}/coordinate"; keep the async
agent_registry.register and leave mock_coordinator.coordinate.return_value
assignment as-is so the flow is consistently awaited.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 328e4540-4814-4449-8e9e-90d114832482

📥 Commits

Reviewing files that changed from the base of the PR and between 1490ce7 and 3ab7f2d.

📒 Files selected for processing (19)
  • CLAUDE.md
  • docs/design/engine.md
  • docs/design/operations.md
  • docs/roadmap/index.md
  • src/ai_company/api/channels.py
  • src/ai_company/api/controllers/approvals.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/state.py
  • src/ai_company/api/ws_models.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/observability/events/coordination.py
  • tests/integration/engine/test_coordination_wiring.py
  • tests/unit/api/controllers/test_coordination.py
  • tests/unit/api/test_dto.py
  • tests/unit/api/test_state.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Greptile Review
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (7)
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Build docs with: uv run zensical build (output: _site/docs/) — use without --strict until zensical/backlog#72 is resolved

Files:

  • docs/roadmap/index.md
  • docs/design/operations.md
  • docs/design/engine.md
docs/**

📄 CodeRabbit inference engine (CLAUDE.md)

Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Files:

  • docs/roadmap/index.md
  • docs/design/operations.md
  • docs/design/engine.md
docs/roadmap/**

📄 CodeRabbit inference engine (CLAUDE.md)

Roadmap in docs/roadmap/ (status, open questions, future vision)

Files:

  • docs/roadmap/index.md
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python 3.14+ required; use PEP 649 native lazy annotations (no from __future__ import annotations)
Use PEP 758 except syntax: except A, B: (no parentheses) — enforced by ruff on Python 3.14
Add type hints to all public functions; enforce mypy strict mode
Use Google style docstrings required on all public classes and functions — enforced by ruff D rules
Create new objects instead of mutating existing ones; for non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves — never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use @computed_field for derived values instead of storing redundant fields; use NotBlankStr for all identifier/name fields (including optional and tuple variants)
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — prefer structured concurrency over bare create_task; existing code is being migrated incrementally
Enforce 88 character line length — ruff enforces this
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)

Files:

  • tests/unit/api/controllers/test_coordination.py
  • src/ai_company/api/channels.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/observability/events/coordination.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/ws_models.py
  • tests/unit/api/test_dto.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/engine/coordination/factory.py
  • tests/unit/api/test_state.py
  • src/ai_company/api/state.py
  • tests/integration/engine/test_coordination_wiring.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/controllers/approvals.py
  • src/ai_company/api/controllers/coordination.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, and @pytest.mark.slow
Use pytest-xdist via -n auto for parallel test execution — ALWAYS include -n auto when running pytest, never run tests sequentially
Prefer @pytest.mark.parametrize for testing similar cases
Use test-provider, test-small-001, etc. in tests instead of real vendor names
Run unit tests with: uv run pytest tests/ -m unit -n auto
Run integration tests with: uv run pytest tests/ -m integration -n auto
Run e2e tests with: uv run pytest tests/ -m e2e -n auto
Run full test suite with coverage: uv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80

Files:

  • tests/unit/api/controllers/test_coordination.py
  • tests/unit/api/test_dto.py
  • tests/unit/api/test_state.py
  • tests/integration/engine/test_coordination_wiring.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Every module with business logic must import get_logger from ai_company.observability and define logger = get_logger(name); never use import logging or logging.getLogger() or print() in application code
Always use logger variable name (not _logger or log)
Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Use structured kwargs in logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Maintain 80% minimum code coverage — enforced in CI
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases
Lint Python code with: uv run ruff check src/ tests/
Auto-fix lint issues with: uv run ruff check src/ tests/ --fix
Format code with: uv run ruff format src/ tests/
Type-check with strict mode: uv run mypy src/ tests/

Files:

  • src/ai_company/api/channels.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/observability/events/coordination.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/ws_models.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/api/state.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/controllers/approvals.py
  • src/ai_company/api/controllers/coordination.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError; non-retryable errors raise immediately without retry; RetryExhaustedError signals all retries failed
Rate limiter respects RateLimitError.retry_after from providers — automatically pauses future requests

Files:

  • src/ai_company/api/channels.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/observability/events/coordination.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/ws_models.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/api/state.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/controllers/approvals.py
  • src/ai_company/api/controllers/coordination.py
🧠 Learnings (11)
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to **/*.py : Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — prefer structured concurrency over bare create_task; existing code is being migrated incrementally

Applied to files:

  • tests/unit/api/controllers/test_coordination.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly

Applied to files:

  • src/ai_company/observability/events/coordination.py
  • src/ai_company/engine/coordination/factory.py
  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/ai_company/**/*.py : Retryable errors (is_retryable=True) include: RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError; non-retryable errors raise immediately without retry; RetryExhaustedError signals all retries failed

Applied to files:

  • src/ai_company/engine/agent_engine.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Every module with business logic must import get_logger from ai_company.observability and define logger = get_logger(__name__); never use import logging or logging.getLogger() or print() in application code

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use structured kwargs in logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising

Applied to files:

  • CLAUDE.md
  • src/ai_company/engine/coordination/dispatchers.py
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Always use logger variable name (not _logger or log)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Use DEBUG logging for object creation, internal flow, and entry/exit of key functions

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T11:20:53.699Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T11:20:53.699Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging

Applied to files:

  • CLAUDE.md
🧬 Code graph analysis (7)
src/ai_company/engine/coordination/service.py (1)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
src/ai_company/api/dto.py (1)
src/ai_company/config/schema.py (1)
  • _validate_unique_agent_names (538-550)
tests/unit/api/test_dto.py (1)
src/ai_company/api/dto.py (8)
  • ApiResponse (39-57)
  • ApproveRequest (247-256)
  • CoordinateTaskRequest (274-313)
  • CoordinationPhaseResponse (316-342)
  • CoordinationResultResponse (345-373)
  • success (55-57)
  • success (94-96)
  • is_success (371-373)
tests/unit/api/test_state.py (3)
src/ai_company/api/state.py (4)
  • coordinator (153-155)
  • has_coordinator (158-160)
  • agent_registry (163-165)
  • has_agent_registry (168-170)
src/ai_company/api/errors.py (1)
  • ServiceUnavailableError (68-74)
src/ai_company/hr/registry.py (1)
  • AgentRegistryService (32-262)
tests/integration/engine/test_coordination_wiring.py (7)
src/ai_company/api/app.py (1)
  • create_app (351-491)
src/ai_company/core/agent.py (1)
  • AgentIdentity (266-342)
src/ai_company/core/role.py (1)
  • Authority (33-61)
src/ai_company/hr/registry.py (1)
  • AgentRegistryService (32-262)
src/ai_company/providers/enums.py (1)
  • FinishReason (15-22)
src/ai_company/engine/coordination/models.py (2)
  • CoordinationPhaseResult (64-96)
  • CoordinationResult (128-198)
src/ai_company/engine/coordination/service.py (1)
  • coordinate (94-198)
src/ai_company/engine/coordination/dispatchers.py (1)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
src/ai_company/api/controllers/approvals.py (1)
src/ai_company/api/channels.py (1)
  • get_channels_plugin (33-40)
🪛 LanguageTool
CLAUDE.md

[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...

(EG_NO_COMMA)

🔇 Additional comments (33)
docs/roadmap/index.md (1)

9-9: LGTM! Documentation accurately reflects the new capability.

The addition of "multi-agent coordination" to the Agent engine's feature list correctly documents the MultiAgentCoordinator integration completed in this PR. The placement in "Current Status" is appropriate given the comprehensive testing and integration work described in the PR objectives.

docs/design/operations.md (1)

958-958: Coordination endpoint documentation looks correct.

Line 958 cleanly documents the new coordination trigger endpoint in the API surface.

src/ai_company/api/ws_models.py (1)

46-50: Coordination WebSocket event types are well integrated.

The new lifecycle event variants are clear and follow the existing naming pattern.

src/ai_company/engine/agent_engine.py (1)

225-254: Coordinator delegation API is correctly implemented.

The optional coordinator accessor plus guarded coordinate() path (warning + ExecutionStateError when absent) is robust and matches runtime wiring goals.

src/ai_company/api/state.py (1)

152-170: New AppState service accessors are consistent and safe.

The coordinator and agent-registry properties correctly mirror the existing _require_service/has_* pattern.

tests/unit/api/test_state.py (1)

139-192: Good unit coverage for new AppState properties.

These tests correctly exercise both success and ServiceUnavailableError paths for coordinator and agent registry.

src/ai_company/observability/events/coordination.py (1)

18-18: Event constant addition is clean and aligned.

COORDINATION_FACTORY_BUILT extends the coordination observability namespace in the expected way.

src/ai_company/api/channels.py (1)

33-40: Shared ChannelsPlugin lookup helper is a solid addition.

This centralizes plugin resolution logic and keeps controller-side usage cleaner.

docs/design/engine.md (1)

411-413: Engine design docs are updated in the right places.

The new coordinator delegation note and added coordination config keys are clearly reflected.

Also applies to: 793-796

src/ai_company/engine/coordination/dispatchers.py (2)

173-178: LGTM: Added exc_info=True to warning logs.

The addition of exc_info=True improves observability by including stack traces in error logs, aligning with best practices for debugging coordination failures.


210-211: Same pattern: MemoryError/RecursionError re-raises without logging.

These blocks follow the same pattern as lines 163-164. The past review comment applies to all these locations.

Also applies to: 253-254, 336-337, 665-666, 769-770

tests/unit/api/test_dto.py (3)

186-236: LGTM: Comprehensive validation tests for CoordinateTaskRequest.

Tests cover defaults, non-empty agent_names, max_length constraints, duplicate name rejection (case-insensitive), and parametrized bounds testing for max_subtasks and max_concurrency_per_wave. Good use of @pytest.mark.parametrize for edge cases.


239-272: LGTM: CoordinationPhaseResponse consistency validation tests.

Tests properly verify the model validator rejects inconsistent success/error combinations and accepts valid configurations.


275-318: LGTM: CoordinationResultResponse computed field tests.

Tests validate the is_success computed field behavior (all pass vs. any failure) and the min_length=1 constraint on phases.

CLAUDE.md (1)

101-107: LGTM: Package structure documentation updated.

The api/ and engine/ descriptions now accurately reflect the coordination endpoint and multi-agent coordination pipeline additions.

src/ai_company/engine/coordination/service.py (1)

517-527: LGTM: Skip handling for None rollup uses appropriate event.

The code now logs COORDINATION_PHASE_COMPLETED with a descriptive note when skipping due to rollup being None, which is semantically correct—the phase didn't fail, it was skipped.

tests/integration/engine/test_coordination_wiring.py (1)

211-245: LGTM: API interaction validates response structure.

The test properly validates response status codes, JSON structure, and field types. Assertions cover parent_task_id, topology (excluding AUTO), total_duration_seconds, and phases presence.

tests/unit/api/controllers/test_coordination.py (2)

238-296: LGTM: Error scenario coverage is thorough.

Tests cover task not found (404), unknown agent name (422), no active agents (422), and coordination phase error (422). Assertions validate both status codes and error message content.


331-348: LGTM: 503 test validates missing coordinator handling.

This test correctly uses the shared test_client fixture (which lacks a coordinator) to verify the 503 response when coordination is attempted without a configured coordinator.

src/ai_company/api/controllers/approvals.py (2)

47-66: LGTM: Refactored _require_channels_plugin with proper error handling.

The helper now delegates to get_channels_plugin() and logs before raising RuntimeError when the plugin is missing. This aligns with the coding guideline requiring error paths to log before raising.


95-109: LGTM: Best-effort publish with proper exception handling.

The updated _publish_approval_event correctly:

  1. Calls _require_channels_plugin which logs if missing
  2. Re-raises MemoryError/RecursionError immediately (PEP 758 syntax)
  3. Logs other failures with exc_info=True for debugging
src/ai_company/api/controllers/coordination.py (5)

45-52: LGTM: Task ID length validation at API boundary.

The _validate_task_id helper rejects oversized task IDs early, preventing potential issues with downstream storage or processing. The 128-character limit is reasonable.


55-84: LGTM: Best-effort WebSocket publishing with proper exception handling.

The function correctly returns early if channels plugin is unavailable, re-raises fatal exceptions (MemoryError, RecursionError), and logs other failures with exc_info=True.


139-156: LGTM: Service availability checks before proceeding.

The endpoint validates both coordinator and agent_registry availability upfront with appropriate logging and 503 responses. This prevents confusing errors later in the flow.


181-196: Task engine access order is acceptable given upfront checks.

The past review noted that app_state.task_engine could raise 503 before a meaningful coordinator error. However, lines 141-155 now validate both has_coordinator and has_agent_registry upfront, so users get coordinator-specific 503s first. The task_engine access here is guarded by those checks in practice.


279-324: LGTM: Agent resolution with clear error handling.

The _resolve_agents method handles both explicit agent name lookup (with 422 for unknown names) and fallback to all active agents (with 422 if none available). Logging covers both failure paths.

src/ai_company/api/dto.py (3)

274-313: LGTM — well-structured request DTO with proper constraints.

The CoordinateTaskRequest correctly uses:

  • NotBlankStr tuple with bounds (min 1, max 50 agents)
  • Case-insensitive duplicate detection in _validate_unique_agent_names
  • Sensible limits for max_subtasks and max_concurrency_per_wave

316-342: LGTM — consistency validator is correct and past feedback addressed.

The error: NotBlankStr | None change ensures failed phases cannot have blank error messages, and the validator properly enforces mutual exclusivity between success=True (no error) and success=False (error required).


345-373: LGTM — computed is_success cleanly derives from phase results.

The @computed_field pattern aligns with guidelines, and the implementation correctly returns True only when all phases succeed.

src/ai_company/engine/coordination/factory.py (4)

1-24: LGTM — module structure and imports follow guidelines.

Event constants are correctly imported from domain-specific modules under ai_company.observability.events.*, matching the project convention. The TYPE_CHECKING guard keeps type-only imports clean.


47-74: LGTM — placeholder strategy correctly logs and raises on decomposition attempt.

The _NoProviderDecompositionStrategy provides clear error messaging and uses the correct DECOMPOSITION_FAILED event constant. The # noqa: ARG002 annotations correctly document that unused parameters are intentional for protocol conformance.


94-108: LGTM — workspace service construction follows conditional pattern correctly.

Returns None when either dependency is missing, allowing the coordinator to function without workspace isolation when not configured.


111-180: LGTM — well-documented factory with clear wiring sequence.

The build_coordinator function:

  • Uses keyword-only arguments for clarity on a complex signature
  • Has a comprehensive Google-style docstring documenting the dependency tree
  • Logs at DEBUG level with structured kwargs before construction
  • Follows a deterministic wiring order matching the documented steps

The # noqa: PLR0913 is justified for a factory function that assembles multiple collaborating services.

Comment on lines +179 to +180
except MemoryError, RecursionError:
raise
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

MemoryError/RecursionError re-raised without logging.

Same pattern as dispatchers.py—consider a minimal static log before re-raising to aid debugging fatal coordination failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/engine/coordination/service.py` around lines 179 - 180, The
except clause currently written as "except MemoryError, RecursionError:" should
log a minimal static error before re-raising; change it to the tuple form
"except (MemoryError, RecursionError):", call the module's logger (e.g.
logger.error or process_logger.error) with a concise static message like "Fatal
coordination failure: MemoryError/RecursionError encountered" (no mutable
state), then re-raise the exception; update the block around that except in
service.py to mirror the logging pattern used in dispatchers.py.

Connect the existing MultiAgentCoordinator to the config schema,
AppState, API layer, AgentEngine, and WebSocket events so it becomes
a usable runtime feature.

- Add CoordinationSectionConfig bridging YAML coordination section
  to per-run CoordinationConfig
- Add build_coordinator() factory for constructing the full
  dependency tree from config + AgentEngine
- Add coordinator param, property, and coordinate() method to
  AgentEngine
- Extend AppState and create_app() with coordinator +
  agent_registry slots
- Add coordination WsEventType entries and API event constants
- Add CoordinateTaskRequest/CoordinationResultResponse DTOs
- Add CoordinationController with POST /tasks/{id}/coordinate
- Export new public API from coordination package
- Add coordination field to RootConfig and config defaults
Pre-reviewed by 11 agents, 30 findings addressed:

- Add get_strategy_name() to _NoProviderDecompositionStrategy (bug fix)
- Add MemoryError/RecursionError guards to service.py (4 locations)
  and dispatchers.py (6 locations) — pre-existing safety gaps
- Split coordinate_task into helper methods (<50 lines each)
- Fix fail_fast: bool → bool | None to preserve section config defaults
- Use API_WS_SEND_FAILED for WebSocket publish errors
- Add log-before-raise on 3 error paths (controller, engine, factory)
- Add extra="forbid" to CoordinationSectionConfig
- DTOs: NotBlankStr for topology, computed_field for is_success,
  success/error validator, ge=0.0 constraints, min_length=1 on phases
- Add min_length=1 + uniqueness validator on agent_names
- Add le=50 upper bound on max_concurrency_per_wave
- Update CLAUDE.md: api/ + engine/ descriptions, 4 new event constants
- Add RootConfig docstring for coordination field
- Add test: is_success=False path, provider/model-only factory edges,
  extra="forbid" validation, replace broad pytest.raises(Exception)
- Log _phase_update_parent skip when rollup is None
Critical (6):
- Fix misused COORDINATION_STARTED event in factory.py (→ DECOMPOSITION_FAILED
  and new COORDINATION_FACTORY_BUILT)
- Branch log event based on result.is_success in controller _execute
- Fix _phase_update_parent skip event (COORDINATION_PHASE_STARTED →
  COORDINATION_PHASE_COMPLETED)
- Sanitize internal CoordinationPhaseError messages in API responses and
  WS payloads to prevent info leakage
- Add exc_info=True to 6 warning log calls in dispatchers.py
- Change CoordinationPhaseResponse.error to NotBlankStr | None

Major (14):
- Add has_agent_registry property + early-exit guard in controller
- Add task_id length validation (_MAX_TASK_ID_LEN = 128)
- Add max_length=50 to agent_names tuple in CoordinateTaskRequest
- Add DTO validation tests (CoordinateTaskRequest, CoordinationPhaseResponse,
  CoordinationResultResponse)
- Strengthen test_coordinate_with_specific_agents assertion
- Update AppState docstring with coordinator/agent_registry
- Fix MultiAgentCoordinator docstring (peer → dual-access)
- Fix integration test docstring (build_coordinator → mock)
- Update design spec: API Surface table, YAML config, AgentEngine section
- Update roadmap with multi-agent coordination
- Extract helpers from build_coordinator (under 50-line convention)

Medium (4):
- Add comment for reserved COORDINATION_PHASE_COMPLETED WsEventType
- Extract shared get_channels_plugin to channels.py
- Update CLAUDE.md with COORDINATION_FACTORY_BUILT event
- Add coordinator param to AgentEngine docstring Args
- Fix mypy errors in test_dto.py (type: ignore for dict unpacking)
- Add WS COORDINATION_FAILED event for unexpected exceptions in _execute
- Log at WARNING level (not INFO) when coordination result is_success=False
- Add max_length=128 to CoordinationResultResponse.parent_task_id
- Make _NoProviderDecompositionStrategy explicitly implement DecompositionStrategy
- Add logging to MemoryError/RecursionError handlers in service.py and dispatchers.py
- Add build_coordinator() factory wiring integration tests
- Fix CLAUDE.md "e.g." → "e.g.," and merge conflict resolution
Copilot AI review requested due to automatic review settings March 14, 2026 15:45
@Aureliolo Aureliolo force-pushed the feat/coordinator-runtime-wiring branch from 3ab7f2d to e801bea Compare March 14, 2026 15:45
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 14, 2026 15:46 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class multi-agent coordination wiring from config → engine services → API endpoint, including DTOs/events and test coverage for the new coordination surface area.

Changes:

  • Introduce CoordinationSectionConfig (RootConfig YAML section) and a build_coordinator() factory to wire MultiAgentCoordinator.
  • Add API endpoint POST /api/v1/tasks/{task_id}/coordinate with request/response DTOs and WebSocket event publishing.
  • Extend AppState / AgentEngine to optionally expose coordinator access and convenience delegation, plus unit/integration tests and docs updates.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/engine/test_coordination_section_config.py Unit tests for coordination YAML section defaults, validation, and conversion into runtime config.
tests/unit/engine/test_coordination_factory.py Unit tests for coordinator factory wiring and placeholder decomposition strategy behavior.
tests/unit/engine/test_agent_engine.py Adds tests for AgentEngine.coordinator and AgentEngine.coordinate() delegation/error path.
tests/unit/config/conftest.py Ensures RootConfig factory includes the new coordination section for config tests.
tests/unit/api/test_state.py Adds tests for new AppState coordinator and has_coordinator helpers.
tests/unit/api/test_dto.py Adds validation tests for new coordination request/response DTOs.
tests/unit/api/controllers/test_coordination.py Controller-level tests for the new coordination endpoint (happy path + error cases).
tests/integration/engine/test_coordination_wiring.py Integration test covering bootstrap → app wiring → API call path for coordination.
src/ai_company/observability/events/coordination.py Adds coordination factory-built event constant.
src/ai_company/observability/events/api.py Adds API coordination lifecycle event constants.
src/ai_company/engine/coordination/service.py Updates coordinator docs and adds explicit handling for MemoryError/RecursionError; adjusts parent-update phase behavior.
src/ai_company/engine/coordination/section_config.py New YAML section model bridging config → runtime CoordinationConfig.
src/ai_company/engine/coordination/factory.py New factory to construct a fully wired MultiAgentCoordinator from runtime services + config.
src/ai_company/engine/coordination/dispatchers.py Improves fatal/system-error handling and exception logging detail during dispatcher phases.
src/ai_company/engine/coordination/init.py Exports new coordination factory + section config for public module surface.
src/ai_company/engine/agent_engine.py Adds optional coordinator dependency and coordinate() convenience method.
src/ai_company/config/schema.py Adds coordination section to RootConfig schema.
src/ai_company/config/defaults.py Adds empty coordination section to default config dict.
src/ai_company/api/ws_models.py Adds websocket event types for coordination lifecycle.
src/ai_company/api/state.py Adds coordinator + agent registry storage/accessors to AppState.
src/ai_company/api/dto.py Adds coordination request/response DTOs.
src/ai_company/api/controllers/coordination.py New controller implementing POST /tasks/{task_id}/coordinate.
src/ai_company/api/controllers/approvals.py Refactors channels plugin lookup to use shared get_channels_plugin().
src/ai_company/api/controllers/init.py Registers CoordinationController in ALL_CONTROLLERS.
src/ai_company/api/channels.py Adds shared get_channels_plugin() helper.
src/ai_company/api/app.py Extends app factory to accept coordinator + agent registry in state wiring.
docs/roadmap/index.md Updates roadmap to reflect coordination support in engine.
docs/design/operations.md Documents new coordination endpoint in API surface table.
docs/design/engine.md Documents AgentEngine.coordinate() and coordination YAML config fields.
CLAUDE.md Updates repo structure notes and logging event guidance to include new coordination events.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +525 to +531
logger.info(
COORDINATION_PHASE_COMPLETED,
phase="update_parent",
note="Skipped — rollup is None (rollup phase failed)",
success=True,
duration_seconds=0.0,
)
Comment on lines +253 to +259
except Exception:
_publish_ws_event(
request,
WsEventType.COORDINATION_FAILED,
{"task_id": task_id, "error": "Unexpected coordination error"},
)
raise
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
src/ai_company/engine/coordination/dispatchers.py (1)

686-687: ⚠️ Potential issue | 🟠 Major

Log context before fatal re-raise in per-wave paths.

Line 686 and Line 790 re-raise fatal exceptions without any phase/wave log, so crash localization is lost for context-dependent execution.

💡 Suggested patch
@@
-        except MemoryError, RecursionError:
-            raise
+        except MemoryError, RecursionError:
+            logger.error(
+                COORDINATION_PHASE_FAILED,
+                phase=f"workspace_setup_wave_{wave_idx}",
+                wave_index=wave_idx,
+                error=(
+                    "Fatal: MemoryError or RecursionError during "
+                    "workspace setup"
+                ),
+            )
+            raise
@@
-        except MemoryError, RecursionError:
-            raise
+        except MemoryError, RecursionError:
+            logger.error(
+                COORDINATION_PHASE_FAILED,
+                phase=f"execute_wave_{wave_idx}",
+                wave_index=wave_idx,
+                error=(
+                    "Fatal: MemoryError or RecursionError during "
+                    "wave execution"
+                ),
+            )
+            raise

As per coding guidelines, "All error paths must log at WARNING or ERROR with context before raising".

Also applies to: 790-791

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/engine/coordination/dispatchers.py` around lines 686 - 687,
The except handlers that re-raise fatal exceptions (the "except MemoryError,
RecursionError:" blocks in the per-wave dispatch paths) must log contextual
information before re-raising; modify those handlers to call the module's logger
(e.g., logger or process_logger used elsewhere in dispatchers.py) to emit an
ERROR/WARNING containing the current phase/wave identifier variables (e.g.,
wave, phase, wave_id or similar local context available in the surrounding
function) and the exception details (exc_info=True or str(exception)) and then
re-raise; update both occurrences that match this pattern so every fatal raise
is preceded by a contextual log message naming the function and wave.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/design/engine.md`:
- Around line 793-796: The YAML snippet has fields (max_concurrency_per_wave,
fail_fast, enable_workspace_isolation, base_branch) incorrectly indented under
auto_topology_rules; move these keys to the same indentation level as
auto_topology_rules so they become top-level keys of the coordination section
per CoordinationSectionConfig, i.e., detach them from auto_topology_rules and
place them alongside it to match the expected schema used by the code that reads
CoordinationSectionConfig.

In `@src/ai_company/api/channels.py`:
- Around line 33-40: Update the get_channels_plugin docstring to Google-style
with Args and Returns: describe the request parameter type and purpose in an
"Args:" section (Request[Any, Any, Any] representing the incoming
Starlette/FastAPI request) and add a "Returns:" section that documents it
returns a ChannelsPlugin instance or None; keep the existing one-line summary
and ensure the docstring mentions the ChannelsPlugin class and the function name
get_channels_plugin for clarity.

In `@src/ai_company/api/controllers/coordination.py`:
- Around line 261-275: The endpoint only publishes terminal WS events but never
emits the per-phase event "coordination.phase_completed", so add a
_publish_ws_event call for WsEventType.COORDINATION_PHASE_COMPLETED (or the
string "coordination.phase_completed") before emitting the terminal event; use
the existing request, include task_id, phase identifier/name (from result.phase
or result.completed_phases), topology (result.topology.value), is_success, and
duration so dashboards can observe intermediate phase completions, and if result
contains a list of completed phases loop and publish one event per phase prior
to the final _publish_ws_event call.
- Around line 253-259: The except Exception handler that calls _publish_ws_event
with WsEventType.COORDINATION_FAILED should also log the caught exception and
contextual data before re-raising; modify the except block around
_publish_ws_event (referencing _publish_ws_event,
WsEventType.COORDINATION_FAILED and task_id/request) to call the module logger
(e.g., logger.error or logger.exception) including the exception info and
task_id/request context, then publish the WS event and re-raise.

In `@src/ai_company/engine/coordination/factory.py`:
- Around line 77-109: The factory currently silently degrades when only one of
the paired deps is provided; update _build_decomposition_strategy to raise a
clear ValueError if exactly one of provider or decomposition_model is non-None
(instead of returning _NoProviderDecompositionStrategy), and similarly update
_build_workspace_service to raise a ValueError if exactly one of
workspace_strategy or workspace_config is non-None (instead of returning None);
reference the existing symbols LlmDecompositionStrategy,
_NoProviderDecompositionStrategy, WorkspaceIsolationService,
_build_decomposition_strategy and _build_workspace_service when making the
change so callers fail fast at construction with an explicit error message
indicating the missing partner dependency.

---

Duplicate comments:
In `@src/ai_company/engine/coordination/dispatchers.py`:
- Around line 686-687: The except handlers that re-raise fatal exceptions (the
"except MemoryError, RecursionError:" blocks in the per-wave dispatch paths)
must log contextual information before re-raising; modify those handlers to call
the module's logger (e.g., logger or process_logger used elsewhere in
dispatchers.py) to emit an ERROR/WARNING containing the current phase/wave
identifier variables (e.g., wave, phase, wave_id or similar local context
available in the surrounding function) and the exception details (exc_info=True
or str(exception)) and then re-raise; update both occurrences that match this
pattern so every fatal raise is preceded by a contextual log message naming the
function and wave.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e3045e4a-819e-4900-97a9-6871dfd97bdd

📥 Commits

Reviewing files that changed from the base of the PR and between 3ab7f2d and e801bea.

📒 Files selected for processing (30)
  • CLAUDE.md
  • docs/design/engine.md
  • docs/design/operations.md
  • docs/roadmap/index.md
  • src/ai_company/api/app.py
  • src/ai_company/api/channels.py
  • src/ai_company/api/controllers/__init__.py
  • src/ai_company/api/controllers/approvals.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/state.py
  • src/ai_company/api/ws_models.py
  • src/ai_company/config/defaults.py
  • src/ai_company/config/schema.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/engine/coordination/__init__.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/section_config.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/observability/events/api.py
  • src/ai_company/observability/events/coordination.py
  • tests/integration/engine/test_coordination_wiring.py
  • tests/unit/api/controllers/test_coordination.py
  • tests/unit/api/test_dto.py
  • tests/unit/api/test_state.py
  • tests/unit/config/conftest.py
  • tests/unit/engine/test_agent_engine.py
  • tests/unit/engine/test_coordination_factory.py
  • tests/unit/engine/test_coordination_section_config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Agent
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Greptile Review
  • GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations in Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax with except A, B: (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Add type hints to all public functions. Enforce strict mypy mode.
Maintain line length of 88 characters—enforced by ruff.
Handle errors explicitly—never silently swallow exceptions.

Files:

  • src/ai_company/api/app.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/section_config.py
  • src/ai_company/api/controllers/approvals.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/ws_models.py
  • src/ai_company/api/channels.py
  • src/ai_company/api/controllers/__init__.py
  • tests/unit/api/test_state.py
  • tests/unit/api/test_dto.py
  • tests/unit/engine/test_coordination_section_config.py
  • src/ai_company/observability/events/coordination.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/state.py
  • src/ai_company/engine/coordination/factory.py
  • tests/unit/config/conftest.py
  • tests/integration/engine/test_coordination_wiring.py
  • src/ai_company/config/schema.py
  • tests/unit/engine/test_coordination_factory.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/observability/events/api.py
  • tests/unit/api/controllers/test_coordination.py
  • src/ai_company/engine/coordination/service.py
  • tests/unit/engine/test_agent_engine.py
  • src/ai_company/config/defaults.py
  • src/ai_company/engine/coordination/__init__.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Use Google-style docstrings on all public classes and functions—required and enforced by ruff D rules.
Immutability: Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Distinguish between config and runtime state: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 conventions: BaseModel, model_validator, computed_field, ConfigDict. Use @computed_field for derived values instead of storing redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Keep functions under 50 lines and files under 800 lines.
Validate at system boundaries: user input, external APIs, config files.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. In tests, use test-provider, test-small-001, etc. Vendor names may only appear in: (1) Operations design page provider list, (2) .claude/ skill/agent files, (3) thir...

Files:

  • src/ai_company/api/app.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/section_config.py
  • src/ai_company/api/controllers/approvals.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/ws_models.py
  • src/ai_company/api/channels.py
  • src/ai_company/api/controllers/__init__.py
  • src/ai_company/observability/events/coordination.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/state.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/config/schema.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/observability/events/api.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/config/defaults.py
  • src/ai_company/engine/coordination/__init__.py
src/ai_company/!(observability)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/!(observability)/**/*.py: Every module with business logic must include from ai_company.observability import get_logger and set logger = get_logger(__name__). Never use import logging, logging.getLogger(), or print() in application code. Use the variable name logger (not _logger or log).
Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider). Import directly: from ai_company.observability.events.<domain> import EVENT_CONSTANT.
Use structured logging kwargs: logger.info(EVENT, key=value)—never use format strings like logger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions.

Files:

  • src/ai_company/api/app.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/section_config.py
  • src/ai_company/api/controllers/approvals.py
  • src/ai_company/api/dto.py
  • src/ai_company/api/ws_models.py
  • src/ai_company/api/channels.py
  • src/ai_company/api/controllers/__init__.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/state.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/config/schema.py
  • src/ai_company/engine/agent_engine.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/config/defaults.py
  • src/ai_company/engine/coordination/__init__.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Apply pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow to categorize tests.
Maintain 80% minimum code coverage—enforced in CI. Use @pytest.mark.parametrize for testing similar cases.

Files:

  • tests/unit/api/test_state.py
  • tests/unit/api/test_dto.py
  • tests/unit/engine/test_coordination_section_config.py
  • tests/unit/config/conftest.py
  • tests/integration/engine/test_coordination_wiring.py
  • tests/unit/engine/test_coordination_factory.py
  • tests/unit/api/controllers/test_coordination.py
  • tests/unit/engine/test_agent_engine.py
🧠 Learnings (10)
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/ai_company/api/app.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/observability/events/api.py
  • CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/service.py
  • CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All state transitions must log at INFO level.

Applied to files:

  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/service.py
  • CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/**/*.py : Distinguish between config and runtime state: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/ai_company/engine/coordination/section_config.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/providers/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`. Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately. `RetryExhaustedError` signals all retries failed—catch at engine layer for fallback chains.

Applied to files:

  • tests/unit/config/conftest.py
  • src/ai_company/config/schema.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

  • tests/unit/api/controllers/test_coordination.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Every module with business logic must include `from ai_company.observability import get_logger` and set `logger = get_logger(__name__)`. Never use `import logging`, `logging.getLogger()`, or `print()` in application code. Use the variable name `logger` (not `_logger` or `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use structured logging kwargs: `logger.info(EVENT, key=value)`—never use format strings like `logger.info("msg %s", val)`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use DEBUG logging for object creation, internal flow, and entry/exit of key functions.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • CLAUDE.md
🧬 Code graph analysis (19)
src/ai_company/api/app.py (3)
src/ai_company/engine/coordination/service.py (1)
  • MultiAgentCoordinator (50-589)
src/ai_company/api/state.py (3)
  • task_engine (120-122)
  • coordinator (153-155)
  • agent_registry (163-165)
src/ai_company/hr/registry.py (1)
  • AgentRegistryService (32-262)
src/ai_company/api/controllers/coordination.py (9)
src/ai_company/api/channels.py (1)
  • get_channels_plugin (33-40)
src/ai_company/api/dto.py (4)
  • CoordinateTaskRequest (274-313)
  • success (55-57)
  • success (94-96)
  • is_success (371-373)
src/ai_company/api/errors.py (3)
  • ApiValidationError (32-38)
  • NotFoundError (23-29)
  • ServiceUnavailableError (68-74)
src/ai_company/api/ws_models.py (2)
  • WsEvent (53-78)
  • WsEventType (20-50)
src/ai_company/engine/coordination/models.py (2)
  • CoordinationContext (30-61)
  • CoordinationResult (128-198)
src/ai_company/engine/errors.py (1)
  • CoordinationPhaseError (132-153)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/api/state.py (4)
  • has_coordinator (158-160)
  • has_agent_registry (168-170)
  • coordinator (153-155)
  • agent_registry (163-165)
src/ai_company/engine/coordination/service.py (1)
  • coordinate (94-203)
src/ai_company/engine/coordination/section_config.py (2)
src/ai_company/engine/coordination/config.py (1)
  • CoordinationConfig (8-38)
src/ai_company/engine/routing/models.py (1)
  • AutoTopologyConfig (160-207)
src/ai_company/api/controllers/approvals.py (1)
src/ai_company/api/channels.py (1)
  • get_channels_plugin (33-40)
src/ai_company/api/dto.py (1)
src/ai_company/config/schema.py (1)
  • _validate_unique_agent_names (538-550)
src/ai_company/api/controllers/__init__.py (1)
src/ai_company/api/controllers/coordination.py (1)
  • CoordinationController (109-334)
tests/unit/api/test_state.py (2)
src/ai_company/api/state.py (2)
  • coordinator (153-155)
  • has_coordinator (158-160)
src/ai_company/api/errors.py (1)
  • ServiceUnavailableError (68-74)
tests/unit/api/test_dto.py (1)
src/ai_company/api/dto.py (6)
  • CoordinateTaskRequest (274-313)
  • CoordinationPhaseResponse (316-342)
  • CoordinationResultResponse (345-373)
  • success (55-57)
  • success (94-96)
  • is_success (371-373)
tests/unit/engine/test_coordination_section_config.py (3)
src/ai_company/engine/coordination/config.py (1)
  • CoordinationConfig (8-38)
src/ai_company/engine/coordination/section_config.py (2)
  • CoordinationSectionConfig (15-83)
  • to_coordination_config (57-83)
src/ai_company/engine/routing/models.py (1)
  • AutoTopologyConfig (160-207)
src/ai_company/engine/coordination/dispatchers.py (1)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
src/ai_company/api/state.py (4)
src/ai_company/engine/coordination/service.py (1)
  • MultiAgentCoordinator (50-589)
src/ai_company/hr/registry.py (1)
  • AgentRegistryService (32-262)
src/ai_company/engine/agent_engine.py (1)
  • coordinator (226-228)
tests/unit/api/controllers/test_coordination.py (1)
  • agent_registry (80-81)
src/ai_company/engine/coordination/factory.py (6)
src/ai_company/engine/decomposition/protocol.py (1)
  • DecompositionStrategy (14-40)
src/ai_company/engine/errors.py (1)
  • DecompositionError (46-47)
src/ai_company/config/schema.py (1)
  • TaskAssignmentConfig (319-383)
src/ai_company/engine/decomposition/models.py (2)
  • DecompositionContext (262-287)
  • DecompositionPlan (66-122)
src/ai_company/engine/workspace/protocol.py (1)
  • WorkspaceIsolationStrategy (14-90)
src/ai_company/providers/protocol.py (1)
  • CompletionProvider (21-80)
tests/unit/config/conftest.py (1)
src/ai_company/engine/coordination/section_config.py (1)
  • CoordinationSectionConfig (15-83)
tests/integration/engine/test_coordination_wiring.py (4)
src/ai_company/api/app.py (1)
  • create_app (351-491)
src/ai_company/hr/registry.py (1)
  • AgentRegistryService (32-262)
src/ai_company/engine/coordination/factory.py (1)
  • build_coordinator (111-180)
src/ai_company/engine/coordination/models.py (2)
  • CoordinationPhaseResult (64-96)
  • CoordinationResult (128-198)
src/ai_company/config/schema.py (1)
src/ai_company/engine/coordination/section_config.py (1)
  • CoordinationSectionConfig (15-83)
tests/unit/engine/test_coordination_factory.py (1)
src/ai_company/engine/errors.py (1)
  • DecompositionError (46-47)
src/ai_company/engine/coordination/service.py (1)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
tests/unit/engine/test_agent_engine.py (3)
src/ai_company/engine/agent_engine.py (2)
  • coordinator (226-228)
  • coordinate (230-253)
src/ai_company/api/state.py (1)
  • coordinator (153-155)
src/ai_company/engine/errors.py (1)
  • ExecutionStateError (17-18)
src/ai_company/engine/coordination/__init__.py (2)
src/ai_company/engine/coordination/factory.py (1)
  • build_coordinator (111-180)
src/ai_company/engine/coordination/section_config.py (1)
  • CoordinationSectionConfig (15-83)
🔇 Additional comments (35)
docs/roadmap/index.md (1)

9-9: Roadmap status update is accurate and consistent with the shipped feature set.

This documentation change correctly reflects the new runtime capability for multi-agent coordination.

src/ai_company/config/defaults.py (1)

42-42: LGTM!

The addition of the "coordination": {} default follows the established pattern for other config sections. An empty dict correctly allows CoordinationSectionConfig's Pydantic field defaults to take effect.

docs/design/engine.md (1)

411-413: LGTM!

The documentation for the optional coordinate() method delegation is clear and appropriately references the coordination documentation.

CLAUDE.md (3)

101-101: LGTM!

Package description updated to reflect the new coordination endpoint.


107-107: LGTM!

Engine package description accurately documents the new CoordinationSectionConfig bridge and build_coordinator factory.


154-154: LGTM!

Event constants list expanded with the new coordination events (API_COORDINATION_STARTED, API_COORDINATION_COMPLETED, API_COORDINATION_FAILED, API_COORDINATION_AGENT_RESOLVE_FAILED, COORDINATION_FACTORY_BUILT). The previously flagged "e.g." comma issue has been addressed.

src/ai_company/api/controllers/approvals.py (2)

47-66: LGTM!

The refactor to _require_channels_plugin correctly delegates to the shared get_channels_plugin helper and adds appropriate error handling with logging before raising. The docstring follows Google style with Args/Returns/Raises sections.


96-96: LGTM!

Call site updated to use the renamed helper function.

docs/design/operations.md (1)

967-967: LGTM!

API endpoint documentation added for the new coordination trigger endpoint. The description is accurate and placement in the table is appropriate.

tests/unit/config/conftest.py (2)

25-25: LGTM!

Import added for the new CoordinationSectionConfig model.


100-100: LGTM!

RootConfigFactory extended with the coordination field using default CoordinationSectionConfig(), following the established pattern for other config sections.

tests/unit/api/test_state.py (2)

139-164: LGTM!

Comprehensive test coverage for the new coordinator property and has_coordinator flag. Tests follow the established pattern from TestAppStateTaskEngine and cover both the error case (None) and success case (set).


167-192: LGTM!

Comprehensive test coverage for the new agent_registry property and has_agent_registry flag. Uses the actual AgentRegistryService class rather than a mock, which provides stronger type guarantees.

src/ai_company/engine/coordination/dispatchers.py (1)

163-169: Good fatal-error hardening in shared dispatch paths.

These branches now preserve fatal exceptions and add contextual phase logs (exc_info=True on non-fatal paths), which improves diagnosability without downcasting critical failures.

Also applies to: 182-183, 215-221, 234-235, 263-269, 275-276, 351-358, 366-367

src/ai_company/api/controllers/__init__.py (1)

13-13: Controller registry/export wiring looks correct.

CoordinationController is consistently imported, included in ALL_CONTROLLERS, and exposed via __all__.

Also applies to: 39-40, 53-53

tests/unit/engine/test_coordination_section_config.py (1)

15-122: Solid config test coverage.

The new tests cover defaults, frozen behavior, override precedence, and validation boundaries comprehensively.

src/ai_company/observability/events/coordination.py (1)

18-18: Event constant addition is clean and consistent.

COORDINATION_FACTORY_BUILT follows the existing coordination event naming scheme.

src/ai_company/config/schema.py (1)

24-24: Root config coordination wiring is correct.

The new coordination field is properly typed, documented, and initialized with a safe default factory.

Also applies to: 420-420, 532-535

src/ai_company/engine/coordination/section_config.py (1)

15-83: Section config design is well-structured.

This keeps company-level defaults immutable and cleanly derives per-run CoordinationConfig with override precedence.

Based on learnings, "Distinguish between config and runtime state: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model."

tests/unit/api/test_dto.py (1)

4-13: DTO validation tests are comprehensive.

The new cases correctly exercise uniqueness, bounds, consistency validators, and computed success behavior.

Also applies to: 186-318

src/ai_company/api/app.py (1)

38-40: App wiring for coordination dependencies looks correct.

create_app() and AppState integration for coordinator and agent_registry are consistent and safely optional.

Also applies to: 360-361, 376-377, 412-413

tests/unit/api/controllers/test_coordination.py (1)

1-348: LGTM!

The test file provides comprehensive coverage of the coordination controller, including:

  • Happy path scenarios (task coordination, specific agents, failed phases)
  • Error scenarios (404, 422, 503)
  • Proper use of @pytest.mark.unit markers on test classes
  • Clean helper functions for test data creation

The async/sync pattern with TestClient works because pytest-asyncio handles the coroutine registration calls while HTTP assertions are synchronous.

src/ai_company/engine/coordination/__init__.py (1)

18-46: LGTM!

The new public exports for build_coordinator and CoordinationSectionConfig are correctly added, maintaining alphabetical ordering in __all__.

tests/unit/engine/test_coordination_factory.py (1)

1-136: LGTM!

Comprehensive unit tests for the build_coordinator factory covering:

  • Basic coordinator creation
  • Various configuration combinations (provider/model, workspace deps, custom min_score)
  • Placeholder strategy behavior when provider is missing
  • Clear error message verification for _NoProviderDecompositionStrategy
src/ai_company/api/ws_models.py (1)

46-51: LGTM!

The new coordination WebSocket event types follow the established naming convention and are appropriately documented. The inline comment clarifying that COORDINATION_PHASE_COMPLETED is reserved for future use is helpful.

src/ai_company/observability/events/api.py (1)

44-49: LGTM!

The new API coordination event constants follow the established naming pattern and are correctly typed with Final[str].

tests/integration/engine/test_coordination_wiring.py (2)

218-254: Intentional mock usage documented.

The past review comment noted that AsyncMock is used instead of build_coordinator(). However, the module docstring (lines 3-10) explicitly explains this design decision: the real build_coordinator() is unit-tested separately (in TestBuildCoordinatorFactory), while this test focuses on the API-to-coordinator wiring path.

This separation of concerns is valid—the integration test verifies that:

  1. create_app correctly wires the coordinator
  2. The /coordinate endpoint correctly resolves agents and invokes the coordinator
  3. Response structure matches DTOs

The _mock_coordinate function (lines 232-251) provides task-id-aware responses, enabling realistic assertions without requiring LLM provider dependencies.


1-319: LGTM!

The integration test file provides solid coverage:

  • Factory verification (TestBuildCoordinatorFactory) tests real build_coordinator() output
  • Full wiring test (TestCoordinationWiring) validates bootstrap → API → coordination flow
  • Proper @pytest.mark.integration markers
  • Clean helper utilities for agents and mock provider
src/ai_company/api/state.py (1)

152-170: LGTM!

The new coordinator and agent_registry accessors follow the established _require_service pattern, and both include corresponding has_* properties for conditional access. The past review comment requesting has_agent_registry has been addressed.

src/ai_company/engine/coordination/service.py (3)

219-220: Consistent MemoryError, RecursionError handling across all phases.

The same PEP 758 except MemoryError, RecursionError: pattern is applied consistently in _phase_decompose, _phase_route, _phase_dispatch, _phase_rollup, and _phase_update_parent. This ensures fatal errors propagate without being wrapped in CoordinationPhaseError.

Assuming the builtin MemoryError concern above is verified, this pattern is correct.

Also applies to: 275-276, 416-417, 481-482, 573-574


522-532: Skip handling for _phase_update_parent correctly addressed.

The past review comment about semantic mismatch has been resolved:

  • Early return when self._task_engine is None (line 522-523)
  • When rollup is None, logs COORDINATION_PHASE_COMPLETED with note="Skipped — rollup is None" and success=True (lines 524-531)

This accurately reflects that the phase completed (by skipping) rather than failed.


179-185: The shadowing concern is not applicable here. src/ai_company/engine/coordination/service.py does not import ai_company.memory.errors.MemoryError, and no transitively-imported module brings it into scope. The except MemoryError, RecursionError: clause correctly catches the Python builtin MemoryError. The PEP 758 except syntax is correct per coding guidelines.

src/ai_company/engine/agent_engine.py (1)

225-253: Coordinator integration in AgentEngine is clean and fail-fast.

The new coordinator property plus coordinate() delegation is straightforward, typed, and handles missing wiring explicitly.

tests/unit/engine/test_agent_engine.py (1)

1320-1369: Coordinator test coverage is solid for the new engine surface.

This block validates both failure and delegation paths for AgentEngine.coordinate() and protects the new wiring contract.

src/ai_company/api/dto.py (1)

274-373: DTO validation and computed response semantics look correct.

The new coordination DTOs enforce strong request/response invariants and keep is_success derived from phases, which avoids redundant state.

Comment on lines +793 to +796
max_concurrency_per_wave: null # None = unlimited
fail_fast: false
enable_workspace_isolation: true
base_branch: main
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

YAML structure appears incorrect: these fields belong at coordination level, not under auto_topology_rules.

Based on CoordinationSectionConfig (see section_config.py), max_concurrency_per_wave, fail_fast, enable_workspace_isolation, and base_branch are top-level fields of the coordination section, not nested under auto_topology_rules. The current indentation suggests they're children of auto_topology_rules.

📝 Suggested fix
 coordination:
   topology: "auto"                    # auto, sas, centralized, decentralized, context_dependent
   auto_topology_rules:
     # sequential tasks -> always single-agent
     sequential_override: "sas"
     # parallel tasks -> select based on domain structure
     parallel_default: "centralized"
     # mixed tasks -> SAS backbone for sequential phases, delegates parallel sub-tasks
     mixed_default: "context_dependent"  # hybrid: not a single topology -- engine selects per-phase
-  max_concurrency_per_wave: null      # None = unlimited
-  fail_fast: false
-  enable_workspace_isolation: true
-  base_branch: main
+  max_concurrency_per_wave: null        # None = unlimited
+  fail_fast: false
+  enable_workspace_isolation: true
+  base_branch: main

Note: If the current indentation is intentional (2-space indent at coordination level), then the YAML is correct but the visual alignment is misleading. Consider adjusting for clarity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/engine.md` around lines 793 - 796, The YAML snippet has fields
(max_concurrency_per_wave, fail_fast, enable_workspace_isolation, base_branch)
incorrectly indented under auto_topology_rules; move these keys to the same
indentation level as auto_topology_rules so they become top-level keys of the
coordination section per CoordinationSectionConfig, i.e., detach them from
auto_topology_rules and place them alongside it to match the expected schema
used by the code that reads CoordinationSectionConfig.

Comment on lines +253 to +259
except Exception:
_publish_ws_event(
request,
WsEventType.COORDINATION_FAILED,
{"task_id": task_id, "error": "Unexpected coordination error"},
)
raise
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Log unexpected coordination exceptions before re-raising.

This broad error path publishes a WS failure event but does not log the exception context before raise, which weakens incident triage.

Suggested fix
-        except Exception:
+        except Exception as exc:
+            logger.exception(
+                API_COORDINATION_FAILED,
+                task_id=task_id,
+                error=f"{type(exc).__name__}: {exc}",
+            )
             _publish_ws_event(
                 request,
                 WsEventType.COORDINATION_FAILED,
                 {"task_id": task_id, "error": "Unexpected coordination error"},
             )
             raise

Based on learnings: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/api/controllers/coordination.py` around lines 253 - 259, The
except Exception handler that calls _publish_ws_event with
WsEventType.COORDINATION_FAILED should also log the caught exception and
contextual data before re-raising; modify the except block around
_publish_ws_event (referencing _publish_ws_event,
WsEventType.COORDINATION_FAILED and task_id/request) to call the module logger
(e.g., logger.error or logger.exception) including the exception info and
task_id/request context, then publish the WS event and re-raise.

Comment on lines +261 to +275
ws_event_type = (
WsEventType.COORDINATION_COMPLETED
if result.is_success
else WsEventType.COORDINATION_FAILED
)
_publish_ws_event(
request,
ws_event_type,
{
"task_id": task_id,
"topology": result.topology.value,
"is_success": result.is_success,
"total_duration_seconds": result.total_duration_seconds,
},
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

coordination.phase_completed is defined but never emitted.

The endpoint currently emits started + terminal events only. Per-phase progress events are missing, so dashboards cannot observe intermediate coordination phase completion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/api/controllers/coordination.py` around lines 261 - 275, The
endpoint only publishes terminal WS events but never emits the per-phase event
"coordination.phase_completed", so add a _publish_ws_event call for
WsEventType.COORDINATION_PHASE_COMPLETED (or the string
"coordination.phase_completed") before emitting the terminal event; use the
existing request, include task_id, phase identifier/name (from result.phase or
result.completed_phases), topology (result.topology.value), is_success, and
duration so dashboards can observe intermediate phase completions, and if result
contains a list of completed phases loop and publish one event per phase prior
to the final _publish_ws_event call.

- Revert MemoryError/RecursionError logging in dispatchers.py back to bare
  raise (logging itself can trigger MemoryError during low-memory conditions)
- Same revert in service.py coordinate() method
- Add logger.exception() with structured context before WS publish in
  coordination controller's generic except Exception handler
- Fix misleading success=True log when rollup is None — now logs as
  COORDINATION_PHASE_FAILED and appends failed phase to results
- Add fail-fast ValueError in factory when mismatched deps are provided
  (e.g. provider without decomposition_model or vice versa)
- Improve get_channels_plugin docstring to Google-style (Args/Returns)
- Fix engine.md YAML coordination config comment alignment
- Update factory tests to expect ValueError for mismatched deps
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
src/ai_company/api/controllers/coordination.py (1)

233-250: ⚠️ Potential issue | 🟠 Major

coordination.phase_completed is still never emitted.

Neither exc.partial_phases on failure nor result.phases on success is published as WsEventType.COORDINATION_PHASE_COMPLETED, so subscribers only ever see started and terminal events. That leaves the new phase-progress event type effectively dead and dashboards still miss intermediate coordination progress.

Also applies to: 266-280

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/api/controllers/coordination.py` around lines 233 - 250, When
handling coordination success and failure in coordination.py, emit
WsEventType.COORDINATION_PHASE_COMPLETED with the intermediate phase list: in
the except CoordinationPhaseError block (where exc is available) call
_publish_ws_event(request, WsEventType.COORDINATION_PHASE_COMPLETED, {"task_id":
task_id, "phases": exc.partial_phases}) before publishing
COORDINATION_FAILED/raising ApiValidationError, and in the success path (where
result is returned) call _publish_ws_event(request,
WsEventType.COORDINATION_PHASE_COMPLETED, {"task_id": task_id, "phases":
result.phases}) before publishing the terminal success event; ensure you
reference the existing _publish_ws_event helper and include task_id and the
phase list in the payload so subscribers see intermediate progress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/api/controllers/coordination.py`:
- Around line 233-250: The current except block turns every
CoordinationPhaseError into an ApiValidationError (4xx); instead, update the
except in coordination.py to distinguish user-correctable phases from
internal/server faults: inspect exc.phase (or an explicit flag on
CoordinationPhaseError if available) and only raise ApiValidationError for known
client-fixable phases (e.g., validation/input-related phases); for internal
phases (e.g., decomposition/provider resolution, runtime MultiAgentCoordinator
failures) log the error, publish the same WsEvent via _publish_ws_event with
WsEventType.COORDINATION_FAILED, and raise an appropriate server-side error (or
re-raise the original exception) so it surfaces as a 5xx; keep the existing
logger.warning and _publish_ws_event behavior but change the exception type
raised based on the phase/flag in the CoordinationPhaseError.

In `@src/ai_company/engine/coordination/dispatchers.py`:
- Around line 163-164: Add a brief inline comment above the except MemoryError,
RecursionError: block explaining that logging is intentionally omitted because
emitting logs may itself raise MemoryError or RecursionError and that these are
re-raised immediately per the fatal-error policy; explicitly note that these are
the built-in exceptions (not ai_company.memory.errors.MemoryError) to avoid
confusion and reference the coding guideline requiring documentation when
skipping logging in fatal error paths.

In `@src/ai_company/engine/coordination/factory.py`:
- Around line 185-191: The COORDINATION_FACTORY_BUILT telemetry is emitted too
early; move the logger.debug call so it runs only after both
_build_decomposition_strategy() and _build_workspace_service() complete
successfully (or alternatively emit a distinct "starting_build" event before
construction and keep COORDINATION_FACTORY_BUILT for post-success), i.e.,
relocate the logger.debug(COORDINATION_FACTORY_BUILT, ...) currently before the
calls to _build_decomposition_strategy and _build_workspace_service to after the
construction code path that produces the coordinator (or add a new pre-build
constant/event name and use that before starting the two build calls).
- Around line 96-103: Before raising the ValueError when exactly one of provider
or decomposition_model is supplied, log a structured warning or error that
includes which of the two was given and which is missing (e.g.,
logger.error("Invalid decomposition wiring: given=%s missing=%s", given,
missing) or logger.warning(...)), then raise; apply the same change to the other
fail-fast branch around the 126-141 region so every error path logs context (use
the module logger variable name `logger` and include the `provider` and
`decomposition_model` values/state in the log).

In `@tests/unit/engine/test_coordination_factory.py`:
- Around line 38-123: The tests currently only assert build_coordinator doesn't
raise; update the relevant tests (those using build_coordinator, e.g.,
test_with_provider_and_model, test_with_task_engine, test_with_workspace_deps,
test_custom_min_score, test_shutdown_manager_passed_to_executor) to also assert
that the provided inputs are propagated to collaborators: verify the provider
and decomposition_model are passed into the created coordinator/executor
(reference build_coordinator, provider, decomposition_model), that task_engine
is attached or injected (reference task_engine), that workspace_strategy and
workspace_config are wired into the workspace component created by
build_coordinator (reference workspace_strategy, workspace_config), that
TaskAssignmentConfig.min_score is used by the scorer (reference
TaskAssignmentConfig(min_score)), and that shutdown_manager is forwarded to the
executor (reference shutdown_manager); additionally add tests asserting that
passing only workspace_strategy or only workspace_config raises the expected
failure (mirror existing provider/model missing tests). Ensure assertions
inspect collaborator state or call args on the coordinator/MultiAgentCoordinator
or its created children rather than only isinstance checks.

---

Duplicate comments:
In `@src/ai_company/api/controllers/coordination.py`:
- Around line 233-250: When handling coordination success and failure in
coordination.py, emit WsEventType.COORDINATION_PHASE_COMPLETED with the
intermediate phase list: in the except CoordinationPhaseError block (where exc
is available) call _publish_ws_event(request,
WsEventType.COORDINATION_PHASE_COMPLETED, {"task_id": task_id, "phases":
exc.partial_phases}) before publishing COORDINATION_FAILED/raising
ApiValidationError, and in the success path (where result is returned) call
_publish_ws_event(request, WsEventType.COORDINATION_PHASE_COMPLETED, {"task_id":
task_id, "phases": result.phases}) before publishing the terminal success event;
ensure you reference the existing _publish_ws_event helper and include task_id
and the phase list in the payload so subscribers see intermediate progress.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: cfe1a0cc-daad-4a98-bf2b-3845472e0461

📥 Commits

Reviewing files that changed from the base of the PR and between e801bea and 6d1daea.

📒 Files selected for processing (7)
  • docs/design/engine.md
  • src/ai_company/api/channels.py
  • src/ai_company/api/controllers/coordination.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/service.py
  • tests/unit/engine/test_coordination_factory.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Greptile Review
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations in Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax with except A, B: (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Add type hints to all public functions. Enforce strict mypy mode.
Maintain line length of 88 characters—enforced by ruff.
Handle errors explicitly—never silently swallow exceptions.

Files:

  • tests/unit/engine/test_coordination_factory.py
  • src/ai_company/api/channels.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/controllers/coordination.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Apply pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow to categorize tests.
Maintain 80% minimum code coverage—enforced in CI. Use @pytest.mark.parametrize for testing similar cases.

Files:

  • tests/unit/engine/test_coordination_factory.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Use Google-style docstrings on all public classes and functions—required and enforced by ruff D rules.
Immutability: Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Distinguish between config and runtime state: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 conventions: BaseModel, model_validator, computed_field, ConfigDict. Use @computed_field for derived values instead of storing redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Keep functions under 50 lines and files under 800 lines.
Validate at system boundaries: user input, external APIs, config files.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. In tests, use test-provider, test-small-001, etc. Vendor names may only appear in: (1) Operations design page provider list, (2) .claude/ skill/agent files, (3) thir...

Files:

  • src/ai_company/api/channels.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/controllers/coordination.py
src/ai_company/!(observability)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/!(observability)/**/*.py: Every module with business logic must include from ai_company.observability import get_logger and set logger = get_logger(__name__). Never use import logging, logging.getLogger(), or print() in application code. Use the variable name logger (not _logger or log).
Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider). Import directly: from ai_company.observability.events.<domain> import EVENT_CONSTANT.
Use structured logging kwargs: logger.info(EVENT, key=value)—never use format strings like logger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions.

Files:

  • src/ai_company/api/channels.py
  • src/ai_company/engine/coordination/service.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/controllers/coordination.py
🧠 Learnings (3)
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/ai_company/engine/coordination/service.py
  • src/ai_company/engine/coordination/factory.py
  • src/ai_company/engine/coordination/dispatchers.py
  • src/ai_company/api/controllers/coordination.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All state transitions must log at INFO level.

Applied to files:

  • src/ai_company/engine/coordination/service.py
  • src/ai_company/api/controllers/coordination.py
📚 Learning: 2026-03-14T15:34:22.487Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:34:22.487Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/ai_company/engine/coordination/factory.py
🧬 Code graph analysis (3)
tests/unit/engine/test_coordination_factory.py (3)
src/ai_company/config/schema.py (1)
  • TaskAssignmentConfig (319-383)
src/ai_company/engine/coordination/factory.py (3)
  • _NoProviderDecompositionStrategy (47-74)
  • build_coordinator (145-214)
  • decompose (59-74)
src/ai_company/engine/errors.py (1)
  • DecompositionError (46-47)
src/ai_company/engine/coordination/service.py (3)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
src/ai_company/engine/coordination/models.py (1)
  • CoordinationPhaseResult (64-96)
src/ai_company/api/dto.py (2)
  • success (55-57)
  • success (94-96)
src/ai_company/engine/coordination/dispatchers.py (1)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
🔇 Additional comments (4)
src/ai_company/api/channels.py (2)

7-30: Clean typing-only import setup.

Using TYPE_CHECKING for Request keeps runtime imports lean while preserving precise type hints.


33-47: get_channels_plugin() looks correct and well-documented.

The helper is concise, has a clear return contract, and the docstring now follows the expected public-function format.

As per coding guidelines, src/**/*.py: "Use Google-style docstrings on all public classes and functions—required and enforced by ruff D rules."

src/ai_company/engine/coordination/dispatchers.py (2)

173-178: LGTM! Good addition of exc_info=True for enhanced observability.

The exc_info=True parameter ensures tracebacks are captured in error logs, improving debuggability. This pattern is consistently applied across all exception handlers in the file.


210-211: Consistent application of the fatal error handling pattern.

These blocks follow the same pattern established at line 163-164: immediate re-raise of MemoryError/RecursionError without logging, with exc_info=True added to the subsequent general exception handlers. The consistency is good.

Also applies to: 253-254, 336-337, 665-666, 769-770

Comment on lines +233 to +250
except CoordinationPhaseError as exc:
logger.warning(
API_COORDINATION_FAILED,
task_id=task_id,
phase=exc.phase,
error=str(exc),
)
client_msg = f"Coordination failed at phase {exc.phase!r}"
_publish_ws_event(
request,
WsEventType.COORDINATION_FAILED,
{
"task_id": task_id,
"phase": exc.phase,
"error": client_msg,
},
)
raise ApiValidationError(client_msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't collapse internal coordination failures into ApiValidationError.

CoordinationPhaseError also wraps server-side faults from the coordinator stack (for example the no-provider decomposition path in src/ai_company/engine/coordination/factory.py and runtime failures wrapped by MultiAgentCoordinator). Turning the whole class into a 4xx response tells clients their request was invalid when the real problem is a 5xx/misconfiguration path. Reserve ApiValidationError for genuinely user-correctable phases and let internal failures surface as server errors.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/api/controllers/coordination.py` around lines 233 - 250, The
current except block turns every CoordinationPhaseError into an
ApiValidationError (4xx); instead, update the except in coordination.py to
distinguish user-correctable phases from internal/server faults: inspect
exc.phase (or an explicit flag on CoordinationPhaseError if available) and only
raise ApiValidationError for known client-fixable phases (e.g.,
validation/input-related phases); for internal phases (e.g.,
decomposition/provider resolution, runtime MultiAgentCoordinator failures) log
the error, publish the same WsEvent via _publish_ws_event with
WsEventType.COORDINATION_FAILED, and raise an appropriate server-side error (or
re-raise the original exception) so it surfaces as a 5xx; keep the existing
logger.warning and _publish_ws_event behavior but change the exception type
raised based on the phase/flag in the CoordinationPhaseError.

- Add inline comment to dispatchers.py explaining bare re-raise pattern
  for MemoryError/RecursionError (logging omitted intentionally as it
  may itself trigger MemoryError; these are built-in exceptions)
- Move COORDINATION_FACTORY_BUILT log to after successful construction
- Add structured logging before ValueError raises in factory mismatched
  deps paths (decomposition and workspace)
- Deepen factory tests: verify provider/model wiring into decomposition
  strategy, task_engine injection, workspace service creation, min_score
  forwarding to scorer, shutdown_manager/engine forwarding to executor
- Add workspace mismatched-deps tests (strategy-only, config-only)
Copilot AI review requested due to automatic review settings March 14, 2026 16:11
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 14, 2026 16:12 — with GitHub Actions Inactive
@Aureliolo Aureliolo merged commit 7a9e516 into main Mar 14, 2026
28 of 29 checks passed
@Aureliolo Aureliolo deleted the feat/coordinator-runtime-wiring branch March 14, 2026 16:14
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 14, 2026 16:14 — with GitHub Actions Inactive
Comment on lines +57 to +83
def to_coordination_config(
self,
*,
max_concurrency_per_wave: int | None = None,
fail_fast: bool | None = None,
) -> CoordinationConfig:
"""Convert to a per-run ``CoordinationConfig``.

Request-level overrides take precedence over section defaults.

Args:
max_concurrency_per_wave: Override for max concurrency.
fail_fast: Override for fail-fast behaviour.

Returns:
A ``CoordinationConfig`` with merged values.
"""
return CoordinationConfig(
max_concurrency_per_wave=(
max_concurrency_per_wave
if max_concurrency_per_wave is not None
else self.max_concurrency_per_wave
),
fail_fast=fail_fast if fail_fast is not None else self.fail_fast,
enable_workspace_isolation=self.enable_workspace_isolation,
base_branch=self.base_branch,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topology field silently dropped — YAML setting has no runtime effect

CoordinationSectionConfig.topology is documented as "Default coordination topology" and exposed in the YAML coordination: block, but to_coordination_config() never forwards it to CoordinationConfig. CoordinationConfig has no topology field, and the factory only uses it for a logger.debug call. A user who sets topology: CENTRALIZED in their YAML gets no error and no effect — the TopologySelector still resolves topology dynamically from routing decisions.

Either:

  • Add a topology field to CoordinationConfig and pass it through here so TopologySelector can use it as a default/override, or
  • Remove the topology field from CoordinationSectionConfig (since it's already controlled through auto_topology_rules), or
  • Update the field description to make clear it is purely informational / a logging hint.

As-is, this is a broken configuration contract: operators configure a value that the system silently ignores.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/engine/coordination/section_config.py
Line: 57-83

Comment:
**`topology` field silently dropped — YAML setting has no runtime effect**

`CoordinationSectionConfig.topology` is documented as "Default coordination topology" and exposed in the YAML `coordination:` block, but `to_coordination_config()` never forwards it to `CoordinationConfig`. `CoordinationConfig` has no `topology` field, and the factory only uses it for a `logger.debug` call. A user who sets `topology: CENTRALIZED` in their YAML gets no error and no effect — the `TopologySelector` still resolves topology dynamically from routing decisions.

Either:
- Add a `topology` field to `CoordinationConfig` and pass it through here so `TopologySelector` can use it as a default/override, **or**
- Remove the `topology` field from `CoordinationSectionConfig` (since it's already controlled through `auto_topology_rules`), **or**
- Update the field description to make clear it is purely informational / a logging hint.

As-is, this is a broken configuration contract: operators configure a value that the system silently ignores.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +147 to +153
logger.warning(
COORDINATION_FACTORY_BUILT,
note="Mismatched workspace dependencies",
given=given,
missing=missing,
)
raise ValueError(msg)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong event constant on error path

COORDINATION_FACTORY_BUILT semantically means "factory construction succeeded", so using it here on the mismatched-deps error path is misleading. Log aggregators or dashboards that filter on this event name will silently include failed construction attempts alongside successful ones.

Compare with _build_decomposition_strategy directly above, which correctly uses DECOMPOSITION_FAILED for its analogous error path.

Suggested change
logger.warning(
COORDINATION_FACTORY_BUILT,
note="Mismatched workspace dependencies",
given=given,
missing=missing,
)
raise ValueError(msg)
logger.warning(
COORDINATION_FACTORY_BUILT,

should be replaced with a dedicated failure event (e.g. COORDINATION_FACTORY_BUILD_FAILED) or reuse an existing error-scoped constant.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/engine/coordination/factory.py
Line: 147-153

Comment:
**Wrong event constant on error path**

`COORDINATION_FACTORY_BUILT` semantically means "factory construction succeeded", so using it here on the mismatched-deps error path is misleading. Log aggregators or dashboards that filter on this event name will silently include failed construction attempts alongside successful ones.

Compare with `_build_decomposition_strategy` directly above, which correctly uses `DECOMPOSITION_FAILED` for its analogous error path.

```suggestion
        logger.warning(
            COORDINATION_FACTORY_BUILT,
```
should be replaced with a dedicated failure event (e.g. `COORDINATION_FACTORY_BUILD_FAILED`) or reuse an existing error-scoped constant.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +302 to +313
@model_validator(mode="after")
def _validate_unique_agent_names(self) -> Self:
"""Reject duplicate agent names."""
if self.agent_names is not None:
seen: set[str] = set()
for name in self.agent_names:
lower = name.lower()
if lower in seen:
msg = f"Duplicate agent name: {name!r}"
raise ValueError(msg)
seen.add(lower)
return self
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case-insensitive dedup may reject valid distinct agents

_validate_unique_agent_names normalises names to lowercase before checking for duplicates (lower = name.lower()), but registry.get_by_name(name) is called with the original, un-normalised string. If the registry is case-sensitive, "Alice" and "alice" are two different agents, yet this validator would reject them as duplicates at the request boundary with a confusing Duplicate agent name: 'alice' error.

If the registry is case-insensitive, the dedup is correct but should document that assumption. If it is case-sensitive (or the behaviour is unspecified), the comparison should match registry semantics:

seen: set[str] = set()
for name in self.agent_names:
    if name in seen:           # exact-match — mirrors registry lookup
        msg = f"Duplicate agent name: {name!r}"
        raise ValueError(msg)
    seen.add(name)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/api/dto.py
Line: 302-313

Comment:
**Case-insensitive dedup may reject valid distinct agents**

`_validate_unique_agent_names` normalises names to lowercase before checking for duplicates (`lower = name.lower()`), but `registry.get_by_name(name)` is called with the original, un-normalised string. If the registry is case-sensitive, `"Alice"` and `"alice"` are two different agents, yet this validator would reject them as duplicates at the request boundary with a confusing `Duplicate agent name: 'alice'` error.

If the registry is case-insensitive, the dedup is correct but should document that assumption. If it is case-sensitive (or the behaviour is unspecified), the comparison should match registry semantics:

```python
seen: set[str] = set()
for name in self.agent_names:
    if name in seen:           # exact-match — mirrors registry lookup
        msg = f"Duplicate agent name: {name!r}"
        raise ValueError(msg)
    seen.add(name)
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class multi-agent coordination support to the API/runtime by wiring a MultiAgentCoordinator through config, app state, and a new /tasks/{task_id}/coordinate endpoint, plus extensive unit/integration coverage and documentation updates.

Changes:

  • Introduces coordination config bridge (CoordinationSectionConfig) and a build_coordinator() factory for fully wiring coordinator dependencies.
  • Adds coordination API endpoint + DTOs + WS event publication, and extends AppState / create_app() to carry coordinator + agent registry.
  • Improves coordination error handling/logging (incl. MemoryError/RecursionError re-raise guards) and expands observability event constants.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/engine/test_coordination_section_config.py Unit tests for new YAML coordination section config defaults/validation/merging.
tests/unit/engine/test_coordination_factory.py Unit tests for build_coordinator() wiring and placeholder decomposition strategy behavior.
tests/unit/engine/test_agent_engine.py Tests for new optional coordinator plumbing on AgentEngine.
tests/unit/config/conftest.py Updates RootConfig test factory to include coordination section.
tests/unit/api/test_state.py Adds coverage for AppState coordinator + agent registry accessors.
tests/unit/api/test_dto.py Adds validation tests for coordination request/response DTOs.
tests/unit/api/controllers/test_coordination.py Controller tests for happy-path and failure modes of the new endpoint.
tests/integration/engine/test_coordination_wiring.py End-to-end wiring test ensuring bootstrap → API → coordination path works.
src/ai_company/observability/events/coordination.py Adds factory event constant for coordination build observability.
src/ai_company/observability/events/api.py Adds API-level coordination event constants.
src/ai_company/engine/coordination/service.py Enhances coordinator pipeline error handling and update-parent phase behavior.
src/ai_company/engine/coordination/section_config.py New company-level coordination section config model + merge to per-run config.
src/ai_company/engine/coordination/factory.py New coordinator factory + placeholder decomposition strategy + workspace wiring logic.
src/ai_company/engine/coordination/dispatchers.py Adds MemoryError/RecursionError re-raise guards + richer exception logging.
src/ai_company/engine/coordination/init.py Re-exports build_coordinator and CoordinationSectionConfig.
src/ai_company/engine/agent_engine.py Adds optional coordinator dependency + coordinate() convenience method.
src/ai_company/config/schema.py Extends RootConfig schema with coordination section.
src/ai_company/config/defaults.py Adds default empty coordination config section.
src/ai_company/api/ws_models.py Adds WS event types for coordination lifecycle events.
src/ai_company/api/state.py Extends AppState with coordinator + agent registry services and accessors.
src/ai_company/api/dto.py Adds coordination request/response DTO models.
src/ai_company/api/controllers/coordination.py New REST endpoint to trigger coordination and publish WS events.
src/ai_company/api/controllers/approvals.py Refactors ChannelsPlugin lookup to use shared helper.
src/ai_company/api/controllers/init.py Registers the new coordination controller in ALL_CONTROLLERS.
src/ai_company/api/channels.py Adds shared get_channels_plugin() helper.
src/ai_company/api/app.py Plumbs coordinator + agent registry through create_app() into AppState.
docs/roadmap/index.md Updates roadmap to reflect multi-agent coordination as part of agent engine.
docs/design/operations.md Documents the new coordination endpoint in the API surface table.
docs/design/engine.md Documents engine-level coordinate() and coordination section config keys.
CLAUDE.md Updates repo module overview + event-constant guidance to include coordination additions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +70 to +73
logger.warning(
DECOMPOSITION_FAILED,
note="Decomposition attempted without LLM provider",
)
Comment on lines +143 to +153
msg = (
f"Workspace isolation requires both workspace_strategy and "
f"workspace_config, but only {given} was supplied (missing {missing})"
)
logger.warning(
COORDINATION_FACTORY_BUILT,
note="Mismatched workspace dependencies",
given=given,
missing=missing,
)
raise ValueError(msg)
Aureliolo added a commit that referenced this pull request Mar 15, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.2.0](v0.1.4...v0.2.0)
(2026-03-15)

##First probably usable release? Most likely not no and everything will break
### Features

* add /get/ installation page for CLI installer
([#413](#413))
([6a47e4a](6a47e4a))
* add cross-platform Go CLI for container lifecycle management
([#401](#401))
([0353d9e](0353d9e)),
closes [#392](#392)
* add explicit ScanOutcome signal to OutputScanResult
([#394](#394))
([be33414](be33414)),
closes [#284](#284)
* add meeting scheduler, event-triggered meetings, and Go CLI lint fixes
([#407](#407))
([5550fa1](5550fa1))
* wire MultiAgentCoordinator into runtime
([#396](#396))
([7a9e516](7a9e516))


### Bug Fixes

* CLA signatures branch + declutter repo root
([#409](#409))
([cabe953](cabe953))
* correct Release Please branch name in release workflow
([#410](#410))
([515d816](515d816))
* replace slsa-github-generator with attest-build-provenance, fix DAST
([#424](#424))
([eeaadff](eeaadff))
* resolve CodeQL path-injection alerts in Go CLI
([#412](#412))
([f41bf16](f41bf16))


### Refactoring

* rename package from ai_company to synthorg
([#422](#422))
([df27c6e](df27c6e)),
closes [#398](#398)


### Tests

* add fuzz and property-based testing across all layers
([#421](#421))
([115a742](115a742))


### CI/CD

* add SLSA L3 provenance for CLI binaries and container images
([#423](#423))
([d3dc75d](d3dc75d))
* bump the major group with 4 updates
([#405](#405))
([20c7a04](20c7a04))


### Maintenance

* bump github.com/spf13/cobra from 1.9.1 to 1.10.2 in /cli in the
minor-and-patch group
([#402](#402))
([e31edbb](e31edbb))
* narrow BSL Additional Use Grant and add CLA
([#408](#408))
([5ab15bd](5ab15bd)),
closes [#406](#406)

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: wire MultiAgentCoordinator into runtime (AgentEngine, API, config)

2 participants