feat: implement model routing engine by Aureliolo · Pull Request #99 · Aureliolo/synthorg

Aureliolo · 2026-03-01T21:52:05Z

Summary

Implements the model routing engine with 4 strategies: Manual, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy
Adds ModelResolver for O(1) alias/ID-to-model lookup with immutable MappingProxyType index
Adds collision detection, structured logging on all error paths, and NotBlankStr validation
Makes seniority catalog tiers vendor-agnostic (small/medium/large instead of vendor model names)
All test fixtures use generic provider names and model IDs — no vendor coupling

Closes #6

Test plan

🤖 Generated with Claude Code

Add strategy-based LLM model routing with resolver, 4 strategies (manual, role_based, cost_aware, smart), and 83 unit tests at 94%+ coverage. Closes #6.

Apply 23 review items from local agents and external reviewers: - Add NotBlankStr validation on routing model string fields - Add collision detection in ModelResolver.from_config - Add missing logging on all error paths before raising - Freeze STRATEGY_MAP with MappingProxyType - Fix SmartStrategy priority order (global chain before cheapest) - Remove redundant fallback chain walk in RoleBasedStrategy - Use NoReturn type for _raise_exhausted - Rename seniority catalog tiers from vendor-specific (haiku/sonnet/opus) to generic (small/medium/large) - Make all test fixtures vendor-agnostic (no real provider names or model IDs) - Add vendor-agnostic fixture rule to CLAUDE.md

github-actions · 2026-03-01T21:52:15Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-03-01T21:52:21Z

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

📥 Commits

Reviewing files that changed from the base of the PR and between b17601b and 50c17a3.

📒 Files selected for processing (5)

src/ai_company/config/schema.py
src/ai_company/providers/routing/models.py
src/ai_company/providers/routing/strategies.py
tests/unit/providers/routing/test_models.py
tests/unit/providers/routing/test_strategies.py

_{✏️ Tip: You can disable in-progress messages and the fortune message in your review settings.}

Tip

You can validate your CodeRabbit configuration file in your editor.

If your editor has YAML language server, you can enable auto-completion and validation by adding # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json at the top of your CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Implements a new model routing subsystem (models, resolver, router, strategies, errors), adds routing observability events and config validation, updates seniority→model-tier mappings and template defaults, expands provider public exports, and adds comprehensive unit tests and routing fixtures. Also adds minor docs and testing guidance.

Changes

Cohort / File(s)	Summary
Documentation & Guidance `\.claude/skills/aurelio-review-pr/skill.md`, `CLAUDE.md`	Added logging-audit guidance and vendor-agnostic testing fixture guidance; docs only.
Core role mappings & tests `src/ai_company/core/role.py`, `src/ai_company/core/role_catalog.py`, `tests/unit/core/test_role.py`, `tests/unit/core/test_role_catalog.py`	Updated `typical_model_tier` values across seniority levels (haiku/sonnet/opus → small/medium/large) and added ROLE_LOOKUP_MISS warning; tests updated accordingly.
Observability events `src/ai_company/observability/events.py`	Added routing lifecycle event constants (router/resolver built, model resolved/resolution failed, decision made, fallback attempted/exhausted, no rule matched, budget exceeded, selection failed, strategy unknown).
Providers public API surface `src/ai_company/providers/__init__.py`, `src/ai_company/providers/routing/__init__.py`	Re-exported routing package: strategy names/map, strategy classes, resolver/router, models, and routing errors; expanded module all.
Routing errors & tests `src/ai_company/providers/routing/errors.py`, `tests/unit/providers/routing/test_errors.py`	New routing error hierarchy: `RoutingError`, `ModelResolutionError`, `NoAvailableModelError`, `UnknownStrategyError` (non-retryable); unit tests for inheritance/semantics.
Routing domain models & tests `src/ai_company/providers/routing/models.py`, `tests/unit/providers/routing/test_models.py`	Added immutable Pydantic models: `ResolvedModel`, `RoutingRequest`, `RoutingDecision` with validation and computed total_cost; unit tests covering validation/immutability.
ModelResolver & tests/fixtures `src/ai_company/providers/routing/resolver.py`, `tests/unit/providers/routing/test_resolver.py`, `tests/unit/providers/routing/conftest.py`	New `ModelResolver` with `from_config`, resolve/resolve_safe, deduplication, collision detection, cost-sorting, immutable index, and observability logging; comprehensive tests and test fixtures.
Router & strategies + tests `src/ai_company/providers/routing/router.py`, `src/ai_company/providers/routing/strategies.py`, `tests/unit/providers/routing/test_router.py`, `tests/unit/providers/routing/test_strategies.py`	New `ModelRouter`, four routing strategies (Manual, RoleBased, CostAware, Smart), helper utilities, `STRATEGY_MAP` and name constants, error handling/logging, and extensive strategy/router tests.
Routing config schema & tests `src/ai_company/config/schema.py`, `tests/unit/config/test_schema.py`, `tests/unit/config/conftest.py`	Added logger wiring, strengthened ProviderModelConfig config, introduced `RoutingRuleConfig` validator enforcing at least one matcher (`role_level`/`task_type`/`fallback`); tests updated to include `task_type` where required.
Templates & renderer defaults `src/ai_company/templates/...`, `src/ai_company/templates/renderer.py`, `src/ai_company/templates/schema.py`, `tests/unit/templates/test_schema.py`	Default model tier changed from `sonnet`→`medium` (templates and renderer); several built-in template agent model values updated to new tiers; test expectation updated.
Routing tests suite `tests/unit/providers/routing/*`	Large set of new/updated unit tests covering models, resolver, router, strategies, errors, fixtures and many edge cases.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Router as ModelRouter
    participant Strategy as RoutingStrategy
    participant Resolver as ModelResolver

    Client->>Router: route(RoutingRequest)
    Router->>Strategy: select(request, routing_config, resolver)
    alt override present
        Strategy->>Resolver: resolve(override_ref)
        Resolver-->>Strategy: ResolvedModel
    else task-type rule
        Strategy->>Resolver: resolve(preferred_model)
        Resolver-->>Strategy: ResolvedModel
    else role-based rule
        Strategy->>Resolver: resolve(role_preferred)
        Resolver-->>Strategy: ResolvedModel
    else cheapest/fallback
        Strategy->>Resolver: all_models_sorted_by_cost()
        Resolver-->>Strategy: ResolvedModel(s)
    end
    Strategy-->>Router: RoutingDecision
    Router->>Router: Log ROUTING_DECISION_MADE
    Router-->>Client: RoutingDecision

sequenceDiagram
    participant Config as RoutingConfig
    participant Router as ModelRouter
    participant Resolver as ModelResolver

    Config->>Router: __init__(routing_config, providers)
    Router->>Resolver: ModelResolver.from_config(providers)
    Resolver->>Resolver: Index models by id & alias
    Resolver-->>Router: ModelResolver instance
    Router->>Router: strategy = STRATEGY_MAP[name]
    alt unknown strategy
        Router->>Router: Log ROUTING_STRATEGY_UNKNOWN (rgba(255,0,0,0.5))
        Router-->>Router: Raise UnknownStrategyError
    else
        Router->>Router: Log ROUTING_ROUTER_BUILT (rgba(0,128,0,0.5))
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: implement YAML config loader with Pydantic validation #75: schema work — directly related to the RoutingRuleConfig validator and schema changes introduced here.
feat: design unified provider interface #86: providers exports — related to changes that re-export routing types from src/ai_company/providers.
feat: implement core entity and role system models #69: role/catalog mappings — related to the seniority→model-tier updates applied in this PR.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 38.82% which is insufficient. The required threshold is 100.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: implement model routing engine' clearly and concisely summarizes the main change—implementing a model routing system with multiple strategies.
Description check	✅ Passed	The PR description is well-related to the changeset, detailing the routing engine implementation, strategies, resolver, validation, seniority tier changes, and testing status.
Linked Issues check	✅ Passed	The PR implements all core requirements from issue `#6`: Manual, Role-based, Cost-aware, and Smart routing strategies; ModelResolver with O(1) lookup; fallback chain support; collision detection; structured logging; vendor-agnostic tiers; YAML config support; comprehensive unit tests; and routing decision logging.
Out of Scope Changes check	✅ Passed	All changes are directly related to routing engine implementation and supporting infrastructure updates (seniority tiers, templates, config schema validation, event constants) without introducing unrelated functionality.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/model-routing-engine

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-01T21:52:29Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a foundational model routing engine designed to intelligently select the most appropriate LLM model based on various criteria. It provides a flexible framework with multiple strategies, enabling dynamic model choices driven by factors like user role, task type, cost considerations, or explicit overrides. The changes enhance the system's ability to manage and utilize diverse LLM resources efficiently, while also improving maintainability through structured error handling, robust validation, and vendor-agnostic configurations.

Highlights

Model Routing Engine Implementation: Implemented a comprehensive model routing engine supporting four distinct strategies: Manual, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy, allowing for flexible and intelligent LLM model selection.
Efficient Model Resolution: Introduced a ModelResolver for O(1) lookup of model aliases and IDs, utilizing an immutable MappingProxyType index to ensure data integrity and performance.
Robustness and Observability: Added collision detection for model references, structured logging across all error paths for enhanced debuggability, and NotBlankStr validation for critical string fields.
Vendor-Agnostic Seniority Tiers: Refactored the seniority catalog to use vendor-agnostic model tier names (e.g., 'small', 'medium', 'large') instead of specific vendor model names, improving portability and reducing coupling.
Improved Test Fixtures: Ensured all new and updated test fixtures use generic provider names and model IDs, further decoupling tests from specific vendor implementations.

Changelog

.claude/skills/aurelio-review-pr/skill.md
- Updated logging audit rules to include logging coverage suggestions and exclusions for specific file types.
CLAUDE.md
- Added a new guideline for vendor-agnostic test fixtures, emphasizing the use of fake model IDs/names in tests.
src/ai_company/core/role.py
- Updated the example for typical_model_tier in SeniorityInfo to use 'large' instead of 'opus', reflecting vendor-agnostic naming.
src/ai_company/core/role_catalog.py
- Modified typical_model_tier values for all SeniorityInfo entries to use vendor-agnostic terms like 'small', 'medium', and 'large'.
src/ai_company/observability/events.py
- Added a new section of constants for 'Routing lifecycle' events, including ROUTING_ROUTER_BUILT, ROUTING_MODEL_RESOLVED, and various error events.
src/ai_company/providers/init.py
- Imported and exposed new classes and errors related to the routing engine, such as ModelRouter, ModelResolver, and various routing strategies.
src/ai_company/providers/routing/init.py
- Added a new __init__.py file to the routing subpackage, exporting core routing components like errors, models, resolver, router, and strategies.
src/ai_company/providers/routing/errors.py
- Added new exception classes for routing-specific errors, including RoutingError, ModelResolutionError, NoAvailableModelError, and UnknownStrategyError, all inheriting from ProviderError.
src/ai_company/providers/routing/models.py
- Added new Pydantic models: ResolvedModel to represent a fully resolved LLM, RoutingRequest for routing inputs, and RoutingDecision for routing outputs.
src/ai_company/providers/routing/resolver.py
- Added ModelResolver class to map model aliases and IDs to ResolvedModel instances, supporting O(1) lookup and collision detection during configuration.
src/ai_company/providers/routing/router.py
- Added ModelRouter class, serving as the main entry point for routing decisions, which delegates to specific strategy implementations.
src/ai_company/providers/routing/strategies.py
- Added RoutingStrategy protocol and implemented four concrete strategies: ManualStrategy, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy, along with shared helper functions for model resolution and fallback logic.
tests/unit/core/test_role.py
- Updated typical_model_tier values in SeniorityInfo test cases from vendor-specific names ('sonnet', 'haiku') to generic tiers ('medium', 'small').
tests/unit/core/test_role_catalog.py
- Updated the test method name test_senior_uses_sonnet_tier to test_senior_uses_medium_tier and adjusted the assertion to check for 'medium' as the typical_model_tier.
tests/unit/providers/routing/conftest.py
- Added new test factories (ResolvedModelFactory, RoutingRequestFactory, RoutingDecisionFactory) and fixtures (three_model_provider, resolver, standard_routing_config) to support unit testing of the routing engine.
tests/unit/providers/routing/test_errors.py
- Added unit tests to verify the hierarchy and behavior of routing-specific error classes, ensuring they extend ProviderError and are not retryable.
tests/unit/providers/routing/test_models.py
- Added unit tests for the ResolvedModel, RoutingRequest, and RoutingDecision Pydantic models, covering instantiation, immutability, validation, and default values.
tests/unit/providers/routing/test_resolver.py
- Added unit tests for the ModelResolver class, covering its construction from configuration, model resolution by ID and alias, safe resolution, and handling of duplicate references.
tests/unit/providers/routing/test_router.py
- Added unit tests for the ModelRouter class, verifying its construction with various strategies, accessibility of the resolver, and correct routing behavior for different strategies.
tests/unit/providers/routing/test_strategies.py
- Added unit tests for each routing strategy (ManualStrategy, RoleBasedStrategy, CostAwareStrategy, SmartStrategy), covering their model selection logic, fallback mechanisms, and error handling.

Activity

All 1415 unit tests pass.
Ruff check is clean.
Ruff format is clean.
Mypy strict check is clean.
CI pipeline is pending.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull request overview

Implements a new model routing engine under ai_company.providers.routing, including a resolver for model ref lookup and multiple routing strategies, plus associated observability events and unit tests. This supports Issue #6’s goal of configurable routing with role/cost/smart strategies and fallback handling.

Changes:

Added routing subpackage (ModelRouter, ModelResolver, routing domain models/errors, and 4 strategy implementations + registry).
Added routing observability event constants and extensive unit test coverage for resolver/router/strategies.
Updated seniority catalog to use vendor-agnostic model tiers (small/medium/large) and aligned tests/docs.

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`src/ai_company/providers/routing/strategies.py`	Implements routing strategies + shared fallback/budget helpers and strategy registry.
`src/ai_company/providers/routing/router.py`	Adds `ModelRouter` as the routing entry point + decision logging.
`src/ai_company/providers/routing/resolver.py`	Adds `ModelResolver` with immutable ref→model index and collision detection.
`src/ai_company/providers/routing/models.py`	Adds `ResolvedModel`, `RoutingRequest`, `RoutingDecision` Pydantic models.
`src/ai_company/providers/routing/errors.py`	Introduces routing error hierarchy extending provider errors.
`src/ai_company/providers/routing/__init__.py`	Exposes routing public API exports.
`src/ai_company/providers/__init__.py`	Re-exports routing APIs from the top-level providers package.
`src/ai_company/observability/events.py`	Adds routing lifecycle event constants.
`src/ai_company/core/role_catalog.py`	Switches seniority tiers to vendor-agnostic sizes.
`src/ai_company/core/role.py`	Updates seniority tier docstring example to vendor-agnostic naming.
`tests/unit/providers/routing/conftest.py`	Adds routing-specific fixtures/factories and 3-model provider config.
`tests/unit/providers/routing/test_strategies.py`	Adds unit tests for each routing strategy and registry behavior.
`tests/unit/providers/routing/test_router.py`	Adds unit tests for router construction, routing, and logging.
`tests/unit/providers/routing/test_resolver.py`	Adds unit tests for resolver indexing, immutability, and sorting.
`tests/unit/providers/routing/test_models.py`	Adds unit tests for routing domain model validation/immutability.
`tests/unit/providers/routing/test_errors.py`	Adds unit tests for routing error hierarchy and context/message behavior.
`tests/unit/providers/routing/__init__.py`	Creates routing test package.
`tests/unit/core/test_role_catalog.py`	Updates expectations for vendor-agnostic tier names.
`tests/unit/core/test_role.py`	Updates expectations for vendor-agnostic tier names.
`CLAUDE.md`	Documents vendor-agnostic test fixture convention.
`.claude/skills/aurelio-review-pr/skill.md`	Expands logging-audit guidance text.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-01T21:57:02Z

src/ai_company/providers/routing/strategies.py

+if TYPE_CHECKING:
+    from collections.abc import Mapping
+
+    from ai_company.config.schema import RoutingConfig, RoutingRuleConfig
+
+    from .resolver import ModelResolver
+
+


This module uses if TYPE_CHECKING: imports for RoutingConfig, RoutingRuleConfig, ModelResolver, and Mapping, but those names are referenced in runtime annotations (Python 3.14 PEP 649 evaluates annotations lazily at runtime). This can raise NameError when annotations are accessed (e.g., inspect.get_annotations). Import these types at runtime instead (optionally with # noqa: TC00x) rather than behind TYPE_CHECKING.

Suggested change

if TYPE_CHECKING:

from collections.abc import Mapping

from ai_company.config.schema import RoutingConfig, RoutingRuleConfig

from .resolver import ModelResolver

from collections.abc import Mapping

from ai_company.config.schema import RoutingConfig, RoutingRuleConfig

from .resolver import ModelResolver

Copilot · 2026-03-01T21:57:02Z

src/ai_company/providers/routing/strategies.py

+STRATEGY_MAP: Mapping[str, RoutingStrategy] = MappingProxyType(
+    {


STRATEGY_MAP is annotated as Mapping[...], but Mapping is only imported under TYPE_CHECKING above. With Python 3.14 lazy annotations, evaluating this annotation will fail at runtime. Import Mapping from collections.abc at runtime (or remove the annotation).

Copilot · 2026-03-01T21:57:02Z

src/ai_company/providers/routing/router.py

+if TYPE_CHECKING:
+    from ai_company.config.schema import ProviderConfig, RoutingConfig
+
+    from .models import RoutingDecision, RoutingRequest


ProviderConfig, RoutingConfig, RoutingRequest, and RoutingDecision are imported only under TYPE_CHECKING but used in annotations. Under Python 3.14 (PEP 649 lazy annotations), accessing annotations can raise NameError. Import these types at runtime (with # noqa: TC00x if needed) instead of behind TYPE_CHECKING.

Copilot · 2026-03-01T21:57:03Z

src/ai_company/providers/routing/resolver.py

+if TYPE_CHECKING:
+    from ai_company.config.schema import ProviderConfig
+


ProviderConfig is imported only under TYPE_CHECKING but referenced in the from_config() signature annotation. With Python 3.14 lazy annotations, accessing annotations can raise NameError. Import ProviderConfig at runtime (with # noqa: TC00x if needed) instead of behind TYPE_CHECKING.

Copilot · 2026-03-01T21:57:03Z

src/ai_company/providers/routing/resolver.py

+                    if ref is None:
+                        continue
+                    existing = index.get(ref)
+                    if existing is not None and existing.model_id != resolved.model_id:


Collision detection only raises when existing.model_id != resolved.model_id. If two providers define the same ref (ID/alias) pointing to different providers but with the same model_id string, this will silently overwrite the index and make the ref ambiguous. Consider raising on any ref collision where either provider_name or model_id differ (or more simply: if existing is not None and existing != resolved).

Suggested change

if existing is not None and existing.model_id != resolved.model_id:

if existing is not None and existing != resolved:

Copilot · 2026-03-01T21:57:03Z

src/ai_company/providers/routing/models.py

+    remaining_budget: float | None = Field(
+        default=None,
+        ge=0.0,
+        description="Remaining cost budget in USD",
+    )


remaining_budget is documented as a remaining budget in USD, but the cost-aware selection compares it against cost_per_1k_input + cost_per_1k_output (USD per 1k tokens). This is a unit mismatch and makes the API ambiguous. Either clarify in the field description/name that the budget is per-1k-token (or per-request) threshold, or extend RoutingRequest with enough information (e.g., expected token counts) to compute an actual USD cost against a USD budget.

Copilot · 2026-03-01T21:57:03Z

src/ai_company/providers/routing/strategies.py

+        request: RoutingRequest,
+        resolver: ModelResolver,
+    ) -> RoutingDecision | None:
+        if not resolver.all_models_sorted_by_cost():


_try_cheapest() calls resolver.all_models_sorted_by_cost() just to check emptiness and then _cheapest_within_budget() calls it again. Cache the sorted tuple (or call resolver.all_models() for the emptiness check) to avoid the extra sort/work on the hot path.

Suggested change

if not resolver.all_models_sorted_by_cost():

# Use a cheap existence check to avoid extra sorting work here.

if not resolver.all_models():

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/aurelio-review-pr/skill.md:
- Around line 107-114: The "Logging coverage suggestions" subsection continues
numbering at 7-10 which triggers MD029; update the list under the "Logging
coverage suggestions (soft rules — mark as SUGGESTION, must be validated by user
in triage):" header so numbering restarts at 1 (i.e., change items labeled 7–10
to 1–4), or explicitly suppress the MD029 warning if the cross-section numbering
is intentional for easier triage reference; locate the header text "Logging
coverage suggestions" and the following list items to make the change and ensure
consistency in the document.

In `@src/ai_company/providers/routing/router.py`:
- Around line 109-113: Wrap the call to self._strategy.select(...) inside
route() in a try/except that catches routing exceptions, log a WARNING or ERROR
with structured context including the request object, the strategy identifier
(from self._strategy), and relevant config/resolver context (self._config,
self._resolver) before re-raising the exception; ensure the logged message is
descriptive (e.g., "routing selection failed") and include the caught exception
details, then re-raise the original exception to preserve behavior.

In `@src/ai_company/providers/routing/strategies.py`:
- Around line 620-633: Add a brief inline comment next to the "cheapest" alias
in the STRATEGY_MAP to document that it intentionally aliases "cost_aware"
(e.g., mention _COST_AWARE / "cheapest" maps to the same strategy), keeping the
registry immutable via MappingProxyType; update the comment near STRATEGY_MAP to
be concise and clarify the UX intent of the alias.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0338d0c and 7618335.

📒 Files selected for processing (21)

.claude/skills/aurelio-review-pr/skill.md
CLAUDE.md
src/ai_company/core/role.py
src/ai_company/core/role_catalog.py
src/ai_company/observability/events.py
src/ai_company/providers/__init__.py
src/ai_company/providers/routing/__init__.py
src/ai_company/providers/routing/errors.py
src/ai_company/providers/routing/models.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/strategies.py
tests/unit/core/test_role.py
tests/unit/core/test_role_catalog.py
tests/unit/providers/routing/__init__.py
tests/unit/providers/routing/conftest.py
tests/unit/providers/routing/test_errors.py
tests/unit/providers/routing/test_models.py
tests/unit/providers/routing/test_resolver.py
tests/unit/providers/routing/test_router.py
tests/unit/providers/routing/test_strategies.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Agent

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations — Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models with BaseModel, model_validator, and ConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines

Files:

src/ai_company/core/role.py
src/ai_company/providers/routing/router.py
tests/unit/providers/routing/test_router.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/__init__.py
tests/unit/providers/routing/test_models.py
src/ai_company/providers/routing/strategies.py
src/ai_company/core/role_catalog.py
src/ai_company/observability/events.py
src/ai_company/providers/routing/errors.py
tests/unit/core/test_role_catalog.py
tests/unit/providers/routing/conftest.py
tests/unit/core/test_role.py
src/ai_company/providers/routing/models.py
tests/unit/providers/routing/test_strategies.py
tests/unit/providers/routing/test_resolver.py
src/ai_company/providers/__init__.py
tests/unit/providers/routing/test_errors.py

src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic must import: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging or logging.getLogger() or print() in application code — use the ai_company logger
Always use logger as the variable name (not _logger or log)
Always use event name constants from ai_company.observability.events in logger calls
Use structured logging format: logger.info(EVENT, key=value) — never use format strings like logger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions

Files:

src/ai_company/core/role.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/__init__.py
src/ai_company/providers/routing/strategies.py
src/ai_company/core/role_catalog.py
src/ai_company/observability/events.py
src/ai_company/providers/routing/errors.py
src/ai_company/providers/routing/models.py
src/ai_company/providers/__init__.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark all test cases with pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow
Configure pytest async mode as asyncio_mode = "auto" — do not manually add @pytest.mark.asyncio to test functions

Files:

tests/unit/providers/routing/test_router.py
tests/unit/providers/routing/test_models.py
tests/unit/core/test_role_catalog.py
tests/unit/providers/routing/conftest.py
tests/unit/core/test_role.py
tests/unit/providers/routing/test_strategies.py
tests/unit/providers/routing/test_resolver.py
tests/unit/providers/routing/test_errors.py

🧠 Learnings (22)

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.

Applied to files:

CLAUDE.md
.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to README.md : Update README.md for significant feature changes

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use `async def` for async test functions; pytest-asyncio is configured with `asyncio_mode = "auto"`; mark async tests with `pytest.mark.asyncio` if needed

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to tests/**/*.py : Configure pytest async mode as `asyncio_mode = "auto"` — do not manually add `pytest.mark.asyncio` to test functions

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Set test timeout to 30 seconds per test (via pytest configuration)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Use pytest-xdist for parallelism with `-n auto` flag

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Mock external dependencies such as file system operations and network calls in tests

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Log all significant operations using Python logging with appropriate levels: debug for detailed info, info for successes, warning for retries, error for failures

Applied to files:

.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`), CI enforces this coverage requirement

Applied to files:

.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions

Applied to files:

.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit. The CI enforces 100% coverage on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`).

Applied to files:

.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/**/*.py : All functions and methods should have appropriate logging using `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.

Applied to files:

.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use logging levels: debug (routine), info (significant), warning (unexpected but recoverable), error (failures)

Applied to files:

.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : Core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`) must maintain 100% test coverage

Applied to files:

.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Agents must extend `BaseAgent`, use retry logic, and implement configurable timeout via settings.

Applied to files:

.claude/skills/aurelio-review-pr/skill.md

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls

Applied to files:

src/ai_company/observability/events.py

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception

Applied to files:

src/ai_company/providers/routing/errors.py

🧬 Code graph analysis (12)

src/ai_company/providers/routing/router.py (4)

src/ai_company/providers/routing/errors.py (1)

UnknownStrategyError (28-31)

src/ai_company/providers/routing/resolver.py (2)

ModelResolver (28-181)

from_config (57-115)

src/ai_company/config/schema.py (2)

ProviderConfig (52-95)

RoutingConfig (127-149)

src/ai_company/providers/routing/models.py (2)

RoutingDecision (74-92)

RoutingRequest (43-71)

tests/unit/providers/routing/test_router.py (6)

src/ai_company/config/schema.py (2)

ProviderConfig (52-95)

RoutingConfig (127-149)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

src/ai_company/providers/routing/errors.py (3)

ModelResolutionError (16-19)

NoAvailableModelError (22-25)

UnknownStrategyError (28-31)

src/ai_company/providers/routing/models.py (1)

RoutingRequest (43-71)

src/ai_company/providers/routing/router.py (4)

ModelRouter (28-122)

strategy_name (92-94)

resolver (87-89)

route (96-122)

src/ai_company/providers/routing/resolver.py (1)

all_models (166-169)

src/ai_company/providers/routing/__init__.py (4)

src/ai_company/providers/routing/errors.py (4)

ModelResolutionError (16-19)

NoAvailableModelError (22-25)

RoutingError (10-13)

UnknownStrategyError (28-31)

src/ai_company/providers/routing/router.py (2)

resolver (87-89)

ModelRouter (28-122)

tests/unit/providers/routing/conftest.py (1)

resolver (97-101)

src/ai_company/providers/routing/resolver.py (1)

ModelResolver (28-181)

src/ai_company/providers/routing/strategies.py (5)

src/ai_company/core/role_catalog.py (1)

get_seniority_info (429-445)

src/ai_company/providers/routing/errors.py (2)

ModelResolutionError (16-19)

NoAvailableModelError (22-25)

src/ai_company/providers/routing/models.py (2)

ResolvedModel (9-40)

RoutingDecision (74-92)

src/ai_company/config/schema.py (2)

RoutingConfig (127-149)

RoutingRuleConfig (98-124)

src/ai_company/providers/routing/resolver.py (5)

ModelResolver (28-181)

resolve_safe (146-164)

all_models (166-169)

all_models_sorted_by_cost (171-181)

resolve (117-144)

src/ai_company/core/role_catalog.py (2)

src/ai_company/core/enums.py (2)

CostTier (57-69)

SeniorityLevel (6-21)

src/ai_company/core/role.py (1)

SeniorityInfo (87-120)

src/ai_company/providers/routing/errors.py (1)

src/ai_company/providers/errors.py (1)

ProviderError (22-69)

tests/unit/core/test_role_catalog.py (2)

src/ai_company/core/role_catalog.py (1)

get_seniority_info (429-445)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

tests/unit/core/test_role.py (2)

src/ai_company/core/role.py (1)

SeniorityInfo (87-120)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

src/ai_company/providers/routing/models.py (1)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

tests/unit/providers/routing/test_resolver.py (2)

tests/unit/providers/routing/conftest.py (2)

resolver (97-101)

three_model_provider (85-93)

src/ai_company/providers/routing/resolver.py (6)

ModelResolver (28-181)

from_config (57-115)

resolve (117-144)

all_models (166-169)

resolve_safe (146-164)

all_models_sorted_by_cost (171-181)

src/ai_company/providers/__init__.py (5)

src/ai_company/providers/routing/strategies.py (5)

CostAwareStrategy (450-494)

ManualStrategy (306-343)

RoleBasedStrategy (349-444)

RoutingStrategy (38-66)

SmartStrategy (500-615)

src/ai_company/providers/routing/errors.py (4)

ModelResolutionError (16-19)

NoAvailableModelError (22-25)

RoutingError (10-13)

UnknownStrategyError (28-31)

src/ai_company/providers/routing/resolver.py (1)

ModelResolver (28-181)

src/ai_company/providers/routing/router.py (1)

ModelRouter (28-122)

src/ai_company/providers/routing/models.py (3)

ResolvedModel (9-40)

RoutingDecision (74-92)

RoutingRequest (43-71)

tests/unit/providers/routing/test_errors.py (2)

src/ai_company/providers/errors.py (1)

ProviderError (22-69)

src/ai_company/providers/routing/errors.py (4)

ModelResolutionError (16-19)

NoAvailableModelError (22-25)

RoutingError (10-13)

UnknownStrategyError (28-31)

🪛 markdownlint-cli2 (0.21.0)

.claude/skills/aurelio-review-pr/skill.md

[warning] 111-111: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/2/3

(MD029, ol-prefix)

[warning] 112-112: Ordered list item prefix
Expected: 2; Actual: 8; Style: 1/2/3

(MD029, ol-prefix)

[warning] 113-113: Ordered list item prefix
Expected: 3; Actual: 9; Style: 1/2/3

(MD029, ol-prefix)

[warning] 114-114: Ordered list item prefix
Expected: 4; Actual: 10; Style: 1/2/3

(MD029, ol-prefix)

🔇 Additional comments (28)

src/ai_company/core/role_catalog.py (1)

21-70: LGTM! Vendor-agnostic model tier naming is a solid improvement.

The update from vendor-specific tiers (haiku, sonnet, opus) to generic tiers (small, medium, large) decouples the seniority catalog from any specific LLM provider. The mapping is consistent and covers all 8 seniority levels appropriately.

src/ai_company/core/role.py (1)

93-93: LGTM! Docstring example updated to match new tier naming.

The example value in the docstring now reflects the vendor-agnostic tier naming ("large" instead of "opus"), keeping documentation consistent with the actual values used in SENIORITY_INFO.

tests/unit/core/test_role_catalog.py (1)

60-63: LGTM! Test correctly updated to reflect the new tier value.

The test method name, docstring, and assertion are all consistently updated to verify that SENIOR maps to "medium" tier, matching the updated SENIORITY_INFO mapping in role_catalog.py.

tests/unit/core/test_role.py (1)

123-200: LGTM! Test fixtures updated consistently with vendor-agnostic tier names.

All SeniorityInfo test instances now use the generic tier names ("small", "medium") matching their respective seniority levels, consistent with the updated SENIORITY_INFO mappings:

JUNIOR → "small"

MID → "medium"

SENIOR → "medium"

The test behavior remains unchanged; only the literal values were updated.

src/ai_company/providers/routing/errors.py (1)

10-31: LGTM! Clean error hierarchy.

The routing error classes properly extend ProviderError and follow the established pattern. The docstrings clearly describe each exception's purpose.

Minor observation: The is_retryable = False on subclasses (lines 13, 19, 25, 31) is redundant since RoutingError already sets it to False and it's inherited. You could remove these for DRYness, but keeping them is also fine for explicit documentation.

src/ai_company/providers/routing/__init__.py (1)

1-39: LGTM!

Clean public API surface with comprehensive exports. The __all__ list is alphabetically sorted and complete, matching all imports.

src/ai_company/providers/__init__.py (1)

37-52: LGTM!

Clean integration of the routing module's public API into the providers package. Alphabetical ordering in __all__ is maintained, and all new exports are properly imported.

src/ai_company/observability/events.py (1)

79-91: LGTM!

All new routing event constants follow the established domain.noun.verb naming convention and are properly typed as Final[str]. Good coverage for the routing lifecycle: build, resolution, decision, fallback, budget, and error scenarios.

src/ai_company/providers/routing/models.py (2)

9-40: LGTM! Well-designed immutable domain models.

Good use of ConfigDict(frozen=True) for immutability, NotBlankStr for string validation, and appropriate field constraints (ge=0.0 for costs, gt=0 for context size).

43-92: LGTM!

RoutingRequest and RoutingDecision are well-structured with proper optional field handling and immutability. The fallbacks_tried default of () is appropriate for the tuple type.

src/ai_company/providers/routing/resolver.py (3)

12-22: LGTM! Proper observability setup.

Correct use of get_logger(__name__) with event constants from ai_company.observability.events. The logger variable naming follows conventions.

56-115: LGTM! Robust collision detection with proper logging.

The from_config method correctly:

Detects duplicate refs mapping to different models

Logs at ERROR level with full context before raising

Uses structured logging with kwargs

Raises ModelResolutionError with appropriate context

117-164: LGTM! Good differentiation between resolve and resolve_safe.

Appropriate log levels: resolve logs WARNING on miss (unexpected), while resolve_safe logs DEBUG (expected possibility). Both use structured kwargs.

src/ai_company/providers/routing/strategies.py (7)

37-66: LGTM! Clean protocol definition.

The RoutingStrategy protocol is well-documented with @runtime_checkable for isinstance checks. The docstring clearly specifies the contract and possible exceptions.

72-143: LGTM! Robust fallback resolution helpers.

Good separation between throwing (_try_resolve_with_fallback) and non-throwing (_try_resolve_with_fallback_safe) variants. Proper logging at each fallback attempt and exhaustion.

171-209: LGTM! Budget-aware selection with proper fallback.

The function correctly handles:

No models registered (logs warning + raises)

No budget constraint (returns cheapest)

Budget exceeded by all models (logs warning + returns cheapest anyway with flag)

306-343: LGTM! ManualStrategy implementation.

Correctly validates that model_override is set, logs warning before raising ModelResolutionError, and delegates to resolver for actual lookup.

349-444: LGTM! RoleBasedStrategy with comprehensive fallback handling.

Good flow: rule match → seniority default tier → global fallback chain. All failure paths log at WARNING level before raising NoAvailableModelError.

450-494: LGTM! CostAwareStrategy implementation.

Clean delegation to _try_task_type_rules and _cheapest_within_budget helpers. Budget exceeded case is clearly communicated in the reason string.

500-616: LGTM! SmartStrategy with clear priority order.

The chained or expression at lines 524-546 clearly expresses the priority: override → task_type → role → seniority → global chain → cheapest → exhausted. Helper methods are well-encapsulated. The NoReturn type hint on _raise_exhausted is correct.

src/ai_company/providers/routing/router.py (1)

63-75: Unknown-strategy handling is implemented well.

Structured error logging with contextual fields before raising UnknownStrategyError is solid.

CLAUDE.md (1)

80-80: Good documentation hardening for test portability.

The added vendor-agnostic fixture rule is clear and helps keep tests independent from real provider/model IDs.

tests/unit/providers/routing/conftest.py (1)

59-130: Fixture design is clean and deterministic.

The shared provider/resolver/routing fixtures are reusable, stable, and vendor-agnostic.

tests/unit/providers/routing/test_errors.py (1)

22-61: Error-hierarchy coverage is solid.

These tests correctly validate inheritance, retryability flags, and context/message propagation.

tests/unit/providers/routing/test_models.py (1)

21-125: Model validation tests are comprehensive.

Coverage of defaults, non-negative constraints, and frozen-field behavior is strong.

tests/unit/providers/routing/test_router.py (1)

20-146: Router behavior tests are well-rounded.

Construction, decision paths, logging assertions, and key error scenarios are all covered.

tests/unit/providers/routing/test_strategies.py (1)

32-402: Excellent strategy-path coverage.

The tests exercise precedence logic, fallback behavior, and empty-model failure paths across all strategies.

tests/unit/providers/routing/test_resolver.py (1)

12-114: Resolver test coverage is strong.

Indexing, lookup failure semantics, ordering, deduplication, and immutability checks are all well represented.

coderabbitai · 2026-03-01T21:57:03Z

.claude/skills/aurelio-review-pr/skill.md

+**Logging coverage suggestions (soft rules — mark as SUGGESTION, must be validated by user in triage):**
+
+For every function touched by the PR, analyze its logic and suggest missing logging where appropriate:
+
+7. Error/except paths that don't `logger.warning()` or `logger.error()` with context before raising or returning (SUGGESTION)
+8. State transitions (status changes, lifecycle events, mode switches) that don't `logger.info()` (SUGGESTION)
+9. Object creation, entry/exit of key functions, or important branching decisions that don't `logger.debug()` (SUGGESTION)
+10. Any other code path that would benefit from logging for debuggability or operational visibility — think about what an operator investigating a production issue would want to see (SUGGESTION)


⚠️ Potential issue | 🟡 Minor

Consider restarting list numbering for the soft rules section.

The static analysis tool flagged that items 7-10 should restart from 1 after the new "Logging coverage suggestions" header, per MD029. However, if the cross-section numbering (1-6 for hard rules, 7-10 for soft rules) is intentional for easier reference in triage discussions, you may suppress this warning. Otherwise, consider restarting from 1.

🧰 Tools

🪛 markdownlint-cli2 (0.21.0)

[warning] 111-111: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/2/3

(MD029, ol-prefix)

[warning] 112-112: Ordered list item prefix
Expected: 2; Actual: 8; Style: 1/2/3

(MD029, ol-prefix)

[warning] 113-113: Ordered list item prefix
Expected: 3; Actual: 9; Style: 1/2/3

(MD029, ol-prefix)

[warning] 114-114: Ordered list item prefix
Expected: 4; Actual: 10; Style: 1/2/3

(MD029, ol-prefix)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.claude/skills/aurelio-review-pr/skill.md around lines 107 - 114, The "Logging coverage suggestions" subsection continues numbering at 7-10 which triggers MD029; update the list under the "Logging coverage suggestions (soft rules — mark as SUGGESTION, must be validated by user in triage):" header so numbering restarts at 1 (i.e., change items labeled 7–10 to 1–4), or explicitly suppress the MD029 warning if the cross-section numbering is intentional for easier triage reference; locate the header text "Logging coverage suggestions" and the following list items to make the change and ensure consistency in the document.

src/ai_company/providers/routing/router.py

src/ai_company/providers/routing/strategies.py

gemini-code-assist

Code Review

This pull request introduces a comprehensive model routing engine with multiple strategies, a model resolver for efficient lookups, and extensive test coverage. While the implementation demonstrates high quality and good design principles, a critical security vulnerability was identified: budget constraints specified in the RoutingRequest can be bypassed by manual overrides and routing rules across all strategies. This could lead to unauthorized financial costs if users can influence routing signals such as model_override or task_type. It is crucial to update the routing logic to enforce budget limits as a hard constraint across all selection paths. Additionally, consider improving the fallback logic to prevent redundant model lookups, which will enhance efficiency and clarity.

gemini-code-assist · 2026-03-01T21:58:50Z

src/ai_company/providers/routing/strategies.py

+    def select(
+        self,
+        request: RoutingRequest,
+        config: RoutingConfig,  # noqa: ARG002
+        resolver: ModelResolver,
+    ) -> RoutingDecision:
+        """Select the explicitly requested model.
+
+        Raises:
+            ModelResolutionError: If ``model_override`` is not set or
+                the model cannot be resolved.
+        """
+        if request.model_override is None:
+            logger.warning(
+                ROUTING_NO_RULE_MATCHED,
+                strategy=self.name,
+                reason="model_override not set",
+            )
+            msg = "ManualStrategy requires model_override to be set"
+            raise ModelResolutionError(msg)
+
+        model = resolver.resolve(request.model_override)
+        return RoutingDecision(
+            resolved_model=model,
+            strategy_used=self.name,
+            reason=f"Explicit override: {request.model_override}",
+        )
+
+
+# ── Strategy 2: Role-Based ───────────────────────────────────────
+
+
+class RoleBasedStrategy:
+    """Select model based on agent seniority level.
+
+    Matches the first routing rule where ``rule.role_level`` equals
+    ``request.agent_level``.  If no rule matches, uses the seniority
+    catalog's ``typical_model_tier`` as a fallback lookup.
+    """
+
+    @property
+    def name(self) -> str:
+        """Return strategy name."""
+        return "role_based"
+
+    def select(
+        self,
+        request: RoutingRequest,
+        config: RoutingConfig,
+        resolver: ModelResolver,
+    ) -> RoutingDecision:
+        """Select model based on role level.
+
+        Raises:
+            ModelResolutionError: If no agent_level is set.
+            NoAvailableModelError: If all candidates are exhausted.
+        """
+        if request.agent_level is None:
+            logger.warning(
+                ROUTING_NO_RULE_MATCHED,
+                strategy=self.name,
+                reason="agent_level not set",
+            )
+            msg = "RoleBasedStrategy requires agent_level to be set"
+            raise ModelResolutionError(msg)
+
+        # Try matching rules
+        for rule in config.rules:
+            if rule.role_level == request.agent_level:
+                model, tried = _try_resolve_with_fallback(
+                    rule.preferred_model,
+                    rule,
+                    config,
+                    resolver,
+                )
+                return RoutingDecision(
+                    resolved_model=model,
+                    strategy_used=self.name,
+                    reason=(
+                        f"Role rule match: "
+                        f"level={request.agent_level.value}"
+                        f", model={model.model_id}"
+                    ),
+                    fallbacks_tried=tried,
+                )
+
+        # No rule matched — use seniority default
+        logger.debug(
+            ROUTING_NO_RULE_MATCHED,
+            level=request.agent_level.value,
+            strategy=self.name,
+        )
+        tier = get_seniority_info(
+            request.agent_level,
+        ).typical_model_tier
+        result = _try_resolve_with_fallback_safe(
+            tier,
+            None,
+            config,
+            resolver,
+        )
+        if result is not None:
+            model, tried = result
+            return RoutingDecision(
+                resolved_model=model,
+                strategy_used=self.name,
+                reason=(
+                    f"Seniority default: level={request.agent_level.value}, tier={tier}"
+                ),
+                fallbacks_tried=tried,
+            )
+
+        if config.fallback_chain:
+            chain_detail = f"fallback chain exhausted: {list(config.fallback_chain)}"
+        else:
+            chain_detail = "no fallback chain configured"
+        msg = (
+            f"No model available for "
+            f"level={request.agent_level.value} "
+            f"(tier={tier}, no rules matched, {chain_detail})"
+        )
+        logger.warning(
+            ROUTING_FALLBACK_EXHAUSTED,
+            level=request.agent_level.value,
+            tier=tier,
+            strategy=self.name,
+        )
+        raise NoAvailableModelError(msg)
+
+
+# ── Strategy 3: Cost-Aware ───────────────────────────────────────
+
+
+class CostAwareStrategy:
+    """Select the cheapest model, optionally respecting a budget.
+
+    Matches ``task_type`` rules first, then falls back to the cheapest
+    model from the resolver.
+    """
+
+    @property
+    def name(self) -> str:
+        """Return strategy name."""
+        return "cost_aware"
+
+    def select(
+        self,
+        request: RoutingRequest,
+        config: RoutingConfig,
+        resolver: ModelResolver,
+    ) -> RoutingDecision:
+        """Select the cheapest available model.
+
+        Raises:
+            NoAvailableModelError: If no models are registered.
+        """
+        decision = _try_task_type_rules(
+            request,
+            config,
+            resolver,
+            self.name,
+        )
+        if decision is not None:
+            return decision
+
+        # Pick cheapest
+        model, budget_exceeded = _cheapest_within_budget(
+            resolver,
+            request.remaining_budget,
+        )
+        reason = f"Cheapest available: {model.model_id}"
+        if budget_exceeded:
+            reason += " (all models exceed remaining budget)"
+        return RoutingDecision(
+            resolved_model=model,
+            strategy_used=self.name,
+            reason=reason,
+        )
+
+
+# ── Strategy 4: Smart ────────────────────────────────────────────
+
+
+class SmartStrategy:
+    """Combined strategy with priority-based signal merging.
+
+    Priority order: model_override > task_type rules > role_level
+    rules > seniority default > global fallback_chain > cheapest
+    available.
+    """
+
+    @property
+    def name(self) -> str:
+        """Return strategy name."""
+        return "smart"
+
+    def select(
+        self,
+        request: RoutingRequest,
+        config: RoutingConfig,
+        resolver: ModelResolver,
+    ) -> RoutingDecision:
+        """Select a model using all available signals.
+
+        Raises:
+            NoAvailableModelError: If all candidates are exhausted.
+        """
+        return (
+            self._try_override(request, resolver)
+            or _try_task_type_rules(
+                request,
+                config,
+                resolver,
+                self.name,
+            )
+            or _try_role_rules(
+                request,
+                config,
+                resolver,
+                self.name,
+            )
+            or _try_seniority_default(
+                request,
+                resolver,
+                self.name,
+            )
+            or self._try_global_chain(config, resolver)
+            or self._try_cheapest(request, resolver)
+            or self._raise_exhausted()
+        )
+
+    def _try_override(
+        self,
+        request: RoutingRequest,
+        resolver: ModelResolver,
+    ) -> RoutingDecision | None:
+        if request.model_override is None:
+            return None
+        model = resolver.resolve_safe(request.model_override)
+        if model is None:
+            logger.warning(
+                ROUTING_MODEL_RESOLUTION_FAILED,
+                ref=request.model_override,
+                source="smart_override",
+            )
+            return None
+        return RoutingDecision(
+            resolved_model=model,
+            strategy_used=self.name,
+            reason=f"Explicit override: {request.model_override}",
+        )
+
+    def _try_cheapest(
+        self,
+        request: RoutingRequest,
+        resolver: ModelResolver,
+    ) -> RoutingDecision | None:
+        if not resolver.all_models_sorted_by_cost():
+            return None
+        model, budget_exceeded = _cheapest_within_budget(
+            resolver,
+            request.remaining_budget,
+        )
+        reason = f"Cheapest available: {model.model_id}"
+        if budget_exceeded:
+            reason += " (all models exceed remaining budget)"
+        return RoutingDecision(
+            resolved_model=model,
+            strategy_used=self.name,
+            reason=reason,
+        )
+
+    def _try_global_chain(
+        self,
+        config: RoutingConfig,
+        resolver: ModelResolver,
+    ) -> RoutingDecision | None:
+        chain_result = _walk_fallback_chain(config, resolver)
+        if chain_result is None:
+            return None
+        model, tried = chain_result
+        return RoutingDecision(
+            resolved_model=model,
+            strategy_used=self.name,
+            reason="Global fallback chain",
+            fallbacks_tried=tried,
+        )
+
+    def _raise_exhausted(self) -> NoReturn:
+        logger.warning(
+            ROUTING_FALLBACK_EXHAUSTED,
+            strategy="smart",
+            reason="all signals exhausted",
+        )
+        msg = "SmartStrategy: no model available from any signal"
+        raise NoAvailableModelError(
+            msg,
+            context={"strategy": "smart"},
+        )
+


The model routing engine implements several strategies (ManualStrategy, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy) that prioritize manual overrides and routing rules over budget constraints.

Specifically:

ManualStrategy (lines 317-343) selects the explicitly requested model without any budget check.

RoleBasedStrategy (lines 362-444) selects models based on seniority rules or defaults without checking the remaining_budget.

CostAwareStrategy (lines 450-495) prioritizes task_type rules (which don't check budget) over the cheapest-model selection.

SmartStrategy (lines 500-616) prioritizes overrides, task rules, role rules, and seniority defaults—all of which bypass budget checks—before finally falling back to a budget-aware cheapest-model selection.

Furthermore, even the budget-aware helper _cheapest_within_budget (lines 171-209) is designed to return the cheapest available model even if it exceeds the remaining_budget, rather than raising an error to enforce the limit.

This design allows users to bypass financial controls by triggering rules or providing overrides, potentially leading to significant unauthorized costs.

Recommendation: Modify the select method of each strategy to validate the cost of the chosen model against request.remaining_budget before returning a RoutingDecision. If the cost exceeds the budget, the strategy should either attempt to find a cheaper alternative that fits the budget or raise a NoAvailableModelError. The _cheapest_within_budget helper should also be updated to enforce a hard limit when a budget is provided.

gemini-code-assist · 2026-03-01T21:58:50Z

src/ai_company/providers/routing/strategies.py

+def _try_resolve_with_fallback(
+    ref: str,
+    rule: RoutingRuleConfig | None,
+    config: RoutingConfig,
+    resolver: ModelResolver,
+) -> tuple[ResolvedModel, tuple[str, ...]]:
+    """Try to resolve *ref*, then rule fallback, then global chain.
+
+    Returns:
+        Tuple of (resolved_model, fallbacks_tried).
+
+    Raises:
+        NoAvailableModelError: If all candidates are exhausted.
+    """
+    tried: list[str] = []
+
+    # Try the primary ref
+    model = resolver.resolve_safe(ref)
+    if model is not None:
+        return model, tuple(tried)
+
+    tried.append(ref)
+    logger.debug(ROUTING_FALLBACK_ATTEMPTED, ref=ref, source="primary")
+
+    # Try rule-level fallback
+    if rule is not None and rule.fallback is not None:
+        model = resolver.resolve_safe(rule.fallback)
+        if model is not None:
+            return model, tuple(tried)
+        tried.append(rule.fallback)
+        logger.debug(
+            ROUTING_FALLBACK_ATTEMPTED,
+            ref=rule.fallback,
+            source="rule_fallback",
+        )
+
+    # Walk global fallback chain
+    for chain_ref in config.fallback_chain:
+        model = resolver.resolve_safe(chain_ref)
+        if model is not None:
+            return model, tuple(tried)
+        tried.append(chain_ref)
+        logger.debug(
+            ROUTING_FALLBACK_ATTEMPTED,
+            ref=chain_ref,
+            source="global_chain",
+        )
+
+    logger.warning(ROUTING_FALLBACK_EXHAUSTED, tried=tried)
+    msg = f"All model candidates exhausted: {tried}"
+    raise NoAvailableModelError(msg, context={"tried": tried})


The current implementation of fallback logic may perform redundant model lookups if the same model reference appears in multiple places (e.g., as a primary reference and also in the global fallback chain). This is inefficient and can lead to confusing logs with duplicate ROUTING_FALLBACK_ATTEMPTED events and duplicate entries in the tried list of failed attempts.

To prevent this, you should check if a model reference has already been tried before attempting to resolve it again. This makes the fallback logic more robust and efficient.

def _try_resolve_with_fallback( ref: str, rule: RoutingRuleConfig | None, config: RoutingConfig, resolver: ModelResolver, ) -> tuple[ResolvedModel, tuple[str, ...]]: """Try to resolve *ref*, then rule fallback, then global chain. Returns: Tuple of (resolved_model, fallbacks_tried). Raises: NoAvailableModelError: If all candidates are exhausted. """ tried: list[str] = [] # Try the primary ref model = resolver.resolve_safe(ref) if model is not None: return model, tuple(tried) tried.append(ref) logger.debug(ROUTING_FALLBACK_ATTEMPTED, ref=ref, source="primary") # Try rule-level fallback if rule is not None and rule.fallback is not None: if rule.fallback not in tried: model = resolver.resolve_safe(rule.fallback) if model is not None: return model, tuple(tried) tried.append(rule.fallback) logger.debug( ROUTING_FALLBACK_ATTEMPTED, ref=rule.fallback, source="rule_fallback", ) # Walk global fallback chain for chain_ref in config.fallback_chain: if chain_ref not in tried: model = resolver.resolve_safe(chain_ref) if model is not None: return model, tuple(tried) tried.append(chain_ref) logger.debug( ROUTING_FALLBACK_ATTEMPTED, ref=chain_ref, source="global_chain", ) logger.warning(ROUTING_FALLBACK_EXHAUSTED, tried=tried) msg = f"All model candidates exhausted: {tried}" raise NoAvailableModelError(msg, context={"tried": tried})

…bbit, and Gemini - Log routing failures in route() before re-raising (CodeRabbit) - Clarify remaining_budget as per-request cost ceiling, not total USD budget - Add total_cost_per_1k property to ResolvedModel, use in 3+ call sites - Update docstring examples to vendor-agnostic (no real provider names) - Extract strategy name constants to prevent drift in STRATEGY_MAP - Dedup tried refs in _try_resolve_with_fallback to avoid redundant lookups - Log empty fallback chains in _walk_fallback_chain for diagnostics - Add debug logs for task_type and role rule miss paths - Document SmartStrategy._try_override soft-fail design - Fix _try_cheapest to use all_models() instead of double-sort - Add "cheapest" alias inline comment in STRATEGY_MAP - Tighten collision detection in ModelResolver (compare full object) - Add RoutingRuleConfig model_validator: at least one matcher required - Add warning log before LookupError in get_seniority_info - Add 10 new tests: collision detection, override soft-fail, 3-stage fallback, mid-range budget, blank validation for RoutingRequest fields

Aureliolo · 2026-03-02T05:25:39Z

@coderabbitai review

coderabbitai · 2026-03-02T05:25:51Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Aureliolo · 2026-03-02T05:26:56Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a comprehensive and well-designed model routing engine. The implementation is robust, featuring multiple routing strategies, a fast model resolver using immutable data structures, and thorough validation. The code is well-structured, with excellent separation of concerns, extensive structured logging, and a solid test suite that covers numerous edge cases. Decoupling model tiers from vendor-specific names is a significant improvement for maintainability. I have one suggestion for a minor refactoring in the SmartStrategy to improve its internal logic, but overall, this is a high-quality contribution.

gemini-code-assist · 2026-03-02T05:28:57Z

src/ai_company/providers/routing/strategies.py

+        return (
+            self._try_override(request, resolver)
+            or _try_task_type_rules(
+                request,
+                config,
+                resolver,
+                self.name,
+            )
+            or _try_role_rules(
+                request,
+                config,
+                resolver,
+                self.name,
+            )
+            or _try_seniority_default(
+                request,
+                resolver,
+                self.name,
+            )
+            or self._try_global_chain(config, resolver)
+            or self._try_cheapest(request, resolver)
+            or self._raise_exhausted()
+        )


The current implementation of SmartStrategy.select has some logical redundancy that could make it harder to maintain. Specifically, the global fallback chain can be walked multiple times during a single select call.

For example:

_try_task_type_rules is called. If it matches a rule, it internally calls _try_resolve_with_fallback_safe, which walks the global fallback chain.

If that fails to find a model, the or chain continues.

Eventually, self._try_global_chain(config, resolver) is called, which may walk the same global fallback chain again.

While this is a minor inefficiency, it points to a conceptual complexity where candidate selection (e.g., finding a rule) is mixed with resolution logic (trying the ref and all its fallbacks).

A potential refactoring could separate these concerns. The select method could first identify a prioritized list of candidate model references from all available signals (override, task rule, role rule, seniority default), and then orchestrate the resolution with fallbacks centrally. This would eliminate the redundancy and make the priority logic in SmartStrategy more explicit and easier to follow.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/config/schema.py`:
- Around line 129-134: The validation in _at_least_one_matcher currently raises
a ValueError without logging; update it to log the failure at WARNING or ERROR
level with contextual details (include the message "Routing rule must specify at
least role_level or task_type" plus relevant self fields such as the rule
id/name, role_level, and task_type) immediately before raising. Use the
module/class logger (create a module-level logger via
logging.getLogger(__name__) if one doesn't exist) and call logger.warning or
logger.error inside _at_least_one_matcher, then raise the ValueError as before
so the exception path is logged with context.

In `@src/ai_company/providers/routing/resolver.py`:
- Around line 57-115: The from_config classmethod on ModelResolver is too large
and should be split into focused helpers: extract a helper to build a
ResolvedModel from (provider_name, model_config) (referencing ResolvedModel and
model_config fields: id, alias, cost_per_1k_input, cost_per_1k_output,
max_context), extract a helper to index a resolved model into the index dict
(handling duplicate detection and raising ModelResolutionError and logging
ROUTING_MODEL_RESOLUTION_FAILED with the same context), and extract a small
helper to compute the final metrics for the logger call (model_count/ref_count).
Replace the current inline loops in from_config with calls to these helpers and
keep from_config as an orchestration layer that builds the index and returns
cls(index).

In `@src/ai_company/providers/routing/router.py`:
- Around line 116-127: The RoutingError except block currently swallows the
exception; change the handler to capture the exception (e.g., except
RoutingError as err) and include its type/message in the warning log for
ROUTING_SELECTION_FAILED; update the logger.warning call near
ROUTING_SELECTION_FAILED (keeping strategy=self._strategy.name and request
fields) to add error details (err or str(err)) or pass exc_info=err so logs
include the exception metadata for easier diagnosis.

In `@src/ai_company/providers/routing/strategies.py`:
- Around line 80-137: The _try_resolve_with_fallback function is overlong; split
its three resolution stages into small helpers to keep functions under 50 lines:
create helper functions like _resolve_primary(ref, resolver, tried, seen),
_resolve_rule_fallback(rule, resolver, tried, seen) and
_resolve_global_chain(config, resolver, tried, seen) that each attempt
resolution with resolver.resolve_safe, update tried/seen, emit
ROUTING_FALLBACK_ATTEMPTED logs, and return a ResolvedModel or None; keep
NoAvailableModelError raising and ROUTING_FALLBACK_EXHAUSTED logging in the
original _try_resolve_with_fallback which now orchestrates the helpers and
preserves types (RoutingRuleConfig, RoutingConfig, ModelResolver) and the final
error context {"tried": tried}.
- Around line 394-477: The select method in RoleBasedStrategy is too large;
split its responsibilities into small helpers: extract validation into a
_validate_request(request) that checks request.agent_level and raises
ModelResolutionError with the same logging (ROUTING_NO_RULE_MATCHED, logger),
extract rule matching into _select_by_rule(request, config, resolver) that
iterates config.rules and calls _try_resolve_with_fallback(rule.preferred_model,
rule, config, resolver) and returns a RoutingDecision when found, extract
seniority fallback into _select_by_seniority(request, config, resolver) that
calls get_seniority_info(...).typical_model_tier and
_try_resolve_with_fallback_safe(tier, None, config, resolver) and returns a
RoutingDecision when found, and extract the final error path into
_raise_no_available(request, config, tier) which logs ROUTING_FALLBACK_EXHAUSTED
and raises NoAvailableModelError with the same constructed message; then rewrite
RoleBasedStrategy.select to call these helpers in sequence (validate ->
_select_by_rule -> _select_by_seniority -> _raise_no_available) so each function
stays small and the main select is under 50 lines.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7618335 and 99fa913.

📒 Files selected for processing (14)

src/ai_company/config/schema.py
src/ai_company/core/role_catalog.py
src/ai_company/observability/events.py
src/ai_company/providers/__init__.py
src/ai_company/providers/routing/__init__.py
src/ai_company/providers/routing/models.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/strategies.py
tests/unit/config/conftest.py
tests/unit/config/test_schema.py
tests/unit/providers/routing/test_models.py
tests/unit/providers/routing/test_resolver.py
tests/unit/providers/routing/test_strategies.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations — Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models with BaseModel, model_validator, and ConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines

Files:

src/ai_company/providers/routing/strategies.py
src/ai_company/providers/routing/models.py
src/ai_company/config/schema.py
src/ai_company/observability/events.py
tests/unit/providers/routing/test_strategies.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/resolver.py
src/ai_company/core/role_catalog.py
tests/unit/providers/routing/test_models.py
src/ai_company/providers/routing/__init__.py
tests/unit/config/conftest.py
src/ai_company/providers/__init__.py
tests/unit/config/test_schema.py
tests/unit/providers/routing/test_resolver.py

src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic must import: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging or logging.getLogger() or print() in application code — use the ai_company logger
Always use logger as the variable name (not _logger or log)
Always use event name constants from ai_company.observability.events in logger calls
Use structured logging format: logger.info(EVENT, key=value) — never use format strings like logger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions

Files:

src/ai_company/providers/routing/strategies.py
src/ai_company/providers/routing/models.py
src/ai_company/config/schema.py
src/ai_company/observability/events.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/resolver.py
src/ai_company/core/role_catalog.py
src/ai_company/providers/routing/__init__.py
src/ai_company/providers/__init__.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark all test cases with pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow
Configure pytest async mode as asyncio_mode = "auto" — do not manually add @pytest.mark.asyncio to test functions

Files:

tests/unit/providers/routing/test_strategies.py
tests/unit/providers/routing/test_models.py
tests/unit/config/conftest.py
tests/unit/config/test_schema.py
tests/unit/providers/routing/test_resolver.py

🧠 Learnings (5)

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to **/*.py : Do not use `from __future__ import annotations` — Python 3.14 has native lazy annotations via PEP 649

Applied to files:

src/ai_company/providers/routing/strategies.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/resolver.py

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : NEVER add `from __future__ import annotations` - Python 3.14+ has native support for deferred annotation evaluation and does not require this import

Applied to files:

src/ai_company/providers/routing/strategies.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/resolver.py

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls

Applied to files:

src/ai_company/observability/events.py

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception

Applied to files:

src/ai_company/providers/routing/router.py

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions

Applied to files:

src/ai_company/providers/routing/router.py

🧬 Code graph analysis (8)

src/ai_company/providers/routing/models.py (1)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

tests/unit/providers/routing/test_strategies.py (5)

src/ai_company/config/schema.py (3)

ProviderConfig (52-95)

RoutingConfig (137-159)

RoutingRuleConfig (98-134)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

src/ai_company/providers/routing/errors.py (2)

ModelResolutionError (16-19)

NoAvailableModelError (22-25)

src/ai_company/providers/routing/models.py (1)

RoutingRequest (48-83)

src/ai_company/providers/routing/resolver.py (2)

ModelResolver (28-181)

from_config (57-115)

src/ai_company/providers/routing/router.py (5)

src/ai_company/providers/routing/errors.py (2)

RoutingError (10-13)

UnknownStrategyError (28-31)

tests/unit/providers/routing/conftest.py (1)

resolver (97-101)

src/ai_company/providers/routing/resolver.py (2)

ModelResolver (28-181)

from_config (57-115)

src/ai_company/config/schema.py (1)

ProviderConfig (52-95)

src/ai_company/providers/routing/models.py (2)

RoutingDecision (86-104)

RoutingRequest (48-83)

src/ai_company/providers/routing/resolver.py (4)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

src/ai_company/providers/routing/errors.py (1)

ModelResolutionError (16-19)

src/ai_company/providers/routing/models.py (2)

ResolvedModel (9-45)

total_cost_per_1k (43-45)

src/ai_company/config/schema.py (1)

ProviderConfig (52-95)

src/ai_company/core/role_catalog.py (2)

src/ai_company/core/enums.py (2)

CostTier (57-69)

SeniorityLevel (6-21)

src/ai_company/core/role.py (1)

SeniorityInfo (87-120)

tests/unit/providers/routing/test_models.py (3)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

src/ai_company/providers/routing/models.py (3)

ResolvedModel (9-45)

RoutingDecision (86-104)

RoutingRequest (48-83)

tests/unit/providers/routing/conftest.py (3)

ResolvedModelFactory (25-34)

RoutingDecisionFactory (47-54)

RoutingRequestFactory (37-44)

src/ai_company/providers/routing/__init__.py (4)

src/ai_company/providers/routing/errors.py (4)

ModelResolutionError (16-19)

NoAvailableModelError (22-25)

RoutingError (10-13)

UnknownStrategyError (28-31)

src/ai_company/providers/routing/models.py (3)

ResolvedModel (9-45)

RoutingDecision (86-104)

RoutingRequest (48-83)

src/ai_company/providers/routing/resolver.py (1)

ModelResolver (28-181)

src/ai_company/providers/routing/strategies.py (5)

CostAwareStrategy (482-526)

ManualStrategy (338-375)

RoleBasedStrategy (381-476)

RoutingStrategy (46-74)

SmartStrategy (532-654)

tests/unit/config/test_schema.py (2)

src/ai_company/config/schema.py (1)

RoutingRuleConfig (98-134)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

🔇 Additional comments (14)

src/ai_company/core/role_catalog.py (2)

25-67: Vendor-agnostic tier mapping is consistent and clean.

The typical_model_tier updates are consistent across all seniority levels and align with the routing decoupling objective.

443-447: Good observability on the miss/error path.

Adding a structured WARNING log with ROLE_LOOKUP_MISS before raising LookupError is the right behavior for diagnosability.

As per coding guidelines, "All error paths must log at WARNING or ERROR level with context before raising an exception."

src/ai_company/config/schema.py (1)

101-103: Matcher requirement documentation is clear and actionable.

This makes the rule contract explicit and easier to reason about in config/tests.

tests/unit/config/conftest.py (1)

42-44: Factory default update is appropriate for new routing-rule validation.

Setting task_type here keeps generated test objects valid by default.

src/ai_company/observability/events.py (1)

79-91: Routing event constant set is comprehensive and consistent.

These additions improve observability coverage across routing lifecycle/error paths.

tests/unit/config/test_schema.py (1)

136-154: Test updates correctly reflect the new routing-rule matcher contract.

Coverage now includes both valid minimal shapes and the no-matcher rejection path.

Also applies to: 195-200, 357-362, 377-377, 421-427, 436-442

src/ai_company/providers/routing/resolver.py (1)

52-54: Immutable index wrapping is a solid safety choice.

Using MappingProxyType prevents accidental mutation of resolver state.

tests/unit/providers/routing/test_models.py (1)

18-141: Domain model test coverage is strong and well-targeted.

The suite validates defaults, invariants, and frozen-model behavior across all routing models.

src/ai_company/providers/routing/router.py (1)

79-85: Router build logging includes the right operational context.

The emitted payload is useful for startup diagnostics and config verification.

src/ai_company/providers/routing/__init__.py (1)

7-51: Routing package exports are clean and complete.

__all__ is explicit and aligns with the intended public routing API.

tests/unit/providers/routing/test_strategies.py (1)

26-470: Comprehensive strategy coverage looks solid.

Great depth here across protocol conformance, fallback chains, soft-fail behavior, and budget-aware paths.

src/ai_company/providers/__init__.py (1)

37-109: Routing re-exports are clean and consistent.

The import/export surface is coherent and the new symbols are correctly exposed via __all__.

tests/unit/providers/routing/test_resolver.py (1)

12-143: Resolver tests cover the critical invariants well.

Good checks for collision detection, immutability, and cost ordering semantics.

src/ai_company/providers/routing/models.py (1)

9-104: Routing domain models are well-structured and validated.

Immutability + field constraints + clear typed contracts look good for strategy/resolver integration.

src/ai_company/config/schema.py

coderabbitai · 2026-03-02T05:31:14Z

src/ai_company/providers/routing/resolver.py

+    def from_config(
+        cls,
+        providers: dict[str, ProviderConfig],
+    ) -> ModelResolver:
+        """Build a resolver from a provider config dict.
+
+        Args:
+            providers: Provider config dict (key = provider name).
+
+        Returns:
+            A new ``ModelResolver`` with all models indexed.
+        """
+        index: dict[str, ResolvedModel] = {}
+
+        for provider_name, provider_config in providers.items():
+            for model_config in provider_config.models:
+                resolved = ResolvedModel(
+                    provider_name=provider_name,
+                    model_id=model_config.id,
+                    alias=model_config.alias,
+                    cost_per_1k_input=model_config.cost_per_1k_input,
+                    cost_per_1k_output=model_config.cost_per_1k_output,
+                    max_context=model_config.max_context,
+                )
+                for ref in (model_config.id, model_config.alias):
+                    if ref is None:
+                        continue
+                    existing = index.get(ref)
+                    if existing is not None and existing != resolved:
+                        logger.error(
+                            ROUTING_MODEL_RESOLUTION_FAILED,
+                            ref=ref,
+                            existing_provider=existing.provider_name,
+                            existing_model_id=existing.model_id,
+                            new_provider=provider_name,
+                            new_model_id=resolved.model_id,
+                        )
+                        msg = (
+                            f"Duplicate model reference {ref!r}: "
+                            f"{existing.provider_name}/{existing.model_id} "
+                            f"vs {provider_name}/{resolved.model_id}"
+                        )
+                        raise ModelResolutionError(
+                            msg,
+                            context={
+                                "ref": ref,
+                                "existing_provider": existing.provider_name,
+                                "new_provider": provider_name,
+                            },
+                        )
+                    index[ref] = resolved
+
+        logger.info(
+            ROUTING_RESOLVER_BUILT,
+            model_count=len({m.model_id for m in index.values()}),
+            ref_count=len(index),
+            providers=sorted(providers),
+        )
+        return cls(index)


🛠️ Refactor suggestion | 🟠 Major

Refactor from_config into smaller helpers to satisfy function-size rule.

This method currently combines multiple concerns and exceeds the 50-line limit.

♻️ Proposed refactor direction

class ModelResolver: + `@staticmethod` + def _build_resolved_model( + provider_name: str, + model_config: ProviderModelConfig, + ) -> ResolvedModel: + return ResolvedModel( + provider_name=provider_name, + model_id=model_config.id, + alias=model_config.alias, + cost_per_1k_input=model_config.cost_per_1k_input, + cost_per_1k_output=model_config.cost_per_1k_output, + max_context=model_config.max_context, + ) + + `@staticmethod` + def _index_ref( + index: dict[str, ResolvedModel], + ref: str, + resolved: ResolvedModel, + provider_name: str, + ) -> None: + existing = index.get(ref) + if existing is not None and existing != resolved: + logger.error( + ROUTING_MODEL_RESOLUTION_FAILED, + ref=ref, + existing_provider=existing.provider_name, + existing_model_id=existing.model_id, + new_provider=provider_name, + new_model_id=resolved.model_id, + ) + msg = ( + f"Duplicate model reference {ref!r}: " + f"{existing.provider_name}/{existing.model_id} " + f"vs {provider_name}/{resolved.model_id}" + ) + raise ModelResolutionError(msg, context={"ref": ref}) + index[ref] = resolved @@ def from_config( cls, providers: dict[str, ProviderConfig], ) -> ModelResolver: index: dict[str, ResolvedModel] = {} for provider_name, provider_config in providers.items(): for model_config in provider_config.models: - resolved = ResolvedModel(...) + resolved = cls._build_resolved_model(provider_name, model_config) for ref in (model_config.id, model_config.alias): if ref is None: continue - ... + cls._index_ref(index, ref, resolved, provider_name)

As per coding guidelines: "Keep functions under 50 lines; keep files under 800 lines."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/providers/routing/resolver.py` around lines 57 - 115, The from_config classmethod on ModelResolver is too large and should be split into focused helpers: extract a helper to build a ResolvedModel from (provider_name, model_config) (referencing ResolvedModel and model_config fields: id, alias, cost_per_1k_input, cost_per_1k_output, max_context), extract a helper to index a resolved model into the index dict (handling duplicate detection and raising ModelResolutionError and logging ROUTING_MODEL_RESOLUTION_FAILED with the same context), and extract a small helper to compute the final metrics for the logger call (model_count/ref_count). Replace the current inline loops in from_config with calls to these helpers and keep from_config as an orchestration layer that builds the index and returns cls(index).

src/ai_company/providers/routing/router.py

src/ai_company/providers/routing/strategies.py

Copilot

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-02T05:32:45Z

src/ai_company/core/role_catalog.py

 SENIORITY_INFO: tuple[SeniorityInfo, ...] = (
    SeniorityInfo(
        level=SeniorityLevel.JUNIOR,
        authority_scope="Execute assigned tasks only",
-        typical_model_tier="haiku",
+        typical_model_tier="small",
        cost_tier=CostTier.LOW,
    ),


typical_model_tier values were changed to vendor-agnostic aliases (small/medium/large), but built-in templates + template rendering logic still default to/use haiku/sonnet/opus (e.g. src/ai_company/templates/renderer.py default "sonnet" and multiple templates/builtins/*.yaml). This will likely make template-generated configs fail to resolve models unless users keep legacy aliases; either update templates/renderer to the new tier names or add backward-compatible alias mapping (haiku→small, sonnet→medium, opus→large).

…and templates - Add allow_inf_nan=False to ResolvedModel and ProviderModelConfig - Add logging before raise in RoutingRuleConfig._at_least_one_matcher - Add strategy-required-fields docstring to RoutingRequest - Update all template tiers from vendor-specific (opus/sonnet/haiku) to generic (large/medium/small) across schema, renderer, and 7 YAML builtins - Add tests for total_cost_per_1k, inf/nan rejection, fallback chain skipping unresolvable entries, exhausted non-empty chain, rule fallback dedup, and routing failure warning log - Rename misleading test_budget_filters_models to test_tight_budget_picks_cheapest - Add noqa comments for intentional runtime imports (PEP 649 compatibility)

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/config/schema.py`:
- Around line 137-141: The validation-failure warning for RoutingRuleConfig
currently logs CONFIG_VALIDATION_FAILED with model and error only; update the
logger.warning call (the one that emits CONFIG_VALIDATION_FAILED in
src/ai_company/config/schema.py, around the RoutingRuleConfig validation path)
to include the routing rule context keys role_level, task_type, preferred_model,
and fallback in the payload (pass them as keyword args alongside
model="RoutingRuleConfig" and error=msg) so the warning contains those fields
for easier triage.

In `@src/ai_company/providers/routing/models.py`:
- Around line 69-90: The remaining_budget field on the RoutingRequest model
currently allows NaN/Infinity which breaks deterministic cost comparisons;
update the Field(...) for remaining_budget in the RoutingRequest class to
disallow non-finite floats by adding allow_inf_nan=False (i.e.,
remaining_budget: float | None = Field(default=None, ge=0.0,
allow_inf_nan=False, description=...)); ensure the same semantics as
ResolvedModel so budget filtering remains reliable.

In `@src/ai_company/providers/routing/strategies.py`:
- Around line 562-569: The code returns immediately when
_try_task_type_rules(...) yields a decision, which skips budget checks and
allows models that exceed remaining_budget; update the flow in the method
containing the call to _try_task_type_rules so that when a non-None decision is
produced you still validate it against remaining_budget (or run the same
budget-filtering logic used in the cost-aware branch) before returning—e.g.,
check the selected model's cost against remaining_budget and only return the
decision if it passes, otherwise continue to the next decision path or fall back
to a budget-compliant selection; reference _try_task_type_rules,
remaining_budget, and the cost-aware routing logic when making the change.
- Around line 615-633: The SmartStrategy priority chain is short-circuited by
calling self._try_global_chain before role/seniority stages, letting global
fallback models win too early; change the resolution order in the composite OR
chain so that _try_task_type_rules, _try_role_rules, and _try_seniority_default
run (and may traverse config.fallback_chain as they currently do) before any
global-chain resolution, and only call self._try_global_chain after
_try_seniority_default and _try_cheapest. Update the OR sequence that currently
ends with "or self._try_global_chain(config, resolver) or
self._try_cheapest(request, resolver)" to instead call
self._try_cheapest(request, resolver) first and then
self._try_global_chain(config, resolver) so global-chain models are only
considered last.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99fa913 and b17601b.

📒 Files selected for processing (18)

src/ai_company/config/schema.py
src/ai_company/providers/routing/models.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/strategies.py
src/ai_company/templates/builtins/agency.yaml
src/ai_company/templates/builtins/dev_shop.yaml
src/ai_company/templates/builtins/full_company.yaml
src/ai_company/templates/builtins/product_team.yaml
src/ai_company/templates/builtins/research_lab.yaml
src/ai_company/templates/builtins/solo_founder.yaml
src/ai_company/templates/builtins/startup.yaml
src/ai_company/templates/renderer.py
src/ai_company/templates/schema.py
tests/unit/providers/routing/test_models.py
tests/unit/providers/routing/test_router.py
tests/unit/providers/routing/test_strategies.py
tests/unit/templates/test_schema.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations — Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models with BaseModel, model_validator, and ConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines

Files:

src/ai_company/templates/schema.py
tests/unit/templates/test_schema.py
src/ai_company/templates/renderer.py
tests/unit/providers/routing/test_router.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/models.py
tests/unit/providers/routing/test_strategies.py
tests/unit/providers/routing/test_models.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/strategies.py
src/ai_company/config/schema.py

src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic must import: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging or logging.getLogger() or print() in application code — use the ai_company logger
Always use logger as the variable name (not _logger or log)
Always use event name constants from ai_company.observability.events in logger calls
Use structured logging format: logger.info(EVENT, key=value) — never use format strings like logger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions

Files:

src/ai_company/templates/schema.py
src/ai_company/templates/renderer.py
src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/models.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/strategies.py
src/ai_company/config/schema.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark all test cases with pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow
Configure pytest async mode as asyncio_mode = "auto" — do not manually add @pytest.mark.asyncio to test functions

Files:

tests/unit/templates/test_schema.py
tests/unit/providers/routing/test_router.py
tests/unit/providers/routing/test_strategies.py
tests/unit/providers/routing/test_models.py

🧠 Learnings (11)

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/model_service.py} : Respect existing model configuration patterns in Ollama integration

Applied to files:

src/ai_company/templates/schema.py
tests/unit/templates/test_schema.py
src/ai_company/templates/renderer.py
src/ai_company/providers/routing/resolver.py

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.

Applied to files:

tests/unit/templates/test_schema.py

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Configure appropriate temperature settings based on agent role: Writer (0.9), Editor (0.6), Continuity (0.3), Architect (0.85), Interviewer (0.7)

Applied to files:

src/ai_company/templates/builtins/research_lab.yaml

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Use context size of 32768 tokens as default for Ollama models.

Applied to files:

src/ai_company/templates/renderer.py

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to **/*.py : Do not use `from __future__ import annotations` — Python 3.14 has native lazy annotations via PEP 649

Applied to files:

src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/strategies.py

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : NEVER add `from __future__ import annotations` - Python 3.14+ has native support for deferred annotation evaluation and does not require this import

Applied to files:

src/ai_company/providers/routing/router.py
src/ai_company/providers/routing/resolver.py
src/ai_company/providers/routing/strategies.py

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception

Applied to files:

src/ai_company/providers/routing/router.py
src/ai_company/config/schema.py

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions

Applied to files:

src/ai_company/providers/routing/router.py

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic must import: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

src/ai_company/config/schema.py

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls

Applied to files:

src/ai_company/config/schema.py

📚 Learning: 2026-03-01T19:59:46.936Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All state transitions must log at INFO level

Applied to files:

src/ai_company/config/schema.py

🧬 Code graph analysis (4)

tests/unit/providers/routing/test_router.py (5)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

src/ai_company/providers/routing/errors.py (3)

ModelResolutionError (16-19)

NoAvailableModelError (22-25)

UnknownStrategyError (28-31)

src/ai_company/providers/routing/router.py (4)

ModelRouter (24-134)

strategy_name (88-90)

resolver (83-85)

route (92-134)

tests/unit/providers/routing/conftest.py (3)

three_model_provider (85-93)

resolver (97-101)

standard_routing_config (105-130)

src/ai_company/providers/routing/resolver.py (1)

all_models (173-176)

src/ai_company/providers/routing/router.py (3)

src/ai_company/providers/routing/errors.py (2)

RoutingError (10-13)

UnknownStrategyError (28-31)

src/ai_company/providers/routing/models.py (1)

RoutingRequest (48-90)

src/ai_company/providers/routing/resolver.py (2)

ModelResolver (25-188)

from_config (87-122)

tests/unit/providers/routing/test_models.py (3)

src/ai_company/core/enums.py (1)

SeniorityLevel (6-21)

src/ai_company/providers/routing/models.py (4)

ResolvedModel (9-45)

RoutingDecision (93-111)

RoutingRequest (48-90)

total_cost_per_1k (43-45)

tests/unit/providers/routing/conftest.py (3)

ResolvedModelFactory (25-34)

RoutingDecisionFactory (47-54)

RoutingRequestFactory (37-44)

src/ai_company/config/schema.py (2)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

src/ai_company/observability/config.py (1)

LogConfig (115-192)

🔇 Additional comments (17)

src/ai_company/templates/builtins/solo_founder.yaml (1)

36-36: Tier migration is consistent and clean.

The updated model tiers align with the new vendor-agnostic template strategy.

Also applies to: 41-41

src/ai_company/templates/builtins/research_lab.yaml (1)

35-35: Model tier updates look correct.

These assignments are consistent with the standardized large/medium/small naming.

Also applies to: 40-40, 45-45

src/ai_company/templates/builtins/dev_shop.yaml (1)

38-38: Good tier normalization across agent roles.

The migration preserves intent and removes provider coupling as expected.

Also applies to: 43-43, 48-48, 53-53, 58-58

src/ai_company/templates/builtins/full_company.yaml (1)

44-44: Template model mapping update is coherent.

The revised tiers are consistent with the new generic catalog.

Also applies to: 49-49, 54-54, 58-58, 63-63

src/ai_company/config/schema.py (1)

32-32: allow_inf_nan=False is a solid hardening change.

Good guardrail for provider model numeric fields.

src/ai_company/templates/builtins/startup.yaml (1)

39-39: Startup template tier migration looks correct.

The updated model values are consistent and aligned with the new routing vocabulary.

Also applies to: 44-44, 49-49, 54-54, 59-59

tests/unit/templates/test_schema.py (1)

79-79: Test expectation update is correct.

This assertion now matches the new default model tier.

src/ai_company/templates/builtins/agency.yaml (1)

38-38: Agency template model updates are consistent.

The changes correctly apply the vendor-agnostic tier naming.

Also applies to: 43-43, 47-47

src/ai_company/templates/renderer.py (1)

379-380: LGTM! Vendor-agnostic model tier default.

The change from "sonnet" to "medium" aligns with the PR objective to use vendor-agnostic tier names (small/medium/large) instead of vendor-specific model names.

src/ai_company/templates/builtins/product_team.yaml (1)

38-52: LGTM! Consistent vendor-agnostic model tier assignments.

The model tier updates appropriately map agent roles to capability tiers:

Senior roles (Product Manager, Full-Stack Developer) and mid-level UX Designer → "medium"

QA Engineer → "small"

This aligns with the PR objective for vendor-agnostic tier naming.

src/ai_company/templates/schema.py (1)

88-102: LGTM! Schema default and documentation aligned.

The default value change to "medium" and updated docstring documenting tier options ("large", "medium", "small") are consistent with the vendor-agnostic tier naming across the codebase.

tests/unit/providers/routing/test_models.py (2)

1-98: LGTM! Comprehensive model validation tests.

Good coverage of ResolvedModel including:

Factory construction and immutability

Optional alias handling

Cost defaults and non-negative validation

max_context positivity constraint

NotBlankStr validation for provider_name/model_id

total_cost_per_1k computed property

Inf/NaN rejection via allow_inf_nan=False

100-166: LGTM! Request and decision model tests.

Good coverage of RoutingRequest and RoutingDecision:

All-optional field construction

Negative budget rejection

NotBlankStr validation for task_type and model_override

Default empty fallbacks tuple

Immutability enforcement

tests/unit/providers/routing/test_router.py (1)

1-171: LGTM! Comprehensive router tests with observability verification.

Excellent coverage including:

Strategy construction and alias mapping (cheapest → cost_aware)

Error handling for unknown strategies

Routing behavior across all strategies (cost_aware, manual, role_based, smart)

Observability verification for ROUTING_ROUTER_BUILT, ROUTING_DECISION_MADE, and ROUTING_SELECTION_FAILED events

Error propagation validation

src/ai_company/providers/routing/router.py (1)

1-134: LGTM! Well-structured router with proper observability.

The implementation:

Correctly delegates to strategies via STRATEGY_MAP

Logs structured events for router construction (ROUTING_ROUTER_BUILT) and routing decisions (ROUTING_DECISION_MADE)

Logs warning with full context (ROUTING_SELECTION_FAILED) before re-raising RoutingError

Exposes resolver and strategy_name properties for introspection

tests/unit/providers/routing/test_strategies.py (1)

1-540: LGTM! Comprehensive strategy test suite.

Excellent coverage of all four strategies with edge cases:

Protocol conformance and STRATEGY_MAP completeness (including cheapest alias)

ManualStrategy: override resolution, missing override, unknown model errors

RoleBasedStrategy: rule matching, seniority defaults, rule fallbacks, global chain fallbacks

CostAwareStrategy: cheapest selection, task_type priority, budget filtering, zero-budget warning

SmartStrategy: priority ordering (override > task_type > role > seniority > chain > cheapest), soft-fail override handling

Fallback chain deduplication and exhaustion scenarios

src/ai_company/providers/routing/resolver.py (1)

1-188: LGTM! Clean resolver with O(1) lookup and collision detection.

Well-designed implementation:

MappingProxyType ensures post-construction immutability

_index_ref helper properly handles collision detection with structured error logging

resolve() logs warning before raising ModelResolutionError

resolve_safe() provides non-throwing variant with debug logging

all_models_sorted_by_cost() supports cost-aware strategy selection

Refactoring from past review (extracting _index_ref) has been applied, keeping from_config concise

src/ai_company/config/schema.py

coderabbitai · 2026-03-02T19:27:55Z

src/ai_company/providers/routing/models.py

+    model_config = ConfigDict(frozen=True)
+
+    agent_level: SeniorityLevel | None = Field(
+        default=None,
+        description="Seniority level of the requesting agent",
+    )
+    task_type: NotBlankStr | None = Field(
+        default=None,
+        description="Task type label",
+    )
+    model_override: NotBlankStr | None = Field(
+        default=None,
+        description="Explicit model reference for manual routing",
+    )
+    remaining_budget: float | None = Field(
+        default=None,
+        ge=0.0,
+        description=(
+            "Per-request cost ceiling (compared against model "
+            "total_cost_per_1k). Not a total session budget."
+        ),
+    )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, examine the file to confirm the model definition cat -n src/ai_company/providers/routing/models.py | head -100

Repository: Aureliolo/ai-company

Length of output: 4271

🌐 Web query:

Pydantic v2 ConfigDict allow_inf_nan default value and float field validation behavior

💡 Result:

ConfigDict.allow_inf_nan default (Pydantic v2): True — i.e., by default Pydantic accepts float('inf'), float('-inf'), and float('nan') (and likewise for Decimal fields). [1]

What it changes for float fields:

If allow_inf_nan=True: float validation allows inf/-inf/nan (including commonly accepted string forms like "inf", "infinity", "nan" in non-strict parsing). [1][3]

If allow_inf_nan=False: those values are rejected with a validation error for float/decimal fields. [1]

Schema-level / per-field override: in pydantic-core, float_schema(..., allow_inf_nan=...) exists; its default is None (meaning “don’t override here”, so model config can apply). [2]

Sources
[1] Pydantic v2 config docs (ConfigDict.allow_inf_nan, default True) (docs.pydantic.dev)
[2] Pydantic docs: pydantic_core.core_schema.float_schema(allow_inf_nan=...) (docs.pydantic.dev)
[3] Summary of pydantic-core float validator accepted inputs (incl. "inf", "infinity", "nan" when allowed) (deepwiki.com)

Citations:

1: https://docs.pydantic.dev/latest/api/config/

2: https://docs.pydantic.dev/latest/api/pydantic_core_schema/?utm_source=openai

3: https://deepwiki.com/pydantic/pydantic-core/3.1-basic-type-validators?utm_source=openai

Disallow NaN/Infinity for remaining_budget to keep budget routing deterministic.

RoutingRequest omits allow_inf_nan=False while ResolvedModel in the same file enforces it. This allows non-finite float inputs for the remaining_budget field, which is used in cost comparisons (lines 62–66) and would produce unreliable filtering results.

Proposed fix

class RoutingRequest(BaseModel): - model_config = ConfigDict(frozen=True) + model_config = ConfigDict(frozen=True, allow_inf_nan=False)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

model_config = ConfigDict(frozen=True)

agent_level: SeniorityLevel | None = Field(

default=None,

description="Seniority level of the requesting agent",

)

task_type: NotBlankStr | None = Field(

default=None,

description="Task type label",

)

model_override: NotBlankStr | None = Field(

default=None,

description="Explicit model reference for manual routing",

)

remaining_budget: float | None = Field(

default=None,

ge=0.0,

description=(

"Per-request cost ceiling (compared against model "

"total_cost_per_1k). Not a total session budget."

),

)

model_config = ConfigDict(frozen=True, allow_inf_nan=False)

agent_level: SeniorityLevel | None = Field(

default=None,

description="Seniority level of the requesting agent",

)

task_type: NotBlankStr | None = Field(

default=None,

description="Task type label",

)

model_override: NotBlankStr | None = Field(

default=None,

description="Explicit model reference for manual routing",

)

remaining_budget: float | None = Field(

default=None,

ge=0.0,

description=(

"Per-request cost ceiling (compared against model "

"total_cost_per_1k). Not a total session budget."

),

)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/providers/routing/models.py` around lines 69 - 90, The remaining_budget field on the RoutingRequest model currently allows NaN/Infinity which breaks deterministic cost comparisons; update the Field(...) for remaining_budget in the RoutingRequest class to disallow non-finite floats by adding allow_inf_nan=False (i.e., remaining_budget: float | None = Field(default=None, ge=0.0, allow_inf_nan=False, description=...)); ensure the same semantics as ResolvedModel so budget filtering remains reliable.

src/ai_company/providers/routing/strategies.py

- Add routing rule context (role_level, task_type, preferred_model, fallback) to RoutingRuleConfig validation warning log payload - Add allow_inf_nan=False to RoutingRequest ConfigDict so inf/nan budgets are rejected at validation time - Gate CostAwareStrategy task_type rule decisions against remaining_budget before returning (fall through to cheapest if the matched model exceeds budget) - Swap SmartStrategy priority: cheapest (budget-aware) now runs before global fallback chain so cost-informed selection takes precedence over blind chain traversal

Copilot

Pull request overview

Copilot reviewed 33 out of 34 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-02T19:46:37Z

src/ai_company/providers/routing/strategies.py

+        logger.info(
+            ROUTING_FALLBACK_EXHAUSTED,
+            ref=ref,
+            source="safe_wrapper",
+        )


_try_resolve_with_fallback_safe() currently logs ROUTING_FALLBACK_EXHAUSTED after catching NoAvailableModelError, but _try_resolve_with_fallback() already logs the same event at WARNING right before raising. This results in duplicate exhaustion logs for every safe attempt. Consider removing the wrapper log, or refactoring _try_resolve_with_fallback() to optionally suppress/downgrade its exhaustion log when used by the safe variant.

Suggested change

logger.info(

ROUTING_FALLBACK_EXHAUSTED,

ref=ref,

source="safe_wrapper",

)

Copilot · 2026-03-02T19:46:37Z

tests/unit/providers/routing/test_strategies.py

+        """MID has no rule -> uses seniority catalog (sonnet tier)."""
+        strategy = RoleBasedStrategy()


This docstring still refers to the seniority catalog default as the “sonnet tier”, but the routing tiers were renamed to vendor-agnostic small/medium/large in this PR. Updating this keeps tests/documentation consistent with the new tier naming.

Copilot · 2026-03-02T19:46:37Z

tests/unit/providers/routing/test_strategies.py

+        # review rule -> opus; junior role rule -> haiku; task wins
+        assert decision.resolved_model.alias == "large"
+        assert "task-type" in decision.reason.lower()


This comment uses vendor-specific names (“opus/haiku”). Since this PR’s stated goal is vendor-agnostic tiers (small/medium/large), it would be clearer to update the comment to those tier names (e.g., “review rule -> large; junior role rule -> small”).

Copilot · 2026-03-02T19:46:38Z

tests/unit/providers/routing/test_strategies.py

+        """Budget large enough for haiku+sonnet but not opus picks haiku."""
+        # haiku total=0.006, sonnet total=0.018, opus total=0.090
+        strategy = CostAwareStrategy()


This docstring still references vendor model names (“haiku/sonnet/opus”), but the suite otherwise uses the vendor-agnostic tier aliases (small/medium/large). Updating the wording avoids reintroducing vendor coupling in test docs.

Copilot · 2026-03-02T19:46:38Z

tests/unit/providers/routing/test_strategies.py

+        """LEAD has tier=opus; if opus not registered, use fallback chain."""
+        provider = ProviderConfig(


This docstring/comment mentions “tier=opus”. With the tier aliases renamed to large/medium/small, consider updating the text to reflect the new large tier naming.

🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>

Aureliolo added 2 commits March 1, 2026 22:00

feat: implement model routing engine

78f4bff

Add strategy-based LLM model routing with resolver, 4 strategies (manual, role_based, cost_aware, smart), and 83 unit tests at 94%+ coverage. Closes #6.

Copilot AI review requested due to automatic review settings March 1, 2026 21:52

Copilot started reviewing on behalf of Aureliolo March 1, 2026 21:52 View session

Copilot AI reviewed Mar 1, 2026

View reviewed changes

coderabbitai bot reviewed Mar 1, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 1, 2026

View reviewed changes

Aureliolo requested a review from Copilot March 2, 2026 05:27

Copilot started reviewing on behalf of Aureliolo March 2, 2026 05:27 View session

gemini-code-assist bot reviewed Mar 2, 2026

View reviewed changes

coderabbitai bot reviewed Mar 2, 2026

View reviewed changes

Copilot AI reviewed Mar 2, 2026

View reviewed changes

coderabbitai bot reviewed Mar 2, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings March 2, 2026 19:40

Copilot started reviewing on behalf of Aureliolo March 2, 2026 19:41 View session

Aureliolo merged commit d3c250b into main Mar 2, 2026
11 of 12 checks passed

Aureliolo deleted the feat/model-routing-engine branch March 2, 2026 19:43

Copilot AI reviewed Mar 2, 2026

View reviewed changes

This was referenced Mar 6, 2026

refactor: split events.py into per-domain event modules #136

Merged

feat: add FastestStrategy routing + vendor-agnostic cleanup #140

Merged

fix: incorporate post-merge feedback + pre-PR review fixes #164

Merged

coderabbitai bot mentioned this pull request Mar 9, 2026

refactor: harden BudgetEnforcer with error handling and review fixes #182

Merged

6 tasks

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release ai-company 0.1.1 #282

Merged

coderabbitai bot mentioned this pull request Mar 10, 2026

fix: resolve circular imports, bump litellm, fix release tag format #285

Closed

5 tasks

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release 0.1.0 #283

Merged

This was referenced Mar 15, 2026

chore(main): release 0.2.4 #431

Merged

chore(main): release 0.2.0 #442

Closed

chore(main): release 0.2.5 #447

Merged

chore(main): release 0.2.0 #460

Closed

chore(main): release 0.2.0 #471

Closed

		STRATEGY_MAP: Mapping[str, RoutingStrategy] = MappingProxyType(
		{

		if TYPE_CHECKING:
		from ai_company.config.schema import ProviderConfig

	if existing is not None and existing.model_id != resolved.model_id:
	if existing is not None and existing != resolved:

	if not resolver.all_models_sorted_by_cost():
	# Use a cheap existence check to avoid extra sorting work here.
	if not resolver.all_models():

		"""MID has no rule -> uses seniority catalog (sonnet tier)."""
		strategy = RoleBasedStrategy()

		"""LEAD has tier=opus; if opus not registered, use fallback chain."""
		provider = ProviderConfig(

Conversation

Aureliolo commented Mar 1, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai bot commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 1, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Aureliolo commented Mar 2, 2026

Uh oh!

coderabbitai bot commented Mar 2, 2026

Uh oh!

Aureliolo commented Mar 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 2, 2026

Choose a reason for hiding this comment

github-actions bot commented Mar 1, 2026 •

edited

Loading

coderabbitai bot commented Mar 1, 2026 •

edited

Loading