Skip to content

feat: implement model routing engine#99

Merged
Aureliolo merged 5 commits intomainfrom
feat/model-routing-engine
Mar 2, 2026
Merged

feat: implement model routing engine#99
Aureliolo merged 5 commits intomainfrom
feat/model-routing-engine

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Implements the model routing engine with 4 strategies: Manual, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy
  • Adds ModelResolver for O(1) alias/ID-to-model lookup with immutable MappingProxyType index
  • Adds collision detection, structured logging on all error paths, and NotBlankStr validation
  • Makes seniority catalog tiers vendor-agnostic (small/medium/large instead of vendor model names)
  • All test fixtures use generic provider names and model IDs — no vendor coupling

Closes #6

Test plan

  • All 1415 unit tests pass
  • ruff check clean
  • ruff format clean
  • mypy strict clean
  • CI pipeline passes

🤖 Generated with Claude Code

Add strategy-based LLM model routing with resolver, 4 strategies
(manual, role_based, cost_aware, smart), and 83 unit tests at 94%+
coverage. Closes #6.
Apply 23 review items from local agents and external reviewers:
- Add NotBlankStr validation on routing model string fields
- Add collision detection in ModelResolver.from_config
- Add missing logging on all error paths before raising
- Freeze STRATEGY_MAP with MappingProxyType
- Fix SmartStrategy priority order (global chain before cheapest)
- Remove redundant fallback chain walk in RoleBasedStrategy
- Use NoReturn type for _raise_exhausted
- Rename seniority catalog tiers from vendor-specific (haiku/sonnet/opus)
  to generic (small/medium/large)
- Make all test fixtures vendor-agnostic (no real provider names or model IDs)
- Add vendor-agnostic fixture rule to CLAUDE.md
Copilot AI review requested due to automatic review settings March 1, 2026 21:52
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 1, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 1, 2026

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

📥 Commits

Reviewing files that changed from the base of the PR and between b17601b and 50c17a3.

📒 Files selected for processing (5)
  • src/ai_company/config/schema.py
  • src/ai_company/providers/routing/models.py
  • src/ai_company/providers/routing/strategies.py
  • tests/unit/providers/routing/test_models.py
  • tests/unit/providers/routing/test_strategies.py

✏️ Tip: You can disable in-progress messages and the fortune message in your review settings.

Tip

You can validate your CodeRabbit configuration file in your editor.

If your editor has YAML language server, you can enable auto-completion and validation by adding # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json at the top of your CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Implements a new model routing subsystem (models, resolver, router, strategies, errors), adds routing observability events and config validation, updates seniority→model-tier mappings and template defaults, expands provider public exports, and adds comprehensive unit tests and routing fixtures. Also adds minor docs and testing guidance.

Changes

Cohort / File(s) Summary
Documentation & Guidance
\.claude/skills/aurelio-review-pr/skill.md, CLAUDE.md
Added logging-audit guidance and vendor-agnostic testing fixture guidance; docs only.
Core role mappings & tests
src/ai_company/core/role.py, src/ai_company/core/role_catalog.py, tests/unit/core/test_role.py, tests/unit/core/test_role_catalog.py
Updated typical_model_tier values across seniority levels (haiku/sonnet/opus → small/medium/large) and added ROLE_LOOKUP_MISS warning; tests updated accordingly.
Observability events
src/ai_company/observability/events.py
Added routing lifecycle event constants (router/resolver built, model resolved/resolution failed, decision made, fallback attempted/exhausted, no rule matched, budget exceeded, selection failed, strategy unknown).
Providers public API surface
src/ai_company/providers/__init__.py, src/ai_company/providers/routing/__init__.py
Re-exported routing package: strategy names/map, strategy classes, resolver/router, models, and routing errors; expanded module all.
Routing errors & tests
src/ai_company/providers/routing/errors.py, tests/unit/providers/routing/test_errors.py
New routing error hierarchy: RoutingError, ModelResolutionError, NoAvailableModelError, UnknownStrategyError (non-retryable); unit tests for inheritance/semantics.
Routing domain models & tests
src/ai_company/providers/routing/models.py, tests/unit/providers/routing/test_models.py
Added immutable Pydantic models: ResolvedModel, RoutingRequest, RoutingDecision with validation and computed total_cost; unit tests covering validation/immutability.
ModelResolver & tests/fixtures
src/ai_company/providers/routing/resolver.py, tests/unit/providers/routing/test_resolver.py, tests/unit/providers/routing/conftest.py
New ModelResolver with from_config, resolve/resolve_safe, deduplication, collision detection, cost-sorting, immutable index, and observability logging; comprehensive tests and test fixtures.
Router & strategies + tests
src/ai_company/providers/routing/router.py, src/ai_company/providers/routing/strategies.py, tests/unit/providers/routing/test_router.py, tests/unit/providers/routing/test_strategies.py
New ModelRouter, four routing strategies (Manual, RoleBased, CostAware, Smart), helper utilities, STRATEGY_MAP and name constants, error handling/logging, and extensive strategy/router tests.
Routing config schema & tests
src/ai_company/config/schema.py, tests/unit/config/test_schema.py, tests/unit/config/conftest.py
Added logger wiring, strengthened ProviderModelConfig config, introduced RoutingRuleConfig validator enforcing at least one matcher (role_level/task_type/fallback); tests updated to include task_type where required.
Templates & renderer defaults
src/ai_company/templates/..., src/ai_company/templates/renderer.py, src/ai_company/templates/schema.py, tests/unit/templates/test_schema.py
Default model tier changed from sonnetmedium (templates and renderer); several built-in template agent model values updated to new tiers; test expectation updated.
Routing tests suite
tests/unit/providers/routing/*
Large set of new/updated unit tests covering models, resolver, router, strategies, errors, fixtures and many edge cases.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Router as ModelRouter
    participant Strategy as RoutingStrategy
    participant Resolver as ModelResolver

    Client->>Router: route(RoutingRequest)
    Router->>Strategy: select(request, routing_config, resolver)
    alt override present
        Strategy->>Resolver: resolve(override_ref)
        Resolver-->>Strategy: ResolvedModel
    else task-type rule
        Strategy->>Resolver: resolve(preferred_model)
        Resolver-->>Strategy: ResolvedModel
    else role-based rule
        Strategy->>Resolver: resolve(role_preferred)
        Resolver-->>Strategy: ResolvedModel
    else cheapest/fallback
        Strategy->>Resolver: all_models_sorted_by_cost()
        Resolver-->>Strategy: ResolvedModel(s)
    end
    Strategy-->>Router: RoutingDecision
    Router->>Router: Log ROUTING_DECISION_MADE
    Router-->>Client: RoutingDecision
Loading
sequenceDiagram
    participant Config as RoutingConfig
    participant Router as ModelRouter
    participant Resolver as ModelResolver

    Config->>Router: __init__(routing_config, providers)
    Router->>Resolver: ModelResolver.from_config(providers)
    Resolver->>Resolver: Index models by id & alias
    Resolver-->>Router: ModelResolver instance
    Router->>Router: strategy = STRATEGY_MAP[name]
    alt unknown strategy
        Router->>Router: Log ROUTING_STRATEGY_UNKNOWN (rgba(255,0,0,0.5))
        Router-->>Router: Raise UnknownStrategyError
    else
        Router->>Router: Log ROUTING_ROUTER_BUILT (rgba(0,128,0,0.5))
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.82% which is insufficient. The required threshold is 100.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: implement model routing engine' clearly and concisely summarizes the main change—implementing a model routing system with multiple strategies.
Description check ✅ Passed The PR description is well-related to the changeset, detailing the routing engine implementation, strategies, resolver, validation, seniority tier changes, and testing status.
Linked Issues check ✅ Passed The PR implements all core requirements from issue #6: Manual, Role-based, Cost-aware, and Smart routing strategies; ModelResolver with O(1) lookup; fallback chain support; collision detection; structured logging; vendor-agnostic tiers; YAML config support; comprehensive unit tests; and routing decision logging.
Out of Scope Changes check ✅ Passed All changes are directly related to routing engine implementation and supporting infrastructure updates (seniority tiers, templates, config schema validation, event constants) without introducing unrelated functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/model-routing-engine

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a foundational model routing engine designed to intelligently select the most appropriate LLM model based on various criteria. It provides a flexible framework with multiple strategies, enabling dynamic model choices driven by factors like user role, task type, cost considerations, or explicit overrides. The changes enhance the system's ability to manage and utilize diverse LLM resources efficiently, while also improving maintainability through structured error handling, robust validation, and vendor-agnostic configurations.

Highlights

  • Model Routing Engine Implementation: Implemented a comprehensive model routing engine supporting four distinct strategies: Manual, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy, allowing for flexible and intelligent LLM model selection.
  • Efficient Model Resolution: Introduced a ModelResolver for O(1) lookup of model aliases and IDs, utilizing an immutable MappingProxyType index to ensure data integrity and performance.
  • Robustness and Observability: Added collision detection for model references, structured logging across all error paths for enhanced debuggability, and NotBlankStr validation for critical string fields.
  • Vendor-Agnostic Seniority Tiers: Refactored the seniority catalog to use vendor-agnostic model tier names (e.g., 'small', 'medium', 'large') instead of specific vendor model names, improving portability and reducing coupling.
  • Improved Test Fixtures: Ensured all new and updated test fixtures use generic provider names and model IDs, further decoupling tests from specific vendor implementations.
Changelog
  • .claude/skills/aurelio-review-pr/skill.md
    • Updated logging audit rules to include logging coverage suggestions and exclusions for specific file types.
  • CLAUDE.md
    • Added a new guideline for vendor-agnostic test fixtures, emphasizing the use of fake model IDs/names in tests.
  • src/ai_company/core/role.py
    • Updated the example for typical_model_tier in SeniorityInfo to use 'large' instead of 'opus', reflecting vendor-agnostic naming.
  • src/ai_company/core/role_catalog.py
    • Modified typical_model_tier values for all SeniorityInfo entries to use vendor-agnostic terms like 'small', 'medium', and 'large'.
  • src/ai_company/observability/events.py
    • Added a new section of constants for 'Routing lifecycle' events, including ROUTING_ROUTER_BUILT, ROUTING_MODEL_RESOLVED, and various error events.
  • src/ai_company/providers/init.py
    • Imported and exposed new classes and errors related to the routing engine, such as ModelRouter, ModelResolver, and various routing strategies.
  • src/ai_company/providers/routing/init.py
    • Added a new __init__.py file to the routing subpackage, exporting core routing components like errors, models, resolver, router, and strategies.
  • src/ai_company/providers/routing/errors.py
    • Added new exception classes for routing-specific errors, including RoutingError, ModelResolutionError, NoAvailableModelError, and UnknownStrategyError, all inheriting from ProviderError.
  • src/ai_company/providers/routing/models.py
    • Added new Pydantic models: ResolvedModel to represent a fully resolved LLM, RoutingRequest for routing inputs, and RoutingDecision for routing outputs.
  • src/ai_company/providers/routing/resolver.py
    • Added ModelResolver class to map model aliases and IDs to ResolvedModel instances, supporting O(1) lookup and collision detection during configuration.
  • src/ai_company/providers/routing/router.py
    • Added ModelRouter class, serving as the main entry point for routing decisions, which delegates to specific strategy implementations.
  • src/ai_company/providers/routing/strategies.py
    • Added RoutingStrategy protocol and implemented four concrete strategies: ManualStrategy, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy, along with shared helper functions for model resolution and fallback logic.
  • tests/unit/core/test_role.py
    • Updated typical_model_tier values in SeniorityInfo test cases from vendor-specific names ('sonnet', 'haiku') to generic tiers ('medium', 'small').
  • tests/unit/core/test_role_catalog.py
    • Updated the test method name test_senior_uses_sonnet_tier to test_senior_uses_medium_tier and adjusted the assertion to check for 'medium' as the typical_model_tier.
  • tests/unit/providers/routing/conftest.py
    • Added new test factories (ResolvedModelFactory, RoutingRequestFactory, RoutingDecisionFactory) and fixtures (three_model_provider, resolver, standard_routing_config) to support unit testing of the routing engine.
  • tests/unit/providers/routing/test_errors.py
    • Added unit tests to verify the hierarchy and behavior of routing-specific error classes, ensuring they extend ProviderError and are not retryable.
  • tests/unit/providers/routing/test_models.py
    • Added unit tests for the ResolvedModel, RoutingRequest, and RoutingDecision Pydantic models, covering instantiation, immutability, validation, and default values.
  • tests/unit/providers/routing/test_resolver.py
    • Added unit tests for the ModelResolver class, covering its construction from configuration, model resolution by ID and alias, safe resolution, and handling of duplicate references.
  • tests/unit/providers/routing/test_router.py
    • Added unit tests for the ModelRouter class, verifying its construction with various strategies, accessibility of the resolver, and correct routing behavior for different strategies.
  • tests/unit/providers/routing/test_strategies.py
    • Added unit tests for each routing strategy (ManualStrategy, RoleBasedStrategy, CostAwareStrategy, SmartStrategy), covering their model selection logic, fallback mechanisms, and error handling.
Activity
  • All 1415 unit tests pass.
  • Ruff check is clean.
  • Ruff format is clean.
  • Mypy strict check is clean.
  • CI pipeline is pending.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a new model routing engine under ai_company.providers.routing, including a resolver for model ref lookup and multiple routing strategies, plus associated observability events and unit tests. This supports Issue #6’s goal of configurable routing with role/cost/smart strategies and fallback handling.

Changes:

  • Added routing subpackage (ModelRouter, ModelResolver, routing domain models/errors, and 4 strategy implementations + registry).
  • Added routing observability event constants and extensive unit test coverage for resolver/router/strategies.
  • Updated seniority catalog to use vendor-agnostic model tiers (small/medium/large) and aligned tests/docs.

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/ai_company/providers/routing/strategies.py Implements routing strategies + shared fallback/budget helpers and strategy registry.
src/ai_company/providers/routing/router.py Adds ModelRouter as the routing entry point + decision logging.
src/ai_company/providers/routing/resolver.py Adds ModelResolver with immutable ref→model index and collision detection.
src/ai_company/providers/routing/models.py Adds ResolvedModel, RoutingRequest, RoutingDecision Pydantic models.
src/ai_company/providers/routing/errors.py Introduces routing error hierarchy extending provider errors.
src/ai_company/providers/routing/__init__.py Exposes routing public API exports.
src/ai_company/providers/__init__.py Re-exports routing APIs from the top-level providers package.
src/ai_company/observability/events.py Adds routing lifecycle event constants.
src/ai_company/core/role_catalog.py Switches seniority tiers to vendor-agnostic sizes.
src/ai_company/core/role.py Updates seniority tier docstring example to vendor-agnostic naming.
tests/unit/providers/routing/conftest.py Adds routing-specific fixtures/factories and 3-model provider config.
tests/unit/providers/routing/test_strategies.py Adds unit tests for each routing strategy and registry behavior.
tests/unit/providers/routing/test_router.py Adds unit tests for router construction, routing, and logging.
tests/unit/providers/routing/test_resolver.py Adds unit tests for resolver indexing, immutability, and sorting.
tests/unit/providers/routing/test_models.py Adds unit tests for routing domain model validation/immutability.
tests/unit/providers/routing/test_errors.py Adds unit tests for routing error hierarchy and context/message behavior.
tests/unit/providers/routing/__init__.py Creates routing test package.
tests/unit/core/test_role_catalog.py Updates expectations for vendor-agnostic tier names.
tests/unit/core/test_role.py Updates expectations for vendor-agnostic tier names.
CLAUDE.md Documents vendor-agnostic test fixture convention.
.claude/skills/aurelio-review-pr/skill.md Expands logging-audit guidance text.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +26 to +33
if TYPE_CHECKING:
from collections.abc import Mapping

from ai_company.config.schema import RoutingConfig, RoutingRuleConfig

from .resolver import ModelResolver


Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module uses if TYPE_CHECKING: imports for RoutingConfig, RoutingRuleConfig, ModelResolver, and Mapping, but those names are referenced in runtime annotations (Python 3.14 PEP 649 evaluates annotations lazily at runtime). This can raise NameError when annotations are accessed (e.g., inspect.get_annotations). Import these types at runtime instead (optionally with # noqa: TC00x) rather than behind TYPE_CHECKING.

Suggested change
if TYPE_CHECKING:
from collections.abc import Mapping
from ai_company.config.schema import RoutingConfig, RoutingRuleConfig
from .resolver import ModelResolver
from collections.abc import Mapping
from ai_company.config.schema import RoutingConfig, RoutingRuleConfig
from .resolver import ModelResolver

Copilot uses AI. Check for mistakes.
Comment on lines +625 to +626
STRATEGY_MAP: Mapping[str, RoutingStrategy] = MappingProxyType(
{
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STRATEGY_MAP is annotated as Mapping[...], but Mapping is only imported under TYPE_CHECKING above. With Python 3.14 lazy annotations, evaluating this annotation will fail at runtime. Import Mapping from collections.abc at runtime (or remove the annotation).

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +25
if TYPE_CHECKING:
from ai_company.config.schema import ProviderConfig, RoutingConfig

from .models import RoutingDecision, RoutingRequest
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProviderConfig, RoutingConfig, RoutingRequest, and RoutingDecision are imported only under TYPE_CHECKING but used in annotations. Under Python 3.14 (PEP 649 lazy annotations), accessing annotations can raise NameError. Import these types at runtime (with # noqa: TC00x if needed) instead of behind TYPE_CHECKING.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +26
if TYPE_CHECKING:
from ai_company.config.schema import ProviderConfig

Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProviderConfig is imported only under TYPE_CHECKING but referenced in the from_config() signature annotation. With Python 3.14 lazy annotations, accessing annotations can raise NameError. Import ProviderConfig at runtime (with # noqa: TC00x if needed) instead of behind TYPE_CHECKING.

Copilot uses AI. Check for mistakes.
if ref is None:
continue
existing = index.get(ref)
if existing is not None and existing.model_id != resolved.model_id:
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collision detection only raises when existing.model_id != resolved.model_id. If two providers define the same ref (ID/alias) pointing to different providers but with the same model_id string, this will silently overwrite the index and make the ref ambiguous. Consider raising on any ref collision where either provider_name or model_id differ (or more simply: if existing is not None and existing != resolved).

Suggested change
if existing is not None and existing.model_id != resolved.model_id:
if existing is not None and existing != resolved:

Copilot uses AI. Check for mistakes.
Comment on lines +67 to +71
remaining_budget: float | None = Field(
default=None,
ge=0.0,
description="Remaining cost budget in USD",
)
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remaining_budget is documented as a remaining budget in USD, but the cost-aware selection compares it against cost_per_1k_input + cost_per_1k_output (USD per 1k tokens). This is a unit mismatch and makes the API ambiguous. Either clarify in the field description/name that the budget is per-1k-token (or per-request) threshold, or extend RoutingRequest with enough information (e.g., expected token counts) to compute an actual USD cost against a USD budget.

Copilot uses AI. Check for mistakes.
request: RoutingRequest,
resolver: ModelResolver,
) -> RoutingDecision | None:
if not resolver.all_models_sorted_by_cost():
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_try_cheapest() calls resolver.all_models_sorted_by_cost() just to check emptiness and then _cheapest_within_budget() calls it again. Cache the sorted tuple (or call resolver.all_models() for the emptiness check) to avoid the extra sort/work on the hot path.

Suggested change
if not resolver.all_models_sorted_by_cost():
# Use a cheap existence check to avoid extra sorting work here.
if not resolver.all_models():

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/aurelio-review-pr/skill.md:
- Around line 107-114: The "Logging coverage suggestions" subsection continues
numbering at 7-10 which triggers MD029; update the list under the "Logging
coverage suggestions (soft rules — mark as SUGGESTION, must be validated by user
in triage):" header so numbering restarts at 1 (i.e., change items labeled 7–10
to 1–4), or explicitly suppress the MD029 warning if the cross-section numbering
is intentional for easier triage reference; locate the header text "Logging
coverage suggestions" and the following list items to make the change and ensure
consistency in the document.

In `@src/ai_company/providers/routing/router.py`:
- Around line 109-113: Wrap the call to self._strategy.select(...) inside
route() in a try/except that catches routing exceptions, log a WARNING or ERROR
with structured context including the request object, the strategy identifier
(from self._strategy), and relevant config/resolver context (self._config,
self._resolver) before re-raising the exception; ensure the logged message is
descriptive (e.g., "routing selection failed") and include the caught exception
details, then re-raise the original exception to preserve behavior.

In `@src/ai_company/providers/routing/strategies.py`:
- Around line 620-633: Add a brief inline comment next to the "cheapest" alias
in the STRATEGY_MAP to document that it intentionally aliases "cost_aware"
(e.g., mention _COST_AWARE / "cheapest" maps to the same strategy), keeping the
registry immutable via MappingProxyType; update the comment near STRATEGY_MAP to
be concise and clarify the UX intent of the alias.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0338d0c and 7618335.

📒 Files selected for processing (21)
  • .claude/skills/aurelio-review-pr/skill.md
  • CLAUDE.md
  • src/ai_company/core/role.py
  • src/ai_company/core/role_catalog.py
  • src/ai_company/observability/events.py
  • src/ai_company/providers/__init__.py
  • src/ai_company/providers/routing/__init__.py
  • src/ai_company/providers/routing/errors.py
  • src/ai_company/providers/routing/models.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/strategies.py
  • tests/unit/core/test_role.py
  • tests/unit/core/test_role_catalog.py
  • tests/unit/providers/routing/__init__.py
  • tests/unit/providers/routing/conftest.py
  • tests/unit/providers/routing/test_errors.py
  • tests/unit/providers/routing/test_models.py
  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/providers/routing/test_router.py
  • tests/unit/providers/routing/test_strategies.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Agent
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations — Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models with BaseModel, model_validator, and ConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines

Files:

  • src/ai_company/core/role.py
  • src/ai_company/providers/routing/router.py
  • tests/unit/providers/routing/test_router.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/__init__.py
  • tests/unit/providers/routing/test_models.py
  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/core/role_catalog.py
  • src/ai_company/observability/events.py
  • src/ai_company/providers/routing/errors.py
  • tests/unit/core/test_role_catalog.py
  • tests/unit/providers/routing/conftest.py
  • tests/unit/core/test_role.py
  • src/ai_company/providers/routing/models.py
  • tests/unit/providers/routing/test_strategies.py
  • tests/unit/providers/routing/test_resolver.py
  • src/ai_company/providers/__init__.py
  • tests/unit/providers/routing/test_errors.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic must import: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging or logging.getLogger() or print() in application code — use the ai_company logger
Always use logger as the variable name (not _logger or log)
Always use event name constants from ai_company.observability.events in logger calls
Use structured logging format: logger.info(EVENT, key=value) — never use format strings like logger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions

Files:

  • src/ai_company/core/role.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/__init__.py
  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/core/role_catalog.py
  • src/ai_company/observability/events.py
  • src/ai_company/providers/routing/errors.py
  • src/ai_company/providers/routing/models.py
  • src/ai_company/providers/__init__.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark all test cases with pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow
Configure pytest async mode as asyncio_mode = "auto" — do not manually add @pytest.mark.asyncio to test functions

Files:

  • tests/unit/providers/routing/test_router.py
  • tests/unit/providers/routing/test_models.py
  • tests/unit/core/test_role_catalog.py
  • tests/unit/providers/routing/conftest.py
  • tests/unit/core/test_role.py
  • tests/unit/providers/routing/test_strategies.py
  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/providers/routing/test_errors.py
🧠 Learnings (22)
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.

Applied to files:

  • CLAUDE.md
  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to README.md : Update README.md for significant feature changes

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use `async def` for async test functions; pytest-asyncio is configured with `asyncio_mode = "auto"`; mark async tests with `pytest.mark.asyncio` if needed

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to tests/**/*.py : Configure pytest async mode as `asyncio_mode = "auto"` — do not manually add `pytest.mark.asyncio` to test functions

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Set test timeout to 30 seconds per test (via pytest configuration)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Use pytest-xdist for parallelism with `-n auto` flag

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Mock external dependencies such as file system operations and network calls in tests

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Log all significant operations using Python logging with appropriate levels: debug for detailed info, info for successes, warning for retries, error for failures

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`), CI enforces this coverage requirement

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit. The CI enforces 100% coverage on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`).

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/**/*.py : All functions and methods should have appropriate logging using `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use logging levels: debug (routine), info (significant), warning (unexpected but recoverable), error (failures)

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : Core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`) must maintain 100% test coverage

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Agents must extend `BaseAgent`, use retry logic, and implement configurable timeout via settings.

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls

Applied to files:

  • src/ai_company/observability/events.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception

Applied to files:

  • src/ai_company/providers/routing/errors.py
🧬 Code graph analysis (12)
src/ai_company/providers/routing/router.py (4)
src/ai_company/providers/routing/errors.py (1)
  • UnknownStrategyError (28-31)
src/ai_company/providers/routing/resolver.py (2)
  • ModelResolver (28-181)
  • from_config (57-115)
src/ai_company/config/schema.py (2)
  • ProviderConfig (52-95)
  • RoutingConfig (127-149)
src/ai_company/providers/routing/models.py (2)
  • RoutingDecision (74-92)
  • RoutingRequest (43-71)
tests/unit/providers/routing/test_router.py (6)
src/ai_company/config/schema.py (2)
  • ProviderConfig (52-95)
  • RoutingConfig (127-149)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
src/ai_company/providers/routing/errors.py (3)
  • ModelResolutionError (16-19)
  • NoAvailableModelError (22-25)
  • UnknownStrategyError (28-31)
src/ai_company/providers/routing/models.py (1)
  • RoutingRequest (43-71)
src/ai_company/providers/routing/router.py (4)
  • ModelRouter (28-122)
  • strategy_name (92-94)
  • resolver (87-89)
  • route (96-122)
src/ai_company/providers/routing/resolver.py (1)
  • all_models (166-169)
src/ai_company/providers/routing/__init__.py (4)
src/ai_company/providers/routing/errors.py (4)
  • ModelResolutionError (16-19)
  • NoAvailableModelError (22-25)
  • RoutingError (10-13)
  • UnknownStrategyError (28-31)
src/ai_company/providers/routing/router.py (2)
  • resolver (87-89)
  • ModelRouter (28-122)
tests/unit/providers/routing/conftest.py (1)
  • resolver (97-101)
src/ai_company/providers/routing/resolver.py (1)
  • ModelResolver (28-181)
src/ai_company/providers/routing/strategies.py (5)
src/ai_company/core/role_catalog.py (1)
  • get_seniority_info (429-445)
src/ai_company/providers/routing/errors.py (2)
  • ModelResolutionError (16-19)
  • NoAvailableModelError (22-25)
src/ai_company/providers/routing/models.py (2)
  • ResolvedModel (9-40)
  • RoutingDecision (74-92)
src/ai_company/config/schema.py (2)
  • RoutingConfig (127-149)
  • RoutingRuleConfig (98-124)
src/ai_company/providers/routing/resolver.py (5)
  • ModelResolver (28-181)
  • resolve_safe (146-164)
  • all_models (166-169)
  • all_models_sorted_by_cost (171-181)
  • resolve (117-144)
src/ai_company/core/role_catalog.py (2)
src/ai_company/core/enums.py (2)
  • CostTier (57-69)
  • SeniorityLevel (6-21)
src/ai_company/core/role.py (1)
  • SeniorityInfo (87-120)
src/ai_company/providers/routing/errors.py (1)
src/ai_company/providers/errors.py (1)
  • ProviderError (22-69)
tests/unit/core/test_role_catalog.py (2)
src/ai_company/core/role_catalog.py (1)
  • get_seniority_info (429-445)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
tests/unit/core/test_role.py (2)
src/ai_company/core/role.py (1)
  • SeniorityInfo (87-120)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
src/ai_company/providers/routing/models.py (1)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
tests/unit/providers/routing/test_resolver.py (2)
tests/unit/providers/routing/conftest.py (2)
  • resolver (97-101)
  • three_model_provider (85-93)
src/ai_company/providers/routing/resolver.py (6)
  • ModelResolver (28-181)
  • from_config (57-115)
  • resolve (117-144)
  • all_models (166-169)
  • resolve_safe (146-164)
  • all_models_sorted_by_cost (171-181)
src/ai_company/providers/__init__.py (5)
src/ai_company/providers/routing/strategies.py (5)
  • CostAwareStrategy (450-494)
  • ManualStrategy (306-343)
  • RoleBasedStrategy (349-444)
  • RoutingStrategy (38-66)
  • SmartStrategy (500-615)
src/ai_company/providers/routing/errors.py (4)
  • ModelResolutionError (16-19)
  • NoAvailableModelError (22-25)
  • RoutingError (10-13)
  • UnknownStrategyError (28-31)
src/ai_company/providers/routing/resolver.py (1)
  • ModelResolver (28-181)
src/ai_company/providers/routing/router.py (1)
  • ModelRouter (28-122)
src/ai_company/providers/routing/models.py (3)
  • ResolvedModel (9-40)
  • RoutingDecision (74-92)
  • RoutingRequest (43-71)
tests/unit/providers/routing/test_errors.py (2)
src/ai_company/providers/errors.py (1)
  • ProviderError (22-69)
src/ai_company/providers/routing/errors.py (4)
  • ModelResolutionError (16-19)
  • NoAvailableModelError (22-25)
  • RoutingError (10-13)
  • UnknownStrategyError (28-31)
🪛 markdownlint-cli2 (0.21.0)
.claude/skills/aurelio-review-pr/skill.md

[warning] 111-111: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/2/3

(MD029, ol-prefix)


[warning] 112-112: Ordered list item prefix
Expected: 2; Actual: 8; Style: 1/2/3

(MD029, ol-prefix)


[warning] 113-113: Ordered list item prefix
Expected: 3; Actual: 9; Style: 1/2/3

(MD029, ol-prefix)


[warning] 114-114: Ordered list item prefix
Expected: 4; Actual: 10; Style: 1/2/3

(MD029, ol-prefix)

🔇 Additional comments (28)
src/ai_company/core/role_catalog.py (1)

21-70: LGTM! Vendor-agnostic model tier naming is a solid improvement.

The update from vendor-specific tiers (haiku, sonnet, opus) to generic tiers (small, medium, large) decouples the seniority catalog from any specific LLM provider. The mapping is consistent and covers all 8 seniority levels appropriately.

src/ai_company/core/role.py (1)

93-93: LGTM! Docstring example updated to match new tier naming.

The example value in the docstring now reflects the vendor-agnostic tier naming ("large" instead of "opus"), keeping documentation consistent with the actual values used in SENIORITY_INFO.

tests/unit/core/test_role_catalog.py (1)

60-63: LGTM! Test correctly updated to reflect the new tier value.

The test method name, docstring, and assertion are all consistently updated to verify that SENIOR maps to "medium" tier, matching the updated SENIORITY_INFO mapping in role_catalog.py.

tests/unit/core/test_role.py (1)

123-200: LGTM! Test fixtures updated consistently with vendor-agnostic tier names.

All SeniorityInfo test instances now use the generic tier names ("small", "medium") matching their respective seniority levels, consistent with the updated SENIORITY_INFO mappings:

  • JUNIOR"small"
  • MID"medium"
  • SENIOR"medium"

The test behavior remains unchanged; only the literal values were updated.

src/ai_company/providers/routing/errors.py (1)

10-31: LGTM! Clean error hierarchy.

The routing error classes properly extend ProviderError and follow the established pattern. The docstrings clearly describe each exception's purpose.

Minor observation: The is_retryable = False on subclasses (lines 13, 19, 25, 31) is redundant since RoutingError already sets it to False and it's inherited. You could remove these for DRYness, but keeping them is also fine for explicit documentation.

src/ai_company/providers/routing/__init__.py (1)

1-39: LGTM!

Clean public API surface with comprehensive exports. The __all__ list is alphabetically sorted and complete, matching all imports.

src/ai_company/providers/__init__.py (1)

37-52: LGTM!

Clean integration of the routing module's public API into the providers package. Alphabetical ordering in __all__ is maintained, and all new exports are properly imported.

src/ai_company/observability/events.py (1)

79-91: LGTM!

All new routing event constants follow the established domain.noun.verb naming convention and are properly typed as Final[str]. Good coverage for the routing lifecycle: build, resolution, decision, fallback, budget, and error scenarios.

src/ai_company/providers/routing/models.py (2)

9-40: LGTM! Well-designed immutable domain models.

Good use of ConfigDict(frozen=True) for immutability, NotBlankStr for string validation, and appropriate field constraints (ge=0.0 for costs, gt=0 for context size).


43-92: LGTM!

RoutingRequest and RoutingDecision are well-structured with proper optional field handling and immutability. The fallbacks_tried default of () is appropriate for the tuple type.

src/ai_company/providers/routing/resolver.py (3)

12-22: LGTM! Proper observability setup.

Correct use of get_logger(__name__) with event constants from ai_company.observability.events. The logger variable naming follows conventions.


56-115: LGTM! Robust collision detection with proper logging.

The from_config method correctly:

  • Detects duplicate refs mapping to different models
  • Logs at ERROR level with full context before raising
  • Uses structured logging with kwargs
  • Raises ModelResolutionError with appropriate context

117-164: LGTM! Good differentiation between resolve and resolve_safe.

Appropriate log levels: resolve logs WARNING on miss (unexpected), while resolve_safe logs DEBUG (expected possibility). Both use structured kwargs.

src/ai_company/providers/routing/strategies.py (7)

37-66: LGTM! Clean protocol definition.

The RoutingStrategy protocol is well-documented with @runtime_checkable for isinstance checks. The docstring clearly specifies the contract and possible exceptions.


72-143: LGTM! Robust fallback resolution helpers.

Good separation between throwing (_try_resolve_with_fallback) and non-throwing (_try_resolve_with_fallback_safe) variants. Proper logging at each fallback attempt and exhaustion.


171-209: LGTM! Budget-aware selection with proper fallback.

The function correctly handles:

  • No models registered (logs warning + raises)
  • No budget constraint (returns cheapest)
  • Budget exceeded by all models (logs warning + returns cheapest anyway with flag)

306-343: LGTM! ManualStrategy implementation.

Correctly validates that model_override is set, logs warning before raising ModelResolutionError, and delegates to resolver for actual lookup.


349-444: LGTM! RoleBasedStrategy with comprehensive fallback handling.

Good flow: rule match → seniority default tier → global fallback chain. All failure paths log at WARNING level before raising NoAvailableModelError.


450-494: LGTM! CostAwareStrategy implementation.

Clean delegation to _try_task_type_rules and _cheapest_within_budget helpers. Budget exceeded case is clearly communicated in the reason string.


500-616: LGTM! SmartStrategy with clear priority order.

The chained or expression at lines 524-546 clearly expresses the priority: override → task_type → role → seniority → global chain → cheapest → exhausted. Helper methods are well-encapsulated. The NoReturn type hint on _raise_exhausted is correct.

src/ai_company/providers/routing/router.py (1)

63-75: Unknown-strategy handling is implemented well.

Structured error logging with contextual fields before raising UnknownStrategyError is solid.

CLAUDE.md (1)

80-80: Good documentation hardening for test portability.

The added vendor-agnostic fixture rule is clear and helps keep tests independent from real provider/model IDs.

tests/unit/providers/routing/conftest.py (1)

59-130: Fixture design is clean and deterministic.

The shared provider/resolver/routing fixtures are reusable, stable, and vendor-agnostic.

tests/unit/providers/routing/test_errors.py (1)

22-61: Error-hierarchy coverage is solid.

These tests correctly validate inheritance, retryability flags, and context/message propagation.

tests/unit/providers/routing/test_models.py (1)

21-125: Model validation tests are comprehensive.

Coverage of defaults, non-negative constraints, and frozen-field behavior is strong.

tests/unit/providers/routing/test_router.py (1)

20-146: Router behavior tests are well-rounded.

Construction, decision paths, logging assertions, and key error scenarios are all covered.

tests/unit/providers/routing/test_strategies.py (1)

32-402: Excellent strategy-path coverage.

The tests exercise precedence logic, fallback behavior, and empty-model failure paths across all strategies.

tests/unit/providers/routing/test_resolver.py (1)

12-114: Resolver test coverage is strong.

Indexing, lookup failure semantics, ordering, deduplication, and immutability checks are all well represented.

Comment on lines +107 to +114
**Logging coverage suggestions (soft rules — mark as SUGGESTION, must be validated by user in triage):**

For every function touched by the PR, analyze its logic and suggest missing logging where appropriate:

7. Error/except paths that don't `logger.warning()` or `logger.error()` with context before raising or returning (SUGGESTION)
8. State transitions (status changes, lifecycle events, mode switches) that don't `logger.info()` (SUGGESTION)
9. Object creation, entry/exit of key functions, or important branching decisions that don't `logger.debug()` (SUGGESTION)
10. Any other code path that would benefit from logging for debuggability or operational visibility — think about what an operator investigating a production issue would want to see (SUGGESTION)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider restarting list numbering for the soft rules section.

The static analysis tool flagged that items 7-10 should restart from 1 after the new "Logging coverage suggestions" header, per MD029. However, if the cross-section numbering (1-6 for hard rules, 7-10 for soft rules) is intentional for easier reference in triage discussions, you may suppress this warning. Otherwise, consider restarting from 1.

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 111-111: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/2/3

(MD029, ol-prefix)


[warning] 112-112: Ordered list item prefix
Expected: 2; Actual: 8; Style: 1/2/3

(MD029, ol-prefix)


[warning] 113-113: Ordered list item prefix
Expected: 3; Actual: 9; Style: 1/2/3

(MD029, ol-prefix)


[warning] 114-114: Ordered list item prefix
Expected: 4; Actual: 10; Style: 1/2/3

(MD029, ol-prefix)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/aurelio-review-pr/skill.md around lines 107 - 114, The
"Logging coverage suggestions" subsection continues numbering at 7-10 which
triggers MD029; update the list under the "Logging coverage suggestions (soft
rules — mark as SUGGESTION, must be validated by user in triage):" header so
numbering restarts at 1 (i.e., change items labeled 7–10 to 1–4), or explicitly
suppress the MD029 warning if the cross-section numbering is intentional for
easier triage reference; locate the header text "Logging coverage suggestions"
and the following list items to make the change and ensure consistency in the
document.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive model routing engine with multiple strategies, a model resolver for efficient lookups, and extensive test coverage. While the implementation demonstrates high quality and good design principles, a critical security vulnerability was identified: budget constraints specified in the RoutingRequest can be bypassed by manual overrides and routing rules across all strategies. This could lead to unauthorized financial costs if users can influence routing signals such as model_override or task_type. It is crucial to update the routing logic to enforce budget limits as a hard constraint across all selection paths. Additionally, consider improving the fallback logic to prevent redundant model lookups, which will enhance efficiency and clarity.

Comment on lines +317 to +616
def select(
self,
request: RoutingRequest,
config: RoutingConfig, # noqa: ARG002
resolver: ModelResolver,
) -> RoutingDecision:
"""Select the explicitly requested model.

Raises:
ModelResolutionError: If ``model_override`` is not set or
the model cannot be resolved.
"""
if request.model_override is None:
logger.warning(
ROUTING_NO_RULE_MATCHED,
strategy=self.name,
reason="model_override not set",
)
msg = "ManualStrategy requires model_override to be set"
raise ModelResolutionError(msg)

model = resolver.resolve(request.model_override)
return RoutingDecision(
resolved_model=model,
strategy_used=self.name,
reason=f"Explicit override: {request.model_override}",
)


# ── Strategy 2: Role-Based ───────────────────────────────────────


class RoleBasedStrategy:
"""Select model based on agent seniority level.

Matches the first routing rule where ``rule.role_level`` equals
``request.agent_level``. If no rule matches, uses the seniority
catalog's ``typical_model_tier`` as a fallback lookup.
"""

@property
def name(self) -> str:
"""Return strategy name."""
return "role_based"

def select(
self,
request: RoutingRequest,
config: RoutingConfig,
resolver: ModelResolver,
) -> RoutingDecision:
"""Select model based on role level.

Raises:
ModelResolutionError: If no agent_level is set.
NoAvailableModelError: If all candidates are exhausted.
"""
if request.agent_level is None:
logger.warning(
ROUTING_NO_RULE_MATCHED,
strategy=self.name,
reason="agent_level not set",
)
msg = "RoleBasedStrategy requires agent_level to be set"
raise ModelResolutionError(msg)

# Try matching rules
for rule in config.rules:
if rule.role_level == request.agent_level:
model, tried = _try_resolve_with_fallback(
rule.preferred_model,
rule,
config,
resolver,
)
return RoutingDecision(
resolved_model=model,
strategy_used=self.name,
reason=(
f"Role rule match: "
f"level={request.agent_level.value}"
f", model={model.model_id}"
),
fallbacks_tried=tried,
)

# No rule matched — use seniority default
logger.debug(
ROUTING_NO_RULE_MATCHED,
level=request.agent_level.value,
strategy=self.name,
)
tier = get_seniority_info(
request.agent_level,
).typical_model_tier
result = _try_resolve_with_fallback_safe(
tier,
None,
config,
resolver,
)
if result is not None:
model, tried = result
return RoutingDecision(
resolved_model=model,
strategy_used=self.name,
reason=(
f"Seniority default: level={request.agent_level.value}, tier={tier}"
),
fallbacks_tried=tried,
)

if config.fallback_chain:
chain_detail = f"fallback chain exhausted: {list(config.fallback_chain)}"
else:
chain_detail = "no fallback chain configured"
msg = (
f"No model available for "
f"level={request.agent_level.value} "
f"(tier={tier}, no rules matched, {chain_detail})"
)
logger.warning(
ROUTING_FALLBACK_EXHAUSTED,
level=request.agent_level.value,
tier=tier,
strategy=self.name,
)
raise NoAvailableModelError(msg)


# ── Strategy 3: Cost-Aware ───────────────────────────────────────


class CostAwareStrategy:
"""Select the cheapest model, optionally respecting a budget.

Matches ``task_type`` rules first, then falls back to the cheapest
model from the resolver.
"""

@property
def name(self) -> str:
"""Return strategy name."""
return "cost_aware"

def select(
self,
request: RoutingRequest,
config: RoutingConfig,
resolver: ModelResolver,
) -> RoutingDecision:
"""Select the cheapest available model.

Raises:
NoAvailableModelError: If no models are registered.
"""
decision = _try_task_type_rules(
request,
config,
resolver,
self.name,
)
if decision is not None:
return decision

# Pick cheapest
model, budget_exceeded = _cheapest_within_budget(
resolver,
request.remaining_budget,
)
reason = f"Cheapest available: {model.model_id}"
if budget_exceeded:
reason += " (all models exceed remaining budget)"
return RoutingDecision(
resolved_model=model,
strategy_used=self.name,
reason=reason,
)


# ── Strategy 4: Smart ────────────────────────────────────────────


class SmartStrategy:
"""Combined strategy with priority-based signal merging.

Priority order: model_override > task_type rules > role_level
rules > seniority default > global fallback_chain > cheapest
available.
"""

@property
def name(self) -> str:
"""Return strategy name."""
return "smart"

def select(
self,
request: RoutingRequest,
config: RoutingConfig,
resolver: ModelResolver,
) -> RoutingDecision:
"""Select a model using all available signals.

Raises:
NoAvailableModelError: If all candidates are exhausted.
"""
return (
self._try_override(request, resolver)
or _try_task_type_rules(
request,
config,
resolver,
self.name,
)
or _try_role_rules(
request,
config,
resolver,
self.name,
)
or _try_seniority_default(
request,
resolver,
self.name,
)
or self._try_global_chain(config, resolver)
or self._try_cheapest(request, resolver)
or self._raise_exhausted()
)

def _try_override(
self,
request: RoutingRequest,
resolver: ModelResolver,
) -> RoutingDecision | None:
if request.model_override is None:
return None
model = resolver.resolve_safe(request.model_override)
if model is None:
logger.warning(
ROUTING_MODEL_RESOLUTION_FAILED,
ref=request.model_override,
source="smart_override",
)
return None
return RoutingDecision(
resolved_model=model,
strategy_used=self.name,
reason=f"Explicit override: {request.model_override}",
)

def _try_cheapest(
self,
request: RoutingRequest,
resolver: ModelResolver,
) -> RoutingDecision | None:
if not resolver.all_models_sorted_by_cost():
return None
model, budget_exceeded = _cheapest_within_budget(
resolver,
request.remaining_budget,
)
reason = f"Cheapest available: {model.model_id}"
if budget_exceeded:
reason += " (all models exceed remaining budget)"
return RoutingDecision(
resolved_model=model,
strategy_used=self.name,
reason=reason,
)

def _try_global_chain(
self,
config: RoutingConfig,
resolver: ModelResolver,
) -> RoutingDecision | None:
chain_result = _walk_fallback_chain(config, resolver)
if chain_result is None:
return None
model, tried = chain_result
return RoutingDecision(
resolved_model=model,
strategy_used=self.name,
reason="Global fallback chain",
fallbacks_tried=tried,
)

def _raise_exhausted(self) -> NoReturn:
logger.warning(
ROUTING_FALLBACK_EXHAUSTED,
strategy="smart",
reason="all signals exhausted",
)
msg = "SmartStrategy: no model available from any signal"
raise NoAvailableModelError(
msg,
context={"strategy": "smart"},
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The model routing engine implements several strategies (ManualStrategy, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy) that prioritize manual overrides and routing rules over budget constraints.

Specifically:

  1. ManualStrategy (lines 317-343) selects the explicitly requested model without any budget check.
  2. RoleBasedStrategy (lines 362-444) selects models based on seniority rules or defaults without checking the remaining_budget.
  3. CostAwareStrategy (lines 450-495) prioritizes task_type rules (which don't check budget) over the cheapest-model selection.
  4. SmartStrategy (lines 500-616) prioritizes overrides, task rules, role rules, and seniority defaults—all of which bypass budget checks—before finally falling back to a budget-aware cheapest-model selection.

Furthermore, even the budget-aware helper _cheapest_within_budget (lines 171-209) is designed to return the cheapest available model even if it exceeds the remaining_budget, rather than raising an error to enforce the limit.

This design allows users to bypass financial controls by triggering rules or providing overrides, potentially leading to significant unauthorized costs.

Recommendation: Modify the select method of each strategy to validate the cost of the chosen model against request.remaining_budget before returning a RoutingDecision. If the cost exceeds the budget, the strategy should either attempt to find a cheaper alternative that fits the budget or raise a NoAvailableModelError. The _cheapest_within_budget helper should also be updated to enforce a hard limit when a budget is provided.

Comment on lines +72 to +122
def _try_resolve_with_fallback(
ref: str,
rule: RoutingRuleConfig | None,
config: RoutingConfig,
resolver: ModelResolver,
) -> tuple[ResolvedModel, tuple[str, ...]]:
"""Try to resolve *ref*, then rule fallback, then global chain.

Returns:
Tuple of (resolved_model, fallbacks_tried).

Raises:
NoAvailableModelError: If all candidates are exhausted.
"""
tried: list[str] = []

# Try the primary ref
model = resolver.resolve_safe(ref)
if model is not None:
return model, tuple(tried)

tried.append(ref)
logger.debug(ROUTING_FALLBACK_ATTEMPTED, ref=ref, source="primary")

# Try rule-level fallback
if rule is not None and rule.fallback is not None:
model = resolver.resolve_safe(rule.fallback)
if model is not None:
return model, tuple(tried)
tried.append(rule.fallback)
logger.debug(
ROUTING_FALLBACK_ATTEMPTED,
ref=rule.fallback,
source="rule_fallback",
)

# Walk global fallback chain
for chain_ref in config.fallback_chain:
model = resolver.resolve_safe(chain_ref)
if model is not None:
return model, tuple(tried)
tried.append(chain_ref)
logger.debug(
ROUTING_FALLBACK_ATTEMPTED,
ref=chain_ref,
source="global_chain",
)

logger.warning(ROUTING_FALLBACK_EXHAUSTED, tried=tried)
msg = f"All model candidates exhausted: {tried}"
raise NoAvailableModelError(msg, context={"tried": tried})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of fallback logic may perform redundant model lookups if the same model reference appears in multiple places (e.g., as a primary reference and also in the global fallback chain). This is inefficient and can lead to confusing logs with duplicate ROUTING_FALLBACK_ATTEMPTED events and duplicate entries in the tried list of failed attempts.

To prevent this, you should check if a model reference has already been tried before attempting to resolve it again. This makes the fallback logic more robust and efficient.

def _try_resolve_with_fallback(
    ref: str,
    rule: RoutingRuleConfig | None,
    config: RoutingConfig,
    resolver: ModelResolver,
) -> tuple[ResolvedModel, tuple[str, ...]]:
    """Try to resolve *ref*, then rule fallback, then global chain.

    Returns:
        Tuple of (resolved_model, fallbacks_tried).

    Raises:
        NoAvailableModelError: If all candidates are exhausted.
    """
    tried: list[str] = []

    # Try the primary ref
    model = resolver.resolve_safe(ref)
    if model is not None:
        return model, tuple(tried)

    tried.append(ref)
    logger.debug(ROUTING_FALLBACK_ATTEMPTED, ref=ref, source="primary")

    # Try rule-level fallback
    if rule is not None and rule.fallback is not None:
        if rule.fallback not in tried:
            model = resolver.resolve_safe(rule.fallback)
            if model is not None:
                return model, tuple(tried)
            tried.append(rule.fallback)
            logger.debug(
                ROUTING_FALLBACK_ATTEMPTED,
                ref=rule.fallback,
                source="rule_fallback",
            )

    # Walk global fallback chain
    for chain_ref in config.fallback_chain:
        if chain_ref not in tried:
            model = resolver.resolve_safe(chain_ref)
            if model is not None:
                return model, tuple(tried)
            tried.append(chain_ref)
            logger.debug(
                ROUTING_FALLBACK_ATTEMPTED,
                ref=chain_ref,
                source="global_chain",
            )

    logger.warning(ROUTING_FALLBACK_EXHAUSTED, tried=tried)
    msg = f"All model candidates exhausted: {tried}"
    raise NoAvailableModelError(msg, context={"tried": tried})

…bbit, and Gemini

- Log routing failures in route() before re-raising (CodeRabbit)
- Clarify remaining_budget as per-request cost ceiling, not total USD budget
- Add total_cost_per_1k property to ResolvedModel, use in 3+ call sites
- Update docstring examples to vendor-agnostic (no real provider names)
- Extract strategy name constants to prevent drift in STRATEGY_MAP
- Dedup tried refs in _try_resolve_with_fallback to avoid redundant lookups
- Log empty fallback chains in _walk_fallback_chain for diagnostics
- Add debug logs for task_type and role rule miss paths
- Document SmartStrategy._try_override soft-fail design
- Fix _try_cheapest to use all_models() instead of double-sort
- Add "cheapest" alias inline comment in STRATEGY_MAP
- Tighten collision detection in ModelResolver (compare full object)
- Add RoutingRuleConfig model_validator: at least one matcher required
- Add warning log before LookupError in get_seniority_info
- Add 10 new tests: collision detection, override soft-fail, 3-stage
  fallback, mid-range budget, blank validation for RoutingRequest fields
@Aureliolo
Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 2, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Aureliolo
Copy link
Copy Markdown
Owner Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive and well-designed model routing engine. The implementation is robust, featuring multiple routing strategies, a fast model resolver using immutable data structures, and thorough validation. The code is well-structured, with excellent separation of concerns, extensive structured logging, and a solid test suite that covers numerous edge cases. Decoupling model tiers from vendor-specific names is a significant improvement for maintainability. I have one suggestion for a minor refactoring in the SmartStrategy to improve its internal logic, but overall, this is a high-quality contribution.

Comment on lines +556 to +578
return (
self._try_override(request, resolver)
or _try_task_type_rules(
request,
config,
resolver,
self.name,
)
or _try_role_rules(
request,
config,
resolver,
self.name,
)
or _try_seniority_default(
request,
resolver,
self.name,
)
or self._try_global_chain(config, resolver)
or self._try_cheapest(request, resolver)
or self._raise_exhausted()
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of SmartStrategy.select has some logical redundancy that could make it harder to maintain. Specifically, the global fallback chain can be walked multiple times during a single select call.

For example:

  1. _try_task_type_rules is called. If it matches a rule, it internally calls _try_resolve_with_fallback_safe, which walks the global fallback chain.
  2. If that fails to find a model, the or chain continues.
  3. Eventually, self._try_global_chain(config, resolver) is called, which may walk the same global fallback chain again.

While this is a minor inefficiency, it points to a conceptual complexity where candidate selection (e.g., finding a rule) is mixed with resolution logic (trying the ref and all its fallbacks).

A potential refactoring could separate these concerns. The select method could first identify a prioritized list of candidate model references from all available signals (override, task rule, role rule, seniority default), and then orchestrate the resolution with fallbacks centrally. This would eliminate the redundancy and make the priority logic in SmartStrategy more explicit and easier to follow.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/config/schema.py`:
- Around line 129-134: The validation in _at_least_one_matcher currently raises
a ValueError without logging; update it to log the failure at WARNING or ERROR
level with contextual details (include the message "Routing rule must specify at
least role_level or task_type" plus relevant self fields such as the rule
id/name, role_level, and task_type) immediately before raising. Use the
module/class logger (create a module-level logger via
logging.getLogger(__name__) if one doesn't exist) and call logger.warning or
logger.error inside _at_least_one_matcher, then raise the ValueError as before
so the exception path is logged with context.

In `@src/ai_company/providers/routing/resolver.py`:
- Around line 57-115: The from_config classmethod on ModelResolver is too large
and should be split into focused helpers: extract a helper to build a
ResolvedModel from (provider_name, model_config) (referencing ResolvedModel and
model_config fields: id, alias, cost_per_1k_input, cost_per_1k_output,
max_context), extract a helper to index a resolved model into the index dict
(handling duplicate detection and raising ModelResolutionError and logging
ROUTING_MODEL_RESOLUTION_FAILED with the same context), and extract a small
helper to compute the final metrics for the logger call (model_count/ref_count).
Replace the current inline loops in from_config with calls to these helpers and
keep from_config as an orchestration layer that builds the index and returns
cls(index).

In `@src/ai_company/providers/routing/router.py`:
- Around line 116-127: The RoutingError except block currently swallows the
exception; change the handler to capture the exception (e.g., except
RoutingError as err) and include its type/message in the warning log for
ROUTING_SELECTION_FAILED; update the logger.warning call near
ROUTING_SELECTION_FAILED (keeping strategy=self._strategy.name and request
fields) to add error details (err or str(err)) or pass exc_info=err so logs
include the exception metadata for easier diagnosis.

In `@src/ai_company/providers/routing/strategies.py`:
- Around line 80-137: The _try_resolve_with_fallback function is overlong; split
its three resolution stages into small helpers to keep functions under 50 lines:
create helper functions like _resolve_primary(ref, resolver, tried, seen),
_resolve_rule_fallback(rule, resolver, tried, seen) and
_resolve_global_chain(config, resolver, tried, seen) that each attempt
resolution with resolver.resolve_safe, update tried/seen, emit
ROUTING_FALLBACK_ATTEMPTED logs, and return a ResolvedModel or None; keep
NoAvailableModelError raising and ROUTING_FALLBACK_EXHAUSTED logging in the
original _try_resolve_with_fallback which now orchestrates the helpers and
preserves types (RoutingRuleConfig, RoutingConfig, ModelResolver) and the final
error context {"tried": tried}.
- Around line 394-477: The select method in RoleBasedStrategy is too large;
split its responsibilities into small helpers: extract validation into a
_validate_request(request) that checks request.agent_level and raises
ModelResolutionError with the same logging (ROUTING_NO_RULE_MATCHED, logger),
extract rule matching into _select_by_rule(request, config, resolver) that
iterates config.rules and calls _try_resolve_with_fallback(rule.preferred_model,
rule, config, resolver) and returns a RoutingDecision when found, extract
seniority fallback into _select_by_seniority(request, config, resolver) that
calls get_seniority_info(...).typical_model_tier and
_try_resolve_with_fallback_safe(tier, None, config, resolver) and returns a
RoutingDecision when found, and extract the final error path into
_raise_no_available(request, config, tier) which logs ROUTING_FALLBACK_EXHAUSTED
and raises NoAvailableModelError with the same constructed message; then rewrite
RoleBasedStrategy.select to call these helpers in sequence (validate ->
_select_by_rule -> _select_by_seniority -> _raise_no_available) so each function
stays small and the main select is under 50 lines.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7618335 and 99fa913.

📒 Files selected for processing (14)
  • src/ai_company/config/schema.py
  • src/ai_company/core/role_catalog.py
  • src/ai_company/observability/events.py
  • src/ai_company/providers/__init__.py
  • src/ai_company/providers/routing/__init__.py
  • src/ai_company/providers/routing/models.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/strategies.py
  • tests/unit/config/conftest.py
  • tests/unit/config/test_schema.py
  • tests/unit/providers/routing/test_models.py
  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/providers/routing/test_strategies.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations — Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models with BaseModel, model_validator, and ConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines

Files:

  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/providers/routing/models.py
  • src/ai_company/config/schema.py
  • src/ai_company/observability/events.py
  • tests/unit/providers/routing/test_strategies.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/core/role_catalog.py
  • tests/unit/providers/routing/test_models.py
  • src/ai_company/providers/routing/__init__.py
  • tests/unit/config/conftest.py
  • src/ai_company/providers/__init__.py
  • tests/unit/config/test_schema.py
  • tests/unit/providers/routing/test_resolver.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic must import: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging or logging.getLogger() or print() in application code — use the ai_company logger
Always use logger as the variable name (not _logger or log)
Always use event name constants from ai_company.observability.events in logger calls
Use structured logging format: logger.info(EVENT, key=value) — never use format strings like logger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions

Files:

  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/providers/routing/models.py
  • src/ai_company/config/schema.py
  • src/ai_company/observability/events.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/core/role_catalog.py
  • src/ai_company/providers/routing/__init__.py
  • src/ai_company/providers/__init__.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark all test cases with pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow
Configure pytest async mode as asyncio_mode = "auto" — do not manually add @pytest.mark.asyncio to test functions

Files:

  • tests/unit/providers/routing/test_strategies.py
  • tests/unit/providers/routing/test_models.py
  • tests/unit/config/conftest.py
  • tests/unit/config/test_schema.py
  • tests/unit/providers/routing/test_resolver.py
🧠 Learnings (5)
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to **/*.py : Do not use `from __future__ import annotations` — Python 3.14 has native lazy annotations via PEP 649

Applied to files:

  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/resolver.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : NEVER add `from __future__ import annotations` - Python 3.14+ has native support for deferred annotation evaluation and does not require this import

Applied to files:

  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/resolver.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls

Applied to files:

  • src/ai_company/observability/events.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception

Applied to files:

  • src/ai_company/providers/routing/router.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions

Applied to files:

  • src/ai_company/providers/routing/router.py
🧬 Code graph analysis (8)
src/ai_company/providers/routing/models.py (1)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
tests/unit/providers/routing/test_strategies.py (5)
src/ai_company/config/schema.py (3)
  • ProviderConfig (52-95)
  • RoutingConfig (137-159)
  • RoutingRuleConfig (98-134)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
src/ai_company/providers/routing/errors.py (2)
  • ModelResolutionError (16-19)
  • NoAvailableModelError (22-25)
src/ai_company/providers/routing/models.py (1)
  • RoutingRequest (48-83)
src/ai_company/providers/routing/resolver.py (2)
  • ModelResolver (28-181)
  • from_config (57-115)
src/ai_company/providers/routing/router.py (5)
src/ai_company/providers/routing/errors.py (2)
  • RoutingError (10-13)
  • UnknownStrategyError (28-31)
tests/unit/providers/routing/conftest.py (1)
  • resolver (97-101)
src/ai_company/providers/routing/resolver.py (2)
  • ModelResolver (28-181)
  • from_config (57-115)
src/ai_company/config/schema.py (1)
  • ProviderConfig (52-95)
src/ai_company/providers/routing/models.py (2)
  • RoutingDecision (86-104)
  • RoutingRequest (48-83)
src/ai_company/providers/routing/resolver.py (4)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/providers/routing/errors.py (1)
  • ModelResolutionError (16-19)
src/ai_company/providers/routing/models.py (2)
  • ResolvedModel (9-45)
  • total_cost_per_1k (43-45)
src/ai_company/config/schema.py (1)
  • ProviderConfig (52-95)
src/ai_company/core/role_catalog.py (2)
src/ai_company/core/enums.py (2)
  • CostTier (57-69)
  • SeniorityLevel (6-21)
src/ai_company/core/role.py (1)
  • SeniorityInfo (87-120)
tests/unit/providers/routing/test_models.py (3)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
src/ai_company/providers/routing/models.py (3)
  • ResolvedModel (9-45)
  • RoutingDecision (86-104)
  • RoutingRequest (48-83)
tests/unit/providers/routing/conftest.py (3)
  • ResolvedModelFactory (25-34)
  • RoutingDecisionFactory (47-54)
  • RoutingRequestFactory (37-44)
src/ai_company/providers/routing/__init__.py (4)
src/ai_company/providers/routing/errors.py (4)
  • ModelResolutionError (16-19)
  • NoAvailableModelError (22-25)
  • RoutingError (10-13)
  • UnknownStrategyError (28-31)
src/ai_company/providers/routing/models.py (3)
  • ResolvedModel (9-45)
  • RoutingDecision (86-104)
  • RoutingRequest (48-83)
src/ai_company/providers/routing/resolver.py (1)
  • ModelResolver (28-181)
src/ai_company/providers/routing/strategies.py (5)
  • CostAwareStrategy (482-526)
  • ManualStrategy (338-375)
  • RoleBasedStrategy (381-476)
  • RoutingStrategy (46-74)
  • SmartStrategy (532-654)
tests/unit/config/test_schema.py (2)
src/ai_company/config/schema.py (1)
  • RoutingRuleConfig (98-134)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
🔇 Additional comments (14)
src/ai_company/core/role_catalog.py (2)

25-67: Vendor-agnostic tier mapping is consistent and clean.

The typical_model_tier updates are consistent across all seniority levels and align with the routing decoupling objective.


443-447: Good observability on the miss/error path.

Adding a structured WARNING log with ROLE_LOOKUP_MISS before raising LookupError is the right behavior for diagnosability.

As per coding guidelines, "All error paths must log at WARNING or ERROR level with context before raising an exception."

src/ai_company/config/schema.py (1)

101-103: Matcher requirement documentation is clear and actionable.

This makes the rule contract explicit and easier to reason about in config/tests.

tests/unit/config/conftest.py (1)

42-44: Factory default update is appropriate for new routing-rule validation.

Setting task_type here keeps generated test objects valid by default.

src/ai_company/observability/events.py (1)

79-91: Routing event constant set is comprehensive and consistent.

These additions improve observability coverage across routing lifecycle/error paths.

tests/unit/config/test_schema.py (1)

136-154: Test updates correctly reflect the new routing-rule matcher contract.

Coverage now includes both valid minimal shapes and the no-matcher rejection path.

Also applies to: 195-200, 357-362, 377-377, 421-427, 436-442

src/ai_company/providers/routing/resolver.py (1)

52-54: Immutable index wrapping is a solid safety choice.

Using MappingProxyType prevents accidental mutation of resolver state.

tests/unit/providers/routing/test_models.py (1)

18-141: Domain model test coverage is strong and well-targeted.

The suite validates defaults, invariants, and frozen-model behavior across all routing models.

src/ai_company/providers/routing/router.py (1)

79-85: Router build logging includes the right operational context.

The emitted payload is useful for startup diagnostics and config verification.

src/ai_company/providers/routing/__init__.py (1)

7-51: Routing package exports are clean and complete.

__all__ is explicit and aligns with the intended public routing API.

tests/unit/providers/routing/test_strategies.py (1)

26-470: Comprehensive strategy coverage looks solid.

Great depth here across protocol conformance, fallback chains, soft-fail behavior, and budget-aware paths.

src/ai_company/providers/__init__.py (1)

37-109: Routing re-exports are clean and consistent.

The import/export surface is coherent and the new symbols are correctly exposed via __all__.

tests/unit/providers/routing/test_resolver.py (1)

12-143: Resolver tests cover the critical invariants well.

Good checks for collision detection, immutability, and cost ordering semantics.

src/ai_company/providers/routing/models.py (1)

9-104: Routing domain models are well-structured and validated.

Immutability + field constraints + clear typed contracts look good for strategy/resolver integration.

Comment on lines +57 to +115
def from_config(
cls,
providers: dict[str, ProviderConfig],
) -> ModelResolver:
"""Build a resolver from a provider config dict.

Args:
providers: Provider config dict (key = provider name).

Returns:
A new ``ModelResolver`` with all models indexed.
"""
index: dict[str, ResolvedModel] = {}

for provider_name, provider_config in providers.items():
for model_config in provider_config.models:
resolved = ResolvedModel(
provider_name=provider_name,
model_id=model_config.id,
alias=model_config.alias,
cost_per_1k_input=model_config.cost_per_1k_input,
cost_per_1k_output=model_config.cost_per_1k_output,
max_context=model_config.max_context,
)
for ref in (model_config.id, model_config.alias):
if ref is None:
continue
existing = index.get(ref)
if existing is not None and existing != resolved:
logger.error(
ROUTING_MODEL_RESOLUTION_FAILED,
ref=ref,
existing_provider=existing.provider_name,
existing_model_id=existing.model_id,
new_provider=provider_name,
new_model_id=resolved.model_id,
)
msg = (
f"Duplicate model reference {ref!r}: "
f"{existing.provider_name}/{existing.model_id} "
f"vs {provider_name}/{resolved.model_id}"
)
raise ModelResolutionError(
msg,
context={
"ref": ref,
"existing_provider": existing.provider_name,
"new_provider": provider_name,
},
)
index[ref] = resolved

logger.info(
ROUTING_RESOLVER_BUILT,
model_count=len({m.model_id for m in index.values()}),
ref_count=len(index),
providers=sorted(providers),
)
return cls(index)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Refactor from_config into smaller helpers to satisfy function-size rule.

This method currently combines multiple concerns and exceeds the 50-line limit.

♻️ Proposed refactor direction
 class ModelResolver:
+    `@staticmethod`
+    def _build_resolved_model(
+        provider_name: str,
+        model_config: ProviderModelConfig,
+    ) -> ResolvedModel:
+        return ResolvedModel(
+            provider_name=provider_name,
+            model_id=model_config.id,
+            alias=model_config.alias,
+            cost_per_1k_input=model_config.cost_per_1k_input,
+            cost_per_1k_output=model_config.cost_per_1k_output,
+            max_context=model_config.max_context,
+        )
+
+    `@staticmethod`
+    def _index_ref(
+        index: dict[str, ResolvedModel],
+        ref: str,
+        resolved: ResolvedModel,
+        provider_name: str,
+    ) -> None:
+        existing = index.get(ref)
+        if existing is not None and existing != resolved:
+            logger.error(
+                ROUTING_MODEL_RESOLUTION_FAILED,
+                ref=ref,
+                existing_provider=existing.provider_name,
+                existing_model_id=existing.model_id,
+                new_provider=provider_name,
+                new_model_id=resolved.model_id,
+            )
+            msg = (
+                f"Duplicate model reference {ref!r}: "
+                f"{existing.provider_name}/{existing.model_id} "
+                f"vs {provider_name}/{resolved.model_id}"
+            )
+            raise ModelResolutionError(msg, context={"ref": ref})
+        index[ref] = resolved
@@
     def from_config(
         cls,
         providers: dict[str, ProviderConfig],
     ) -> ModelResolver:
         index: dict[str, ResolvedModel] = {}
         for provider_name, provider_config in providers.items():
             for model_config in provider_config.models:
-                resolved = ResolvedModel(...)
+                resolved = cls._build_resolved_model(provider_name, model_config)
                 for ref in (model_config.id, model_config.alias):
                     if ref is None:
                         continue
-                    ...
+                    cls._index_ref(index, ref, resolved, provider_name)

As per coding guidelines: "Keep functions under 50 lines; keep files under 800 lines."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/providers/routing/resolver.py` around lines 57 - 115, The
from_config classmethod on ModelResolver is too large and should be split into
focused helpers: extract a helper to build a ResolvedModel from (provider_name,
model_config) (referencing ResolvedModel and model_config fields: id, alias,
cost_per_1k_input, cost_per_1k_output, max_context), extract a helper to index a
resolved model into the index dict (handling duplicate detection and raising
ModelResolutionError and logging ROUTING_MODEL_RESOLUTION_FAILED with the same
context), and extract a small helper to compute the final metrics for the logger
call (model_count/ref_count). Replace the current inline loops in from_config
with calls to these helpers and keep from_config as an orchestration layer that
builds the index and returns cls(index).

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 21 to 27
SENIORITY_INFO: tuple[SeniorityInfo, ...] = (
SeniorityInfo(
level=SeniorityLevel.JUNIOR,
authority_scope="Execute assigned tasks only",
typical_model_tier="haiku",
typical_model_tier="small",
cost_tier=CostTier.LOW,
),
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typical_model_tier values were changed to vendor-agnostic aliases (small/medium/large), but built-in templates + template rendering logic still default to/use haiku/sonnet/opus (e.g. src/ai_company/templates/renderer.py default "sonnet" and multiple templates/builtins/*.yaml). This will likely make template-generated configs fail to resolve models unless users keep legacy aliases; either update templates/renderer to the new tier names or add backward-compatible alias mapping (haiku→small, sonnet→medium, opus→large).

Copilot uses AI. Check for mistakes.
…and templates

- Add allow_inf_nan=False to ResolvedModel and ProviderModelConfig
- Add logging before raise in RoutingRuleConfig._at_least_one_matcher
- Add strategy-required-fields docstring to RoutingRequest
- Update all template tiers from vendor-specific (opus/sonnet/haiku) to
  generic (large/medium/small) across schema, renderer, and 7 YAML builtins
- Add tests for total_cost_per_1k, inf/nan rejection, fallback chain
  skipping unresolvable entries, exhausted non-empty chain, rule fallback
  dedup, and routing failure warning log
- Rename misleading test_budget_filters_models to test_tight_budget_picks_cheapest
- Add noqa comments for intentional runtime imports (PEP 649 compatibility)
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/config/schema.py`:
- Around line 137-141: The validation-failure warning for RoutingRuleConfig
currently logs CONFIG_VALIDATION_FAILED with model and error only; update the
logger.warning call (the one that emits CONFIG_VALIDATION_FAILED in
src/ai_company/config/schema.py, around the RoutingRuleConfig validation path)
to include the routing rule context keys role_level, task_type, preferred_model,
and fallback in the payload (pass them as keyword args alongside
model="RoutingRuleConfig" and error=msg) so the warning contains those fields
for easier triage.

In `@src/ai_company/providers/routing/models.py`:
- Around line 69-90: The remaining_budget field on the RoutingRequest model
currently allows NaN/Infinity which breaks deterministic cost comparisons;
update the Field(...) for remaining_budget in the RoutingRequest class to
disallow non-finite floats by adding allow_inf_nan=False (i.e.,
remaining_budget: float | None = Field(default=None, ge=0.0,
allow_inf_nan=False, description=...)); ensure the same semantics as
ResolvedModel so budget filtering remains reliable.

In `@src/ai_company/providers/routing/strategies.py`:
- Around line 562-569: The code returns immediately when
_try_task_type_rules(...) yields a decision, which skips budget checks and
allows models that exceed remaining_budget; update the flow in the method
containing the call to _try_task_type_rules so that when a non-None decision is
produced you still validate it against remaining_budget (or run the same
budget-filtering logic used in the cost-aware branch) before returning—e.g.,
check the selected model's cost against remaining_budget and only return the
decision if it passes, otherwise continue to the next decision path or fall back
to a budget-compliant selection; reference _try_task_type_rules,
remaining_budget, and the cost-aware routing logic when making the change.
- Around line 615-633: The SmartStrategy priority chain is short-circuited by
calling self._try_global_chain before role/seniority stages, letting global
fallback models win too early; change the resolution order in the composite OR
chain so that _try_task_type_rules, _try_role_rules, and _try_seniority_default
run (and may traverse config.fallback_chain as they currently do) before any
global-chain resolution, and only call self._try_global_chain after
_try_seniority_default and _try_cheapest. Update the OR sequence that currently
ends with "or self._try_global_chain(config, resolver) or
self._try_cheapest(request, resolver)" to instead call
self._try_cheapest(request, resolver) first and then
self._try_global_chain(config, resolver) so global-chain models are only
considered last.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99fa913 and b17601b.

📒 Files selected for processing (18)
  • src/ai_company/config/schema.py
  • src/ai_company/providers/routing/models.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/templates/builtins/agency.yaml
  • src/ai_company/templates/builtins/dev_shop.yaml
  • src/ai_company/templates/builtins/full_company.yaml
  • src/ai_company/templates/builtins/product_team.yaml
  • src/ai_company/templates/builtins/research_lab.yaml
  • src/ai_company/templates/builtins/solo_founder.yaml
  • src/ai_company/templates/builtins/startup.yaml
  • src/ai_company/templates/renderer.py
  • src/ai_company/templates/schema.py
  • tests/unit/providers/routing/test_models.py
  • tests/unit/providers/routing/test_router.py
  • tests/unit/providers/routing/test_strategies.py
  • tests/unit/templates/test_schema.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations — Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models with BaseModel, model_validator, and ConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines

Files:

  • src/ai_company/templates/schema.py
  • tests/unit/templates/test_schema.py
  • src/ai_company/templates/renderer.py
  • tests/unit/providers/routing/test_router.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/models.py
  • tests/unit/providers/routing/test_strategies.py
  • tests/unit/providers/routing/test_models.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/config/schema.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic must import: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging or logging.getLogger() or print() in application code — use the ai_company logger
Always use logger as the variable name (not _logger or log)
Always use event name constants from ai_company.observability.events in logger calls
Use structured logging format: logger.info(EVENT, key=value) — never use format strings like logger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions

Files:

  • src/ai_company/templates/schema.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/models.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/strategies.py
  • src/ai_company/config/schema.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark all test cases with pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow
Configure pytest async mode as asyncio_mode = "auto" — do not manually add @pytest.mark.asyncio to test functions

Files:

  • tests/unit/templates/test_schema.py
  • tests/unit/providers/routing/test_router.py
  • tests/unit/providers/routing/test_strategies.py
  • tests/unit/providers/routing/test_models.py
🧠 Learnings (11)
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/model_service.py} : Respect existing model configuration patterns in Ollama integration

Applied to files:

  • src/ai_company/templates/schema.py
  • tests/unit/templates/test_schema.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/providers/routing/resolver.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.

Applied to files:

  • tests/unit/templates/test_schema.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Configure appropriate temperature settings based on agent role: Writer (0.9), Editor (0.6), Continuity (0.3), Architect (0.85), Interviewer (0.7)

Applied to files:

  • src/ai_company/templates/builtins/research_lab.yaml
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Use context size of 32768 tokens as default for Ollama models.

Applied to files:

  • src/ai_company/templates/renderer.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to **/*.py : Do not use `from __future__ import annotations` — Python 3.14 has native lazy annotations via PEP 649

Applied to files:

  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/strategies.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : NEVER add `from __future__ import annotations` - Python 3.14+ has native support for deferred annotation evaluation and does not require this import

Applied to files:

  • src/ai_company/providers/routing/router.py
  • src/ai_company/providers/routing/resolver.py
  • src/ai_company/providers/routing/strategies.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception

Applied to files:

  • src/ai_company/providers/routing/router.py
  • src/ai_company/config/schema.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions

Applied to files:

  • src/ai_company/providers/routing/router.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic must import: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • src/ai_company/config/schema.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls

Applied to files:

  • src/ai_company/config/schema.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All state transitions must log at INFO level

Applied to files:

  • src/ai_company/config/schema.py
🧬 Code graph analysis (4)
tests/unit/providers/routing/test_router.py (5)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
src/ai_company/providers/routing/errors.py (3)
  • ModelResolutionError (16-19)
  • NoAvailableModelError (22-25)
  • UnknownStrategyError (28-31)
src/ai_company/providers/routing/router.py (4)
  • ModelRouter (24-134)
  • strategy_name (88-90)
  • resolver (83-85)
  • route (92-134)
tests/unit/providers/routing/conftest.py (3)
  • three_model_provider (85-93)
  • resolver (97-101)
  • standard_routing_config (105-130)
src/ai_company/providers/routing/resolver.py (1)
  • all_models (173-176)
src/ai_company/providers/routing/router.py (3)
src/ai_company/providers/routing/errors.py (2)
  • RoutingError (10-13)
  • UnknownStrategyError (28-31)
src/ai_company/providers/routing/models.py (1)
  • RoutingRequest (48-90)
src/ai_company/providers/routing/resolver.py (2)
  • ModelResolver (25-188)
  • from_config (87-122)
tests/unit/providers/routing/test_models.py (3)
src/ai_company/core/enums.py (1)
  • SeniorityLevel (6-21)
src/ai_company/providers/routing/models.py (4)
  • ResolvedModel (9-45)
  • RoutingDecision (93-111)
  • RoutingRequest (48-90)
  • total_cost_per_1k (43-45)
tests/unit/providers/routing/conftest.py (3)
  • ResolvedModelFactory (25-34)
  • RoutingDecisionFactory (47-54)
  • RoutingRequestFactory (37-44)
src/ai_company/config/schema.py (2)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/observability/config.py (1)
  • LogConfig (115-192)
🔇 Additional comments (17)
src/ai_company/templates/builtins/solo_founder.yaml (1)

36-36: Tier migration is consistent and clean.

The updated model tiers align with the new vendor-agnostic template strategy.

Also applies to: 41-41

src/ai_company/templates/builtins/research_lab.yaml (1)

35-35: Model tier updates look correct.

These assignments are consistent with the standardized large/medium/small naming.

Also applies to: 40-40, 45-45

src/ai_company/templates/builtins/dev_shop.yaml (1)

38-38: Good tier normalization across agent roles.

The migration preserves intent and removes provider coupling as expected.

Also applies to: 43-43, 48-48, 53-53, 58-58

src/ai_company/templates/builtins/full_company.yaml (1)

44-44: Template model mapping update is coherent.

The revised tiers are consistent with the new generic catalog.

Also applies to: 49-49, 54-54, 58-58, 63-63

src/ai_company/config/schema.py (1)

32-32: allow_inf_nan=False is a solid hardening change.

Good guardrail for provider model numeric fields.

src/ai_company/templates/builtins/startup.yaml (1)

39-39: Startup template tier migration looks correct.

The updated model values are consistent and aligned with the new routing vocabulary.

Also applies to: 44-44, 49-49, 54-54, 59-59

tests/unit/templates/test_schema.py (1)

79-79: Test expectation update is correct.

This assertion now matches the new default model tier.

src/ai_company/templates/builtins/agency.yaml (1)

38-38: Agency template model updates are consistent.

The changes correctly apply the vendor-agnostic tier naming.

Also applies to: 43-43, 47-47

src/ai_company/templates/renderer.py (1)

379-380: LGTM! Vendor-agnostic model tier default.

The change from "sonnet" to "medium" aligns with the PR objective to use vendor-agnostic tier names (small/medium/large) instead of vendor-specific model names.

src/ai_company/templates/builtins/product_team.yaml (1)

38-52: LGTM! Consistent vendor-agnostic model tier assignments.

The model tier updates appropriately map agent roles to capability tiers:

  • Senior roles (Product Manager, Full-Stack Developer) and mid-level UX Designer → "medium"
  • QA Engineer → "small"

This aligns with the PR objective for vendor-agnostic tier naming.

src/ai_company/templates/schema.py (1)

88-102: LGTM! Schema default and documentation aligned.

The default value change to "medium" and updated docstring documenting tier options ("large", "medium", "small") are consistent with the vendor-agnostic tier naming across the codebase.

tests/unit/providers/routing/test_models.py (2)

1-98: LGTM! Comprehensive model validation tests.

Good coverage of ResolvedModel including:

  • Factory construction and immutability
  • Optional alias handling
  • Cost defaults and non-negative validation
  • max_context positivity constraint
  • NotBlankStr validation for provider_name/model_id
  • total_cost_per_1k computed property
  • Inf/NaN rejection via allow_inf_nan=False

100-166: LGTM! Request and decision model tests.

Good coverage of RoutingRequest and RoutingDecision:

  • All-optional field construction
  • Negative budget rejection
  • NotBlankStr validation for task_type and model_override
  • Default empty fallbacks tuple
  • Immutability enforcement
tests/unit/providers/routing/test_router.py (1)

1-171: LGTM! Comprehensive router tests with observability verification.

Excellent coverage including:

  • Strategy construction and alias mapping (cheapestcost_aware)
  • Error handling for unknown strategies
  • Routing behavior across all strategies (cost_aware, manual, role_based, smart)
  • Observability verification for ROUTING_ROUTER_BUILT, ROUTING_DECISION_MADE, and ROUTING_SELECTION_FAILED events
  • Error propagation validation
src/ai_company/providers/routing/router.py (1)

1-134: LGTM! Well-structured router with proper observability.

The implementation:

  • Correctly delegates to strategies via STRATEGY_MAP
  • Logs structured events for router construction (ROUTING_ROUTER_BUILT) and routing decisions (ROUTING_DECISION_MADE)
  • Logs warning with full context (ROUTING_SELECTION_FAILED) before re-raising RoutingError
  • Exposes resolver and strategy_name properties for introspection
tests/unit/providers/routing/test_strategies.py (1)

1-540: LGTM! Comprehensive strategy test suite.

Excellent coverage of all four strategies with edge cases:

  • Protocol conformance and STRATEGY_MAP completeness (including cheapest alias)
  • ManualStrategy: override resolution, missing override, unknown model errors
  • RoleBasedStrategy: rule matching, seniority defaults, rule fallbacks, global chain fallbacks
  • CostAwareStrategy: cheapest selection, task_type priority, budget filtering, zero-budget warning
  • SmartStrategy: priority ordering (override > task_type > role > seniority > chain > cheapest), soft-fail override handling
  • Fallback chain deduplication and exhaustion scenarios
src/ai_company/providers/routing/resolver.py (1)

1-188: LGTM! Clean resolver with O(1) lookup and collision detection.

Well-designed implementation:

  • MappingProxyType ensures post-construction immutability
  • _index_ref helper properly handles collision detection with structured error logging
  • resolve() logs warning before raising ModelResolutionError
  • resolve_safe() provides non-throwing variant with debug logging
  • all_models_sorted_by_cost() supports cost-aware strategy selection
  • Refactoring from past review (extracting _index_ref) has been applied, keeping from_config concise

Comment on lines +69 to +90
model_config = ConfigDict(frozen=True)

agent_level: SeniorityLevel | None = Field(
default=None,
description="Seniority level of the requesting agent",
)
task_type: NotBlankStr | None = Field(
default=None,
description="Task type label",
)
model_override: NotBlankStr | None = Field(
default=None,
description="Explicit model reference for manual routing",
)
remaining_budget: float | None = Field(
default=None,
ge=0.0,
description=(
"Per-request cost ceiling (compared against model "
"total_cost_per_1k). Not a total session budget."
),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, examine the file to confirm the model definition
cat -n src/ai_company/providers/routing/models.py | head -100

Repository: Aureliolo/ai-company

Length of output: 4271


🌐 Web query:

Pydantic v2 ConfigDict allow_inf_nan default value and float field validation behavior

💡 Result:

  • ConfigDict.allow_inf_nan default (Pydantic v2): True — i.e., by default Pydantic accepts float('inf'), float('-inf'), and float('nan') (and likewise for Decimal fields). [1]
  • What it changes for float fields:
    • If allow_inf_nan=True: float validation allows inf/-inf/nan (including commonly accepted string forms like "inf", "infinity", "nan" in non-strict parsing). [1][3]
    • If allow_inf_nan=False: those values are rejected with a validation error for float/decimal fields. [1]
  • Schema-level / per-field override: in pydantic-core, float_schema(..., allow_inf_nan=...) exists; its default is None (meaning “don’t override here”, so model config can apply). [2]

Sources
[1] Pydantic v2 config docs (ConfigDict.allow_inf_nan, default True) (docs.pydantic.dev)
[2] Pydantic docs: pydantic_core.core_schema.float_schema(allow_inf_nan=...) (docs.pydantic.dev)
[3] Summary of pydantic-core float validator accepted inputs (incl. "inf", "infinity", "nan" when allowed) (deepwiki.com)

Citations:


Disallow NaN/Infinity for remaining_budget to keep budget routing deterministic.

RoutingRequest omits allow_inf_nan=False while ResolvedModel in the same file enforces it. This allows non-finite float inputs for the remaining_budget field, which is used in cost comparisons (lines 62–66) and would produce unreliable filtering results.

Proposed fix
 class RoutingRequest(BaseModel):
-    model_config = ConfigDict(frozen=True)
+    model_config = ConfigDict(frozen=True, allow_inf_nan=False)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
model_config = ConfigDict(frozen=True)
agent_level: SeniorityLevel | None = Field(
default=None,
description="Seniority level of the requesting agent",
)
task_type: NotBlankStr | None = Field(
default=None,
description="Task type label",
)
model_override: NotBlankStr | None = Field(
default=None,
description="Explicit model reference for manual routing",
)
remaining_budget: float | None = Field(
default=None,
ge=0.0,
description=(
"Per-request cost ceiling (compared against model "
"total_cost_per_1k). Not a total session budget."
),
)
model_config = ConfigDict(frozen=True, allow_inf_nan=False)
agent_level: SeniorityLevel | None = Field(
default=None,
description="Seniority level of the requesting agent",
)
task_type: NotBlankStr | None = Field(
default=None,
description="Task type label",
)
model_override: NotBlankStr | None = Field(
default=None,
description="Explicit model reference for manual routing",
)
remaining_budget: float | None = Field(
default=None,
ge=0.0,
description=(
"Per-request cost ceiling (compared against model "
"total_cost_per_1k). Not a total session budget."
),
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/providers/routing/models.py` around lines 69 - 90, The
remaining_budget field on the RoutingRequest model currently allows NaN/Infinity
which breaks deterministic cost comparisons; update the Field(...) for
remaining_budget in the RoutingRequest class to disallow non-finite floats by
adding allow_inf_nan=False (i.e., remaining_budget: float | None =
Field(default=None, ge=0.0, allow_inf_nan=False, description=...)); ensure the
same semantics as ResolvedModel so budget filtering remains reliable.

- Add routing rule context (role_level, task_type, preferred_model,
  fallback) to RoutingRuleConfig validation warning log payload
- Add allow_inf_nan=False to RoutingRequest ConfigDict so inf/nan
  budgets are rejected at validation time
- Gate CostAwareStrategy task_type rule decisions against
  remaining_budget before returning (fall through to cheapest if
  the matched model exceeds budget)
- Swap SmartStrategy priority: cheapest (budget-aware) now runs
  before global fallback chain so cost-informed selection takes
  precedence over blind chain traversal
Copilot AI review requested due to automatic review settings March 2, 2026 19:40
@Aureliolo Aureliolo merged commit d3c250b into main Mar 2, 2026
11 of 12 checks passed
@Aureliolo Aureliolo deleted the feat/model-routing-engine branch March 2, 2026 19:43
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 34 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +161 to +165
logger.info(
ROUTING_FALLBACK_EXHAUSTED,
ref=ref,
source="safe_wrapper",
)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_try_resolve_with_fallback_safe() currently logs ROUTING_FALLBACK_EXHAUSTED after catching NoAvailableModelError, but _try_resolve_with_fallback() already logs the same event at WARNING right before raising. This results in duplicate exhaustion logs for every safe attempt. Consider removing the wrapper log, or refactoring _try_resolve_with_fallback() to optionally suppress/downgrade its exhaustion log when used by the safe variant.

Suggested change
logger.info(
ROUTING_FALLBACK_EXHAUSTED,
ref=ref,
source="safe_wrapper",
)

Copilot uses AI. Check for mistakes.
Comment on lines +140 to +141
"""MID has no rule -> uses seniority catalog (sonnet tier)."""
strategy = RoleBasedStrategy()
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring still refers to the seniority catalog default as the “sonnet tier”, but the routing tiers were renamed to vendor-agnostic small/medium/large in this PR. Updating this keeps tests/documentation consistent with the new tier naming.

Copilot uses AI. Check for mistakes.
Comment on lines +343 to +345
# review rule -> opus; junior role rule -> haiku; task wins
assert decision.resolved_model.alias == "large"
assert "task-type" in decision.reason.lower()
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment uses vendor-specific names (“opus/haiku”). Since this PR’s stated goal is vendor-agnostic tiers (small/medium/large), it would be clearer to update the comment to those tier names (e.g., “review rule -> large; junior role rule -> small”).

Copilot uses AI. Check for mistakes.
Comment on lines +547 to +549
"""Budget large enough for haiku+sonnet but not opus picks haiku."""
# haiku total=0.006, sonnet total=0.018, opus total=0.090
strategy = CostAwareStrategy()
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring still references vendor model names (“haiku/sonnet/opus”), but the suite otherwise uses the vendor-agnostic tier aliases (small/medium/large). Updating the wording avoids reintroducing vendor coupling in test docs.

Copilot uses AI. Check for mistakes.
Comment on lines +154 to +155
"""LEAD has tier=opus; if opus not registered, use fallback chain."""
provider = ProviderConfig(
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring/comment mentions “tier=opus”. With the tier aliases renamed to large/medium/small, consider updating the text to reflect the new large tier naming.

Copilot uses AI. Check for mistakes.
Aureliolo added a commit that referenced this pull request Mar 10, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.1.1](ai-company-v0.1.0...ai-company-v0.1.1)
(2026-03-10)


### Features

* add autonomy levels and approval timeout policies
([#42](#42),
[#126](#126))
([#197](#197))
([eecc25a](eecc25a))
* add CFO cost optimization service with anomaly detection, reports, and
approval decisions
([#186](#186))
([a7fa00b](a7fa00b))
* add code quality toolchain (ruff, mypy, pre-commit, dependabot)
([#63](#63))
([36681a8](36681a8))
* add configurable cost tiers and subscription/quota-aware tracking
([#67](#67))
([#185](#185))
([9baedfa](9baedfa))
* add container packaging, Docker Compose, and CI pipeline
([#269](#269))
([435bdfe](435bdfe)),
closes [#267](#267)
* add coordination error taxonomy classification pipeline
([#146](#146))
([#181](#181))
([70c7480](70c7480))
* add cost-optimized, hierarchical, and auction assignment strategies
([#175](#175))
([ce924fa](ce924fa)),
closes [#173](#173)
* add design specification, license, and project setup
([8669a09](8669a09))
* add env var substitution and config file auto-discovery
([#77](#77))
([7f53832](7f53832))
* add FastestStrategy routing + vendor-agnostic cleanup
([#140](#140))
([09619cb](09619cb)),
closes [#139](#139)
* add HR engine and performance tracking
([#45](#45),
[#47](#47))
([#193](#193))
([2d091ea](2d091ea))
* add issue auto-search and resolution verification to PR review skill
([#119](#119))
([deecc39](deecc39))
* add memory retrieval, ranking, and context injection pipeline
([#41](#41))
([873b0aa](873b0aa))
* add pluggable MemoryBackend protocol with models, config, and events
([#180](#180))
([46cfdd4](46cfdd4))
* add pluggable MemoryBackend protocol with models, config, and events
([#32](#32))
([46cfdd4](46cfdd4))
* add pluggable PersistenceBackend protocol with SQLite implementation
([#36](#36))
([f753779](f753779))
* add progressive trust and promotion/demotion subsystems
([#43](#43),
[#49](#49))
([3a87c08](3a87c08))
* add retry handler, rate limiter, and provider resilience
([#100](#100))
([b890545](b890545))
* add SecOps security agent with rule engine, audit log, and ToolInvoker
integration ([#40](#40))
([83b7b6c](83b7b6c))
* add shared org memory and memory consolidation/archival
([#125](#125),
[#48](#48))
([4a0832b](4a0832b))
* design unified provider interface
([#86](#86))
([3e23d64](3e23d64))
* expand template presets, rosters, and add inheritance
([#80](#80),
[#81](#81),
[#84](#84))
([15a9134](15a9134))
* implement agent runtime state vs immutable config split
([#115](#115))
([4cb1ca5](4cb1ca5))
* implement AgentEngine core orchestrator
([#11](#11))
([#143](#143))
([f2eb73a](f2eb73a))
* implement basic tool system (registry, invocation, results)
([#15](#15))
([c51068b](c51068b))
* implement built-in file system tools
([#18](#18))
([325ef98](325ef98))
* implement communication foundation — message bus, dispatcher, and
messenger ([#157](#157))
([8e71bfd](8e71bfd))
* implement company template system with 7 built-in presets
([#85](#85))
([cbf1496](cbf1496))
* implement conflict resolution protocol
([#122](#122))
([#166](#166))
([e03f9f2](e03f9f2))
* implement core entity and role system models
([#69](#69))
([acf9801](acf9801))
* implement crash recovery with fail-and-reassign strategy
([#149](#149))
([e6e91ed](e6e91ed))
* implement engine extensions — Plan-and-Execute loop and call
categorization
([#134](#134),
[#135](#135))
([#159](#159))
([9b2699f](9b2699f))
* implement enterprise logging system with structlog
([#73](#73))
([2f787e5](2f787e5))
* implement graceful shutdown with cooperative timeout strategy
([#130](#130))
([6592515](6592515))
* implement hierarchical delegation and loop prevention
([#12](#12),
[#17](#17))
([6be60b6](6be60b6))
* implement LiteLLM driver and provider registry
([#88](#88))
([ae3f18b](ae3f18b)),
closes [#4](#4)
* implement LLM decomposition strategy and workspace isolation
([#174](#174))
([aa0eefe](aa0eefe))
* implement meeting protocol system
([#123](#123))
([ee7caca](ee7caca))
* implement message and communication domain models
([#74](#74))
([560a5d2](560a5d2))
* implement model routing engine
([#99](#99))
([d3c250b](d3c250b))
* implement parallel agent execution
([#22](#22))
([#161](#161))
([65940b3](65940b3))
* implement per-call cost tracking service
([#7](#7))
([#102](#102))
([c4f1f1c](c4f1f1c))
* implement personality injection and system prompt construction
([#105](#105))
([934dd85](934dd85))
* implement single-task execution lifecycle
([#21](#21))
([#144](#144))
([c7e64e4](c7e64e4))
* implement subprocess sandbox for tool execution isolation
([#131](#131))
([#153](#153))
([3c8394e](3c8394e))
* implement task assignment subsystem with pluggable strategies
([#172](#172))
([c7f1b26](c7f1b26)),
closes [#26](#26)
[#30](#30)
* implement task decomposition and routing engine
([#14](#14))
([9c7fb52](9c7fb52))
* implement Task, Project, Artifact, Budget, and Cost domain models
([#71](#71))
([81eabf1](81eabf1))
* implement tool permission checking
([#16](#16))
([833c190](833c190))
* implement YAML config loader with Pydantic validation
([#59](#59))
([ff3a2ba](ff3a2ba))
* implement YAML config loader with Pydantic validation
([#75](#75))
([ff3a2ba](ff3a2ba))
* initialize project with uv, hatchling, and src layout
([39005f9](39005f9))
* initialize project with uv, hatchling, and src layout
([#62](#62))
([39005f9](39005f9))
* Litestar REST API, WebSocket feed, and approval queue (M6)
([#189](#189))
([29fcd08](29fcd08))
* make TokenUsage.total_tokens a computed field
([#118](#118))
([c0bab18](c0bab18)),
closes [#109](#109)
* parallel tool execution in ToolInvoker.invoke_all
([#137](#137))
([58517ee](58517ee))
* testing framework, CI pipeline, and M0 gap fixes
([#64](#64))
([f581749](f581749))
* wire all modules into observability system
([#97](#97))
([f7a0617](f7a0617))


### Bug Fixes

* address Greptile post-merge review findings from PRs
[#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175)
([#176](#176))
([c5ca929](c5ca929))
* address post-merge review feedback from PRs
[#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167)
([#170](#170))
([3bf897a](3bf897a)),
closes [#169](#169)
* enforce strict mypy on test files
([#89](#89))
([aeeff8c](aeeff8c))
* harden Docker sandbox, MCP bridge, and code runner
([#50](#50),
[#53](#53))
([d5e1b6e](d5e1b6e))
* harden git tools security + code quality improvements
([#150](#150))
([000a325](000a325))
* harden subprocess cleanup, env filtering, and shutdown resilience
([#155](#155))
([d1fe1fb](d1fe1fb))
* incorporate post-merge feedback + pre-PR review fixes
([#164](#164))
([c02832a](c02832a))
* pre-PR review fixes for post-merge findings
([#183](#183))
([26b3108](26b3108))
* strengthen immutability for BaseTool schema and ToolInvoker boundaries
([#117](#117))
([7e5e861](7e5e861))


### Performance

* harden non-inferable principle implementation
([#195](#195))
([02b5f4e](02b5f4e)),
closes [#188](#188)


### Refactoring

* adopt NotBlankStr across all models
([#108](#108))
([#120](#120))
([ef89b90](ef89b90))
* extract _SpendingTotals base class from spending summary models
([#111](#111))
([2f39c1b](2f39c1b))
* harden BudgetEnforcer with error handling, validation extraction, and
review fixes
([#182](#182))
([c107bf9](c107bf9))
* harden personality profiles, department validation, and template
rendering ([#158](#158))
([10b2299](10b2299))
* pre-PR review improvements for ExecutionLoop + ReAct loop
([#124](#124))
([8dfb3c0](8dfb3c0))
* split events.py into per-domain event modules
([#136](#136))
([e9cba89](e9cba89))


### Documentation

* add ADR-001 memory layer evaluation and selection
([#178](#178))
([db3026f](db3026f)),
closes [#39](#39)
* add agent scaling research findings to DESIGN_SPEC
([#145](#145))
([57e487b](57e487b))
* add CLAUDE.md, contributing guide, and dev documentation
([#65](#65))
([55c1025](55c1025)),
closes [#54](#54)
* add crash recovery, sandboxing, analytics, and testing decisions
([#127](#127))
([5c11595](5c11595))
* address external review feedback with MVP scope and new protocols
([#128](#128))
([3b30b9a](3b30b9a))
* expand design spec with pluggable strategy protocols
([#121](#121))
([6832db6](6832db6))
* finalize 23 design decisions (ADR-002)
([#190](#190))
([8c39742](8c39742))
* update project docs for M2.5 conventions and add docs-consistency
review agent
([#114](#114))
([99766ee](99766ee))


### Tests

* add e2e single agent integration tests
([#24](#24))
([#156](#156))
([f566fb4](f566fb4))
* add provider adapter integration tests
([#90](#90))
([40a61f4](40a61f4))


### CI/CD

* add Release Please for automated versioning and GitHub Releases
([#278](#278))
([a488758](a488758))
* bump actions/checkout from 4 to 6
([#95](#95))
([1897247](1897247))
* bump actions/upload-artifact from 4 to 7
([#94](#94))
([27b1517](27b1517))
* harden CI/CD pipeline
([#92](#92))
([ce4693c](ce4693c))
* split vulnerability scans into critical-fail and high-warn tiers
([#277](#277))
([aba48af](aba48af))


### Maintenance

* add /worktree skill for parallel worktree management
([#171](#171))
([951e337](951e337))
* add design spec context loading to research-link skill
([8ef9685](8ef9685))
* add post-merge-cleanup skill
([#70](#70))
([f913705](f913705))
* add pre-pr-review skill and update CLAUDE.md
([#103](#103))
([92e9023](92e9023))
* add research-link skill and rename skill files to SKILL.md
([#101](#101))
([651c577](651c577))
* bump aiosqlite from 0.21.0 to 0.22.1
([#191](#191))
([3274a86](3274a86))
* bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group
([#96](#96))
([0338d0c](0338d0c))
* bump ruff from 0.15.4 to 0.15.5
([a49ee46](a49ee46))
* fix M0 audit items
([#66](#66))
([c7724b5](c7724b5))
* pin setup-uv action to full SHA
([#281](#281))
([4448002](4448002))
* post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests,
hookify rules
([#148](#148))
([c57a6a9](c57a6a9))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Aureliolo added a commit that referenced this pull request Mar 11, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.1.0](v0.0.0...v0.1.0)
(2026-03-11)


### Features

* add autonomy levels and approval timeout policies
([#42](#42),
[#126](#126))
([#197](#197))
([eecc25a](eecc25a))
* add CFO cost optimization service with anomaly detection, reports, and
approval decisions
([#186](#186))
([a7fa00b](a7fa00b))
* add code quality toolchain (ruff, mypy, pre-commit, dependabot)
([#63](#63))
([36681a8](36681a8))
* add configurable cost tiers and subscription/quota-aware tracking
([#67](#67))
([#185](#185))
([9baedfa](9baedfa))
* add container packaging, Docker Compose, and CI pipeline
([#269](#269))
([435bdfe](435bdfe)),
closes [#267](#267)
* add coordination error taxonomy classification pipeline
([#146](#146))
([#181](#181))
([70c7480](70c7480))
* add cost-optimized, hierarchical, and auction assignment strategies
([#175](#175))
([ce924fa](ce924fa)),
closes [#173](#173)
* add design specification, license, and project setup
([8669a09](8669a09))
* add env var substitution and config file auto-discovery
([#77](#77))
([7f53832](7f53832))
* add FastestStrategy routing + vendor-agnostic cleanup
([#140](#140))
([09619cb](09619cb)),
closes [#139](#139)
* add HR engine and performance tracking
([#45](#45),
[#47](#47))
([#193](#193))
([2d091ea](2d091ea))
* add issue auto-search and resolution verification to PR review skill
([#119](#119))
([deecc39](deecc39))
* add mandatory JWT + API key authentication
([#256](#256))
([c279cfe](c279cfe))
* add memory retrieval, ranking, and context injection pipeline
([#41](#41))
([873b0aa](873b0aa))
* add pluggable MemoryBackend protocol with models, config, and events
([#180](#180))
([46cfdd4](46cfdd4))
* add pluggable MemoryBackend protocol with models, config, and events
([#32](#32))
([46cfdd4](46cfdd4))
* add pluggable output scan response policies
([#263](#263))
([b9907e8](b9907e8))
* add pluggable PersistenceBackend protocol with SQLite implementation
([#36](#36))
([f753779](f753779))
* add progressive trust and promotion/demotion subsystems
([#43](#43),
[#49](#49))
([3a87c08](3a87c08))
* add retry handler, rate limiter, and provider resilience
([#100](#100))
([b890545](b890545))
* add SecOps security agent with rule engine, audit log, and ToolInvoker
integration ([#40](#40))
([83b7b6c](83b7b6c))
* add shared org memory and memory consolidation/archival
([#125](#125),
[#48](#48))
([4a0832b](4a0832b))
* design unified provider interface
([#86](#86))
([3e23d64](3e23d64))
* expand template presets, rosters, and add inheritance
([#80](#80),
[#81](#81),
[#84](#84))
([15a9134](15a9134))
* implement agent runtime state vs immutable config split
([#115](#115))
([4cb1ca5](4cb1ca5))
* implement AgentEngine core orchestrator
([#11](#11))
([#143](#143))
([f2eb73a](f2eb73a))
* implement AuditRepository for security audit log persistence
([#279](#279))
([94bc29f](94bc29f))
* implement basic tool system (registry, invocation, results)
([#15](#15))
([c51068b](c51068b))
* implement built-in file system tools
([#18](#18))
([325ef98](325ef98))
* implement communication foundation — message bus, dispatcher, and
messenger ([#157](#157))
([8e71bfd](8e71bfd))
* implement company template system with 7 built-in presets
([#85](#85))
([cbf1496](cbf1496))
* implement conflict resolution protocol
([#122](#122))
([#166](#166))
([e03f9f2](e03f9f2))
* implement core entity and role system models
([#69](#69))
([acf9801](acf9801))
* implement crash recovery with fail-and-reassign strategy
([#149](#149))
([e6e91ed](e6e91ed))
* implement engine extensions — Plan-and-Execute loop and call
categorization
([#134](#134),
[#135](#135))
([#159](#159))
([9b2699f](9b2699f))
* implement enterprise logging system with structlog
([#73](#73))
([2f787e5](2f787e5))
* implement graceful shutdown with cooperative timeout strategy
([#130](#130))
([6592515](6592515))
* implement hierarchical delegation and loop prevention
([#12](#12),
[#17](#17))
([6be60b6](6be60b6))
* implement LiteLLM driver and provider registry
([#88](#88))
([ae3f18b](ae3f18b)),
closes [#4](#4)
* implement LLM decomposition strategy and workspace isolation
([#174](#174))
([aa0eefe](aa0eefe))
* implement meeting protocol system
([#123](#123))
([ee7caca](ee7caca))
* implement message and communication domain models
([#74](#74))
([560a5d2](560a5d2))
* implement model routing engine
([#99](#99))
([d3c250b](d3c250b))
* implement parallel agent execution
([#22](#22))
([#161](#161))
([65940b3](65940b3))
* implement per-call cost tracking service
([#7](#7))
([#102](#102))
([c4f1f1c](c4f1f1c))
* implement personality injection and system prompt construction
([#105](#105))
([934dd85](934dd85))
* implement single-task execution lifecycle
([#21](#21))
([#144](#144))
([c7e64e4](c7e64e4))
* implement subprocess sandbox for tool execution isolation
([#131](#131))
([#153](#153))
([3c8394e](3c8394e))
* implement task assignment subsystem with pluggable strategies
([#172](#172))
([c7f1b26](c7f1b26)),
closes [#26](#26)
[#30](#30)
* implement task decomposition and routing engine
([#14](#14))
([9c7fb52](9c7fb52))
* implement Task, Project, Artifact, Budget, and Cost domain models
([#71](#71))
([81eabf1](81eabf1))
* implement tool permission checking
([#16](#16))
([833c190](833c190))
* implement YAML config loader with Pydantic validation
([#59](#59))
([ff3a2ba](ff3a2ba))
* implement YAML config loader with Pydantic validation
([#75](#75))
([ff3a2ba](ff3a2ba))
* initialize project with uv, hatchling, and src layout
([39005f9](39005f9))
* initialize project with uv, hatchling, and src layout
([#62](#62))
([39005f9](39005f9))
* Litestar REST API, WebSocket feed, and approval queue (M6)
([#189](#189))
([29fcd08](29fcd08))
* make TokenUsage.total_tokens a computed field
([#118](#118))
([c0bab18](c0bab18)),
closes [#109](#109)
* parallel tool execution in ToolInvoker.invoke_all
([#137](#137))
([58517ee](58517ee))
* testing framework, CI pipeline, and M0 gap fixes
([#64](#64))
([f581749](f581749))
* wire all modules into observability system
([#97](#97))
([f7a0617](f7a0617))


### Bug Fixes

* address Greptile post-merge review findings from PRs
[#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175)
([#176](#176))
([c5ca929](c5ca929))
* address post-merge review feedback from PRs
[#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167)
([#170](#170))
([3bf897a](3bf897a)),
closes [#169](#169)
* enforce strict mypy on test files
([#89](#89))
([aeeff8c](aeeff8c))
* harden Docker sandbox, MCP bridge, and code runner
([#50](#50),
[#53](#53))
([d5e1b6e](d5e1b6e))
* harden git tools security + code quality improvements
([#150](#150))
([000a325](000a325))
* harden subprocess cleanup, env filtering, and shutdown resilience
([#155](#155))
([d1fe1fb](d1fe1fb))
* incorporate post-merge feedback + pre-PR review fixes
([#164](#164))
([c02832a](c02832a))
* pre-PR review fixes for post-merge findings
([#183](#183))
([26b3108](26b3108))
* resolve circular imports, bump litellm, fix release tag format
([#286](#286))
([a6659b5](a6659b5))
* strengthen immutability for BaseTool schema and ToolInvoker boundaries
([#117](#117))
([7e5e861](7e5e861))


### Performance

* harden non-inferable principle implementation
([#195](#195))
([02b5f4e](02b5f4e)),
closes [#188](#188)


### Refactoring

* adopt NotBlankStr across all models
([#108](#108))
([#120](#120))
([ef89b90](ef89b90))
* extract _SpendingTotals base class from spending summary models
([#111](#111))
([2f39c1b](2f39c1b))
* harden BudgetEnforcer with error handling, validation extraction, and
review fixes
([#182](#182))
([c107bf9](c107bf9))
* harden personality profiles, department validation, and template
rendering ([#158](#158))
([10b2299](10b2299))
* pre-PR review improvements for ExecutionLoop + ReAct loop
([#124](#124))
([8dfb3c0](8dfb3c0))
* split events.py into per-domain event modules
([#136](#136))
([e9cba89](e9cba89))


### Documentation

* add ADR-001 memory layer evaluation and selection
([#178](#178))
([db3026f](db3026f)),
closes [#39](#39)
* add agent scaling research findings to DESIGN_SPEC
([#145](#145))
([57e487b](57e487b))
* add CLAUDE.md, contributing guide, and dev documentation
([#65](#65))
([55c1025](55c1025)),
closes [#54](#54)
* add crash recovery, sandboxing, analytics, and testing decisions
([#127](#127))
([5c11595](5c11595))
* address external review feedback with MVP scope and new protocols
([#128](#128))
([3b30b9a](3b30b9a))
* expand design spec with pluggable strategy protocols
([#121](#121))
([6832db6](6832db6))
* finalize 23 design decisions (ADR-002)
([#190](#190))
([8c39742](8c39742))
* update project docs for M2.5 conventions and add docs-consistency
review agent
([#114](#114))
([99766ee](99766ee))


### Tests

* add e2e single agent integration tests
([#24](#24))
([#156](#156))
([f566fb4](f566fb4))
* add provider adapter integration tests
([#90](#90))
([40a61f4](40a61f4))


### CI/CD

* add Release Please for automated versioning and GitHub Releases
([#278](#278))
([a488758](a488758))
* bump actions/checkout from 4 to 6
([#95](#95))
([1897247](1897247))
* bump actions/upload-artifact from 4 to 7
([#94](#94))
([27b1517](27b1517))
* bump anchore/scan-action from 6.5.1 to 7.3.2
([#271](#271))
([80a1c15](80a1c15))
* bump docker/build-push-action from 6.19.2 to 7.0.0
([#273](#273))
([dd0219e](dd0219e))
* bump docker/login-action from 3.7.0 to 4.0.0
([#272](#272))
([33d6238](33d6238))
* bump docker/metadata-action from 5.10.0 to 6.0.0
([#270](#270))
([baee04e](baee04e))
* bump docker/setup-buildx-action from 3.12.0 to 4.0.0
([#274](#274))
([5fc06f7](5fc06f7))
* bump sigstore/cosign-installer from 3.9.1 to 4.1.0
([#275](#275))
([29dd16c](29dd16c))
* harden CI/CD pipeline
([#92](#92))
([ce4693c](ce4693c))
* split vulnerability scans into critical-fail and high-warn tiers
([#277](#277))
([aba48af](aba48af))


### Maintenance

* add /worktree skill for parallel worktree management
([#171](#171))
([951e337](951e337))
* add design spec context loading to research-link skill
([8ef9685](8ef9685))
* add post-merge-cleanup skill
([#70](#70))
([f913705](f913705))
* add pre-pr-review skill and update CLAUDE.md
([#103](#103))
([92e9023](92e9023))
* add research-link skill and rename skill files to SKILL.md
([#101](#101))
([651c577](651c577))
* bump aiosqlite from 0.21.0 to 0.22.1
([#191](#191))
([3274a86](3274a86))
* bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group
([#96](#96))
([0338d0c](0338d0c))
* bump ruff from 0.15.4 to 0.15.5
([a49ee46](a49ee46))
* fix M0 audit items
([#66](#66))
([c7724b5](c7724b5))
* **main:** release ai-company 0.1.1
([#282](#282))
([2f4703d](2f4703d))
* pin setup-uv action to full SHA
([#281](#281))
([4448002](4448002))
* post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests,
hookify rules
([#148](#148))
([c57a6a9](c57a6a9))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement model routing engine (role-based, cost-based, fallback chains)

2 participants