Conversation
Add strategy-based LLM model routing with resolver, 4 strategies (manual, role_based, cost_aware, smart), and 83 unit tests at 94%+ coverage. Closes #6.
Apply 23 review items from local agents and external reviewers: - Add NotBlankStr validation on routing model string fields - Add collision detection in ModelResolver.from_config - Add missing logging on all error paths before raising - Freeze STRATEGY_MAP with MappingProxyType - Fix SmartStrategy priority order (global chain before cheapest) - Remove redundant fallback chain walk in RoleBasedStrategy - Use NoReturn type for _raise_exhausted - Rename seniority catalog tiers from vendor-specific (haiku/sonnet/opus) to generic (small/medium/large) - Make all test fixtures vendor-agnostic (no real provider names or model IDs) - Add vendor-agnostic fixture rule to CLAUDE.md
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... 📒 Files selected for processing (5)
✏️ Tip: You can disable in-progress messages and the fortune message in your review settings. Tip You can validate your CodeRabbit configuration file in your editor.If your editor has YAML language server, you can enable auto-completion and validation by adding 📝 WalkthroughWalkthroughImplements a new model routing subsystem (models, resolver, router, strategies, errors), adds routing observability events and config validation, updates seniority→model-tier mappings and template defaults, expands provider public exports, and adds comprehensive unit tests and routing fixtures. Also adds minor docs and testing guidance. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Router as ModelRouter
participant Strategy as RoutingStrategy
participant Resolver as ModelResolver
Client->>Router: route(RoutingRequest)
Router->>Strategy: select(request, routing_config, resolver)
alt override present
Strategy->>Resolver: resolve(override_ref)
Resolver-->>Strategy: ResolvedModel
else task-type rule
Strategy->>Resolver: resolve(preferred_model)
Resolver-->>Strategy: ResolvedModel
else role-based rule
Strategy->>Resolver: resolve(role_preferred)
Resolver-->>Strategy: ResolvedModel
else cheapest/fallback
Strategy->>Resolver: all_models_sorted_by_cost()
Resolver-->>Strategy: ResolvedModel(s)
end
Strategy-->>Router: RoutingDecision
Router->>Router: Log ROUTING_DECISION_MADE
Router-->>Client: RoutingDecision
sequenceDiagram
participant Config as RoutingConfig
participant Router as ModelRouter
participant Resolver as ModelResolver
Config->>Router: __init__(routing_config, providers)
Router->>Resolver: ModelResolver.from_config(providers)
Resolver->>Resolver: Index models by id & alias
Resolver-->>Router: ModelResolver instance
Router->>Router: strategy = STRATEGY_MAP[name]
alt unknown strategy
Router->>Router: Log ROUTING_STRATEGY_UNKNOWN (rgba(255,0,0,0.5))
Router-->>Router: Raise UnknownStrategyError
else
Router->>Router: Log ROUTING_ROUTER_BUILT (rgba(0,128,0,0.5))
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a foundational model routing engine designed to intelligently select the most appropriate LLM model based on various criteria. It provides a flexible framework with multiple strategies, enabling dynamic model choices driven by factors like user role, task type, cost considerations, or explicit overrides. The changes enhance the system's ability to manage and utilize diverse LLM resources efficiently, while also improving maintainability through structured error handling, robust validation, and vendor-agnostic configurations. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
Implements a new model routing engine under ai_company.providers.routing, including a resolver for model ref lookup and multiple routing strategies, plus associated observability events and unit tests. This supports Issue #6’s goal of configurable routing with role/cost/smart strategies and fallback handling.
Changes:
- Added routing subpackage (
ModelRouter,ModelResolver, routing domain models/errors, and 4 strategy implementations + registry). - Added routing observability event constants and extensive unit test coverage for resolver/router/strategies.
- Updated seniority catalog to use vendor-agnostic model tiers (
small/medium/large) and aligned tests/docs.
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/ai_company/providers/routing/strategies.py |
Implements routing strategies + shared fallback/budget helpers and strategy registry. |
src/ai_company/providers/routing/router.py |
Adds ModelRouter as the routing entry point + decision logging. |
src/ai_company/providers/routing/resolver.py |
Adds ModelResolver with immutable ref→model index and collision detection. |
src/ai_company/providers/routing/models.py |
Adds ResolvedModel, RoutingRequest, RoutingDecision Pydantic models. |
src/ai_company/providers/routing/errors.py |
Introduces routing error hierarchy extending provider errors. |
src/ai_company/providers/routing/__init__.py |
Exposes routing public API exports. |
src/ai_company/providers/__init__.py |
Re-exports routing APIs from the top-level providers package. |
src/ai_company/observability/events.py |
Adds routing lifecycle event constants. |
src/ai_company/core/role_catalog.py |
Switches seniority tiers to vendor-agnostic sizes. |
src/ai_company/core/role.py |
Updates seniority tier docstring example to vendor-agnostic naming. |
tests/unit/providers/routing/conftest.py |
Adds routing-specific fixtures/factories and 3-model provider config. |
tests/unit/providers/routing/test_strategies.py |
Adds unit tests for each routing strategy and registry behavior. |
tests/unit/providers/routing/test_router.py |
Adds unit tests for router construction, routing, and logging. |
tests/unit/providers/routing/test_resolver.py |
Adds unit tests for resolver indexing, immutability, and sorting. |
tests/unit/providers/routing/test_models.py |
Adds unit tests for routing domain model validation/immutability. |
tests/unit/providers/routing/test_errors.py |
Adds unit tests for routing error hierarchy and context/message behavior. |
tests/unit/providers/routing/__init__.py |
Creates routing test package. |
tests/unit/core/test_role_catalog.py |
Updates expectations for vendor-agnostic tier names. |
tests/unit/core/test_role.py |
Updates expectations for vendor-agnostic tier names. |
CLAUDE.md |
Documents vendor-agnostic test fixture convention. |
.claude/skills/aurelio-review-pr/skill.md |
Expands logging-audit guidance text. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if TYPE_CHECKING: | ||
| from collections.abc import Mapping | ||
|
|
||
| from ai_company.config.schema import RoutingConfig, RoutingRuleConfig | ||
|
|
||
| from .resolver import ModelResolver | ||
|
|
||
|
|
There was a problem hiding this comment.
This module uses if TYPE_CHECKING: imports for RoutingConfig, RoutingRuleConfig, ModelResolver, and Mapping, but those names are referenced in runtime annotations (Python 3.14 PEP 649 evaluates annotations lazily at runtime). This can raise NameError when annotations are accessed (e.g., inspect.get_annotations). Import these types at runtime instead (optionally with # noqa: TC00x) rather than behind TYPE_CHECKING.
| if TYPE_CHECKING: | |
| from collections.abc import Mapping | |
| from ai_company.config.schema import RoutingConfig, RoutingRuleConfig | |
| from .resolver import ModelResolver | |
| from collections.abc import Mapping | |
| from ai_company.config.schema import RoutingConfig, RoutingRuleConfig | |
| from .resolver import ModelResolver |
| STRATEGY_MAP: Mapping[str, RoutingStrategy] = MappingProxyType( | ||
| { |
There was a problem hiding this comment.
STRATEGY_MAP is annotated as Mapping[...], but Mapping is only imported under TYPE_CHECKING above. With Python 3.14 lazy annotations, evaluating this annotation will fail at runtime. Import Mapping from collections.abc at runtime (or remove the annotation).
| if TYPE_CHECKING: | ||
| from ai_company.config.schema import ProviderConfig, RoutingConfig | ||
|
|
||
| from .models import RoutingDecision, RoutingRequest |
There was a problem hiding this comment.
ProviderConfig, RoutingConfig, RoutingRequest, and RoutingDecision are imported only under TYPE_CHECKING but used in annotations. Under Python 3.14 (PEP 649 lazy annotations), accessing annotations can raise NameError. Import these types at runtime (with # noqa: TC00x if needed) instead of behind TYPE_CHECKING.
| if TYPE_CHECKING: | ||
| from ai_company.config.schema import ProviderConfig | ||
|
|
There was a problem hiding this comment.
ProviderConfig is imported only under TYPE_CHECKING but referenced in the from_config() signature annotation. With Python 3.14 lazy annotations, accessing annotations can raise NameError. Import ProviderConfig at runtime (with # noqa: TC00x if needed) instead of behind TYPE_CHECKING.
| if ref is None: | ||
| continue | ||
| existing = index.get(ref) | ||
| if existing is not None and existing.model_id != resolved.model_id: |
There was a problem hiding this comment.
Collision detection only raises when existing.model_id != resolved.model_id. If two providers define the same ref (ID/alias) pointing to different providers but with the same model_id string, this will silently overwrite the index and make the ref ambiguous. Consider raising on any ref collision where either provider_name or model_id differ (or more simply: if existing is not None and existing != resolved).
| if existing is not None and existing.model_id != resolved.model_id: | |
| if existing is not None and existing != resolved: |
| remaining_budget: float | None = Field( | ||
| default=None, | ||
| ge=0.0, | ||
| description="Remaining cost budget in USD", | ||
| ) |
There was a problem hiding this comment.
remaining_budget is documented as a remaining budget in USD, but the cost-aware selection compares it against cost_per_1k_input + cost_per_1k_output (USD per 1k tokens). This is a unit mismatch and makes the API ambiguous. Either clarify in the field description/name that the budget is per-1k-token (or per-request) threshold, or extend RoutingRequest with enough information (e.g., expected token counts) to compute an actual USD cost against a USD budget.
| request: RoutingRequest, | ||
| resolver: ModelResolver, | ||
| ) -> RoutingDecision | None: | ||
| if not resolver.all_models_sorted_by_cost(): |
There was a problem hiding this comment.
_try_cheapest() calls resolver.all_models_sorted_by_cost() just to check emptiness and then _cheapest_within_budget() calls it again. Cache the sorted tuple (or call resolver.all_models() for the emptiness check) to avoid the extra sort/work on the hot path.
| if not resolver.all_models_sorted_by_cost(): | |
| # Use a cheap existence check to avoid extra sorting work here. | |
| if not resolver.all_models(): |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/aurelio-review-pr/skill.md:
- Around line 107-114: The "Logging coverage suggestions" subsection continues
numbering at 7-10 which triggers MD029; update the list under the "Logging
coverage suggestions (soft rules — mark as SUGGESTION, must be validated by user
in triage):" header so numbering restarts at 1 (i.e., change items labeled 7–10
to 1–4), or explicitly suppress the MD029 warning if the cross-section numbering
is intentional for easier triage reference; locate the header text "Logging
coverage suggestions" and the following list items to make the change and ensure
consistency in the document.
In `@src/ai_company/providers/routing/router.py`:
- Around line 109-113: Wrap the call to self._strategy.select(...) inside
route() in a try/except that catches routing exceptions, log a WARNING or ERROR
with structured context including the request object, the strategy identifier
(from self._strategy), and relevant config/resolver context (self._config,
self._resolver) before re-raising the exception; ensure the logged message is
descriptive (e.g., "routing selection failed") and include the caught exception
details, then re-raise the original exception to preserve behavior.
In `@src/ai_company/providers/routing/strategies.py`:
- Around line 620-633: Add a brief inline comment next to the "cheapest" alias
in the STRATEGY_MAP to document that it intentionally aliases "cost_aware"
(e.g., mention _COST_AWARE / "cheapest" maps to the same strategy), keeping the
registry immutable via MappingProxyType; update the comment near STRATEGY_MAP to
be concise and clarify the UX intent of the alias.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (21)
.claude/skills/aurelio-review-pr/skill.mdCLAUDE.mdsrc/ai_company/core/role.pysrc/ai_company/core/role_catalog.pysrc/ai_company/observability/events.pysrc/ai_company/providers/__init__.pysrc/ai_company/providers/routing/__init__.pysrc/ai_company/providers/routing/errors.pysrc/ai_company/providers/routing/models.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/strategies.pytests/unit/core/test_role.pytests/unit/core/test_role_catalog.pytests/unit/providers/routing/__init__.pytests/unit/providers/routing/conftest.pytests/unit/providers/routing/test_errors.pytests/unit/providers/routing/test_models.pytests/unit/providers/routing/test_resolver.pytests/unit/providers/routing/test_router.pytests/unit/providers/routing/test_strategies.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Agent
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotations— Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models withBaseModel,model_validator, andConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines
Files:
src/ai_company/core/role.pysrc/ai_company/providers/routing/router.pytests/unit/providers/routing/test_router.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/__init__.pytests/unit/providers/routing/test_models.pysrc/ai_company/providers/routing/strategies.pysrc/ai_company/core/role_catalog.pysrc/ai_company/observability/events.pysrc/ai_company/providers/routing/errors.pytests/unit/core/test_role_catalog.pytests/unit/providers/routing/conftest.pytests/unit/core/test_role.pysrc/ai_company/providers/routing/models.pytests/unit/providers/routing/test_strategies.pytests/unit/providers/routing/test_resolver.pysrc/ai_company/providers/__init__.pytests/unit/providers/routing/test_errors.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic must import:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport loggingorlogging.getLogger()orprint()in application code — use the ai_company logger
Always useloggeras the variable name (not_loggerorlog)
Always use event name constants fromai_company.observability.eventsin logger calls
Use structured logging format:logger.info(EVENT, key=value)— never use format strings likelogger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/core/role.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/__init__.pysrc/ai_company/providers/routing/strategies.pysrc/ai_company/core/role_catalog.pysrc/ai_company/observability/events.pysrc/ai_company/providers/routing/errors.pysrc/ai_company/providers/routing/models.pysrc/ai_company/providers/__init__.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark all test cases with pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow
Configure pytest async mode asasyncio_mode = "auto"— do not manually add@pytest.mark.asyncioto test functions
Files:
tests/unit/providers/routing/test_router.pytests/unit/providers/routing/test_models.pytests/unit/core/test_role_catalog.pytests/unit/providers/routing/conftest.pytests/unit/core/test_role.pytests/unit/providers/routing/test_strategies.pytests/unit/providers/routing/test_resolver.pytests/unit/providers/routing/test_errors.py
🧠 Learnings (22)
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.
Applied to files:
CLAUDE.md.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to README.md : Update README.md for significant feature changes
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use `async def` for async test functions; pytest-asyncio is configured with `asyncio_mode = "auto"`; mark async tests with `pytest.mark.asyncio` if needed
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to tests/**/*.py : Configure pytest async mode as `asyncio_mode = "auto"` — do not manually add `pytest.mark.asyncio` to test functions
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Set test timeout to 30 seconds per test (via pytest configuration)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Use pytest-xdist for parallelism with `-n auto` flag
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Mock external dependencies such as file system operations and network calls in tests
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Log all significant operations using Python logging with appropriate levels: debug for detailed info, info for successes, warning for retries, error for failures
Applied to files:
.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`), CI enforces this coverage requirement
Applied to files:
.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions
Applied to files:
.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit. The CI enforces 100% coverage on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`).
Applied to files:
.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/**/*.py : All functions and methods should have appropriate logging using `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.
Applied to files:
.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use logging levels: debug (routine), info (significant), warning (unexpected but recoverable), error (failures)
Applied to files:
.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : Core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`) must maintain 100% test coverage
Applied to files:
.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Agents must extend `BaseAgent`, use retry logic, and implement configurable timeout via settings.
Applied to files:
.claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls
Applied to files:
src/ai_company/observability/events.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception
Applied to files:
src/ai_company/providers/routing/errors.py
🧬 Code graph analysis (12)
src/ai_company/providers/routing/router.py (4)
src/ai_company/providers/routing/errors.py (1)
UnknownStrategyError(28-31)src/ai_company/providers/routing/resolver.py (2)
ModelResolver(28-181)from_config(57-115)src/ai_company/config/schema.py (2)
ProviderConfig(52-95)RoutingConfig(127-149)src/ai_company/providers/routing/models.py (2)
RoutingDecision(74-92)RoutingRequest(43-71)
tests/unit/providers/routing/test_router.py (6)
src/ai_company/config/schema.py (2)
ProviderConfig(52-95)RoutingConfig(127-149)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)src/ai_company/providers/routing/errors.py (3)
ModelResolutionError(16-19)NoAvailableModelError(22-25)UnknownStrategyError(28-31)src/ai_company/providers/routing/models.py (1)
RoutingRequest(43-71)src/ai_company/providers/routing/router.py (4)
ModelRouter(28-122)strategy_name(92-94)resolver(87-89)route(96-122)src/ai_company/providers/routing/resolver.py (1)
all_models(166-169)
src/ai_company/providers/routing/__init__.py (4)
src/ai_company/providers/routing/errors.py (4)
ModelResolutionError(16-19)NoAvailableModelError(22-25)RoutingError(10-13)UnknownStrategyError(28-31)src/ai_company/providers/routing/router.py (2)
resolver(87-89)ModelRouter(28-122)tests/unit/providers/routing/conftest.py (1)
resolver(97-101)src/ai_company/providers/routing/resolver.py (1)
ModelResolver(28-181)
src/ai_company/providers/routing/strategies.py (5)
src/ai_company/core/role_catalog.py (1)
get_seniority_info(429-445)src/ai_company/providers/routing/errors.py (2)
ModelResolutionError(16-19)NoAvailableModelError(22-25)src/ai_company/providers/routing/models.py (2)
ResolvedModel(9-40)RoutingDecision(74-92)src/ai_company/config/schema.py (2)
RoutingConfig(127-149)RoutingRuleConfig(98-124)src/ai_company/providers/routing/resolver.py (5)
ModelResolver(28-181)resolve_safe(146-164)all_models(166-169)all_models_sorted_by_cost(171-181)resolve(117-144)
src/ai_company/core/role_catalog.py (2)
src/ai_company/core/enums.py (2)
CostTier(57-69)SeniorityLevel(6-21)src/ai_company/core/role.py (1)
SeniorityInfo(87-120)
src/ai_company/providers/routing/errors.py (1)
src/ai_company/providers/errors.py (1)
ProviderError(22-69)
tests/unit/core/test_role_catalog.py (2)
src/ai_company/core/role_catalog.py (1)
get_seniority_info(429-445)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)
tests/unit/core/test_role.py (2)
src/ai_company/core/role.py (1)
SeniorityInfo(87-120)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)
src/ai_company/providers/routing/models.py (1)
src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)
tests/unit/providers/routing/test_resolver.py (2)
tests/unit/providers/routing/conftest.py (2)
resolver(97-101)three_model_provider(85-93)src/ai_company/providers/routing/resolver.py (6)
ModelResolver(28-181)from_config(57-115)resolve(117-144)all_models(166-169)resolve_safe(146-164)all_models_sorted_by_cost(171-181)
src/ai_company/providers/__init__.py (5)
src/ai_company/providers/routing/strategies.py (5)
CostAwareStrategy(450-494)ManualStrategy(306-343)RoleBasedStrategy(349-444)RoutingStrategy(38-66)SmartStrategy(500-615)src/ai_company/providers/routing/errors.py (4)
ModelResolutionError(16-19)NoAvailableModelError(22-25)RoutingError(10-13)UnknownStrategyError(28-31)src/ai_company/providers/routing/resolver.py (1)
ModelResolver(28-181)src/ai_company/providers/routing/router.py (1)
ModelRouter(28-122)src/ai_company/providers/routing/models.py (3)
ResolvedModel(9-40)RoutingDecision(74-92)RoutingRequest(43-71)
tests/unit/providers/routing/test_errors.py (2)
src/ai_company/providers/errors.py (1)
ProviderError(22-69)src/ai_company/providers/routing/errors.py (4)
ModelResolutionError(16-19)NoAvailableModelError(22-25)RoutingError(10-13)UnknownStrategyError(28-31)
🪛 markdownlint-cli2 (0.21.0)
.claude/skills/aurelio-review-pr/skill.md
[warning] 111-111: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/2/3
(MD029, ol-prefix)
[warning] 112-112: Ordered list item prefix
Expected: 2; Actual: 8; Style: 1/2/3
(MD029, ol-prefix)
[warning] 113-113: Ordered list item prefix
Expected: 3; Actual: 9; Style: 1/2/3
(MD029, ol-prefix)
[warning] 114-114: Ordered list item prefix
Expected: 4; Actual: 10; Style: 1/2/3
(MD029, ol-prefix)
🔇 Additional comments (28)
src/ai_company/core/role_catalog.py (1)
21-70: LGTM! Vendor-agnostic model tier naming is a solid improvement.The update from vendor-specific tiers (
haiku,sonnet,opus) to generic tiers (small,medium,large) decouples the seniority catalog from any specific LLM provider. The mapping is consistent and covers all 8 seniority levels appropriately.src/ai_company/core/role.py (1)
93-93: LGTM! Docstring example updated to match new tier naming.The example value in the docstring now reflects the vendor-agnostic tier naming (
"large"instead of"opus"), keeping documentation consistent with the actual values used inSENIORITY_INFO.tests/unit/core/test_role_catalog.py (1)
60-63: LGTM! Test correctly updated to reflect the new tier value.The test method name, docstring, and assertion are all consistently updated to verify that
SENIORmaps to"medium"tier, matching the updatedSENIORITY_INFOmapping inrole_catalog.py.tests/unit/core/test_role.py (1)
123-200: LGTM! Test fixtures updated consistently with vendor-agnostic tier names.All
SeniorityInfotest instances now use the generic tier names ("small","medium") matching their respective seniority levels, consistent with the updatedSENIORITY_INFOmappings:
JUNIOR→"small"MID→"medium"SENIOR→"medium"The test behavior remains unchanged; only the literal values were updated.
src/ai_company/providers/routing/errors.py (1)
10-31: LGTM! Clean error hierarchy.The routing error classes properly extend
ProviderErrorand follow the established pattern. The docstrings clearly describe each exception's purpose.Minor observation: The
is_retryable = Falseon subclasses (lines 13, 19, 25, 31) is redundant sinceRoutingErroralready sets it toFalseand it's inherited. You could remove these for DRYness, but keeping them is also fine for explicit documentation.src/ai_company/providers/routing/__init__.py (1)
1-39: LGTM!Clean public API surface with comprehensive exports. The
__all__list is alphabetically sorted and complete, matching all imports.src/ai_company/providers/__init__.py (1)
37-52: LGTM!Clean integration of the routing module's public API into the providers package. Alphabetical ordering in
__all__is maintained, and all new exports are properly imported.src/ai_company/observability/events.py (1)
79-91: LGTM!All new routing event constants follow the established
domain.noun.verbnaming convention and are properly typed asFinal[str]. Good coverage for the routing lifecycle: build, resolution, decision, fallback, budget, and error scenarios.src/ai_company/providers/routing/models.py (2)
9-40: LGTM! Well-designed immutable domain models.Good use of
ConfigDict(frozen=True)for immutability,NotBlankStrfor string validation, and appropriate field constraints (ge=0.0for costs,gt=0for context size).
43-92: LGTM!
RoutingRequestandRoutingDecisionare well-structured with proper optional field handling and immutability. Thefallbacks_trieddefault of()is appropriate for the tuple type.src/ai_company/providers/routing/resolver.py (3)
12-22: LGTM! Proper observability setup.Correct use of
get_logger(__name__)with event constants fromai_company.observability.events. The logger variable naming follows conventions.
56-115: LGTM! Robust collision detection with proper logging.The
from_configmethod correctly:
- Detects duplicate refs mapping to different models
- Logs at ERROR level with full context before raising
- Uses structured logging with kwargs
- Raises
ModelResolutionErrorwith appropriate context
117-164: LGTM! Good differentiation between resolve and resolve_safe.Appropriate log levels:
resolvelogs WARNING on miss (unexpected), whileresolve_safelogs DEBUG (expected possibility). Both use structured kwargs.src/ai_company/providers/routing/strategies.py (7)
37-66: LGTM! Clean protocol definition.The
RoutingStrategyprotocol is well-documented with@runtime_checkablefor isinstance checks. The docstring clearly specifies the contract and possible exceptions.
72-143: LGTM! Robust fallback resolution helpers.Good separation between throwing (
_try_resolve_with_fallback) and non-throwing (_try_resolve_with_fallback_safe) variants. Proper logging at each fallback attempt and exhaustion.
171-209: LGTM! Budget-aware selection with proper fallback.The function correctly handles:
- No models registered (logs warning + raises)
- No budget constraint (returns cheapest)
- Budget exceeded by all models (logs warning + returns cheapest anyway with flag)
306-343: LGTM! ManualStrategy implementation.Correctly validates that
model_overrideis set, logs warning before raisingModelResolutionError, and delegates to resolver for actual lookup.
349-444: LGTM! RoleBasedStrategy with comprehensive fallback handling.Good flow: rule match → seniority default tier → global fallback chain. All failure paths log at WARNING level before raising
NoAvailableModelError.
450-494: LGTM! CostAwareStrategy implementation.Clean delegation to
_try_task_type_rulesand_cheapest_within_budgethelpers. Budget exceeded case is clearly communicated in the reason string.
500-616: LGTM! SmartStrategy with clear priority order.The chained
orexpression at lines 524-546 clearly expresses the priority: override → task_type → role → seniority → global chain → cheapest → exhausted. Helper methods are well-encapsulated. TheNoReturntype hint on_raise_exhaustedis correct.src/ai_company/providers/routing/router.py (1)
63-75: Unknown-strategy handling is implemented well.Structured error logging with contextual fields before raising
UnknownStrategyErroris solid.CLAUDE.md (1)
80-80: Good documentation hardening for test portability.The added vendor-agnostic fixture rule is clear and helps keep tests independent from real provider/model IDs.
tests/unit/providers/routing/conftest.py (1)
59-130: Fixture design is clean and deterministic.The shared provider/resolver/routing fixtures are reusable, stable, and vendor-agnostic.
tests/unit/providers/routing/test_errors.py (1)
22-61: Error-hierarchy coverage is solid.These tests correctly validate inheritance, retryability flags, and context/message propagation.
tests/unit/providers/routing/test_models.py (1)
21-125: Model validation tests are comprehensive.Coverage of defaults, non-negative constraints, and frozen-field behavior is strong.
tests/unit/providers/routing/test_router.py (1)
20-146: Router behavior tests are well-rounded.Construction, decision paths, logging assertions, and key error scenarios are all covered.
tests/unit/providers/routing/test_strategies.py (1)
32-402: Excellent strategy-path coverage.The tests exercise precedence logic, fallback behavior, and empty-model failure paths across all strategies.
tests/unit/providers/routing/test_resolver.py (1)
12-114: Resolver test coverage is strong.Indexing, lookup failure semantics, ordering, deduplication, and immutability checks are all well represented.
| **Logging coverage suggestions (soft rules — mark as SUGGESTION, must be validated by user in triage):** | ||
|
|
||
| For every function touched by the PR, analyze its logic and suggest missing logging where appropriate: | ||
|
|
||
| 7. Error/except paths that don't `logger.warning()` or `logger.error()` with context before raising or returning (SUGGESTION) | ||
| 8. State transitions (status changes, lifecycle events, mode switches) that don't `logger.info()` (SUGGESTION) | ||
| 9. Object creation, entry/exit of key functions, or important branching decisions that don't `logger.debug()` (SUGGESTION) | ||
| 10. Any other code path that would benefit from logging for debuggability or operational visibility — think about what an operator investigating a production issue would want to see (SUGGESTION) |
There was a problem hiding this comment.
Consider restarting list numbering for the soft rules section.
The static analysis tool flagged that items 7-10 should restart from 1 after the new "Logging coverage suggestions" header, per MD029. However, if the cross-section numbering (1-6 for hard rules, 7-10 for soft rules) is intentional for easier reference in triage discussions, you may suppress this warning. Otherwise, consider restarting from 1.
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 111-111: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/2/3
(MD029, ol-prefix)
[warning] 112-112: Ordered list item prefix
Expected: 2; Actual: 8; Style: 1/2/3
(MD029, ol-prefix)
[warning] 113-113: Ordered list item prefix
Expected: 3; Actual: 9; Style: 1/2/3
(MD029, ol-prefix)
[warning] 114-114: Ordered list item prefix
Expected: 4; Actual: 10; Style: 1/2/3
(MD029, ol-prefix)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.claude/skills/aurelio-review-pr/skill.md around lines 107 - 114, The
"Logging coverage suggestions" subsection continues numbering at 7-10 which
triggers MD029; update the list under the "Logging coverage suggestions (soft
rules — mark as SUGGESTION, must be validated by user in triage):" header so
numbering restarts at 1 (i.e., change items labeled 7–10 to 1–4), or explicitly
suppress the MD029 warning if the cross-section numbering is intentional for
easier triage reference; locate the header text "Logging coverage suggestions"
and the following list items to make the change and ensure consistency in the
document.
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive model routing engine with multiple strategies, a model resolver for efficient lookups, and extensive test coverage. While the implementation demonstrates high quality and good design principles, a critical security vulnerability was identified: budget constraints specified in the RoutingRequest can be bypassed by manual overrides and routing rules across all strategies. This could lead to unauthorized financial costs if users can influence routing signals such as model_override or task_type. It is crucial to update the routing logic to enforce budget limits as a hard constraint across all selection paths. Additionally, consider improving the fallback logic to prevent redundant model lookups, which will enhance efficiency and clarity.
| def select( | ||
| self, | ||
| request: RoutingRequest, | ||
| config: RoutingConfig, # noqa: ARG002 | ||
| resolver: ModelResolver, | ||
| ) -> RoutingDecision: | ||
| """Select the explicitly requested model. | ||
|
|
||
| Raises: | ||
| ModelResolutionError: If ``model_override`` is not set or | ||
| the model cannot be resolved. | ||
| """ | ||
| if request.model_override is None: | ||
| logger.warning( | ||
| ROUTING_NO_RULE_MATCHED, | ||
| strategy=self.name, | ||
| reason="model_override not set", | ||
| ) | ||
| msg = "ManualStrategy requires model_override to be set" | ||
| raise ModelResolutionError(msg) | ||
|
|
||
| model = resolver.resolve(request.model_override) | ||
| return RoutingDecision( | ||
| resolved_model=model, | ||
| strategy_used=self.name, | ||
| reason=f"Explicit override: {request.model_override}", | ||
| ) | ||
|
|
||
|
|
||
| # ── Strategy 2: Role-Based ─────────────────────────────────────── | ||
|
|
||
|
|
||
| class RoleBasedStrategy: | ||
| """Select model based on agent seniority level. | ||
|
|
||
| Matches the first routing rule where ``rule.role_level`` equals | ||
| ``request.agent_level``. If no rule matches, uses the seniority | ||
| catalog's ``typical_model_tier`` as a fallback lookup. | ||
| """ | ||
|
|
||
| @property | ||
| def name(self) -> str: | ||
| """Return strategy name.""" | ||
| return "role_based" | ||
|
|
||
| def select( | ||
| self, | ||
| request: RoutingRequest, | ||
| config: RoutingConfig, | ||
| resolver: ModelResolver, | ||
| ) -> RoutingDecision: | ||
| """Select model based on role level. | ||
|
|
||
| Raises: | ||
| ModelResolutionError: If no agent_level is set. | ||
| NoAvailableModelError: If all candidates are exhausted. | ||
| """ | ||
| if request.agent_level is None: | ||
| logger.warning( | ||
| ROUTING_NO_RULE_MATCHED, | ||
| strategy=self.name, | ||
| reason="agent_level not set", | ||
| ) | ||
| msg = "RoleBasedStrategy requires agent_level to be set" | ||
| raise ModelResolutionError(msg) | ||
|
|
||
| # Try matching rules | ||
| for rule in config.rules: | ||
| if rule.role_level == request.agent_level: | ||
| model, tried = _try_resolve_with_fallback( | ||
| rule.preferred_model, | ||
| rule, | ||
| config, | ||
| resolver, | ||
| ) | ||
| return RoutingDecision( | ||
| resolved_model=model, | ||
| strategy_used=self.name, | ||
| reason=( | ||
| f"Role rule match: " | ||
| f"level={request.agent_level.value}" | ||
| f", model={model.model_id}" | ||
| ), | ||
| fallbacks_tried=tried, | ||
| ) | ||
|
|
||
| # No rule matched — use seniority default | ||
| logger.debug( | ||
| ROUTING_NO_RULE_MATCHED, | ||
| level=request.agent_level.value, | ||
| strategy=self.name, | ||
| ) | ||
| tier = get_seniority_info( | ||
| request.agent_level, | ||
| ).typical_model_tier | ||
| result = _try_resolve_with_fallback_safe( | ||
| tier, | ||
| None, | ||
| config, | ||
| resolver, | ||
| ) | ||
| if result is not None: | ||
| model, tried = result | ||
| return RoutingDecision( | ||
| resolved_model=model, | ||
| strategy_used=self.name, | ||
| reason=( | ||
| f"Seniority default: level={request.agent_level.value}, tier={tier}" | ||
| ), | ||
| fallbacks_tried=tried, | ||
| ) | ||
|
|
||
| if config.fallback_chain: | ||
| chain_detail = f"fallback chain exhausted: {list(config.fallback_chain)}" | ||
| else: | ||
| chain_detail = "no fallback chain configured" | ||
| msg = ( | ||
| f"No model available for " | ||
| f"level={request.agent_level.value} " | ||
| f"(tier={tier}, no rules matched, {chain_detail})" | ||
| ) | ||
| logger.warning( | ||
| ROUTING_FALLBACK_EXHAUSTED, | ||
| level=request.agent_level.value, | ||
| tier=tier, | ||
| strategy=self.name, | ||
| ) | ||
| raise NoAvailableModelError(msg) | ||
|
|
||
|
|
||
| # ── Strategy 3: Cost-Aware ─────────────────────────────────────── | ||
|
|
||
|
|
||
| class CostAwareStrategy: | ||
| """Select the cheapest model, optionally respecting a budget. | ||
|
|
||
| Matches ``task_type`` rules first, then falls back to the cheapest | ||
| model from the resolver. | ||
| """ | ||
|
|
||
| @property | ||
| def name(self) -> str: | ||
| """Return strategy name.""" | ||
| return "cost_aware" | ||
|
|
||
| def select( | ||
| self, | ||
| request: RoutingRequest, | ||
| config: RoutingConfig, | ||
| resolver: ModelResolver, | ||
| ) -> RoutingDecision: | ||
| """Select the cheapest available model. | ||
|
|
||
| Raises: | ||
| NoAvailableModelError: If no models are registered. | ||
| """ | ||
| decision = _try_task_type_rules( | ||
| request, | ||
| config, | ||
| resolver, | ||
| self.name, | ||
| ) | ||
| if decision is not None: | ||
| return decision | ||
|
|
||
| # Pick cheapest | ||
| model, budget_exceeded = _cheapest_within_budget( | ||
| resolver, | ||
| request.remaining_budget, | ||
| ) | ||
| reason = f"Cheapest available: {model.model_id}" | ||
| if budget_exceeded: | ||
| reason += " (all models exceed remaining budget)" | ||
| return RoutingDecision( | ||
| resolved_model=model, | ||
| strategy_used=self.name, | ||
| reason=reason, | ||
| ) | ||
|
|
||
|
|
||
| # ── Strategy 4: Smart ──────────────────────────────────────────── | ||
|
|
||
|
|
||
| class SmartStrategy: | ||
| """Combined strategy with priority-based signal merging. | ||
|
|
||
| Priority order: model_override > task_type rules > role_level | ||
| rules > seniority default > global fallback_chain > cheapest | ||
| available. | ||
| """ | ||
|
|
||
| @property | ||
| def name(self) -> str: | ||
| """Return strategy name.""" | ||
| return "smart" | ||
|
|
||
| def select( | ||
| self, | ||
| request: RoutingRequest, | ||
| config: RoutingConfig, | ||
| resolver: ModelResolver, | ||
| ) -> RoutingDecision: | ||
| """Select a model using all available signals. | ||
|
|
||
| Raises: | ||
| NoAvailableModelError: If all candidates are exhausted. | ||
| """ | ||
| return ( | ||
| self._try_override(request, resolver) | ||
| or _try_task_type_rules( | ||
| request, | ||
| config, | ||
| resolver, | ||
| self.name, | ||
| ) | ||
| or _try_role_rules( | ||
| request, | ||
| config, | ||
| resolver, | ||
| self.name, | ||
| ) | ||
| or _try_seniority_default( | ||
| request, | ||
| resolver, | ||
| self.name, | ||
| ) | ||
| or self._try_global_chain(config, resolver) | ||
| or self._try_cheapest(request, resolver) | ||
| or self._raise_exhausted() | ||
| ) | ||
|
|
||
| def _try_override( | ||
| self, | ||
| request: RoutingRequest, | ||
| resolver: ModelResolver, | ||
| ) -> RoutingDecision | None: | ||
| if request.model_override is None: | ||
| return None | ||
| model = resolver.resolve_safe(request.model_override) | ||
| if model is None: | ||
| logger.warning( | ||
| ROUTING_MODEL_RESOLUTION_FAILED, | ||
| ref=request.model_override, | ||
| source="smart_override", | ||
| ) | ||
| return None | ||
| return RoutingDecision( | ||
| resolved_model=model, | ||
| strategy_used=self.name, | ||
| reason=f"Explicit override: {request.model_override}", | ||
| ) | ||
|
|
||
| def _try_cheapest( | ||
| self, | ||
| request: RoutingRequest, | ||
| resolver: ModelResolver, | ||
| ) -> RoutingDecision | None: | ||
| if not resolver.all_models_sorted_by_cost(): | ||
| return None | ||
| model, budget_exceeded = _cheapest_within_budget( | ||
| resolver, | ||
| request.remaining_budget, | ||
| ) | ||
| reason = f"Cheapest available: {model.model_id}" | ||
| if budget_exceeded: | ||
| reason += " (all models exceed remaining budget)" | ||
| return RoutingDecision( | ||
| resolved_model=model, | ||
| strategy_used=self.name, | ||
| reason=reason, | ||
| ) | ||
|
|
||
| def _try_global_chain( | ||
| self, | ||
| config: RoutingConfig, | ||
| resolver: ModelResolver, | ||
| ) -> RoutingDecision | None: | ||
| chain_result = _walk_fallback_chain(config, resolver) | ||
| if chain_result is None: | ||
| return None | ||
| model, tried = chain_result | ||
| return RoutingDecision( | ||
| resolved_model=model, | ||
| strategy_used=self.name, | ||
| reason="Global fallback chain", | ||
| fallbacks_tried=tried, | ||
| ) | ||
|
|
||
| def _raise_exhausted(self) -> NoReturn: | ||
| logger.warning( | ||
| ROUTING_FALLBACK_EXHAUSTED, | ||
| strategy="smart", | ||
| reason="all signals exhausted", | ||
| ) | ||
| msg = "SmartStrategy: no model available from any signal" | ||
| raise NoAvailableModelError( | ||
| msg, | ||
| context={"strategy": "smart"}, | ||
| ) | ||
|
|
There was a problem hiding this comment.
The model routing engine implements several strategies (ManualStrategy, RoleBasedStrategy, CostAwareStrategy, and SmartStrategy) that prioritize manual overrides and routing rules over budget constraints.
Specifically:
ManualStrategy(lines 317-343) selects the explicitly requested model without any budget check.RoleBasedStrategy(lines 362-444) selects models based on seniority rules or defaults without checking theremaining_budget.CostAwareStrategy(lines 450-495) prioritizestask_typerules (which don't check budget) over the cheapest-model selection.SmartStrategy(lines 500-616) prioritizes overrides, task rules, role rules, and seniority defaults—all of which bypass budget checks—before finally falling back to a budget-aware cheapest-model selection.
Furthermore, even the budget-aware helper _cheapest_within_budget (lines 171-209) is designed to return the cheapest available model even if it exceeds the remaining_budget, rather than raising an error to enforce the limit.
This design allows users to bypass financial controls by triggering rules or providing overrides, potentially leading to significant unauthorized costs.
Recommendation: Modify the select method of each strategy to validate the cost of the chosen model against request.remaining_budget before returning a RoutingDecision. If the cost exceeds the budget, the strategy should either attempt to find a cheaper alternative that fits the budget or raise a NoAvailableModelError. The _cheapest_within_budget helper should also be updated to enforce a hard limit when a budget is provided.
| def _try_resolve_with_fallback( | ||
| ref: str, | ||
| rule: RoutingRuleConfig | None, | ||
| config: RoutingConfig, | ||
| resolver: ModelResolver, | ||
| ) -> tuple[ResolvedModel, tuple[str, ...]]: | ||
| """Try to resolve *ref*, then rule fallback, then global chain. | ||
|
|
||
| Returns: | ||
| Tuple of (resolved_model, fallbacks_tried). | ||
|
|
||
| Raises: | ||
| NoAvailableModelError: If all candidates are exhausted. | ||
| """ | ||
| tried: list[str] = [] | ||
|
|
||
| # Try the primary ref | ||
| model = resolver.resolve_safe(ref) | ||
| if model is not None: | ||
| return model, tuple(tried) | ||
|
|
||
| tried.append(ref) | ||
| logger.debug(ROUTING_FALLBACK_ATTEMPTED, ref=ref, source="primary") | ||
|
|
||
| # Try rule-level fallback | ||
| if rule is not None and rule.fallback is not None: | ||
| model = resolver.resolve_safe(rule.fallback) | ||
| if model is not None: | ||
| return model, tuple(tried) | ||
| tried.append(rule.fallback) | ||
| logger.debug( | ||
| ROUTING_FALLBACK_ATTEMPTED, | ||
| ref=rule.fallback, | ||
| source="rule_fallback", | ||
| ) | ||
|
|
||
| # Walk global fallback chain | ||
| for chain_ref in config.fallback_chain: | ||
| model = resolver.resolve_safe(chain_ref) | ||
| if model is not None: | ||
| return model, tuple(tried) | ||
| tried.append(chain_ref) | ||
| logger.debug( | ||
| ROUTING_FALLBACK_ATTEMPTED, | ||
| ref=chain_ref, | ||
| source="global_chain", | ||
| ) | ||
|
|
||
| logger.warning(ROUTING_FALLBACK_EXHAUSTED, tried=tried) | ||
| msg = f"All model candidates exhausted: {tried}" | ||
| raise NoAvailableModelError(msg, context={"tried": tried}) |
There was a problem hiding this comment.
The current implementation of fallback logic may perform redundant model lookups if the same model reference appears in multiple places (e.g., as a primary reference and also in the global fallback chain). This is inefficient and can lead to confusing logs with duplicate ROUTING_FALLBACK_ATTEMPTED events and duplicate entries in the tried list of failed attempts.
To prevent this, you should check if a model reference has already been tried before attempting to resolve it again. This makes the fallback logic more robust and efficient.
def _try_resolve_with_fallback(
ref: str,
rule: RoutingRuleConfig | None,
config: RoutingConfig,
resolver: ModelResolver,
) -> tuple[ResolvedModel, tuple[str, ...]]:
"""Try to resolve *ref*, then rule fallback, then global chain.
Returns:
Tuple of (resolved_model, fallbacks_tried).
Raises:
NoAvailableModelError: If all candidates are exhausted.
"""
tried: list[str] = []
# Try the primary ref
model = resolver.resolve_safe(ref)
if model is not None:
return model, tuple(tried)
tried.append(ref)
logger.debug(ROUTING_FALLBACK_ATTEMPTED, ref=ref, source="primary")
# Try rule-level fallback
if rule is not None and rule.fallback is not None:
if rule.fallback not in tried:
model = resolver.resolve_safe(rule.fallback)
if model is not None:
return model, tuple(tried)
tried.append(rule.fallback)
logger.debug(
ROUTING_FALLBACK_ATTEMPTED,
ref=rule.fallback,
source="rule_fallback",
)
# Walk global fallback chain
for chain_ref in config.fallback_chain:
if chain_ref not in tried:
model = resolver.resolve_safe(chain_ref)
if model is not None:
return model, tuple(tried)
tried.append(chain_ref)
logger.debug(
ROUTING_FALLBACK_ATTEMPTED,
ref=chain_ref,
source="global_chain",
)
logger.warning(ROUTING_FALLBACK_EXHAUSTED, tried=tried)
msg = f"All model candidates exhausted: {tried}"
raise NoAvailableModelError(msg, context={"tried": tried})…bbit, and Gemini - Log routing failures in route() before re-raising (CodeRabbit) - Clarify remaining_budget as per-request cost ceiling, not total USD budget - Add total_cost_per_1k property to ResolvedModel, use in 3+ call sites - Update docstring examples to vendor-agnostic (no real provider names) - Extract strategy name constants to prevent drift in STRATEGY_MAP - Dedup tried refs in _try_resolve_with_fallback to avoid redundant lookups - Log empty fallback chains in _walk_fallback_chain for diagnostics - Add debug logs for task_type and role rule miss paths - Document SmartStrategy._try_override soft-fail design - Fix _try_cheapest to use all_models() instead of double-sort - Add "cheapest" alias inline comment in STRATEGY_MAP - Tighten collision detection in ModelResolver (compare full object) - Add RoutingRuleConfig model_validator: at least one matcher required - Add warning log before LookupError in get_seniority_info - Add 10 new tests: collision detection, override soft-fail, 3-stage fallback, mid-range budget, blank validation for RoutingRequest fields
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive and well-designed model routing engine. The implementation is robust, featuring multiple routing strategies, a fast model resolver using immutable data structures, and thorough validation. The code is well-structured, with excellent separation of concerns, extensive structured logging, and a solid test suite that covers numerous edge cases. Decoupling model tiers from vendor-specific names is a significant improvement for maintainability. I have one suggestion for a minor refactoring in the SmartStrategy to improve its internal logic, but overall, this is a high-quality contribution.
| return ( | ||
| self._try_override(request, resolver) | ||
| or _try_task_type_rules( | ||
| request, | ||
| config, | ||
| resolver, | ||
| self.name, | ||
| ) | ||
| or _try_role_rules( | ||
| request, | ||
| config, | ||
| resolver, | ||
| self.name, | ||
| ) | ||
| or _try_seniority_default( | ||
| request, | ||
| resolver, | ||
| self.name, | ||
| ) | ||
| or self._try_global_chain(config, resolver) | ||
| or self._try_cheapest(request, resolver) | ||
| or self._raise_exhausted() | ||
| ) |
There was a problem hiding this comment.
The current implementation of SmartStrategy.select has some logical redundancy that could make it harder to maintain. Specifically, the global fallback chain can be walked multiple times during a single select call.
For example:
_try_task_type_rulesis called. If it matches a rule, it internally calls_try_resolve_with_fallback_safe, which walks the global fallback chain.- If that fails to find a model, the
orchain continues. - Eventually,
self._try_global_chain(config, resolver)is called, which may walk the same global fallback chain again.
While this is a minor inefficiency, it points to a conceptual complexity where candidate selection (e.g., finding a rule) is mixed with resolution logic (trying the ref and all its fallbacks).
A potential refactoring could separate these concerns. The select method could first identify a prioritized list of candidate model references from all available signals (override, task rule, role rule, seniority default), and then orchestrate the resolution with fallbacks centrally. This would eliminate the redundancy and make the priority logic in SmartStrategy more explicit and easier to follow.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/config/schema.py`:
- Around line 129-134: The validation in _at_least_one_matcher currently raises
a ValueError without logging; update it to log the failure at WARNING or ERROR
level with contextual details (include the message "Routing rule must specify at
least role_level or task_type" plus relevant self fields such as the rule
id/name, role_level, and task_type) immediately before raising. Use the
module/class logger (create a module-level logger via
logging.getLogger(__name__) if one doesn't exist) and call logger.warning or
logger.error inside _at_least_one_matcher, then raise the ValueError as before
so the exception path is logged with context.
In `@src/ai_company/providers/routing/resolver.py`:
- Around line 57-115: The from_config classmethod on ModelResolver is too large
and should be split into focused helpers: extract a helper to build a
ResolvedModel from (provider_name, model_config) (referencing ResolvedModel and
model_config fields: id, alias, cost_per_1k_input, cost_per_1k_output,
max_context), extract a helper to index a resolved model into the index dict
(handling duplicate detection and raising ModelResolutionError and logging
ROUTING_MODEL_RESOLUTION_FAILED with the same context), and extract a small
helper to compute the final metrics for the logger call (model_count/ref_count).
Replace the current inline loops in from_config with calls to these helpers and
keep from_config as an orchestration layer that builds the index and returns
cls(index).
In `@src/ai_company/providers/routing/router.py`:
- Around line 116-127: The RoutingError except block currently swallows the
exception; change the handler to capture the exception (e.g., except
RoutingError as err) and include its type/message in the warning log for
ROUTING_SELECTION_FAILED; update the logger.warning call near
ROUTING_SELECTION_FAILED (keeping strategy=self._strategy.name and request
fields) to add error details (err or str(err)) or pass exc_info=err so logs
include the exception metadata for easier diagnosis.
In `@src/ai_company/providers/routing/strategies.py`:
- Around line 80-137: The _try_resolve_with_fallback function is overlong; split
its three resolution stages into small helpers to keep functions under 50 lines:
create helper functions like _resolve_primary(ref, resolver, tried, seen),
_resolve_rule_fallback(rule, resolver, tried, seen) and
_resolve_global_chain(config, resolver, tried, seen) that each attempt
resolution with resolver.resolve_safe, update tried/seen, emit
ROUTING_FALLBACK_ATTEMPTED logs, and return a ResolvedModel or None; keep
NoAvailableModelError raising and ROUTING_FALLBACK_EXHAUSTED logging in the
original _try_resolve_with_fallback which now orchestrates the helpers and
preserves types (RoutingRuleConfig, RoutingConfig, ModelResolver) and the final
error context {"tried": tried}.
- Around line 394-477: The select method in RoleBasedStrategy is too large;
split its responsibilities into small helpers: extract validation into a
_validate_request(request) that checks request.agent_level and raises
ModelResolutionError with the same logging (ROUTING_NO_RULE_MATCHED, logger),
extract rule matching into _select_by_rule(request, config, resolver) that
iterates config.rules and calls _try_resolve_with_fallback(rule.preferred_model,
rule, config, resolver) and returns a RoutingDecision when found, extract
seniority fallback into _select_by_seniority(request, config, resolver) that
calls get_seniority_info(...).typical_model_tier and
_try_resolve_with_fallback_safe(tier, None, config, resolver) and returns a
RoutingDecision when found, and extract the final error path into
_raise_no_available(request, config, tier) which logs ROUTING_FALLBACK_EXHAUSTED
and raises NoAvailableModelError with the same constructed message; then rewrite
RoleBasedStrategy.select to call these helpers in sequence (validate ->
_select_by_rule -> _select_by_seniority -> _raise_no_available) so each function
stays small and the main select is under 50 lines.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (14)
src/ai_company/config/schema.pysrc/ai_company/core/role_catalog.pysrc/ai_company/observability/events.pysrc/ai_company/providers/__init__.pysrc/ai_company/providers/routing/__init__.pysrc/ai_company/providers/routing/models.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/strategies.pytests/unit/config/conftest.pytests/unit/config/test_schema.pytests/unit/providers/routing/test_models.pytests/unit/providers/routing/test_resolver.pytests/unit/providers/routing/test_strategies.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotations— Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models withBaseModel,model_validator, andConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines
Files:
src/ai_company/providers/routing/strategies.pysrc/ai_company/providers/routing/models.pysrc/ai_company/config/schema.pysrc/ai_company/observability/events.pytests/unit/providers/routing/test_strategies.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/core/role_catalog.pytests/unit/providers/routing/test_models.pysrc/ai_company/providers/routing/__init__.pytests/unit/config/conftest.pysrc/ai_company/providers/__init__.pytests/unit/config/test_schema.pytests/unit/providers/routing/test_resolver.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic must import:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport loggingorlogging.getLogger()orprint()in application code — use the ai_company logger
Always useloggeras the variable name (not_loggerorlog)
Always use event name constants fromai_company.observability.eventsin logger calls
Use structured logging format:logger.info(EVENT, key=value)— never use format strings likelogger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/providers/routing/strategies.pysrc/ai_company/providers/routing/models.pysrc/ai_company/config/schema.pysrc/ai_company/observability/events.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/core/role_catalog.pysrc/ai_company/providers/routing/__init__.pysrc/ai_company/providers/__init__.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark all test cases with pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow
Configure pytest async mode asasyncio_mode = "auto"— do not manually add@pytest.mark.asyncioto test functions
Files:
tests/unit/providers/routing/test_strategies.pytests/unit/providers/routing/test_models.pytests/unit/config/conftest.pytests/unit/config/test_schema.pytests/unit/providers/routing/test_resolver.py
🧠 Learnings (5)
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to **/*.py : Do not use `from __future__ import annotations` — Python 3.14 has native lazy annotations via PEP 649
Applied to files:
src/ai_company/providers/routing/strategies.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/resolver.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : NEVER add `from __future__ import annotations` - Python 3.14+ has native support for deferred annotation evaluation and does not require this import
Applied to files:
src/ai_company/providers/routing/strategies.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/resolver.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls
Applied to files:
src/ai_company/observability/events.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception
Applied to files:
src/ai_company/providers/routing/router.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions
Applied to files:
src/ai_company/providers/routing/router.py
🧬 Code graph analysis (8)
src/ai_company/providers/routing/models.py (1)
src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)
tests/unit/providers/routing/test_strategies.py (5)
src/ai_company/config/schema.py (3)
ProviderConfig(52-95)RoutingConfig(137-159)RoutingRuleConfig(98-134)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)src/ai_company/providers/routing/errors.py (2)
ModelResolutionError(16-19)NoAvailableModelError(22-25)src/ai_company/providers/routing/models.py (1)
RoutingRequest(48-83)src/ai_company/providers/routing/resolver.py (2)
ModelResolver(28-181)from_config(57-115)
src/ai_company/providers/routing/router.py (5)
src/ai_company/providers/routing/errors.py (2)
RoutingError(10-13)UnknownStrategyError(28-31)tests/unit/providers/routing/conftest.py (1)
resolver(97-101)src/ai_company/providers/routing/resolver.py (2)
ModelResolver(28-181)from_config(57-115)src/ai_company/config/schema.py (1)
ProviderConfig(52-95)src/ai_company/providers/routing/models.py (2)
RoutingDecision(86-104)RoutingRequest(48-83)
src/ai_company/providers/routing/resolver.py (4)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/providers/routing/errors.py (1)
ModelResolutionError(16-19)src/ai_company/providers/routing/models.py (2)
ResolvedModel(9-45)total_cost_per_1k(43-45)src/ai_company/config/schema.py (1)
ProviderConfig(52-95)
src/ai_company/core/role_catalog.py (2)
src/ai_company/core/enums.py (2)
CostTier(57-69)SeniorityLevel(6-21)src/ai_company/core/role.py (1)
SeniorityInfo(87-120)
tests/unit/providers/routing/test_models.py (3)
src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)src/ai_company/providers/routing/models.py (3)
ResolvedModel(9-45)RoutingDecision(86-104)RoutingRequest(48-83)tests/unit/providers/routing/conftest.py (3)
ResolvedModelFactory(25-34)RoutingDecisionFactory(47-54)RoutingRequestFactory(37-44)
src/ai_company/providers/routing/__init__.py (4)
src/ai_company/providers/routing/errors.py (4)
ModelResolutionError(16-19)NoAvailableModelError(22-25)RoutingError(10-13)UnknownStrategyError(28-31)src/ai_company/providers/routing/models.py (3)
ResolvedModel(9-45)RoutingDecision(86-104)RoutingRequest(48-83)src/ai_company/providers/routing/resolver.py (1)
ModelResolver(28-181)src/ai_company/providers/routing/strategies.py (5)
CostAwareStrategy(482-526)ManualStrategy(338-375)RoleBasedStrategy(381-476)RoutingStrategy(46-74)SmartStrategy(532-654)
tests/unit/config/test_schema.py (2)
src/ai_company/config/schema.py (1)
RoutingRuleConfig(98-134)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)
🔇 Additional comments (14)
src/ai_company/core/role_catalog.py (2)
25-67: Vendor-agnostic tier mapping is consistent and clean.The
typical_model_tierupdates are consistent across all seniority levels and align with the routing decoupling objective.
443-447: Good observability on the miss/error path.Adding a structured
WARNINGlog withROLE_LOOKUP_MISSbefore raisingLookupErroris the right behavior for diagnosability.As per coding guidelines, "All error paths must log at WARNING or ERROR level with context before raising an exception."
src/ai_company/config/schema.py (1)
101-103: Matcher requirement documentation is clear and actionable.This makes the rule contract explicit and easier to reason about in config/tests.
tests/unit/config/conftest.py (1)
42-44: Factory default update is appropriate for new routing-rule validation.Setting
task_typehere keeps generated test objects valid by default.src/ai_company/observability/events.py (1)
79-91: Routing event constant set is comprehensive and consistent.These additions improve observability coverage across routing lifecycle/error paths.
tests/unit/config/test_schema.py (1)
136-154: Test updates correctly reflect the new routing-rule matcher contract.Coverage now includes both valid minimal shapes and the no-matcher rejection path.
Also applies to: 195-200, 357-362, 377-377, 421-427, 436-442
src/ai_company/providers/routing/resolver.py (1)
52-54: Immutable index wrapping is a solid safety choice.Using
MappingProxyTypeprevents accidental mutation of resolver state.tests/unit/providers/routing/test_models.py (1)
18-141: Domain model test coverage is strong and well-targeted.The suite validates defaults, invariants, and frozen-model behavior across all routing models.
src/ai_company/providers/routing/router.py (1)
79-85: Router build logging includes the right operational context.The emitted payload is useful for startup diagnostics and config verification.
src/ai_company/providers/routing/__init__.py (1)
7-51: Routing package exports are clean and complete.
__all__is explicit and aligns with the intended public routing API.tests/unit/providers/routing/test_strategies.py (1)
26-470: Comprehensive strategy coverage looks solid.Great depth here across protocol conformance, fallback chains, soft-fail behavior, and budget-aware paths.
src/ai_company/providers/__init__.py (1)
37-109: Routing re-exports are clean and consistent.The import/export surface is coherent and the new symbols are correctly exposed via
__all__.tests/unit/providers/routing/test_resolver.py (1)
12-143: Resolver tests cover the critical invariants well.Good checks for collision detection, immutability, and cost ordering semantics.
src/ai_company/providers/routing/models.py (1)
9-104: Routing domain models are well-structured and validated.Immutability + field constraints + clear typed contracts look good for strategy/resolver integration.
| def from_config( | ||
| cls, | ||
| providers: dict[str, ProviderConfig], | ||
| ) -> ModelResolver: | ||
| """Build a resolver from a provider config dict. | ||
|
|
||
| Args: | ||
| providers: Provider config dict (key = provider name). | ||
|
|
||
| Returns: | ||
| A new ``ModelResolver`` with all models indexed. | ||
| """ | ||
| index: dict[str, ResolvedModel] = {} | ||
|
|
||
| for provider_name, provider_config in providers.items(): | ||
| for model_config in provider_config.models: | ||
| resolved = ResolvedModel( | ||
| provider_name=provider_name, | ||
| model_id=model_config.id, | ||
| alias=model_config.alias, | ||
| cost_per_1k_input=model_config.cost_per_1k_input, | ||
| cost_per_1k_output=model_config.cost_per_1k_output, | ||
| max_context=model_config.max_context, | ||
| ) | ||
| for ref in (model_config.id, model_config.alias): | ||
| if ref is None: | ||
| continue | ||
| existing = index.get(ref) | ||
| if existing is not None and existing != resolved: | ||
| logger.error( | ||
| ROUTING_MODEL_RESOLUTION_FAILED, | ||
| ref=ref, | ||
| existing_provider=existing.provider_name, | ||
| existing_model_id=existing.model_id, | ||
| new_provider=provider_name, | ||
| new_model_id=resolved.model_id, | ||
| ) | ||
| msg = ( | ||
| f"Duplicate model reference {ref!r}: " | ||
| f"{existing.provider_name}/{existing.model_id} " | ||
| f"vs {provider_name}/{resolved.model_id}" | ||
| ) | ||
| raise ModelResolutionError( | ||
| msg, | ||
| context={ | ||
| "ref": ref, | ||
| "existing_provider": existing.provider_name, | ||
| "new_provider": provider_name, | ||
| }, | ||
| ) | ||
| index[ref] = resolved | ||
|
|
||
| logger.info( | ||
| ROUTING_RESOLVER_BUILT, | ||
| model_count=len({m.model_id for m in index.values()}), | ||
| ref_count=len(index), | ||
| providers=sorted(providers), | ||
| ) | ||
| return cls(index) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Refactor from_config into smaller helpers to satisfy function-size rule.
This method currently combines multiple concerns and exceeds the 50-line limit.
♻️ Proposed refactor direction
class ModelResolver:
+ `@staticmethod`
+ def _build_resolved_model(
+ provider_name: str,
+ model_config: ProviderModelConfig,
+ ) -> ResolvedModel:
+ return ResolvedModel(
+ provider_name=provider_name,
+ model_id=model_config.id,
+ alias=model_config.alias,
+ cost_per_1k_input=model_config.cost_per_1k_input,
+ cost_per_1k_output=model_config.cost_per_1k_output,
+ max_context=model_config.max_context,
+ )
+
+ `@staticmethod`
+ def _index_ref(
+ index: dict[str, ResolvedModel],
+ ref: str,
+ resolved: ResolvedModel,
+ provider_name: str,
+ ) -> None:
+ existing = index.get(ref)
+ if existing is not None and existing != resolved:
+ logger.error(
+ ROUTING_MODEL_RESOLUTION_FAILED,
+ ref=ref,
+ existing_provider=existing.provider_name,
+ existing_model_id=existing.model_id,
+ new_provider=provider_name,
+ new_model_id=resolved.model_id,
+ )
+ msg = (
+ f"Duplicate model reference {ref!r}: "
+ f"{existing.provider_name}/{existing.model_id} "
+ f"vs {provider_name}/{resolved.model_id}"
+ )
+ raise ModelResolutionError(msg, context={"ref": ref})
+ index[ref] = resolved
@@
def from_config(
cls,
providers: dict[str, ProviderConfig],
) -> ModelResolver:
index: dict[str, ResolvedModel] = {}
for provider_name, provider_config in providers.items():
for model_config in provider_config.models:
- resolved = ResolvedModel(...)
+ resolved = cls._build_resolved_model(provider_name, model_config)
for ref in (model_config.id, model_config.alias):
if ref is None:
continue
- ...
+ cls._index_ref(index, ref, resolved, provider_name)As per coding guidelines: "Keep functions under 50 lines; keep files under 800 lines."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/providers/routing/resolver.py` around lines 57 - 115, The
from_config classmethod on ModelResolver is too large and should be split into
focused helpers: extract a helper to build a ResolvedModel from (provider_name,
model_config) (referencing ResolvedModel and model_config fields: id, alias,
cost_per_1k_input, cost_per_1k_output, max_context), extract a helper to index a
resolved model into the index dict (handling duplicate detection and raising
ModelResolutionError and logging ROUTING_MODEL_RESOLUTION_FAILED with the same
context), and extract a small helper to compute the final metrics for the logger
call (model_count/ref_count). Replace the current inline loops in from_config
with calls to these helpers and keep from_config as an orchestration layer that
builds the index and returns cls(index).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| SENIORITY_INFO: tuple[SeniorityInfo, ...] = ( | ||
| SeniorityInfo( | ||
| level=SeniorityLevel.JUNIOR, | ||
| authority_scope="Execute assigned tasks only", | ||
| typical_model_tier="haiku", | ||
| typical_model_tier="small", | ||
| cost_tier=CostTier.LOW, | ||
| ), |
There was a problem hiding this comment.
typical_model_tier values were changed to vendor-agnostic aliases (small/medium/large), but built-in templates + template rendering logic still default to/use haiku/sonnet/opus (e.g. src/ai_company/templates/renderer.py default "sonnet" and multiple templates/builtins/*.yaml). This will likely make template-generated configs fail to resolve models unless users keep legacy aliases; either update templates/renderer to the new tier names or add backward-compatible alias mapping (haiku→small, sonnet→medium, opus→large).
…and templates - Add allow_inf_nan=False to ResolvedModel and ProviderModelConfig - Add logging before raise in RoutingRuleConfig._at_least_one_matcher - Add strategy-required-fields docstring to RoutingRequest - Update all template tiers from vendor-specific (opus/sonnet/haiku) to generic (large/medium/small) across schema, renderer, and 7 YAML builtins - Add tests for total_cost_per_1k, inf/nan rejection, fallback chain skipping unresolvable entries, exhausted non-empty chain, rule fallback dedup, and routing failure warning log - Rename misleading test_budget_filters_models to test_tight_budget_picks_cheapest - Add noqa comments for intentional runtime imports (PEP 649 compatibility)
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/config/schema.py`:
- Around line 137-141: The validation-failure warning for RoutingRuleConfig
currently logs CONFIG_VALIDATION_FAILED with model and error only; update the
logger.warning call (the one that emits CONFIG_VALIDATION_FAILED in
src/ai_company/config/schema.py, around the RoutingRuleConfig validation path)
to include the routing rule context keys role_level, task_type, preferred_model,
and fallback in the payload (pass them as keyword args alongside
model="RoutingRuleConfig" and error=msg) so the warning contains those fields
for easier triage.
In `@src/ai_company/providers/routing/models.py`:
- Around line 69-90: The remaining_budget field on the RoutingRequest model
currently allows NaN/Infinity which breaks deterministic cost comparisons;
update the Field(...) for remaining_budget in the RoutingRequest class to
disallow non-finite floats by adding allow_inf_nan=False (i.e.,
remaining_budget: float | None = Field(default=None, ge=0.0,
allow_inf_nan=False, description=...)); ensure the same semantics as
ResolvedModel so budget filtering remains reliable.
In `@src/ai_company/providers/routing/strategies.py`:
- Around line 562-569: The code returns immediately when
_try_task_type_rules(...) yields a decision, which skips budget checks and
allows models that exceed remaining_budget; update the flow in the method
containing the call to _try_task_type_rules so that when a non-None decision is
produced you still validate it against remaining_budget (or run the same
budget-filtering logic used in the cost-aware branch) before returning—e.g.,
check the selected model's cost against remaining_budget and only return the
decision if it passes, otherwise continue to the next decision path or fall back
to a budget-compliant selection; reference _try_task_type_rules,
remaining_budget, and the cost-aware routing logic when making the change.
- Around line 615-633: The SmartStrategy priority chain is short-circuited by
calling self._try_global_chain before role/seniority stages, letting global
fallback models win too early; change the resolution order in the composite OR
chain so that _try_task_type_rules, _try_role_rules, and _try_seniority_default
run (and may traverse config.fallback_chain as they currently do) before any
global-chain resolution, and only call self._try_global_chain after
_try_seniority_default and _try_cheapest. Update the OR sequence that currently
ends with "or self._try_global_chain(config, resolver) or
self._try_cheapest(request, resolver)" to instead call
self._try_cheapest(request, resolver) first and then
self._try_global_chain(config, resolver) so global-chain models are only
considered last.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (18)
src/ai_company/config/schema.pysrc/ai_company/providers/routing/models.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/strategies.pysrc/ai_company/templates/builtins/agency.yamlsrc/ai_company/templates/builtins/dev_shop.yamlsrc/ai_company/templates/builtins/full_company.yamlsrc/ai_company/templates/builtins/product_team.yamlsrc/ai_company/templates/builtins/research_lab.yamlsrc/ai_company/templates/builtins/solo_founder.yamlsrc/ai_company/templates/builtins/startup.yamlsrc/ai_company/templates/renderer.pysrc/ai_company/templates/schema.pytests/unit/providers/routing/test_models.pytests/unit/providers/routing/test_router.pytests/unit/providers/routing/test_strategies.pytests/unit/templates/test_schema.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotations— Python 3.14 has native lazy annotations via PEP 649
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14
All public functions must include type hints; enforce via mypy strict mode
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Use Pydantic v2 models withBaseModel,model_validator, andConfigDict
Enforce 88-character line length (enforced by ruff)
Keep functions under 50 lines; keep files under 800 lines
Files:
src/ai_company/templates/schema.pytests/unit/templates/test_schema.pysrc/ai_company/templates/renderer.pytests/unit/providers/routing/test_router.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/models.pytests/unit/providers/routing/test_strategies.pytests/unit/providers/routing/test_models.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/strategies.pysrc/ai_company/config/schema.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic must import:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport loggingorlogging.getLogger()orprint()in application code — use the ai_company logger
Always useloggeras the variable name (not_loggerorlog)
Always use event name constants fromai_company.observability.eventsin logger calls
Use structured logging format:logger.info(EVENT, key=value)— never use format strings likelogger.info("msg %s", val)
All error paths must log at WARNING or ERROR level with context before raising an exception
All state transitions must log at INFO level
Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/templates/schema.pysrc/ai_company/templates/renderer.pysrc/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/models.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/strategies.pysrc/ai_company/config/schema.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark all test cases with pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow
Configure pytest async mode asasyncio_mode = "auto"— do not manually add@pytest.mark.asyncioto test functions
Files:
tests/unit/templates/test_schema.pytests/unit/providers/routing/test_router.pytests/unit/providers/routing/test_strategies.pytests/unit/providers/routing/test_models.py
🧠 Learnings (11)
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/model_service.py} : Respect existing model configuration patterns in Ollama integration
Applied to files:
src/ai_company/templates/schema.pytests/unit/templates/test_schema.pysrc/ai_company/templates/renderer.pysrc/ai_company/providers/routing/resolver.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.
Applied to files:
tests/unit/templates/test_schema.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Configure appropriate temperature settings based on agent role: Writer (0.9), Editor (0.6), Continuity (0.3), Architect (0.85), Interviewer (0.7)
Applied to files:
src/ai_company/templates/builtins/research_lab.yaml
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Use context size of 32768 tokens as default for Ollama models.
Applied to files:
src/ai_company/templates/renderer.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to **/*.py : Do not use `from __future__ import annotations` — Python 3.14 has native lazy annotations via PEP 649
Applied to files:
src/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/strategies.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : NEVER add `from __future__ import annotations` - Python 3.14+ has native support for deferred annotation evaluation and does not require this import
Applied to files:
src/ai_company/providers/routing/router.pysrc/ai_company/providers/routing/resolver.pysrc/ai_company/providers/routing/strategies.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising an exception
Applied to files:
src/ai_company/providers/routing/router.pysrc/ai_company/config/schema.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions
Applied to files:
src/ai_company/providers/routing/router.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic must import: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
src/ai_company/config/schema.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` in logger calls
Applied to files:
src/ai_company/config/schema.py
📚 Learning: 2026-03-01T19:59:46.936Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T19:59:46.936Z
Learning: Applies to src/ai_company/**/*.py : All state transitions must log at INFO level
Applied to files:
src/ai_company/config/schema.py
🧬 Code graph analysis (4)
tests/unit/providers/routing/test_router.py (5)
src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)src/ai_company/providers/routing/errors.py (3)
ModelResolutionError(16-19)NoAvailableModelError(22-25)UnknownStrategyError(28-31)src/ai_company/providers/routing/router.py (4)
ModelRouter(24-134)strategy_name(88-90)resolver(83-85)route(92-134)tests/unit/providers/routing/conftest.py (3)
three_model_provider(85-93)resolver(97-101)standard_routing_config(105-130)src/ai_company/providers/routing/resolver.py (1)
all_models(173-176)
src/ai_company/providers/routing/router.py (3)
src/ai_company/providers/routing/errors.py (2)
RoutingError(10-13)UnknownStrategyError(28-31)src/ai_company/providers/routing/models.py (1)
RoutingRequest(48-90)src/ai_company/providers/routing/resolver.py (2)
ModelResolver(25-188)from_config(87-122)
tests/unit/providers/routing/test_models.py (3)
src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)src/ai_company/providers/routing/models.py (4)
ResolvedModel(9-45)RoutingDecision(93-111)RoutingRequest(48-90)total_cost_per_1k(43-45)tests/unit/providers/routing/conftest.py (3)
ResolvedModelFactory(25-34)RoutingDecisionFactory(47-54)RoutingRequestFactory(37-44)
src/ai_company/config/schema.py (2)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/observability/config.py (1)
LogConfig(115-192)
🔇 Additional comments (17)
src/ai_company/templates/builtins/solo_founder.yaml (1)
36-36: Tier migration is consistent and clean.The updated model tiers align with the new vendor-agnostic template strategy.
Also applies to: 41-41
src/ai_company/templates/builtins/research_lab.yaml (1)
35-35: Model tier updates look correct.These assignments are consistent with the standardized
large/medium/smallnaming.Also applies to: 40-40, 45-45
src/ai_company/templates/builtins/dev_shop.yaml (1)
38-38: Good tier normalization across agent roles.The migration preserves intent and removes provider coupling as expected.
Also applies to: 43-43, 48-48, 53-53, 58-58
src/ai_company/templates/builtins/full_company.yaml (1)
44-44: Template model mapping update is coherent.The revised tiers are consistent with the new generic catalog.
Also applies to: 49-49, 54-54, 58-58, 63-63
src/ai_company/config/schema.py (1)
32-32:allow_inf_nan=Falseis a solid hardening change.Good guardrail for provider model numeric fields.
src/ai_company/templates/builtins/startup.yaml (1)
39-39: Startup template tier migration looks correct.The updated model values are consistent and aligned with the new routing vocabulary.
Also applies to: 44-44, 49-49, 54-54, 59-59
tests/unit/templates/test_schema.py (1)
79-79: Test expectation update is correct.This assertion now matches the new default model tier.
src/ai_company/templates/builtins/agency.yaml (1)
38-38: Agency template model updates are consistent.The changes correctly apply the vendor-agnostic tier naming.
Also applies to: 43-43, 47-47
src/ai_company/templates/renderer.py (1)
379-380: LGTM! Vendor-agnostic model tier default.The change from
"sonnet"to"medium"aligns with the PR objective to use vendor-agnostic tier names (small/medium/large) instead of vendor-specific model names.src/ai_company/templates/builtins/product_team.yaml (1)
38-52: LGTM! Consistent vendor-agnostic model tier assignments.The model tier updates appropriately map agent roles to capability tiers:
- Senior roles (Product Manager, Full-Stack Developer) and mid-level UX Designer →
"medium"- QA Engineer →
"small"This aligns with the PR objective for vendor-agnostic tier naming.
src/ai_company/templates/schema.py (1)
88-102: LGTM! Schema default and documentation aligned.The default value change to
"medium"and updated docstring documenting tier options ("large","medium","small") are consistent with the vendor-agnostic tier naming across the codebase.tests/unit/providers/routing/test_models.py (2)
1-98: LGTM! Comprehensive model validation tests.Good coverage of
ResolvedModelincluding:
- Factory construction and immutability
- Optional alias handling
- Cost defaults and non-negative validation
max_contextpositivity constraintNotBlankStrvalidation for provider_name/model_idtotal_cost_per_1kcomputed property- Inf/NaN rejection via
allow_inf_nan=False
100-166: LGTM! Request and decision model tests.Good coverage of
RoutingRequestandRoutingDecision:
- All-optional field construction
- Negative budget rejection
NotBlankStrvalidation fortask_typeandmodel_override- Default empty fallbacks tuple
- Immutability enforcement
tests/unit/providers/routing/test_router.py (1)
1-171: LGTM! Comprehensive router tests with observability verification.Excellent coverage including:
- Strategy construction and alias mapping (
cheapest→cost_aware)- Error handling for unknown strategies
- Routing behavior across all strategies (cost_aware, manual, role_based, smart)
- Observability verification for
ROUTING_ROUTER_BUILT,ROUTING_DECISION_MADE, andROUTING_SELECTION_FAILEDevents- Error propagation validation
src/ai_company/providers/routing/router.py (1)
1-134: LGTM! Well-structured router with proper observability.The implementation:
- Correctly delegates to strategies via
STRATEGY_MAP- Logs structured events for router construction (
ROUTING_ROUTER_BUILT) and routing decisions (ROUTING_DECISION_MADE)- Logs warning with full context (
ROUTING_SELECTION_FAILED) before re-raisingRoutingError- Exposes
resolverandstrategy_nameproperties for introspectiontests/unit/providers/routing/test_strategies.py (1)
1-540: LGTM! Comprehensive strategy test suite.Excellent coverage of all four strategies with edge cases:
- Protocol conformance and
STRATEGY_MAPcompleteness (includingcheapestalias)- ManualStrategy: override resolution, missing override, unknown model errors
- RoleBasedStrategy: rule matching, seniority defaults, rule fallbacks, global chain fallbacks
- CostAwareStrategy: cheapest selection, task_type priority, budget filtering, zero-budget warning
- SmartStrategy: priority ordering (override > task_type > role > seniority > chain > cheapest), soft-fail override handling
- Fallback chain deduplication and exhaustion scenarios
src/ai_company/providers/routing/resolver.py (1)
1-188: LGTM! Clean resolver with O(1) lookup and collision detection.Well-designed implementation:
MappingProxyTypeensures post-construction immutability_index_refhelper properly handles collision detection with structured error loggingresolve()logs warning before raisingModelResolutionErrorresolve_safe()provides non-throwing variant with debug loggingall_models_sorted_by_cost()supports cost-aware strategy selection- Refactoring from past review (extracting
_index_ref) has been applied, keepingfrom_configconcise
| model_config = ConfigDict(frozen=True) | ||
|
|
||
| agent_level: SeniorityLevel | None = Field( | ||
| default=None, | ||
| description="Seniority level of the requesting agent", | ||
| ) | ||
| task_type: NotBlankStr | None = Field( | ||
| default=None, | ||
| description="Task type label", | ||
| ) | ||
| model_override: NotBlankStr | None = Field( | ||
| default=None, | ||
| description="Explicit model reference for manual routing", | ||
| ) | ||
| remaining_budget: float | None = Field( | ||
| default=None, | ||
| ge=0.0, | ||
| description=( | ||
| "Per-request cost ceiling (compared against model " | ||
| "total_cost_per_1k). Not a total session budget." | ||
| ), | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, examine the file to confirm the model definition
cat -n src/ai_company/providers/routing/models.py | head -100Repository: Aureliolo/ai-company
Length of output: 4271
🌐 Web query:
Pydantic v2 ConfigDict allow_inf_nan default value and float field validation behavior
💡 Result:
ConfigDict.allow_inf_nandefault (Pydantic v2):True— i.e., by default Pydantic acceptsfloat('inf'),float('-inf'), andfloat('nan')(and likewise forDecimalfields). [1]- What it changes for
floatfields:- If
allow_inf_nan=True: float validation allowsinf/-inf/nan(including commonly accepted string forms like"inf","infinity","nan"in non-strict parsing). [1][3] - If
allow_inf_nan=False: those values are rejected with a validation error for float/decimal fields. [1]
- If
- Schema-level / per-field override: in pydantic-core,
float_schema(..., allow_inf_nan=...)exists; its default isNone(meaning “don’t override here”, so model config can apply). [2]
Sources
[1] Pydantic v2 config docs (ConfigDict.allow_inf_nan, default True) (docs.pydantic.dev)
[2] Pydantic docs: pydantic_core.core_schema.float_schema(allow_inf_nan=...) (docs.pydantic.dev)
[3] Summary of pydantic-core float validator accepted inputs (incl. "inf", "infinity", "nan" when allowed) (deepwiki.com)
Citations:
- 1: https://docs.pydantic.dev/latest/api/config/
- 2: https://docs.pydantic.dev/latest/api/pydantic_core_schema/?utm_source=openai
- 3: https://deepwiki.com/pydantic/pydantic-core/3.1-basic-type-validators?utm_source=openai
Disallow NaN/Infinity for remaining_budget to keep budget routing deterministic.
RoutingRequest omits allow_inf_nan=False while ResolvedModel in the same file enforces it. This allows non-finite float inputs for the remaining_budget field, which is used in cost comparisons (lines 62–66) and would produce unreliable filtering results.
Proposed fix
class RoutingRequest(BaseModel):
- model_config = ConfigDict(frozen=True)
+ model_config = ConfigDict(frozen=True, allow_inf_nan=False)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model_config = ConfigDict(frozen=True) | |
| agent_level: SeniorityLevel | None = Field( | |
| default=None, | |
| description="Seniority level of the requesting agent", | |
| ) | |
| task_type: NotBlankStr | None = Field( | |
| default=None, | |
| description="Task type label", | |
| ) | |
| model_override: NotBlankStr | None = Field( | |
| default=None, | |
| description="Explicit model reference for manual routing", | |
| ) | |
| remaining_budget: float | None = Field( | |
| default=None, | |
| ge=0.0, | |
| description=( | |
| "Per-request cost ceiling (compared against model " | |
| "total_cost_per_1k). Not a total session budget." | |
| ), | |
| ) | |
| model_config = ConfigDict(frozen=True, allow_inf_nan=False) | |
| agent_level: SeniorityLevel | None = Field( | |
| default=None, | |
| description="Seniority level of the requesting agent", | |
| ) | |
| task_type: NotBlankStr | None = Field( | |
| default=None, | |
| description="Task type label", | |
| ) | |
| model_override: NotBlankStr | None = Field( | |
| default=None, | |
| description="Explicit model reference for manual routing", | |
| ) | |
| remaining_budget: float | None = Field( | |
| default=None, | |
| ge=0.0, | |
| description=( | |
| "Per-request cost ceiling (compared against model " | |
| "total_cost_per_1k). Not a total session budget." | |
| ), | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/providers/routing/models.py` around lines 69 - 90, The
remaining_budget field on the RoutingRequest model currently allows NaN/Infinity
which breaks deterministic cost comparisons; update the Field(...) for
remaining_budget in the RoutingRequest class to disallow non-finite floats by
adding allow_inf_nan=False (i.e., remaining_budget: float | None =
Field(default=None, ge=0.0, allow_inf_nan=False, description=...)); ensure the
same semantics as ResolvedModel so budget filtering remains reliable.
- Add routing rule context (role_level, task_type, preferred_model, fallback) to RoutingRuleConfig validation warning log payload - Add allow_inf_nan=False to RoutingRequest ConfigDict so inf/nan budgets are rejected at validation time - Gate CostAwareStrategy task_type rule decisions against remaining_budget before returning (fall through to cheapest if the matched model exceeds budget) - Swap SmartStrategy priority: cheapest (budget-aware) now runs before global fallback chain so cost-informed selection takes precedence over blind chain traversal
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 34 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.info( | ||
| ROUTING_FALLBACK_EXHAUSTED, | ||
| ref=ref, | ||
| source="safe_wrapper", | ||
| ) |
There was a problem hiding this comment.
_try_resolve_with_fallback_safe() currently logs ROUTING_FALLBACK_EXHAUSTED after catching NoAvailableModelError, but _try_resolve_with_fallback() already logs the same event at WARNING right before raising. This results in duplicate exhaustion logs for every safe attempt. Consider removing the wrapper log, or refactoring _try_resolve_with_fallback() to optionally suppress/downgrade its exhaustion log when used by the safe variant.
| logger.info( | |
| ROUTING_FALLBACK_EXHAUSTED, | |
| ref=ref, | |
| source="safe_wrapper", | |
| ) |
| """MID has no rule -> uses seniority catalog (sonnet tier).""" | ||
| strategy = RoleBasedStrategy() |
There was a problem hiding this comment.
This docstring still refers to the seniority catalog default as the “sonnet tier”, but the routing tiers were renamed to vendor-agnostic small/medium/large in this PR. Updating this keeps tests/documentation consistent with the new tier naming.
| # review rule -> opus; junior role rule -> haiku; task wins | ||
| assert decision.resolved_model.alias == "large" | ||
| assert "task-type" in decision.reason.lower() |
There was a problem hiding this comment.
This comment uses vendor-specific names (“opus/haiku”). Since this PR’s stated goal is vendor-agnostic tiers (small/medium/large), it would be clearer to update the comment to those tier names (e.g., “review rule -> large; junior role rule -> small”).
| """Budget large enough for haiku+sonnet but not opus picks haiku.""" | ||
| # haiku total=0.006, sonnet total=0.018, opus total=0.090 | ||
| strategy = CostAwareStrategy() |
There was a problem hiding this comment.
This docstring still references vendor model names (“haiku/sonnet/opus”), but the suite otherwise uses the vendor-agnostic tier aliases (small/medium/large). Updating the wording avoids reintroducing vendor coupling in test docs.
| """LEAD has tier=opus; if opus not registered, use fallback chain.""" | ||
| provider = ProviderConfig( |
There was a problem hiding this comment.
This docstring/comment mentions “tier=opus”. With the tier aliases renamed to large/medium/small, consider updating the text to reflect the new large tier naming.
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
Closes #6
Test plan
🤖 Generated with Claude Code