feat: implement LiteLLM driver and provider registry by Aureliolo · Pull Request #88 · Aureliolo/synthorg

Aureliolo · 2026-03-01T11:15:10Z

Summary

Implements the Employment Agency — a swappable driver system with LiteLLM as the default backend behind the contracts from feat: design unified provider interface #86
LiteLLMDriver(BaseCompletionProvider) wrapping litellm.acompletion for streaming/non-streaming completions, model capabilities, and full exception mapping (10 LiteLLM exception types → ProviderError hierarchy)
ProviderRegistry (immutable, MappingProxyType) mapping provider names → driver instances, built from config via from_config() with factory_overrides for testing/native SDK swaps
Pure mapping functions for messages, tools, finish reasons, and tool call extraction
Upgraded litellm 1.67.2 → 1.82.0, removing all Python 3.14 compatibility workarounds (warning filters, PYTHONUTF8 env var, stale type: ignore)
Removed unused F403 ruff ignore for __init__.py

Architecture

Engine  ──>  ProviderRegistry  ──>  LiteLLMDriver (anthropic)
             ("Employment Agency")  LiteLLMDriver (openrouter)
                                    LiteLLMDriver (ollama)
                                    # Future: NativeAnthropicDriver, etc.

New files

File	Lines	Purpose
`providers/drivers/litellm_driver.py`	~530	LiteLLM-backed completion driver
`providers/drivers/mappers.py`	~170	Pure message/tool/reason mapping functions
`providers/registry.py`	~185	Immutable provider name → driver registry
`providers/drivers/__init__.py`	~20	Sub-package exports

Modified files

File	Change
`config/schema.py`	Added `driver` field to `ProviderConfig` (default `"litellm"`)
`providers/errors.py`	Added 3 driver error classes
`providers/__init__.py`	Added new exports
`pyproject.toml`	litellm `1.82.0`, removed warning filters and `F403` ignore

Test plan

235 new unit tests covering driver, mappers, registry, and exception mapping
All 1273 tests pass (uv run pytest tests/ -n auto)
94% coverage (--cov-fail-under=80)
Lint clean (uv run ruff check src/ tests/)
Format clean (uv run ruff format --check src/ tests/)
Type-check clean (uv run mypy src/ — 0 errors in 64 files)
All pre-commit hooks pass

Closes #4

Add the "Employment Agency" — a swappable driver system with LiteLLM as the default backend. Implements the concrete provider layer behind the contracts designed in #86. - LiteLLMDriver wrapping litellm.acompletion for streaming and non-streaming completions, model capability queries, and full exception mapping (10 LiteLLM exception types → ProviderError hierarchy) - ProviderRegistry (immutable) mapping provider names to driver instances, built from config via from_config() with factory_overrides for testing - Pure mapping functions (messages, tools, finish reasons, tool calls) - 3 new driver error classes (DriverNotRegistered, DriverAlreadyRegistered, DriverFactoryNotFound) - driver field on ProviderConfig (defaults to "litellm") - Upgraded litellm 1.67.2 → 1.82.0 (fixes Python 3.14 compat, removes need for PYTHONUTF8 env var and deprecation warning filters) - Removed unused F403 ruff ignore for __init__.py - 235 new unit tests, all 1273 tests pass at 94% coverage

coderabbitai · 2026-03-01T11:15:29Z

Warning

Rate limit exceeded

@Aureliolo has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 63f0774 and c8c7300.

📒 Files selected for processing (5)

src/ai_company/providers/drivers/litellm_driver.py
src/ai_company/providers/drivers/mappers.py
src/ai_company/providers/registry.py
tests/unit/providers/drivers/test_litellm_driver.py
tests/unit/providers/test_registry.py

📝 Walkthrough

Walkthrough

Adds a LiteLLM-backed provider layer: new LiteLLMDriver, mapping utilities, driver registry and errors, ProviderConfig.driver field, package export updates, pyproject/mypy adjustments, and comprehensive unit tests for drivers, mappers, and the registry.

Changes

Cohort / File(s)	Summary
Project config `pyproject.toml`	Added `litellm` dependency, adjusted mypy overrides (litellm., tests.) and wrapped `filterwarnings` as a list.
Schema `src/ai_company/config/schema.py`	Added `driver: NotBlankStr` field to `ProviderConfig` (default `"litellm"`).
Provider package exports `src/ai_company/providers/__init__.py`	Exported `LiteLLMDriver`, `ProviderRegistry`, and driver error classes (`DriverAlreadyRegisteredError`, `DriverFactoryNotFoundError`, `DriverNotRegisteredError`).
Drivers package init `src/ai_company/providers/drivers/__init__.py`	New drivers package initializer exporting `LiteLLMDriver`.
LiteLLM driver `src/ai_company/providers/drivers/litellm_driver.py`	New `LiteLLMDriver`: model resolution, request construction, streaming support (with tool-call accumulation), error mapping, response mapping, cost computation, and capability discovery.
Driver mappers `src/ai_company/providers/drivers/mappers.py`	New mapping utilities: `messages_to_dicts`, `tools_to_dicts`, `extract_tool_calls`, `map_finish_reason` and helpers for converting domain models to OpenAI-style dicts.
Provider errors `src/ai_company/providers/errors.py`	Added `DriverNotRegisteredError`, `DriverAlreadyRegisteredError`, `DriverFactoryNotFoundError` (subclassing `ProviderError`).
Provider registry `src/ai_company/providers/registry.py`	New `ProviderRegistry` with `from_config` factory construction, driver instantiation/validation, immutable mapping, and lookup APIs.
Driver tests fixtures `tests/unit/providers/drivers/conftest.py`	New test fixtures and helpers for provider configs, mock LiteLLM responses, stream chunks, and tool-call deltas.
LiteLLMDriver tests `tests/unit/providers/drivers/test_litellm_driver.py`	Comprehensive unit tests for non-streaming/streaming flows, exception mapping, model capabilities, tool handling, and cost/provenance assertions.
Mapper tests `tests/unit/providers/drivers/test_mappers.py`	Unit tests for message/tool conversion, finish-reason mapping, and tool-call extraction.
Registry tests `tests/unit/providers/test_registry.py`	Tests for registry retrieval, listing, membership, factory overrides, error cases, and immutability.

Sequence Diagram

sequenceDiagram
    actor Client
    participant ProviderRegistry
    participant LiteLLMDriver
    participant Mappers
    participant litellm

    Client->>ProviderRegistry: get(provider_name)
    ProviderRegistry-->>Client: LiteLLMDriver instance

    Client->>LiteLLMDriver: complete(messages, tools, config)
    LiteLLMDriver->>LiteLLMDriver: resolve model & apply config
    LiteLLMDriver->>Mappers: messages_to_dicts(messages)
    Mappers-->>LiteLLMDriver: message dicts
    LiteLLMDriver->>Mappers: tools_to_dicts(tools)
    Mappers-->>LiteLLMDriver: tool dicts
    LiteLLMDriver->>litellm: acompletion(model, messages, tools, ...)
    litellm-->>LiteLLMDriver: ModelResponse / stream
    LiteLLMDriver->>Mappers: extract_tool_calls(raw_tool_calls)
    Mappers-->>LiteLLMDriver: ToolCall tuples
    LiteLLMDriver-->>Client: CompletionResponse / stream chunks

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat: design unified provider interface #86 — Introduces the BaseCompletionProvider interface and domain models that this PR implements with the LiteLLM driver, mappers, registry, and driver errors.
feat: implement YAML config loader with Pydantic validation #75 — Adds the driver field to ProviderConfig and related schema changes referenced and used by this PR.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.97% which is insufficient. The required threshold is 100.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'feat: implement LiteLLM driver and provider registry' directly summarizes the main changes: adding a LiteLLM-backed driver and a provider registry system.
Description check	✅ Passed	The PR description comprehensively relates to the changeset, detailing the implementation of the Employment Agency pattern with LiteLLM driver, registry, mapping functions, dependencies, and test coverage.
Linked Issues check	✅ Passed	The PR implements core requirements from `#4`: LiteLLM integration prototype with Anthropic/OpenRouter/Ollama support, cost tracking, exception mapping, tool calling, streaming, and error propagation demonstrated through 235+ tests.
Out of Scope Changes check	✅ Passed	All changes are in scope: LiteLLMDriver and ProviderRegistry (core components), mappers (required for message/tool conversion), error classes (supporting exception mapping requirement), config schema update (driver field), and litellm upgrade with cleanup.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/litellm-driver

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-01T11:15:33Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a foundational architectural shift by implementing a pluggable driver system for LLM providers. It centralizes the management and interaction with various LLM backends through a ProviderRegistry and provides a robust LiteLLMDriver as the initial default. This change significantly improves the system's flexibility, maintainability, and error handling capabilities, paving the way for easier integration of diverse AI models and services.

Highlights

Modular Driver System: Implemented a swappable driver system, referred to as the 'Employment Agency', with LiteLLM serving as the default backend for LLM providers. This architecture allows for easy integration of different LLM backends.
LiteLLM Driver Implementation: Introduced the LiteLLMDriver which wraps litellm.acompletion to handle both streaming and non-streaming completions, manage model capabilities, and provide comprehensive exception mapping for 10 LiteLLM exception types to a custom ProviderError hierarchy.
Provider Registry: Created a ProviderRegistry (an immutable MappingProxyType) that maps provider names to their respective driver instances. This registry is built from configuration via from_config(), supporting factory_overrides for testing or native SDK integrations.
Mapping Functions: Added pure mapping functions to convert between internal domain models and OpenAI-compatible dict formats for messages, tools, finish reasons, and tool call extraction, ensuring reusability for future native SDK drivers.
Dependency and Tooling Updates: Upgraded the litellm dependency from 1.67.2 to 1.82.0, which allowed for the removal of Python 3.14 compatibility workarounds (warning filters, PYTHONUTF8 env var, stale type: ignore). Also removed an unused F403 ruff ignore for __init__.py files.
Configuration Schema Enhancement: Modified the ProviderConfig schema to include a new driver field, defaulting to "litellm", enabling configuration-driven selection of provider backends.
New Error Classes: Added three new driver-specific error classes: DriverNotRegisteredError, DriverAlreadyRegisteredError, and DriverFactoryNotFoundError to enhance error handling within the provider layer.

Changelog

pyproject.toml
- Added litellm==1.82.0 to project dependencies.
- Removed F403 from the ruff ignore list for __init__.py files.
- Configured mypy to ignore missing imports for litellm.* modules.
- Updated filterwarnings to use a multi-line list format.
src/ai_company/config/schema.py
- Added a new driver field to the ProviderConfig class, defaulting to "litellm".
src/ai_company/providers/init.py
- Updated the module docstring to reflect new exports including drivers and the registry.
- Imported LiteLLMDriver and ProviderRegistry.
- Imported new driver-related error classes: DriverAlreadyRegisteredError, DriverFactoryNotFoundError, and DriverNotRegisteredError.
- Added newly imported classes and modules to the __all__ export list.
src/ai_company/providers/drivers/init.py
- Added a new file to define the drivers sub-package and export LiteLLMDriver.
src/ai_company/providers/drivers/litellm_driver.py
- Added a new file implementing the LiteLLMDriver class, which integrates with LiteLLM for LLM completions and streaming.
src/ai_company/providers/drivers/mappers.py
- Added a new file containing utility functions for mapping between internal data models and LiteLLM's OpenAI-compatible API formats.
src/ai_company/providers/errors.py
- Added DriverNotRegisteredError for unregistered drivers.
- Added DriverAlreadyRegisteredError for duplicate driver registrations.
- Added DriverFactoryNotFoundError for missing driver factories.
src/ai_company/providers/registry.py
- Added a new file implementing the ProviderRegistry class for managing and retrieving LLM provider drivers.
tests/unit/providers/drivers/conftest.py
- Added a new file providing shared fixtures and mock factories for unit tests of provider drivers.
tests/unit/providers/drivers/test_litellm_driver.py
- Added a new file containing unit tests for the LiteLLMDriver functionality.
tests/unit/providers/drivers/test_mappers.py
- Added a new file containing unit tests for the mapping functions in mappers.py.
tests/unit/providers/test_registry.py
- Added a new file containing unit tests for the ProviderRegistry.

Activity

235 new unit tests were added, specifically covering the driver, mappers, registry, and exception mapping logic.
All 1273 existing tests passed successfully.
Code coverage was maintained at 94%, exceeding the 80% minimum threshold.
Linting checks passed without issues.
Formatting checks passed without issues.
Type-checking completed with 0 errors across 64 files.
All pre-commit hooks passed.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull request overview

This PR introduces a swappable provider “driver” layer (defaulting to LiteLLM) plus an immutable ProviderRegistry that builds provider instances from config, aligning with the unified provider contracts introduced in #86.

Changes:

Added LiteLLMDriver implementing BaseCompletionProvider with streaming/non-streaming completions, model capability lookup, and provider error mapping.
Added ProviderRegistry to construct and expose provider-name → driver mappings from ProviderConfig.driver.
Added pure mapping utilities (messages_to_dicts, tools_to_dicts, finish-reason/tool-call extraction) and comprehensive unit tests; bumped litellm to 1.82.0.

Reviewed changes

Copilot reviewed 12 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`pyproject.toml`	Adds `litellm==1.82.0`, mypy override for `litellm.*`, and adjusts ruff per-file ignore for `__init__.py`.
`uv.lock`	Locks `litellm` and its transitive dependencies.
`src/ai_company/config/schema.py`	Adds `ProviderConfig.driver` field (default `"litellm"`).
`src/ai_company/providers/errors.py`	Introduces registry/driver-related error types.
`src/ai_company/providers/registry.py`	Implements immutable `ProviderRegistry` + factory-based construction from config.
`src/ai_company/providers/drivers/mappers.py`	Adds message/tool/finish-reason/tool-call mapping helpers.
`src/ai_company/providers/drivers/litellm_driver.py`	Implements LiteLLM-backed driver with streaming support and exception mapping.
`src/ai_company/providers/drivers/__init__.py`	Exposes driver(s) from the subpackage.
`src/ai_company/providers/__init__.py`	Exports registry + driver + new error types from the top-level providers package.
`tests/unit/providers/test_registry.py`	Unit tests for `ProviderRegistry`.
`tests/unit/providers/drivers/test_mappers.py`	Unit tests for mapping helpers.
`tests/unit/providers/drivers/test_litellm_driver.py`	Unit tests for `LiteLLMDriver` (mocked LiteLLM calls).
`tests/unit/providers/drivers/conftest.py`	Shared mock factories/fixtures for driver tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-01T11:20:36Z

src/ai_company/providers/drivers/mappers.py

+    if isinstance(raw, str):
+        try:
+            parsed = json.loads(raw)
+        except json.JSONDecodeError, ValueError:


Invalid exception syntax: except json.JSONDecodeError, ValueError: is Python 2 syntax and raises a SyntaxError on import under Python 3. Use except (json.JSONDecodeError, ValueError): instead.

Suggested change

except json.JSONDecodeError, ValueError:

except (json.JSONDecodeError, ValueError):

Copilot · 2026-03-01T11:20:37Z

src/ai_company/providers/drivers/litellm_driver.py

+            return None
+        try:
+            return float(raw)
+        except ValueError, TypeError:


Invalid exception syntax: except ValueError, TypeError: is Python 2 syntax and will raise a SyntaxError under Python 3. Use except (ValueError, TypeError): instead.

Suggested change

except ValueError, TypeError:

except (ValueError, TypeError):

Copilot · 2026-03-01T11:20:37Z

src/ai_company/providers/drivers/litellm_driver.py

+            return None
+        try:
+            parsed = json.loads(self.arguments) if self.arguments else {}
+        except json.JSONDecodeError, ValueError:


Invalid exception syntax: except json.JSONDecodeError, ValueError: is Python 2 syntax and will raise a SyntaxError under Python 3. Use except (json.JSONDecodeError, ValueError): (or just json.JSONDecodeError) instead.

Suggested change

except json.JSONDecodeError, ValueError:

except (json.JSONDecodeError, ValueError):

gemini-code-assist

Code Review

This pull request introduces a well-designed and extensible driver system for LLM providers, with LiteLLM as the default implementation. The architecture is clean, with good separation of concerns between the driver, mappers, and registry. The code is extensively tested and the exception handling is robust. I've found a few critical syntax issues related to exception handling that appear to be from Python 2, which will cause errors in the target Python 3.14 environment. Once these are addressed, this will be an excellent addition to the codebase.

gemini-code-assist · 2026-03-01T11:21:44Z

src/ai_company/providers/drivers/litellm_driver.py

+            return None
+        try:
+            return float(raw)
+        except ValueError, TypeError:


This except syntax is from Python 2 and will raise a SyntaxError in Python 3. To catch multiple exception types, they must be enclosed in a tuple.

except (ValueError, TypeError):

gemini-code-assist · 2026-03-01T11:21:44Z

src/ai_company/providers/drivers/litellm_driver.py

+            return None
+        try:
+            parsed = json.loads(self.arguments) if self.arguments else {}
+        except json.JSONDecodeError, ValueError:


This except syntax is from Python 2 and will raise a SyntaxError in Python 3. To catch multiple exception types, they must be enclosed in a tuple.

Suggested change

except json.JSONDecodeError, ValueError:

except (json.JSONDecodeError, ValueError):

gemini-code-assist · 2026-03-01T11:21:44Z

src/ai_company/providers/drivers/mappers.py

+    if isinstance(raw, str):
+        try:
+            parsed = json.loads(raw)
+        except json.JSONDecodeError, ValueError:


This except syntax is from Python 2 and will raise a SyntaxError in Python 3. To catch multiple exception types, they must be enclosed in a tuple.

Suggested change

except json.JSONDecodeError, ValueError:

except (json.JSONDecodeError, ValueError):

coderabbitai

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/providers/drivers/litellm_driver.py`:
- Around line 205-208: The loop that builds _model_lookup (the for m in models
block creating lookup[m.id] and lookup[m.alias]) can silently overwrite entries
when an alias equals another model's id or alias; update the builder to detect
collisions: when about to assign lookup[key] check if key already exists and if
so raise or log a clear validation error referencing the conflicting model
ids/aliases (include both existing and new m.id/m.alias) and skip or abort
loading as appropriate per policy; ensure checks cover both m.id and m.alias and
run before any assignment so _model_lookup cannot be silently remapped.
- Around line 360-363: The streaming branch currently drops usage events when
usage_obj.prompt_tokens is zero; change the condition in the streaming path
(where usage_obj is retrieved from chunk and _make_usage_chunk is called) to
emit usage whenever usage_obj is not None (i.e., check usage_obj is not None)
rather than requiring prompt_tokens to be truthy, so
result.append(self._make_usage_chunk(usage_obj, model_config)) runs for valid
usage objects even if prompt_tokens == 0.
- Around line 276-277: The code converts usage_obj prompt/completion token
attributes with int(getattr(...)) which throws TypeError if the attribute exists
but is None; change the conversions to coerce None to 0 (e.g., input_tok =
int(getattr(usage_obj, "prompt_tokens", 0) or 0) and output_tok =
int(getattr(usage_obj, "completion_tokens", 0) or 0)) and make the same change
for the other occurrence referenced (the conversions at the second location
around lines with input_tok/output_tok in the later block) so None values are
safely treated as 0 before int().
- Around line 421-425: The code currently does a case-sensitive lookup raw =
headers.get("retry-after") which breaks HTTP semantics; update the lookup in
litellm_driver.py (around the getattr(exc, "headers", None) handling) to perform
a case-insensitive search (for example, by normalizing keys or iterating
headers.items() and matching k.lower() == "retry-after") and assign the found
value to raw; keep the existing isinstance(headers, dict) guard and ensure the
new lookup still returns None when no Retry-After header is present.
- Around line 265-266: In _map_response, avoid direct indexing of
response.choices[0]; instead retrieve choices via getattr(response, "choices",
[]) and check for emptiness—if empty, raise ProviderInternalError with a clear
message; otherwise use the first choice (e.g., choice = choices[0]) and continue
mapping as before. Ensure the change is made inside the _map_response method and
mirror the defensive pattern used in _process_chunk.
- Around line 186-189: The supports_streaming flag is hard-coded True; change it
to read the model info like the other capabilities (e.g., supports_streaming =
bool(info.get("supports_streaming", False))) and set
supports_streaming_tool_calls to the logical AND of streaming and
function-calling (e.g., bool(info.get("supports_function_calling", False)) and
supports_streaming) so non-streaming models aren't routed to streaming
endpoints; update the assignment locations where supports_streaming and
supports_streaming_tool_calls are set (the dict building that currently contains
supports_streaming=True and supports_streaming_tool_calls=...) to use these
extracted values.

In `@src/ai_company/providers/drivers/mappers.py`:
- Around line 73-79: The mapper currently returns payloads containing mutable
dicts by reference (e.g., using tool.parameters_schema in the function payload
and other dicts around lines 153-161); fix by returning defensive copies of any
mutable objects before including them in the returned dicts (use shallow or deep
copy as appropriate for nested structures) so callers cannot mutate the original
tool or schema objects — update the mapper return that constructs the "function"
payload (references: tool.name, tool.description, tool.parameters_schema) and
the other dict-producing mapper(s) around lines 153-161 to clone their dict/list
values before returning.
- Around line 83-103: The _FINISH_REASON_MAP in map_finish_reason is missing
Anthropic-specific keys so Anthropic finish reasons like "end_turn",
"stop_sequence", and "tool_use" currently fall back to FinishReason.ERROR;
update _FINISH_REASON_MAP to include "end_turn" -> FinishReason.STOP,
"stop_sequence" -> FinishReason.STOP, and "tool_use" -> FinishReason.TOOL_USE
(optionally normalize incoming reason with .lower() inside map_finish_reason
before lookup) so these provider-native values map correctly instead of
defaulting to ERROR.

In `@src/ai_company/providers/registry.py`:
- Around line 84-86: The __contains__ method currently does "name in
self._drivers" which raises TypeError for unhashable inputs; update the
Registry.__contains__ implementation to handle unhashable objects by performing
the membership test inside a try/except TypeError block (or using a safe lookup)
and return False when a TypeError occurs so unhashable probes (e.g., lists) do
not propagate exceptions; reference the __contains__ method and the
self._drivers attribute when making the change.
- Around line 174-175: The call to factory(name, config) can raise raw
exceptions which bypass the registry's driver error type; wrap the invocation of
factory inside a try/except in the registry code where driver = factory(name,
config) is executed, catch any Exception, and re-raise the registry's driver
error type (including contextual information: provider name and config) while
preserving the original exception as the cause; then proceed to the isinstance
check for BaseCompletionProvider as before.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e23d64 and 49710c6.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (13)

pyproject.toml
src/ai_company/config/schema.py
src/ai_company/providers/__init__.py
src/ai_company/providers/drivers/__init__.py
src/ai_company/providers/drivers/litellm_driver.py
src/ai_company/providers/drivers/mappers.py
src/ai_company/providers/errors.py
src/ai_company/providers/registry.py
tests/unit/providers/drivers/__init__.py
tests/unit/providers/drivers/conftest.py
tests/unit/providers/drivers/test_litellm_driver.py
tests/unit/providers/drivers/test_mappers.py
tests/unit/providers/test_registry.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Agent

🧰 Additional context used

📓 Path-based instructions (4)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Python 3.14+ with PEP 649 native lazy annotations
Do not use from __future__ import annotations — Python 3.14 has PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
Add type hints to all public functions, enforced by mypy strict mode
Use Google style docstrings on all public classes and functions, enforced by ruff D rules
Create new objects instead of mutating existing ones — enforce immutability
Use Pydantic v2 with BaseModel, model_validator, and ConfigDict
Keep line length to 88 characters, enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries: user input, external APIs, and config files

Files:

tests/unit/providers/drivers/test_litellm_driver.py
tests/unit/providers/drivers/test_mappers.py
src/ai_company/providers/registry.py
src/ai_company/providers/__init__.py
src/ai_company/config/schema.py
src/ai_company/providers/drivers/__init__.py
src/ai_company/providers/errors.py
src/ai_company/providers/drivers/litellm_driver.py
tests/unit/providers/test_registry.py
src/ai_company/providers/drivers/mappers.py
tests/unit/providers/drivers/conftest.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Use asyncio_mode = 'auto' in pytest — no manual @pytest.mark.asyncio needed
Set 30-second timeout per test

Files:

tests/unit/providers/drivers/test_litellm_driver.py
tests/unit/providers/drivers/test_mappers.py
tests/unit/providers/test_registry.py
tests/unit/providers/drivers/conftest.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain 80% minimum code coverage, enforced in CI

Files:

src/ai_company/providers/registry.py
src/ai_company/providers/__init__.py
src/ai_company/config/schema.py
src/ai_company/providers/drivers/__init__.py
src/ai_company/providers/errors.py
src/ai_company/providers/drivers/litellm_driver.py
src/ai_company/providers/drivers/mappers.py

pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

Pin all dependency versions using == in pyproject.toml

Files:

pyproject.toml

🧠 Learnings (8)

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/test*.py : Agent tests should cover: successful generation with valid output, handling malformed LLM responses, error conditions (network errors, timeouts), output format validation, and integration with story state

Applied to files:

tests/unit/providers/drivers/test_litellm_driver.py

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Mock Ollama API responses to support both dict (`models.get("models")`) and object (`response.models`) patterns in test mocks.

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Mock Ollama API calls in tests to avoid requiring a running Ollama instance

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/**/*.py : Mock Ollama in tests to avoid requiring running instance - use model names from `RECOMMENDED_MODELS` (e.g., `huihui_ai/dolphin3-abliterated:8b`)

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-31T13:51:16.868Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-31T13:51:16.868Z
Learning: Applies to tests/**/*.py : Ollama API response mocks must support both dict pattern (`models.get('models')`) and object pattern (`response.models`) to match actual API behavior.

Applied to files:

tests/unit/providers/drivers/conftest.py

🧬 Code graph analysis (5)

tests/unit/providers/drivers/test_mappers.py (3)

src/ai_company/providers/drivers/mappers.py (4)

extract_tool_calls (106-134)

map_finish_reason (93-103)

messages_to_dicts (15-25)

tools_to_dicts (58-68)

src/ai_company/providers/enums.py (2)

FinishReason (15-22)

MessageRole (6-12)

src/ai_company/providers/models.py (4)

ChatMessage (114-186)

ToolCall (73-95)

ToolDefinition (45-70)

ToolResult (98-111)

src/ai_company/providers/registry.py (3)

src/ai_company/providers/base.py (1)

BaseCompletionProvider (26-273)

src/ai_company/providers/errors.py (2)

DriverFactoryNotFoundError (155-158)

DriverNotRegisteredError (143-146)

src/ai_company/config/schema.py (1)

ProviderConfig (52-95)

src/ai_company/providers/drivers/__init__.py (1)

src/ai_company/providers/drivers/litellm_driver.py (1)

LiteLLMDriver (86-447)

src/ai_company/providers/drivers/litellm_driver.py (6)

src/ai_company/providers/errors.py (4)

AuthenticationError (72-75)

RateLimitError (78-104)

ProviderError (22-69)

InvalidRequestError (113-116)

src/ai_company/providers/base.py (2)

BaseCompletionProvider (26-273)

compute_cost (182-238)

src/ai_company/providers/capabilities.py (1)

ModelCapabilities (10-83)

src/ai_company/providers/enums.py (1)

StreamEventType (25-32)

src/ai_company/providers/models.py (5)

CompletionResponse (233-282)

StreamChunk (285-363)

ToolCall (73-95)

ChatMessage (114-186)

ToolDefinition (45-70)

src/ai_company/config/schema.py (2)

ProviderConfig (52-95)

ProviderModelConfig (17-49)

tests/unit/providers/drivers/conftest.py (1)

src/ai_company/config/schema.py (2)

ProviderConfig (52-95)

ProviderModelConfig (17-49)

🔇 Additional comments (14)

pyproject.toml (4)

125-125: LGTM!

Removing the F403 (star imports) ignore tightens lint rules appropriately. Retaining F401 for __init__.py is correct since these files commonly re-export symbols.

149-152: LGTM!

The MyPy override to ignore missing imports for litellm.* is appropriate since the library lacks complete type stubs.

195-197: LGTM!

The list format for filterwarnings is valid and more extensible. Treating warnings as errors during tests ensures deprecation issues are caught early.

17-17: No changes needed—all dependencies are correctly pinned.

The litellm==1.82.0 dependency is properly pinned with == and version 1.82.0 exists on PyPI. All other dependencies in pyproject.toml (lines 15-21 and 34-52) are also pinned with == per coding guidelines.

src/ai_company/config/schema.py (1)

64-67: Good default driver wiring.

This adds a typed, immutable config selector with a safe default and keeps existing configs working.

src/ai_company/providers/drivers/__init__.py (1)

7-9: Clean public export surface.

Explicitly exporting LiteLLMDriver here keeps driver imports stable and discoverable.

src/ai_company/providers/errors.py (1)

143-158: Error taxonomy extension looks solid.

The new driver-registry errors are specific and keep retry semantics explicit.

tests/unit/providers/test_registry.py (1)

130-202: Nice coverage for construction and immutability paths.

The from_config and source-dict mutation cases are especially valuable for guarding registry behavior.

tests/unit/providers/drivers/test_mappers.py (1)

22-278: Mapper tests are thorough and well-structured.

Good balance of nominal and edge-case coverage, especially for tool-call parsing variants.

src/ai_company/providers/__init__.py (1)

9-59: Public API exports are coherent with the new driver architecture.

LiteLLMDriver, ProviderRegistry, and driver-related errors are surfaced cleanly.

tests/unit/providers/drivers/test_litellm_driver.py (2)

338-440: Exception-path coverage is excellent.

The mapped-exception and stream-iteration failure tests are strong and directly exercise resilience behavior.

80-505: No action needed — 30-second timeout is already configured globally.

The pyproject.toml file already sets timeout = 30 in [tool.pytest.ini_options] (line 187), and pytest-timeout is installed as a test dependency. This global configuration automatically applies to all tests in the repository, including the async tests in this file, preventing hung streams from stalling CI.

src/ai_company/providers/drivers/mappers.py (1)

158-159: PEP 758 syntax is correctly applied for Python 3.14+.

Line 158 uses the PEP 758 except A, B: syntax, which is valid since the project requires Python 3.14+ (pinned in pyproject.toml), ruff targets py314, and CI runs on Python 3.14.

tests/unit/providers/drivers/conftest.py (1)

16-176: Fixture utilities look solid.

The helpers are deterministic and keep driver tests isolated from real provider/network behavior.

src/ai_company/providers/drivers/litellm_driver.py

src/ai_company/providers/drivers/mappers.py

src/ai_company/providers/registry.py

…pilot Source fixes: - Add collision detection in _build_model_lookup for alias/ID conflicts - Defensive check for empty choices in _map_response - Wrap response mapping in try/except to keep ProviderError hierarchy - Read supports_streaming from model info instead of hard-coding True - Case-insensitive retry-after header lookup per HTTP semantics - None-safe int conversion for usage token counts (or 0 pattern) - Fix streaming usage drop when prompt_tokens is zero - Replace bare except Exception with targeted catches + logging - Add warning logging for silent JSON parse failures in tool calls - Add warning logging for dropped/incomplete tool calls - Add warning logging for unknown finish reasons and skipped items - Add Anthropic-specific finish reason keys (end_turn, stop_sequence, tool_use) - Deep copy parameters_schema in tool mapper for immutability - Handle unhashable inputs in ProviderRegistry.__contains__ - Wrap factory call in _build_driver to catch construction errors - Document DriverAlreadyRegisteredError as reserved for future use - Remove unused mock_acompletion fixture - Multiple docstring improvements across driver, mappers, registry New tests (18 added, 253 total): - Stream exception before iteration - Response mapping error wrapped as ProviderError - Incomplete tool call accumulator dropped - Multiple concurrent streaming tool calls - Usage-only chunk with empty choices - Usage emitted when prompt_tokens is zero - Case-insensitive retry-after header - No headers / non-numeric retry-after edge cases - supports_streaming from model info - supports_streaming_tool_calls requires both - Non-callable factory, non-provider return, factory exception - Unhashable __contains__ returns False - Anthropic finish reasons (end_turn, stop_sequence, tool_use)

Aureliolo · 2026-03-01T11:34:31Z

/gemini review

coderabbitai

♻️ Duplicate comments (2)

src/ai_company/providers/drivers/mappers.py (2)
194-196: 🧹 Nitpick | 🔵 Trivial

Return a defensive copy for parsed dict arguments.

Same concern as above—the parsed dict is returned directly without copying.
♻️ Proposed fix
         if isinstance(parsed, dict):
-            return parsed
+            return dict(parsed)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/providers/drivers/mappers.py` around lines 194 - 196, The code
returns the input dict object directly from the branch checking "if
isinstance(parsed, dict): return parsed", which can lead to callers mutating
internal state; change that to return a defensive shallow copy (e.g., return
parsed.copy() or dict(parsed)) instead, updating the return in the function in
mappers.py that uses the "parsed" variable so callers receive a copy rather than
the original dict.
183-184: 🧹 Nitpick | 🔵 Trivial

Return a defensive copy for dict arguments to enforce immutability.

When raw is already a dict, returning it directly allows callers to mutate the original. This is inconsistent with the deep copy applied to parameters_schema at line 82.
♻️ Proposed fix
     if isinstance(raw, dict):
-        return raw
+        return dict(raw)
As per coding guidelines "Create new objects instead of mutating existing ones — enforce immutability".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/providers/drivers/mappers.py` around lines 183 - 184, Replace
the direct return of the dict `raw` with a defensive deep copy to enforce
immutability (consistent with the `parameters_schema` deep copy at line 82);
specifically, where `if isinstance(raw, dict): return raw` appears in
mappers.py, return a deep copy of `raw` instead (use copy.deepcopy(raw)) and add
the necessary `import copy` if not already present so callers cannot mutate the
original dict.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/ai_company/providers/drivers/mappers.py`:
- Around line 194-196: The code returns the input dict object directly from the
branch checking "if isinstance(parsed, dict): return parsed", which can lead to
callers mutating internal state; change that to return a defensive shallow copy
(e.g., return parsed.copy() or dict(parsed)) instead, updating the return in the
function in mappers.py that uses the "parsed" variable so callers receive a copy
rather than the original dict.
- Around line 183-184: Replace the direct return of the dict `raw` with a
defensive deep copy to enforce immutability (consistent with the
`parameters_schema` deep copy at line 82); specifically, where `if
isinstance(raw, dict): return raw` appears in mappers.py, return a deep copy of
`raw` instead (use copy.deepcopy(raw)) and add the necessary `import copy` if
not already present so callers cannot mutate the original dict.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49710c6 and 63f0774.

📒 Files selected for processing (8)

src/ai_company/providers/drivers/litellm_driver.py
src/ai_company/providers/drivers/mappers.py
src/ai_company/providers/errors.py
src/ai_company/providers/registry.py
tests/unit/providers/drivers/conftest.py
tests/unit/providers/drivers/test_litellm_driver.py
tests/unit/providers/drivers/test_mappers.py
tests/unit/providers/test_registry.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Agent

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Python 3.14+ with PEP 649 native lazy annotations
Do not use from __future__ import annotations — Python 3.14 has PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
Add type hints to all public functions, enforced by mypy strict mode
Use Google style docstrings on all public classes and functions, enforced by ruff D rules
Create new objects instead of mutating existing ones — enforce immutability
Use Pydantic v2 with BaseModel, model_validator, and ConfigDict
Keep line length to 88 characters, enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries: user input, external APIs, and config files

Files:

src/ai_company/providers/drivers/mappers.py
tests/unit/providers/test_registry.py
tests/unit/providers/drivers/test_mappers.py
src/ai_company/providers/errors.py
src/ai_company/providers/drivers/litellm_driver.py
tests/unit/providers/drivers/test_litellm_driver.py
src/ai_company/providers/registry.py
tests/unit/providers/drivers/conftest.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain 80% minimum code coverage, enforced in CI

Files:

src/ai_company/providers/drivers/mappers.py
src/ai_company/providers/errors.py
src/ai_company/providers/drivers/litellm_driver.py
src/ai_company/providers/registry.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Use asyncio_mode = 'auto' in pytest — no manual @pytest.mark.asyncio needed
Set 30-second timeout per test

Files:

tests/unit/providers/test_registry.py
tests/unit/providers/drivers/test_mappers.py
tests/unit/providers/drivers/test_litellm_driver.py
tests/unit/providers/drivers/conftest.py

🧠 Learnings (12)

📚 Learning: 2026-03-01T10:09:25.209Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T10:09:25.209Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

src/ai_company/providers/drivers/mappers.py
src/ai_company/providers/drivers/litellm_driver.py

📚 Learning: 2026-03-01T10:09:25.209Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T10:09:25.209Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — ruff enforces this on Python 3.14

Applied to files:

src/ai_company/providers/drivers/mappers.py
src/ai_company/providers/drivers/litellm_driver.py

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.

Applied to files:

src/ai_company/providers/drivers/litellm_driver.py

📚 Learning: 2026-01-31T13:51:16.868Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-31T13:51:16.868Z
Learning: Applies to tests/**/*.py : Mock models in tests must use a name from `RECOMMENDED_MODELS` (e.g., `huihui_ai/dolphin3-abliterated:8b`) - fake model names cause `ValueError: No model tagged for role`.

Applied to files:

src/ai_company/providers/drivers/litellm_driver.py

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/test*.py : Agent tests should cover: successful generation with valid output, handling malformed LLM responses, error conditions (network errors, timeouts), output format validation, and integration with story state

Applied to files:

tests/unit/providers/drivers/test_litellm_driver.py

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Mock Ollama API responses to support both dict (`models.get("models")`) and object (`response.models`) patterns in test mocks.

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Mock Ollama API calls in tests to avoid requiring a running Ollama instance

Applied to files:

tests/unit/providers/drivers/conftest.py

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/**/*.py : Mock Ollama in tests to avoid requiring running instance - use model names from `RECOMMENDED_MODELS` (e.g., `huihui_ai/dolphin3-abliterated:8b`)

Applied to files:

tests/unit/providers/drivers/conftest.py

🧬 Code graph analysis (5)

tests/unit/providers/test_registry.py (3)

src/ai_company/config/schema.py (2)

ProviderConfig (52-95)

ProviderModelConfig (17-49)

src/ai_company/providers/errors.py (2)

DriverFactoryNotFoundError (159-162)

DriverNotRegisteredError (143-146)

src/ai_company/providers/registry.py (4)

ProviderRegistry (21-139)

get (56-78)

list_providers (80-82)

from_config (96-139)

tests/unit/providers/drivers/test_mappers.py (2)

src/ai_company/providers/enums.py (2)

FinishReason (15-22)

MessageRole (6-12)

src/ai_company/providers/models.py (4)

ChatMessage (114-186)

ToolCall (73-95)

ToolDefinition (45-70)

ToolResult (98-111)

tests/unit/providers/drivers/test_litellm_driver.py (4)

src/ai_company/config/schema.py (2)

ProviderConfig (52-95)

ProviderModelConfig (17-49)

src/ai_company/providers/drivers/litellm_driver.py (1)

LiteLLMDriver (89-514)

src/ai_company/providers/enums.py (3)

FinishReason (15-22)

MessageRole (6-12)

StreamEventType (25-32)

src/ai_company/providers/errors.py (8)

AuthenticationError (72-75)

ContentFilterError (119-122)

InvalidRequestError (113-116)

ModelNotFoundError (107-110)

ProviderError (22-69)

ProviderInternalError (137-140)

ProviderTimeoutError (125-128)

RateLimitError (78-104)

src/ai_company/providers/registry.py (3)

src/ai_company/providers/base.py (1)

BaseCompletionProvider (26-273)

src/ai_company/providers/errors.py (2)

DriverFactoryNotFoundError (159-162)

DriverNotRegisteredError (143-146)

src/ai_company/config/schema.py (1)

ProviderConfig (52-95)

tests/unit/providers/drivers/conftest.py (1)

src/ai_company/config/schema.py (2)

ProviderConfig (52-95)

ProviderModelConfig (17-49)

🔇 Additional comments (13)

src/ai_company/providers/errors.py (1)

141-162: LGTM!

The new driver error classes follow the established pattern, have appropriate docstrings, and correctly set is_retryable = False for configuration-time errors.

src/ai_company/providers/drivers/mappers.py (1)

1-124: LGTM!

The mapping module is well-structured with:

Proper type hints and Google-style docstrings

Comprehensive finish reason mapping including Anthropic-specific values

Defensive deep copy for parameters_schema

Appropriate logging for unknown finish reasons

tests/unit/providers/drivers/conftest.py (1)

1-170: LGTM!

The conftest module provides well-designed test utilities:

Reusable mock factories with sensible defaults

Support for both dict and attribute-access patterns (per learnings about Ollama API mocks)

Configurable fixtures for various test scenarios

Proper async generator for streaming tests

tests/unit/providers/test_registry.py (1)

1-249: LGTM!

Comprehensive test coverage for the ProviderRegistry including:

Core operations (get, list, contains, len)

Factory configuration with overrides

Error cases (unknown driver, non-callable factory, non-provider return, factory exceptions)

Immutability verification

Default LiteLLM driver resolution

All tests properly marked with @pytest.mark.unit.

tests/unit/providers/drivers/test_mappers.py (1)

1-281: LGTM!

Thorough test coverage for mapper functions including:

Message conversion for all role types

Tool definition conversion

Finish reason mapping with Anthropic-specific values

Tool call extraction with edge cases (None, empty, invalid JSON, missing fields)

All tests properly marked with @pytest.mark.unit.

src/ai_company/providers/registry.py (1)

1-196: LGTM!

Well-designed registry implementation:

Immutable driver mapping using MappingProxyType

Defensive copy on construction

Robust __contains__ handling unhashable inputs

Comprehensive factory validation (callable check, type check, exception wrapping)

Clear error messages listing available providers/drivers

tests/unit/providers/drivers/test_litellm_driver.py (1)

1-759: LGTM!

Excellent test coverage for LiteLLMDriver including:

Non-streaming completion with various configurations

Streaming with tool call deltas and usage tracking

Exception mapping for all LiteLLM exception types

Model capabilities with fallbacks

Edge cases (empty choices, zero prompt_tokens, case-insensitive headers)

All tests properly marked with @pytest.mark.unit and use mocked LiteLLM calls.

src/ai_company/providers/drivers/litellm_driver.py (6)

1-106: LGTM!

The driver module is well-structured with:

Clean separation of concerns (hooks, mapping, streaming, exception handling)

Comprehensive LiteLLM exception mapping table

Proper TYPE_CHECKING pattern for type-only imports

219-261: LGTM!

Model resolution properly validates:

Duplicate model IDs

Alias collisions with existing keys

Clear error messages with conflicting identifiers

293-337: LGTM!

Response mapping implements defensive patterns:

Guard against empty choices with descriptive error

Null-safe token extraction with or 0 coercion

Proper use of getattr for attribute access

370-431: LGTM!

Streaming implementation handles edge cases:

Usage-only chunks with empty choices

Usage emission regardless of prompt_tokens value

Null-safe token conversion

Proper delta accumulation for tool calls

464-485: LGTM!

Retry-after extraction properly implements:

Case-insensitive header lookup per HTTP semantics

Graceful handling of non-numeric values

PEP 758 compliant exception syntax

573-622: LGTM!

The _ToolCallAccumulator class properly:

Accumulates streaming deltas incrementally

Handles incomplete tool calls with appropriate logging

Gracefully handles JSON parse failures

Uses PEP 758 compliant exception syntax

Copilot

Pull request overview

Copilot reviewed 12 out of 14 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-01T11:39:07Z

src/ai_company/providers/drivers/mappers.py

+        try:
+            parsed = json.loads(raw)
+        except json.JSONDecodeError, ValueError:
+            _logger.warning(
+                "Failed to parse tool call arguments: %r",
+                raw[:200],
+            )
+            return {}


Invalid exception syntax: except json.JSONDecodeError, ValueError: is not valid in Python 3. Use except (json.JSONDecodeError, ValueError): (or just json.JSONDecodeError) so this file parses and the fallback {} path works.

Copilot · 2026-03-01T11:39:08Z

src/ai_company/providers/drivers/litellm_driver.py

+        try:
+            return float(raw)
+        except ValueError, TypeError:
+            _logger.debug(
+                "Could not parse retry-after header as seconds: %r",
+                raw,
+            )
+            return None


Invalid exception syntax: except ValueError, TypeError: is not valid in Python 3. Wrap the exception types in parentheses (e.g., except (ValueError, TypeError):) so retry-after parsing doesn’t cause a SyntaxError.

Copilot · 2026-03-01T11:39:08Z

src/ai_company/providers/drivers/litellm_driver.py

+        try:
+            raw = _litellm.get_model_info(model=litellm_model)
+            info: dict[str, Any] = dict(raw) if raw else {}
+        except KeyError, ValueError:
+            _logger.info(
+                "No LiteLLM metadata for model %r, using config defaults",
+                litellm_model,
+            )
+            return {}


Invalid exception syntax: except KeyError, ValueError: is not valid in Python 3. Use except (KeyError, ValueError): so model-info fallback works and the module can be imported.

Copilot · 2026-03-01T11:39:08Z

src/ai_company/providers/drivers/litellm_driver.py

+        try:
+            parsed = json.loads(self.arguments) if self.arguments else {}
+        except json.JSONDecodeError, ValueError:
+            _logger.warning(
+                "Failed to parse tool call arguments for tool %r (id=%r): %r",
+                self.name,
+                self.id,
+                self.arguments[:200] if self.arguments else "",
+            )
+            parsed = {}


Invalid exception syntax: except json.JSONDecodeError, ValueError: is not valid in Python 3. Use except (json.JSONDecodeError, ValueError): (or just json.JSONDecodeError) so streamed tool-call argument parsing doesn’t break module import.

gemini-code-assist

Code Review

This pull request introduces a well-designed and extensible driver system for LLM providers, with LiteLLM as the default backend. The new ProviderRegistry provides a clean, immutable way to manage provider drivers, and the pure mapping functions in mappers.py create a strong separation of concerns. The exception handling is particularly robust, mapping a wide range of provider-specific errors to a unified hierarchy, and the code is accompanied by a comprehensive suite of unit tests. However, potential security issues have been identified, specifically the leakage of sensitive information in error logs and a resource exhaustion vulnerability in the streaming tool call accumulation logic. Additionally, there are critical syntax errors in the exception handling blocks, using Python 2-style syntax (e.g., except A, B:) which is invalid in Python 3 and will cause a SyntaxError. Addressing these concerns will improve the production readiness and security posture of the provider layer.

gemini-code-assist · 2026-03-01T11:45:33Z

src/ai_company/providers/drivers/litellm_driver.py

+            return None
+        try:
+            return float(raw)
+        except ValueError, TypeError:


This except syntax is for Python 2. For Python 3, multiple exceptions must be caught as a tuple. This will raise a SyntaxError at runtime.

except (ValueError, TypeError):

gemini-code-assist · 2026-03-01T11:45:33Z

src/ai_company/providers/drivers/litellm_driver.py

+        try:
+            raw = _litellm.get_model_info(model=litellm_model)
+            info: dict[str, Any] = dict(raw) if raw else {}
+        except KeyError, ValueError:


This except syntax is for Python 2. For Python 3, multiple exceptions must be caught as a tuple. This will raise a SyntaxError at runtime.

except (KeyError, ValueError):

gemini-code-assist · 2026-03-01T11:45:33Z

src/ai_company/providers/drivers/litellm_driver.py

+            return None
+        try:
+            parsed = json.loads(self.arguments) if self.arguments else {}
+        except json.JSONDecodeError, ValueError:


This except syntax is for Python 2. For Python 3, multiple exceptions must be caught as a tuple. This will raise a SyntaxError at runtime.

Suggested change

except json.JSONDecodeError, ValueError:

except (json.JSONDecodeError, ValueError):

gemini-code-assist · 2026-03-01T11:45:33Z

src/ai_company/providers/drivers/mappers.py

+    if isinstance(raw, str):
+        try:
+            parsed = json.loads(raw)
+        except json.JSONDecodeError, ValueError:


This except syntax is for Python 2. For Python 3, multiple exceptions must be caught as a tuple. This will raise a SyntaxError at runtime.

Suggested change

except json.JSONDecodeError, ValueError:

except (json.JSONDecodeError, ValueError):

gemini-code-assist · 2026-03-01T11:45:33Z

src/ai_company/providers/drivers/litellm_driver.py

+                        retry_after=self._extract_retry_after(exc),
+                        context=ctx,
+                    )
+                return our_type(str(exc), context=ctx)


Raw exception strings from LiteLLM are included directly in the error message. These strings can contain sensitive information like API keys (e.g., in authentication errors). Since these messages are often logged, this can lead to secret leakage. It is safer to use a generic message and put the exception details in the context dictionary, which handles redaction.

gemini-code-assist · 2026-03-01T11:45:33Z

src/ai_company/providers/drivers/litellm_driver.py

+            return exc
+
+        return errors.ProviderInternalError(
+            f"Unexpected error from {self._provider_name}: {exc}",


Including the raw exception exc in the error message can leak sensitive information if the exception string contains secrets. Consider using a generic message and moving the exception details to the context dictionary.

gemini-code-assist · 2026-03-01T11:45:33Z

src/ai_company/providers/drivers/litellm_driver.py

+                self.name = str(name)
+            args = getattr(func, "arguments", None)
+            if args:
+                self.arguments += str(args)


Tool call arguments are accumulated here without any length limit. A malicious LLM provider or a prompt injection attack could send an infinite stream of deltas, leading to memory exhaustion and a Denial of Service (DoS). Please implement a maximum length check for self.arguments.

gemini-code-assist · 2026-03-01T11:45:33Z

src/ai_company/providers/registry.py

+    except Exception as exc:
+        msg = (
+            f"Failed to instantiate driver {driver_type!r} for provider {name!r}: {exc}"
+        )


The raw exception exc is included in the error message, which may leak sensitive configuration data if the exception string contains secrets. Use a generic error message instead.

…copies - Move raw exception strings from error messages to context dicts to prevent potential API key leakage in logs (Gemini security-medium) - Add 1 MiB length limit to _ToolCallAccumulator to prevent DoS via infinite streaming deltas (Gemini security-medium) - Return defensive copies from _parse_arguments for immutability consistency (CodeRabbit nitpick) - Update registry factory error to use generic message with detail in context

## Summary - Adds **39 integration tests** for the provider adapter layer, completing the final unchecked acceptance criterion from #5: _"Integration tests with mock/recorded API responses"_ - All source code was already implemented in PRs #86 and #88 — this PR covers only the integration test suite - Mocks at `litellm.acompletion` level using **real `litellm.ModelResponse`** objects (not MagicMock), exercising actual attribute access paths through `_map_response`, `_process_chunk`, and `extract_tool_calls` ### Test files | File | Tests | Coverage | |------|-------|---------| | `test_anthropic_pipeline.py` | 13 | Config→registry→complete/stream, alias resolution, cost computation, streaming | | `test_openrouter_pipeline.py` | 5 | Custom base_url forwarding, model prefixing, multi-model alias resolution | | `test_ollama_pipeline.py` | 4 | No api_key, localhost base_url, zero-cost models | | `test_error_scenarios.py` | 9 | Rate limit (429 + retry-after), auth (401), timeout, connection, internal, unknown | | `test_tool_calling_pipeline.py` | 8 | Single/multiple tool calls, streaming accumulation, mixed text+tools, multi-turn | | `conftest.py` | — | Config factories, real ModelResponse builders, stream helpers | ### Verification - `ruff check` — all passed - `ruff format` — all formatted - `mypy` — 0 errors (7 files) - `pytest` — 1331 total tests pass, **94.49% coverage** (80% required) Closes #5 ## Test plan - [ ] CI passes (lint + type-check + test + coverage) - [ ] 39 integration tests pass under `pytest -m integration` - [ ] No regressions in existing 1292 unit tests - [ ] Coverage remains above 80% threshold

🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 1, 2026 11:15

Copilot started reviewing on behalf of Aureliolo March 1, 2026 11:15 View session

Copilot AI reviewed Mar 1, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 1, 2026

View reviewed changes

coderabbitai bot reviewed Mar 1, 2026

View reviewed changes

Aureliolo requested a review from Copilot March 1, 2026 11:34

Copilot started reviewing on behalf of Aureliolo March 1, 2026 11:34 View session

coderabbitai bot reviewed Mar 1, 2026

View reviewed changes

Copilot AI reviewed Mar 1, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 1, 2026

View reviewed changes

Aureliolo merged commit ae3f18b into main Mar 1, 2026
7 of 8 checks passed

Aureliolo deleted the feat/litellm-driver branch March 1, 2026 12:07

coderabbitai bot mentioned this pull request Mar 1, 2026

fix: enforce strict mypy on test files #89

Merged

4 tasks

Aureliolo mentioned this pull request Mar 1, 2026

test: add provider adapter integration tests #90

Merged

4 tasks

This was referenced Mar 1, 2026

feat: wire all modules into observability system #97

Merged

feat: add retry handler, rate limiter, and provider resilience #100

Merged

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release ai-company 0.1.1 #282

Merged

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release 0.1.0 #283

Merged

This was referenced Mar 15, 2026

chore(main): release 0.2.4 #431

Merged

chore(main): release 0.2.0 #442

Closed

chore(main): release 0.2.5 #447

Merged

This was referenced Mar 15, 2026

chore(main): release 0.2.0 #460

Closed

chore(main): release 0.2.0 #471

Closed

	except json.JSONDecodeError, ValueError:
	except (json.JSONDecodeError, ValueError):

	except ValueError, TypeError:
	except (ValueError, TypeError):

Conversation

Aureliolo commented Mar 1, 2026

Summary

Architecture

New files

Modified files

Test plan

Uh oh!

coderabbitai bot commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 1, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Aureliolo commented Mar 1, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

coderabbitai bot commented Mar 1, 2026 •

edited

Loading