feat: implement LiteLLM driver and provider registry#88
Conversation
Add the "Employment Agency" — a swappable driver system with LiteLLM as the default backend. Implements the concrete provider layer behind the contracts designed in #86. - LiteLLMDriver wrapping litellm.acompletion for streaming and non-streaming completions, model capability queries, and full exception mapping (10 LiteLLM exception types → ProviderError hierarchy) - ProviderRegistry (immutable) mapping provider names to driver instances, built from config via from_config() with factory_overrides for testing - Pure mapping functions (messages, tools, finish reasons, tool calls) - 3 new driver error classes (DriverNotRegistered, DriverAlreadyRegistered, DriverFactoryNotFound) - driver field on ProviderConfig (defaults to "litellm") - Upgraded litellm 1.67.2 → 1.82.0 (fixes Python 3.14 compat, removes need for PYTHONUTF8 env var and deprecation warning filters) - Removed unused F403 ruff ignore for __init__.py - 235 new unit tests, all 1273 tests pass at 94% coverage
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughAdds a LiteLLM-backed provider layer: new LiteLLMDriver, mapping utilities, driver registry and errors, ProviderConfig.driver field, package export updates, pyproject/mypy adjustments, and comprehensive unit tests for drivers, mappers, and the registry. Changes
Sequence DiagramsequenceDiagram
actor Client
participant ProviderRegistry
participant LiteLLMDriver
participant Mappers
participant litellm
Client->>ProviderRegistry: get(provider_name)
ProviderRegistry-->>Client: LiteLLMDriver instance
Client->>LiteLLMDriver: complete(messages, tools, config)
LiteLLMDriver->>LiteLLMDriver: resolve model & apply config
LiteLLMDriver->>Mappers: messages_to_dicts(messages)
Mappers-->>LiteLLMDriver: message dicts
LiteLLMDriver->>Mappers: tools_to_dicts(tools)
Mappers-->>LiteLLMDriver: tool dicts
LiteLLMDriver->>litellm: acompletion(model, messages, tools, ...)
litellm-->>LiteLLMDriver: ModelResponse / stream
LiteLLMDriver->>Mappers: extract_tool_calls(raw_tool_calls)
Mappers-->>LiteLLMDriver: ToolCall tuples
LiteLLMDriver-->>Client: CompletionResponse / stream chunks
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a foundational architectural shift by implementing a pluggable driver system for LLM providers. It centralizes the management and interaction with various LLM backends through a Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR introduces a swappable provider “driver” layer (defaulting to LiteLLM) plus an immutable ProviderRegistry that builds provider instances from config, aligning with the unified provider contracts introduced in #86.
Changes:
- Added
LiteLLMDriverimplementingBaseCompletionProviderwith streaming/non-streaming completions, model capability lookup, and provider error mapping. - Added
ProviderRegistryto construct and expose provider-name → driver mappings fromProviderConfig.driver. - Added pure mapping utilities (
messages_to_dicts,tools_to_dicts, finish-reason/tool-call extraction) and comprehensive unit tests; bumpedlitellmto1.82.0.
Reviewed changes
Copilot reviewed 12 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
pyproject.toml |
Adds litellm==1.82.0, mypy override for litellm.*, and adjusts ruff per-file ignore for __init__.py. |
uv.lock |
Locks litellm and its transitive dependencies. |
src/ai_company/config/schema.py |
Adds ProviderConfig.driver field (default "litellm"). |
src/ai_company/providers/errors.py |
Introduces registry/driver-related error types. |
src/ai_company/providers/registry.py |
Implements immutable ProviderRegistry + factory-based construction from config. |
src/ai_company/providers/drivers/mappers.py |
Adds message/tool/finish-reason/tool-call mapping helpers. |
src/ai_company/providers/drivers/litellm_driver.py |
Implements LiteLLM-backed driver with streaming support and exception mapping. |
src/ai_company/providers/drivers/__init__.py |
Exposes driver(s) from the subpackage. |
src/ai_company/providers/__init__.py |
Exports registry + driver + new error types from the top-level providers package. |
tests/unit/providers/test_registry.py |
Unit tests for ProviderRegistry. |
tests/unit/providers/drivers/test_mappers.py |
Unit tests for mapping helpers. |
tests/unit/providers/drivers/test_litellm_driver.py |
Unit tests for LiteLLMDriver (mocked LiteLLM calls). |
tests/unit/providers/drivers/conftest.py |
Shared mock factories/fixtures for driver tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if isinstance(raw, str): | ||
| try: | ||
| parsed = json.loads(raw) | ||
| except json.JSONDecodeError, ValueError: |
There was a problem hiding this comment.
Invalid exception syntax: except json.JSONDecodeError, ValueError: is Python 2 syntax and raises a SyntaxError on import under Python 3. Use except (json.JSONDecodeError, ValueError): instead.
| except json.JSONDecodeError, ValueError: | |
| except (json.JSONDecodeError, ValueError): |
| return None | ||
| try: | ||
| return float(raw) | ||
| except ValueError, TypeError: |
There was a problem hiding this comment.
Invalid exception syntax: except ValueError, TypeError: is Python 2 syntax and will raise a SyntaxError under Python 3. Use except (ValueError, TypeError): instead.
| except ValueError, TypeError: | |
| except (ValueError, TypeError): |
| return None | ||
| try: | ||
| parsed = json.loads(self.arguments) if self.arguments else {} | ||
| except json.JSONDecodeError, ValueError: |
There was a problem hiding this comment.
Invalid exception syntax: except json.JSONDecodeError, ValueError: is Python 2 syntax and will raise a SyntaxError under Python 3. Use except (json.JSONDecodeError, ValueError): (or just json.JSONDecodeError) instead.
| except json.JSONDecodeError, ValueError: | |
| except (json.JSONDecodeError, ValueError): |
There was a problem hiding this comment.
Code Review
This pull request introduces a well-designed and extensible driver system for LLM providers, with LiteLLM as the default implementation. The architecture is clean, with good separation of concerns between the driver, mappers, and registry. The code is extensively tested and the exception handling is robust. I've found a few critical syntax issues related to exception handling that appear to be from Python 2, which will cause errors in the target Python 3.14 environment. Once these are addressed, this will be an excellent addition to the codebase.
| return None | ||
| try: | ||
| return float(raw) | ||
| except ValueError, TypeError: |
| return None | ||
| try: | ||
| parsed = json.loads(self.arguments) if self.arguments else {} | ||
| except json.JSONDecodeError, ValueError: |
| if isinstance(raw, str): | ||
| try: | ||
| parsed = json.loads(raw) | ||
| except json.JSONDecodeError, ValueError: |
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/providers/drivers/litellm_driver.py`:
- Around line 205-208: The loop that builds _model_lookup (the for m in models
block creating lookup[m.id] and lookup[m.alias]) can silently overwrite entries
when an alias equals another model's id or alias; update the builder to detect
collisions: when about to assign lookup[key] check if key already exists and if
so raise or log a clear validation error referencing the conflicting model
ids/aliases (include both existing and new m.id/m.alias) and skip or abort
loading as appropriate per policy; ensure checks cover both m.id and m.alias and
run before any assignment so _model_lookup cannot be silently remapped.
- Around line 360-363: The streaming branch currently drops usage events when
usage_obj.prompt_tokens is zero; change the condition in the streaming path
(where usage_obj is retrieved from chunk and _make_usage_chunk is called) to
emit usage whenever usage_obj is not None (i.e., check usage_obj is not None)
rather than requiring prompt_tokens to be truthy, so
result.append(self._make_usage_chunk(usage_obj, model_config)) runs for valid
usage objects even if prompt_tokens == 0.
- Around line 276-277: The code converts usage_obj prompt/completion token
attributes with int(getattr(...)) which throws TypeError if the attribute exists
but is None; change the conversions to coerce None to 0 (e.g., input_tok =
int(getattr(usage_obj, "prompt_tokens", 0) or 0) and output_tok =
int(getattr(usage_obj, "completion_tokens", 0) or 0)) and make the same change
for the other occurrence referenced (the conversions at the second location
around lines with input_tok/output_tok in the later block) so None values are
safely treated as 0 before int().
- Around line 421-425: The code currently does a case-sensitive lookup raw =
headers.get("retry-after") which breaks HTTP semantics; update the lookup in
litellm_driver.py (around the getattr(exc, "headers", None) handling) to perform
a case-insensitive search (for example, by normalizing keys or iterating
headers.items() and matching k.lower() == "retry-after") and assign the found
value to raw; keep the existing isinstance(headers, dict) guard and ensure the
new lookup still returns None when no Retry-After header is present.
- Around line 265-266: In _map_response, avoid direct indexing of
response.choices[0]; instead retrieve choices via getattr(response, "choices",
[]) and check for emptiness—if empty, raise ProviderInternalError with a clear
message; otherwise use the first choice (e.g., choice = choices[0]) and continue
mapping as before. Ensure the change is made inside the _map_response method and
mirror the defensive pattern used in _process_chunk.
- Around line 186-189: The supports_streaming flag is hard-coded True; change it
to read the model info like the other capabilities (e.g., supports_streaming =
bool(info.get("supports_streaming", False))) and set
supports_streaming_tool_calls to the logical AND of streaming and
function-calling (e.g., bool(info.get("supports_function_calling", False)) and
supports_streaming) so non-streaming models aren't routed to streaming
endpoints; update the assignment locations where supports_streaming and
supports_streaming_tool_calls are set (the dict building that currently contains
supports_streaming=True and supports_streaming_tool_calls=...) to use these
extracted values.
In `@src/ai_company/providers/drivers/mappers.py`:
- Around line 73-79: The mapper currently returns payloads containing mutable
dicts by reference (e.g., using tool.parameters_schema in the function payload
and other dicts around lines 153-161); fix by returning defensive copies of any
mutable objects before including them in the returned dicts (use shallow or deep
copy as appropriate for nested structures) so callers cannot mutate the original
tool or schema objects — update the mapper return that constructs the "function"
payload (references: tool.name, tool.description, tool.parameters_schema) and
the other dict-producing mapper(s) around lines 153-161 to clone their dict/list
values before returning.
- Around line 83-103: The _FINISH_REASON_MAP in map_finish_reason is missing
Anthropic-specific keys so Anthropic finish reasons like "end_turn",
"stop_sequence", and "tool_use" currently fall back to FinishReason.ERROR;
update _FINISH_REASON_MAP to include "end_turn" -> FinishReason.STOP,
"stop_sequence" -> FinishReason.STOP, and "tool_use" -> FinishReason.TOOL_USE
(optionally normalize incoming reason with .lower() inside map_finish_reason
before lookup) so these provider-native values map correctly instead of
defaulting to ERROR.
In `@src/ai_company/providers/registry.py`:
- Around line 84-86: The __contains__ method currently does "name in
self._drivers" which raises TypeError for unhashable inputs; update the
Registry.__contains__ implementation to handle unhashable objects by performing
the membership test inside a try/except TypeError block (or using a safe lookup)
and return False when a TypeError occurs so unhashable probes (e.g., lists) do
not propagate exceptions; reference the __contains__ method and the
self._drivers attribute when making the change.
- Around line 174-175: The call to factory(name, config) can raise raw
exceptions which bypass the registry's driver error type; wrap the invocation of
factory inside a try/except in the registry code where driver = factory(name,
config) is executed, catch any Exception, and re-raise the registry's driver
error type (including contextual information: provider name and config) while
preserving the original exception as the cause; then proceed to the isinstance
check for BaseCompletionProvider as before.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (13)
pyproject.tomlsrc/ai_company/config/schema.pysrc/ai_company/providers/__init__.pysrc/ai_company/providers/drivers/__init__.pysrc/ai_company/providers/drivers/litellm_driver.pysrc/ai_company/providers/drivers/mappers.pysrc/ai_company/providers/errors.pysrc/ai_company/providers/registry.pytests/unit/providers/drivers/__init__.pytests/unit/providers/drivers/conftest.pytests/unit/providers/drivers/test_litellm_driver.pytests/unit/providers/drivers/test_mappers.pytests/unit/providers/test_registry.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Agent
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Python 3.14+ with PEP 649 native lazy annotations
Do not usefrom __future__ import annotations— Python 3.14 has PEP 649
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14
Add type hints to all public functions, enforced by mypy strict mode
Use Google style docstrings on all public classes and functions, enforced by ruff D rules
Create new objects instead of mutating existing ones — enforce immutability
Use Pydantic v2 withBaseModel,model_validator, andConfigDict
Keep line length to 88 characters, enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries: user input, external APIs, and config files
Files:
tests/unit/providers/drivers/test_litellm_driver.pytests/unit/providers/drivers/test_mappers.pysrc/ai_company/providers/registry.pysrc/ai_company/providers/__init__.pysrc/ai_company/config/schema.pysrc/ai_company/providers/drivers/__init__.pysrc/ai_company/providers/errors.pysrc/ai_company/providers/drivers/litellm_driver.pytests/unit/providers/test_registry.pysrc/ai_company/providers/drivers/mappers.pytests/unit/providers/drivers/conftest.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Useasyncio_mode = 'auto'in pytest — no manual@pytest.mark.asyncioneeded
Set 30-second timeout per test
Files:
tests/unit/providers/drivers/test_litellm_driver.pytests/unit/providers/drivers/test_mappers.pytests/unit/providers/test_registry.pytests/unit/providers/drivers/conftest.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain 80% minimum code coverage, enforced in CI
Files:
src/ai_company/providers/registry.pysrc/ai_company/providers/__init__.pysrc/ai_company/config/schema.pysrc/ai_company/providers/drivers/__init__.pysrc/ai_company/providers/errors.pysrc/ai_company/providers/drivers/litellm_driver.pysrc/ai_company/providers/drivers/mappers.py
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
Pin all dependency versions using
==inpyproject.toml
Files:
pyproject.toml
🧠 Learnings (8)
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/test*.py : Agent tests should cover: successful generation with valid output, handling malformed LLM responses, error conditions (network errors, timeouts), output format validation, and integration with story state
Applied to files:
tests/unit/providers/drivers/test_litellm_driver.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Mock Ollama API responses to support both dict (`models.get("models")`) and object (`response.models`) patterns in test mocks.
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Mock Ollama API calls in tests to avoid requiring a running Ollama instance
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/**/*.py : Mock Ollama in tests to avoid requiring running instance - use model names from `RECOMMENDED_MODELS` (e.g., `huihui_ai/dolphin3-abliterated:8b`)
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-31T13:51:16.868Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-31T13:51:16.868Z
Learning: Applies to tests/**/*.py : Ollama API response mocks must support both dict pattern (`models.get('models')`) and object pattern (`response.models`) to match actual API behavior.
Applied to files:
tests/unit/providers/drivers/conftest.py
🧬 Code graph analysis (5)
tests/unit/providers/drivers/test_mappers.py (3)
src/ai_company/providers/drivers/mappers.py (4)
extract_tool_calls(106-134)map_finish_reason(93-103)messages_to_dicts(15-25)tools_to_dicts(58-68)src/ai_company/providers/enums.py (2)
FinishReason(15-22)MessageRole(6-12)src/ai_company/providers/models.py (4)
ChatMessage(114-186)ToolCall(73-95)ToolDefinition(45-70)ToolResult(98-111)
src/ai_company/providers/registry.py (3)
src/ai_company/providers/base.py (1)
BaseCompletionProvider(26-273)src/ai_company/providers/errors.py (2)
DriverFactoryNotFoundError(155-158)DriverNotRegisteredError(143-146)src/ai_company/config/schema.py (1)
ProviderConfig(52-95)
src/ai_company/providers/drivers/__init__.py (1)
src/ai_company/providers/drivers/litellm_driver.py (1)
LiteLLMDriver(86-447)
src/ai_company/providers/drivers/litellm_driver.py (6)
src/ai_company/providers/errors.py (4)
AuthenticationError(72-75)RateLimitError(78-104)ProviderError(22-69)InvalidRequestError(113-116)src/ai_company/providers/base.py (2)
BaseCompletionProvider(26-273)compute_cost(182-238)src/ai_company/providers/capabilities.py (1)
ModelCapabilities(10-83)src/ai_company/providers/enums.py (1)
StreamEventType(25-32)src/ai_company/providers/models.py (5)
CompletionResponse(233-282)StreamChunk(285-363)ToolCall(73-95)ChatMessage(114-186)ToolDefinition(45-70)src/ai_company/config/schema.py (2)
ProviderConfig(52-95)ProviderModelConfig(17-49)
tests/unit/providers/drivers/conftest.py (1)
src/ai_company/config/schema.py (2)
ProviderConfig(52-95)ProviderModelConfig(17-49)
🔇 Additional comments (14)
pyproject.toml (4)
125-125: LGTM!Removing the F403 (star imports) ignore tightens lint rules appropriately. Retaining F401 for
__init__.pyis correct since these files commonly re-export symbols.
149-152: LGTM!The MyPy override to ignore missing imports for
litellm.*is appropriate since the library lacks complete type stubs.
195-197: LGTM!The list format for
filterwarningsis valid and more extensible. Treating warnings as errors during tests ensures deprecation issues are caught early.
17-17: No changes needed—all dependencies are correctly pinned.The litellm==1.82.0 dependency is properly pinned with
==and version 1.82.0 exists on PyPI. All other dependencies in pyproject.toml (lines 15-21 and 34-52) are also pinned with==per coding guidelines.src/ai_company/config/schema.py (1)
64-67: Good default driver wiring.This adds a typed, immutable config selector with a safe default and keeps existing configs working.
src/ai_company/providers/drivers/__init__.py (1)
7-9: Clean public export surface.Explicitly exporting
LiteLLMDriverhere keeps driver imports stable and discoverable.src/ai_company/providers/errors.py (1)
143-158: Error taxonomy extension looks solid.The new driver-registry errors are specific and keep retry semantics explicit.
tests/unit/providers/test_registry.py (1)
130-202: Nice coverage for construction and immutability paths.The
from_configand source-dict mutation cases are especially valuable for guarding registry behavior.tests/unit/providers/drivers/test_mappers.py (1)
22-278: Mapper tests are thorough and well-structured.Good balance of nominal and edge-case coverage, especially for tool-call parsing variants.
src/ai_company/providers/__init__.py (1)
9-59: Public API exports are coherent with the new driver architecture.
LiteLLMDriver,ProviderRegistry, and driver-related errors are surfaced cleanly.tests/unit/providers/drivers/test_litellm_driver.py (2)
338-440: Exception-path coverage is excellent.The mapped-exception and stream-iteration failure tests are strong and directly exercise resilience behavior.
80-505: No action needed — 30-second timeout is already configured globally.The
pyproject.tomlfile already setstimeout = 30in[tool.pytest.ini_options](line 187), andpytest-timeoutis installed as a test dependency. This global configuration automatically applies to all tests in the repository, including the async tests in this file, preventing hung streams from stalling CI.src/ai_company/providers/drivers/mappers.py (1)
158-159: PEP 758 syntax is correctly applied for Python 3.14+.Line 158 uses the PEP 758
except A, B:syntax, which is valid since the project requires Python 3.14+ (pinned inpyproject.toml), ruff targetspy314, and CI runs on Python 3.14.tests/unit/providers/drivers/conftest.py (1)
16-176: Fixture utilities look solid.The helpers are deterministic and keep driver tests isolated from real provider/network behavior.
…pilot Source fixes: - Add collision detection in _build_model_lookup for alias/ID conflicts - Defensive check for empty choices in _map_response - Wrap response mapping in try/except to keep ProviderError hierarchy - Read supports_streaming from model info instead of hard-coding True - Case-insensitive retry-after header lookup per HTTP semantics - None-safe int conversion for usage token counts (or 0 pattern) - Fix streaming usage drop when prompt_tokens is zero - Replace bare except Exception with targeted catches + logging - Add warning logging for silent JSON parse failures in tool calls - Add warning logging for dropped/incomplete tool calls - Add warning logging for unknown finish reasons and skipped items - Add Anthropic-specific finish reason keys (end_turn, stop_sequence, tool_use) - Deep copy parameters_schema in tool mapper for immutability - Handle unhashable inputs in ProviderRegistry.__contains__ - Wrap factory call in _build_driver to catch construction errors - Document DriverAlreadyRegisteredError as reserved for future use - Remove unused mock_acompletion fixture - Multiple docstring improvements across driver, mappers, registry New tests (18 added, 253 total): - Stream exception before iteration - Response mapping error wrapped as ProviderError - Incomplete tool call accumulator dropped - Multiple concurrent streaming tool calls - Usage-only chunk with empty choices - Usage emitted when prompt_tokens is zero - Case-insensitive retry-after header - No headers / non-numeric retry-after edge cases - supports_streaming from model info - supports_streaming_tool_calls requires both - Non-callable factory, non-provider return, factory exception - Unhashable __contains__ returns False - Anthropic finish reasons (end_turn, stop_sequence, tool_use)
|
/gemini review |
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/ai_company/providers/drivers/mappers.py (2)
194-196: 🧹 Nitpick | 🔵 TrivialReturn a defensive copy for parsed dict arguments.
Same concern as above—the parsed dict is returned directly without copying.
♻️ Proposed fix
if isinstance(parsed, dict): - return parsed + return dict(parsed)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/providers/drivers/mappers.py` around lines 194 - 196, The code returns the input dict object directly from the branch checking "if isinstance(parsed, dict): return parsed", which can lead to callers mutating internal state; change that to return a defensive shallow copy (e.g., return parsed.copy() or dict(parsed)) instead, updating the return in the function in mappers.py that uses the "parsed" variable so callers receive a copy rather than the original dict.
183-184: 🧹 Nitpick | 🔵 TrivialReturn a defensive copy for dict arguments to enforce immutability.
When
rawis already a dict, returning it directly allows callers to mutate the original. This is inconsistent with the deep copy applied toparameters_schemaat line 82.♻️ Proposed fix
if isinstance(raw, dict): - return raw + return dict(raw)As per coding guidelines "Create new objects instead of mutating existing ones — enforce immutability".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/providers/drivers/mappers.py` around lines 183 - 184, Replace the direct return of the dict `raw` with a defensive deep copy to enforce immutability (consistent with the `parameters_schema` deep copy at line 82); specifically, where `if isinstance(raw, dict): return raw` appears in mappers.py, return a deep copy of `raw` instead (use copy.deepcopy(raw)) and add the necessary `import copy` if not already present so callers cannot mutate the original dict.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/ai_company/providers/drivers/mappers.py`:
- Around line 194-196: The code returns the input dict object directly from the
branch checking "if isinstance(parsed, dict): return parsed", which can lead to
callers mutating internal state; change that to return a defensive shallow copy
(e.g., return parsed.copy() or dict(parsed)) instead, updating the return in the
function in mappers.py that uses the "parsed" variable so callers receive a copy
rather than the original dict.
- Around line 183-184: Replace the direct return of the dict `raw` with a
defensive deep copy to enforce immutability (consistent with the
`parameters_schema` deep copy at line 82); specifically, where `if
isinstance(raw, dict): return raw` appears in mappers.py, return a deep copy of
`raw` instead (use copy.deepcopy(raw)) and add the necessary `import copy` if
not already present so callers cannot mutate the original dict.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
src/ai_company/providers/drivers/litellm_driver.pysrc/ai_company/providers/drivers/mappers.pysrc/ai_company/providers/errors.pysrc/ai_company/providers/registry.pytests/unit/providers/drivers/conftest.pytests/unit/providers/drivers/test_litellm_driver.pytests/unit/providers/drivers/test_mappers.pytests/unit/providers/test_registry.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Agent
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Python 3.14+ with PEP 649 native lazy annotations
Do not usefrom __future__ import annotations— Python 3.14 has PEP 649
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14
Add type hints to all public functions, enforced by mypy strict mode
Use Google style docstrings on all public classes and functions, enforced by ruff D rules
Create new objects instead of mutating existing ones — enforce immutability
Use Pydantic v2 withBaseModel,model_validator, andConfigDict
Keep line length to 88 characters, enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries: user input, external APIs, and config files
Files:
src/ai_company/providers/drivers/mappers.pytests/unit/providers/test_registry.pytests/unit/providers/drivers/test_mappers.pysrc/ai_company/providers/errors.pysrc/ai_company/providers/drivers/litellm_driver.pytests/unit/providers/drivers/test_litellm_driver.pysrc/ai_company/providers/registry.pytests/unit/providers/drivers/conftest.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain 80% minimum code coverage, enforced in CI
Files:
src/ai_company/providers/drivers/mappers.pysrc/ai_company/providers/errors.pysrc/ai_company/providers/drivers/litellm_driver.pysrc/ai_company/providers/registry.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Useasyncio_mode = 'auto'in pytest — no manual@pytest.mark.asyncioneeded
Set 30-second timeout per test
Files:
tests/unit/providers/test_registry.pytests/unit/providers/drivers/test_mappers.pytests/unit/providers/drivers/test_litellm_driver.pytests/unit/providers/drivers/conftest.py
🧠 Learnings (12)
📚 Learning: 2026-03-01T10:09:25.209Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T10:09:25.209Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions
Applied to files:
src/ai_company/providers/drivers/mappers.pysrc/ai_company/providers/drivers/litellm_driver.py
📚 Learning: 2026-03-01T10:09:25.209Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-01T10:09:25.209Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — ruff enforces this on Python 3.14
Applied to files:
src/ai_company/providers/drivers/mappers.pysrc/ai_company/providers/drivers/litellm_driver.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.
Applied to files:
src/ai_company/providers/drivers/litellm_driver.py
📚 Learning: 2026-01-31T13:51:16.868Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-31T13:51:16.868Z
Learning: Applies to tests/**/*.py : Mock models in tests must use a name from `RECOMMENDED_MODELS` (e.g., `huihui_ai/dolphin3-abliterated:8b`) - fake model names cause `ValueError: No model tagged for role`.
Applied to files:
src/ai_company/providers/drivers/litellm_driver.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/test*.py : Agent tests should cover: successful generation with valid output, handling malformed LLM responses, error conditions (network errors, timeouts), output format validation, and integration with story state
Applied to files:
tests/unit/providers/drivers/test_litellm_driver.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Mock Ollama API responses to support both dict (`models.get("models")`) and object (`response.models`) patterns in test mocks.
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Mock Ollama API calls in tests to avoid requiring a running Ollama instance
Applied to files:
tests/unit/providers/drivers/conftest.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/**/*.py : Mock Ollama in tests to avoid requiring running instance - use model names from `RECOMMENDED_MODELS` (e.g., `huihui_ai/dolphin3-abliterated:8b`)
Applied to files:
tests/unit/providers/drivers/conftest.py
🧬 Code graph analysis (5)
tests/unit/providers/test_registry.py (3)
src/ai_company/config/schema.py (2)
ProviderConfig(52-95)ProviderModelConfig(17-49)src/ai_company/providers/errors.py (2)
DriverFactoryNotFoundError(159-162)DriverNotRegisteredError(143-146)src/ai_company/providers/registry.py (4)
ProviderRegistry(21-139)get(56-78)list_providers(80-82)from_config(96-139)
tests/unit/providers/drivers/test_mappers.py (2)
src/ai_company/providers/enums.py (2)
FinishReason(15-22)MessageRole(6-12)src/ai_company/providers/models.py (4)
ChatMessage(114-186)ToolCall(73-95)ToolDefinition(45-70)ToolResult(98-111)
tests/unit/providers/drivers/test_litellm_driver.py (4)
src/ai_company/config/schema.py (2)
ProviderConfig(52-95)ProviderModelConfig(17-49)src/ai_company/providers/drivers/litellm_driver.py (1)
LiteLLMDriver(89-514)src/ai_company/providers/enums.py (3)
FinishReason(15-22)MessageRole(6-12)StreamEventType(25-32)src/ai_company/providers/errors.py (8)
AuthenticationError(72-75)ContentFilterError(119-122)InvalidRequestError(113-116)ModelNotFoundError(107-110)ProviderError(22-69)ProviderInternalError(137-140)ProviderTimeoutError(125-128)RateLimitError(78-104)
src/ai_company/providers/registry.py (3)
src/ai_company/providers/base.py (1)
BaseCompletionProvider(26-273)src/ai_company/providers/errors.py (2)
DriverFactoryNotFoundError(159-162)DriverNotRegisteredError(143-146)src/ai_company/config/schema.py (1)
ProviderConfig(52-95)
tests/unit/providers/drivers/conftest.py (1)
src/ai_company/config/schema.py (2)
ProviderConfig(52-95)ProviderModelConfig(17-49)
🔇 Additional comments (13)
src/ai_company/providers/errors.py (1)
141-162: LGTM!The new driver error classes follow the established pattern, have appropriate docstrings, and correctly set
is_retryable = Falsefor configuration-time errors.src/ai_company/providers/drivers/mappers.py (1)
1-124: LGTM!The mapping module is well-structured with:
- Proper type hints and Google-style docstrings
- Comprehensive finish reason mapping including Anthropic-specific values
- Defensive deep copy for
parameters_schema- Appropriate logging for unknown finish reasons
tests/unit/providers/drivers/conftest.py (1)
1-170: LGTM!The conftest module provides well-designed test utilities:
- Reusable mock factories with sensible defaults
- Support for both dict and attribute-access patterns (per learnings about Ollama API mocks)
- Configurable fixtures for various test scenarios
- Proper async generator for streaming tests
tests/unit/providers/test_registry.py (1)
1-249: LGTM!Comprehensive test coverage for the ProviderRegistry including:
- Core operations (get, list, contains, len)
- Factory configuration with overrides
- Error cases (unknown driver, non-callable factory, non-provider return, factory exceptions)
- Immutability verification
- Default LiteLLM driver resolution
All tests properly marked with
@pytest.mark.unit.tests/unit/providers/drivers/test_mappers.py (1)
1-281: LGTM!Thorough test coverage for mapper functions including:
- Message conversion for all role types
- Tool definition conversion
- Finish reason mapping with Anthropic-specific values
- Tool call extraction with edge cases (None, empty, invalid JSON, missing fields)
All tests properly marked with
@pytest.mark.unit.src/ai_company/providers/registry.py (1)
1-196: LGTM!Well-designed registry implementation:
- Immutable driver mapping using
MappingProxyType- Defensive copy on construction
- Robust
__contains__handling unhashable inputs- Comprehensive factory validation (callable check, type check, exception wrapping)
- Clear error messages listing available providers/drivers
tests/unit/providers/drivers/test_litellm_driver.py (1)
1-759: LGTM!Excellent test coverage for LiteLLMDriver including:
- Non-streaming completion with various configurations
- Streaming with tool call deltas and usage tracking
- Exception mapping for all LiteLLM exception types
- Model capabilities with fallbacks
- Edge cases (empty choices, zero prompt_tokens, case-insensitive headers)
All tests properly marked with
@pytest.mark.unitand use mocked LiteLLM calls.src/ai_company/providers/drivers/litellm_driver.py (6)
1-106: LGTM!The driver module is well-structured with:
- Clean separation of concerns (hooks, mapping, streaming, exception handling)
- Comprehensive LiteLLM exception mapping table
- Proper TYPE_CHECKING pattern for type-only imports
219-261: LGTM!Model resolution properly validates:
- Duplicate model IDs
- Alias collisions with existing keys
- Clear error messages with conflicting identifiers
293-337: LGTM!Response mapping implements defensive patterns:
- Guard against empty choices with descriptive error
- Null-safe token extraction with
or 0coercion- Proper use of
getattrfor attribute access
370-431: LGTM!Streaming implementation handles edge cases:
- Usage-only chunks with empty choices
- Usage emission regardless of prompt_tokens value
- Null-safe token conversion
- Proper delta accumulation for tool calls
464-485: LGTM!Retry-after extraction properly implements:
- Case-insensitive header lookup per HTTP semantics
- Graceful handling of non-numeric values
- PEP 758 compliant exception syntax
573-622: LGTM!The
_ToolCallAccumulatorclass properly:
- Accumulates streaming deltas incrementally
- Handles incomplete tool calls with appropriate logging
- Gracefully handles JSON parse failures
- Uses PEP 758 compliant exception syntax
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 14 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| parsed = json.loads(raw) | ||
| except json.JSONDecodeError, ValueError: | ||
| _logger.warning( | ||
| "Failed to parse tool call arguments: %r", | ||
| raw[:200], | ||
| ) | ||
| return {} |
There was a problem hiding this comment.
Invalid exception syntax: except json.JSONDecodeError, ValueError: is not valid in Python 3. Use except (json.JSONDecodeError, ValueError): (or just json.JSONDecodeError) so this file parses and the fallback {} path works.
| try: | ||
| return float(raw) | ||
| except ValueError, TypeError: | ||
| _logger.debug( | ||
| "Could not parse retry-after header as seconds: %r", | ||
| raw, | ||
| ) | ||
| return None |
There was a problem hiding this comment.
Invalid exception syntax: except ValueError, TypeError: is not valid in Python 3. Wrap the exception types in parentheses (e.g., except (ValueError, TypeError):) so retry-after parsing doesn’t cause a SyntaxError.
| try: | ||
| raw = _litellm.get_model_info(model=litellm_model) | ||
| info: dict[str, Any] = dict(raw) if raw else {} | ||
| except KeyError, ValueError: | ||
| _logger.info( | ||
| "No LiteLLM metadata for model %r, using config defaults", | ||
| litellm_model, | ||
| ) | ||
| return {} |
There was a problem hiding this comment.
Invalid exception syntax: except KeyError, ValueError: is not valid in Python 3. Use except (KeyError, ValueError): so model-info fallback works and the module can be imported.
| try: | ||
| parsed = json.loads(self.arguments) if self.arguments else {} | ||
| except json.JSONDecodeError, ValueError: | ||
| _logger.warning( | ||
| "Failed to parse tool call arguments for tool %r (id=%r): %r", | ||
| self.name, | ||
| self.id, | ||
| self.arguments[:200] if self.arguments else "", | ||
| ) | ||
| parsed = {} |
There was a problem hiding this comment.
Invalid exception syntax: except json.JSONDecodeError, ValueError: is not valid in Python 3. Use except (json.JSONDecodeError, ValueError): (or just json.JSONDecodeError) so streamed tool-call argument parsing doesn’t break module import.
There was a problem hiding this comment.
Code Review
This pull request introduces a well-designed and extensible driver system for LLM providers, with LiteLLM as the default backend. The new ProviderRegistry provides a clean, immutable way to manage provider drivers, and the pure mapping functions in mappers.py create a strong separation of concerns. The exception handling is particularly robust, mapping a wide range of provider-specific errors to a unified hierarchy, and the code is accompanied by a comprehensive suite of unit tests. However, potential security issues have been identified, specifically the leakage of sensitive information in error logs and a resource exhaustion vulnerability in the streaming tool call accumulation logic. Additionally, there are critical syntax errors in the exception handling blocks, using Python 2-style syntax (e.g., except A, B:) which is invalid in Python 3 and will cause a SyntaxError. Addressing these concerns will improve the production readiness and security posture of the provider layer.
| return None | ||
| try: | ||
| return float(raw) | ||
| except ValueError, TypeError: |
| try: | ||
| raw = _litellm.get_model_info(model=litellm_model) | ||
| info: dict[str, Any] = dict(raw) if raw else {} | ||
| except KeyError, ValueError: |
| return None | ||
| try: | ||
| parsed = json.loads(self.arguments) if self.arguments else {} | ||
| except json.JSONDecodeError, ValueError: |
| if isinstance(raw, str): | ||
| try: | ||
| parsed = json.loads(raw) | ||
| except json.JSONDecodeError, ValueError: |
| retry_after=self._extract_retry_after(exc), | ||
| context=ctx, | ||
| ) | ||
| return our_type(str(exc), context=ctx) |
There was a problem hiding this comment.
Raw exception strings from LiteLLM are included directly in the error message. These strings can contain sensitive information like API keys (e.g., in authentication errors). Since these messages are often logged, this can lead to secret leakage. It is safer to use a generic message and put the exception details in the context dictionary, which handles redaction.
| return exc | ||
|
|
||
| return errors.ProviderInternalError( | ||
| f"Unexpected error from {self._provider_name}: {exc}", |
| self.name = str(name) | ||
| args = getattr(func, "arguments", None) | ||
| if args: | ||
| self.arguments += str(args) |
There was a problem hiding this comment.
src/ai_company/providers/registry.py
Outdated
| except Exception as exc: | ||
| msg = ( | ||
| f"Failed to instantiate driver {driver_type!r} for provider {name!r}: {exc}" | ||
| ) |
…copies - Move raw exception strings from error messages to context dicts to prevent potential API key leakage in logs (Gemini security-medium) - Add 1 MiB length limit to _ToolCallAccumulator to prevent DoS via infinite streaming deltas (Gemini security-medium) - Return defensive copies from _parse_arguments for immutability consistency (CodeRabbit nitpick) - Update registry factory error to use generic message with detail in context
## Summary - Adds **39 integration tests** for the provider adapter layer, completing the final unchecked acceptance criterion from #5: _"Integration tests with mock/recorded API responses"_ - All source code was already implemented in PRs #86 and #88 — this PR covers only the integration test suite - Mocks at `litellm.acompletion` level using **real `litellm.ModelResponse`** objects (not MagicMock), exercising actual attribute access paths through `_map_response`, `_process_chunk`, and `extract_tool_calls` ### Test files | File | Tests | Coverage | |------|-------|---------| | `test_anthropic_pipeline.py` | 13 | Config→registry→complete/stream, alias resolution, cost computation, streaming | | `test_openrouter_pipeline.py` | 5 | Custom base_url forwarding, model prefixing, multi-model alias resolution | | `test_ollama_pipeline.py` | 4 | No api_key, localhost base_url, zero-cost models | | `test_error_scenarios.py` | 9 | Rate limit (429 + retry-after), auth (401), timeout, connection, internal, unknown | | `test_tool_calling_pipeline.py` | 8 | Single/multiple tool calls, streaming accumulation, mixed text+tools, multi-turn | | `conftest.py` | — | Config factories, real ModelResponse builders, stream helpers | ### Verification - `ruff check` — all passed - `ruff format` — all formatted - `mypy` — 0 errors (7 files) - `pytest` — 1331 total tests pass, **94.49% coverage** (80% required) Closes #5 ## Test plan - [ ] CI passes (lint + type-check + test + coverage) - [ ] 39 integration tests pass under `pytest -m integration` - [ ] No regressions in existing 1292 unit tests - [ ] Coverage remains above 80% threshold
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
LiteLLMDriver(BaseCompletionProvider)wrappinglitellm.acompletionfor streaming/non-streaming completions, model capabilities, and full exception mapping (10 LiteLLM exception types →ProviderErrorhierarchy)ProviderRegistry(immutable,MappingProxyType) mapping provider names → driver instances, built from config viafrom_config()withfactory_overridesfor testing/native SDK swaps1.67.2→1.82.0, removing all Python 3.14 compatibility workarounds (warning filters,PYTHONUTF8env var, staletype: ignore)F403ruff ignore for__init__.pyArchitecture
New files
providers/drivers/litellm_driver.pyproviders/drivers/mappers.pyproviders/registry.pyproviders/drivers/__init__.pyModified files
config/schema.pydriverfield toProviderConfig(default"litellm")providers/errors.pyproviders/__init__.pypyproject.toml1.82.0, removed warning filters andF403ignoreTest plan
uv run pytest tests/ -n auto)--cov-fail-under=80)uv run ruff check src/ tests/)uv run ruff format --check src/ tests/)uv run mypy src/— 0 errors in 64 files)Closes #4