test: add provider adapter integration tests#90
Conversation
39 integration tests exercising the full provider pipeline (config → registry → driver.complete/stream) with real litellm ModelResponse objects mocked at the acompletion level. - test_anthropic_pipeline: 13 tests (alias resolution, cost, streaming) - test_openrouter_pipeline: 5 tests (base_url, model prefix, multi-model) - test_ollama_pipeline: 4 tests (no api_key, localhost, zero-cost) - test_error_scenarios: 9 tests (rate limit, auth, timeout, connection) - test_tool_calling_pipeline: 8 tests (single/multi tool calls, streaming)
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... 📒 Files selected for processing (6)
✏️ Tip: You can disable in-progress messages and the fortune message in your review settings. 📝 WalkthroughWalkthroughThe PR introduces a comprehensive integration test suite for the provider adapter layer, covering end-to-end pipelines for multiple providers (Anthropic, OpenRouter, Ollama), error mapping scenarios, streaming behavior, and tool-calling functionality using mocked litellm responses. Changes
Possibly related PRs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the testing suite by introducing 39 new integration tests for the provider adapter layer. These tests ensure the robustness and correctness of the LLM integration across various providers and scenarios, including streaming, error handling, and tool calling. By mocking at the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive suite of 39 integration tests for the provider adapter layer. The tests are well-structured, covering various providers (Anthropic, OpenRouter, Ollama), different features like streaming and tool calling, and a wide range of error scenarios. The approach of mocking at the litellm.acompletion level and using real litellm.ModelResponse objects is excellent for verifying the response mapping logic. Overall, this is a high-quality addition that significantly improves confidence in the provider layer. I have one suggestion to enhance the multi-turn tool conversation test to make it more robust and reflect a more standard interaction pattern.
| async def test_multi_turn_tool_conversation( | ||
| sample_tool_definitions: list[ToolDefinition], | ||
| ) -> None: | ||
| """Multi-turn: user -> assistant(tool_call) -> tool_result -> assistant.""" | ||
| driver = _make_driver() | ||
|
|
||
| # Turn 1: user asks, model calls tool | ||
| messages_t1 = [ | ||
| ChatMessage(role=MessageRole.USER, content="What's the weather?"), | ||
| ] | ||
| tc = build_tool_call_dict( | ||
| call_id="call_w1", | ||
| name="get_weather", | ||
| arguments='{"location": "Tokyo"}', | ||
| ) | ||
| mock_resp_t1 = build_model_response( | ||
| content=None, | ||
| tool_calls=[tc], | ||
| finish_reason="tool_calls", | ||
| ) | ||
| with patch(_PATCH_TARGET, new_callable=AsyncMock, return_value=mock_resp_t1): | ||
| result_t1 = await driver.complete( | ||
| messages_t1, "sonnet", tools=sample_tool_definitions | ||
| ) | ||
|
|
||
| assert result_t1.finish_reason == FinishReason.TOOL_USE | ||
| assert len(result_t1.tool_calls) == 1 | ||
|
|
||
| # Turn 2: include tool result, model responds with text | ||
| messages_t2 = [ | ||
| ChatMessage(role=MessageRole.USER, content="What's the weather?"), | ||
| ChatMessage( | ||
| role=MessageRole.ASSISTANT, | ||
| tool_calls=( | ||
| ToolCall( | ||
| id="call_w1", | ||
| name="get_weather", | ||
| arguments={"location": "Tokyo"}, | ||
| ), | ||
| ), | ||
| ), | ||
| ChatMessage( | ||
| role=MessageRole.TOOL, | ||
| tool_result=ToolResult( | ||
| tool_call_id="call_w1", | ||
| content="Sunny, 25°C", | ||
| ), | ||
| ), | ||
| ChatMessage(role=MessageRole.USER, content="Tell me the result"), | ||
| ] | ||
| mock_resp_t2 = build_model_response( | ||
| content="It's sunny and 25°C in Tokyo!", | ||
| finish_reason="stop", | ||
| ) | ||
| with patch(_PATCH_TARGET, new_callable=AsyncMock, return_value=mock_resp_t2): | ||
| result_t2 = await driver.complete( | ||
| messages_t2, "sonnet", tools=sample_tool_definitions | ||
| ) | ||
|
|
||
| assert result_t2.content == "It's sunny and 25°C in Tokyo!" | ||
| assert result_t2.finish_reason == FinishReason.STOP | ||
| assert len(result_t2.tool_calls) == 0 |
There was a problem hiding this comment.
This test is great for verifying a multi-turn tool conversation. However, it could be improved in two ways to be more robust and realistic:
-
Simplify the conversation flow: The final user message,
ChatMessage(role=MessageRole.USER, content="Tell me the result"), is not typical in a tool-use conversation. The assistant should be able to generate a response based on the tool result without an extra prompt. Removing this message would make the test case reflect a more standard interaction pattern. -
Assert on forwarded messages: The test currently only asserts on the final
CompletionResponse. It would be more thorough to also assert that the conversation history (messages_t2) is correctly formatted and passed to the underlyinglitellm.acompletioncall. This would ensure the message mapping logic forTOOLandASSISTANT(with tool calls) roles is working as expected.
Here's a suggested implementation that incorporates these points:
async def test_multi_turn_tool_conversation(
sample_tool_definitions: list[ToolDefinition],
) -> None:
"""Multi-turn: user -> assistant(tool_call) -> tool_result -> assistant."""
driver = _make_driver()
# Turn 1: user asks, model calls tool
messages_t1 = [
ChatMessage(role=MessageRole.USER, content="What's the weather in Tokyo?"),
]
tc = build_tool_call_dict(
call_id="call_w1",
name="get_weather",
arguments='{"location": "Tokyo"}',
)
mock_resp_t1 = build_model_response(
content=None,
tool_calls=[tc],
finish_reason="tool_calls",
)
with patch(_PATCH_TARGET, new_callable=AsyncMock, return_value=mock_resp_t1):
result_t1 = await driver.complete(
messages_t1, "sonnet", tools=sample_tool_definitions
)
assert result_t1.finish_reason == FinishReason.TOOL_USE
assert len(result_t1.tool_calls) == 1
assistant_message = ChatMessage(
role=MessageRole.ASSISTANT,
content=result_t1.content,
tool_calls=result_t1.tool_calls,
)
# Turn 2: include tool result, model responds with text
messages_t2 = [
*messages_t1,
assistant_message,
ChatMessage(
role=MessageRole.TOOL,
tool_result=ToolResult(
tool_call_id="call_w1",
content="Sunny, 25°C",
),
),
]
mock_resp_t2 = build_model_response(
content="It's sunny and 25°C in Tokyo!",
finish_reason="stop",
)
with patch(
_PATCH_TARGET, new_callable=AsyncMock, return_value=mock_resp_t2
) as mock_call_t2:
result_t2 = await driver.complete(
messages_t2, "sonnet", tools=sample_tool_definitions
)
# Assert on final response
assert result_t2.content == "It's sunny and 25°C in Tokyo!"
assert result_t2.finish_reason == FinishReason.STOP
assert not result_t2.tool_calls
# Assert on messages forwarded to litellm
kwargs = mock_call_t2.call_args.kwargs
forwarded_messages = kwargs["messages"]
assert len(forwarded_messages) == 3
assert forwarded_messages[0]["role"] == "user"
assert forwarded_messages[1]["role"] == "assistant"
assert forwarded_messages[1]["tool_calls"][0]["id"] == "call_w1"
assert forwarded_messages[2]["role"] == "tool"
assert forwarded_messages[2]["tool_call_id"] == "call_w1"
assert forwarded_messages[2]["content"] == "Sunny, 25°C"There was a problem hiding this comment.
Pull request overview
Adds an integration test suite for the provider adapter layer (LiteLLM-backed drivers + ProviderRegistry), using real litellm.ModelResponse objects while mocking litellm.acompletion to exercise the full mapping pipeline and error translation.
Changes:
- Adds end-to-end “pipeline” integration tests for Anthropic, OpenRouter, and Ollama adapters (config → registry → complete/stream).
- Adds integration coverage for tool-calling (non-stream + streaming accumulation + multi-turn tool conversations).
- Adds integration coverage for LiteLLM exception →
ProviderErrormapping, including retryability + retry-after parsing.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
tests/integration/providers/test_anthropic_pipeline.py |
Anthropic pipeline coverage: config/aliasing, kwargs forwarding, cost computation, basic streaming behavior. |
tests/integration/providers/test_openrouter_pipeline.py |
OpenRouter behaviors: base_url/api_base forwarding, provider-prefixed model IDs, alias resolution, cost. |
tests/integration/providers/test_ollama_pipeline.py |
Ollama behaviors: no api_key, localhost api_base, zero-cost pricing, response mapping. |
tests/integration/providers/test_tool_calling_pipeline.py |
Tool definition forwarding + tool call extraction and streaming accumulation; multi-turn tool conversation. |
tests/integration/providers/test_error_scenarios.py |
Cross-provider error mapping coverage from LiteLLM exceptions to internal ProviderError hierarchy. |
tests/integration/providers/conftest.py |
Shared integration fixtures and builders for real ModelResponse + streaming chunk helpers and provider configs. |
tests/integration/providers/__init__.py |
Package marker for integration provider tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integration/providers/conftest.py`:
- Around line 120-122: In build_model_response, avoid mutating the message dict
after construction; instead construct message in a single expression that
includes "role" and "content" plus an optional "tool_calls" key only when
tool_calls is not None (reference symbols: build_model_response, message,
tool_calls). Replace the two-step creation/mutation with a single immutable
construction (e.g., combining the base keys with a conditional small dict or
using a dict-union/merge expression) so no post-creation assignment to message
occurs.
In `@tests/integration/providers/test_error_scenarios.py`:
- Line 44: The pytest file currently sets only pytestmark =
pytest.mark.integration; add an explicit 30-second timeout marker for
consistency by changing pytestmark to include the timeout (e.g., set pytestmark
to a list or chained markers such as pytestmark = [pytest.mark.integration,
pytest.mark.timeout(30)]) so the file applies both the integration marker and
the 30s timeout.
In `@tests/integration/providers/test_ollama_pipeline.py`:
- Line 20: The module-level pytest marker only sets pytest.mark.integration and
is missing the required 30-second timeout; update the module-level pytestmark to
include the timeout marker so each test in this file has a 30s bound (e.g.,
change pytestmark to include pytest.mark.timeout(30) alongside
pytest.mark.integration) — look for the pytestmark symbol in
tests/integration/providers/test_ollama_pipeline.py and make it a list
containing both markers.
In `@tests/integration/providers/test_openrouter_pipeline.py`:
- Line 20: The module-level pytest marker declaration only sets pytestmark =
pytest.mark.integration but must also include an explicit 30-second timeout
marker; update the module-level pytestmark to include both markers (e.g., make
pytestmark a list containing pytest.mark.integration and
pytest.mark.timeout(30)) so the test module is marked as integration and has an
explicit 30s timeout.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (7)
tests/integration/providers/__init__.pytests/integration/providers/conftest.pytests/integration/providers/test_anthropic_pipeline.pytests/integration/providers/test_error_scenarios.pytests/integration/providers/test_ollama_pipeline.pytests/integration/providers/test_openrouter_pipeline.pytests/integration/providers/test_tool_calling_pipeline.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Agent
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Python 3.14+ with PEP 649 native lazy annotations
Do not usefrom __future__ import annotations— Python 3.14 has PEP 649
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14
Add type hints to all public functions, enforced by mypy strict mode
Use Google style docstrings on all public classes and functions, enforced by ruff D rules
Create new objects instead of mutating existing ones — enforce immutability
Use Pydantic v2 withBaseModel,model_validator, andConfigDict
Keep line length to 88 characters, enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries: user input, external APIs, and config files
Files:
tests/integration/providers/test_anthropic_pipeline.pytests/integration/providers/test_openrouter_pipeline.pytests/integration/providers/test_tool_calling_pipeline.pytests/integration/providers/__init__.pytests/integration/providers/conftest.pytests/integration/providers/test_ollama_pipeline.pytests/integration/providers/test_error_scenarios.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Useasyncio_mode = 'auto'in pytest — no manual@pytest.mark.asyncioneeded
Set 30-second timeout per test
Files:
tests/integration/providers/test_anthropic_pipeline.pytests/integration/providers/test_openrouter_pipeline.pytests/integration/providers/test_tool_calling_pipeline.pytests/integration/providers/__init__.pytests/integration/providers/conftest.pytests/integration/providers/test_ollama_pipeline.pytests/integration/providers/test_error_scenarios.py
🧠 Learnings (11)
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/test*.py : Agent tests should cover: successful generation with valid output, handling malformed LLM responses, error conditions (network errors, timeouts), output format validation, and integration with story state
Applied to files:
tests/integration/providers/test_anthropic_pipeline.pytests/integration/providers/test_tool_calling_pipeline.pytests/integration/providers/test_ollama_pipeline.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/integration/test_*.py : Place integration tests for component interactions in `tests/integration/` directory
Applied to files:
tests/integration/providers/__init__.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
tests/integration/providers/conftest.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Mock Ollama API responses to support both dict (`models.get("models")`) and object (`response.models`) patterns in test mocks.
Applied to files:
tests/integration/providers/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Mock Ollama API calls in tests to avoid requiring a running Ollama instance
Applied to files:
tests/integration/providers/test_ollama_pipeline.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py} : Ollama Integration - all AI agents use Ollama for local LLM serving with default endpoint `http://localhost:11434`
Applied to files:
tests/integration/providers/test_ollama_pipeline.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/test*.py : In agent tests, mock Ollama API calls using `unittest.mock` and patch `agents.base.ollama.Client`
Applied to files:
tests/integration/providers/test_ollama_pipeline.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Always mock Ollama API calls in tests - tests should not require a running Ollama instance; use `unittest.mock` for mocking (`patch`, `MagicMock`); mock the Ollama client with: `patch("agents.base.ollama.Client")`
Applied to files:
tests/integration/providers/test_ollama_pipeline.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/model_service.py} : Respect existing model configuration patterns in Ollama integration
Applied to files:
tests/integration/providers/test_ollama_pipeline.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/**/*.py : Mock Ollama in tests to avoid requiring running instance - use model names from `RECOMMENDED_MODELS` (e.g., `huihui_ai/dolphin3-abliterated:8b`)
Applied to files:
tests/integration/providers/test_ollama_pipeline.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Use BaseAgent methods for Ollama integration: `self.generate(prompt)` for LLM calls with built-in retry logic, `self.settings` for model configuration, and `self.client` for Ollama client instance
Applied to files:
tests/integration/providers/test_ollama_pipeline.py
🧬 Code graph analysis (5)
tests/integration/providers/test_anthropic_pipeline.py (4)
src/ai_company/providers/enums.py (2)
FinishReason(15-22)StreamEventType(25-32)src/ai_company/providers/models.py (2)
ChatMessage(114-186)CompletionConfig(189-230)src/ai_company/providers/registry.py (3)
ProviderRegistry(21-139)from_config(96-139)get(56-78)tests/integration/providers/conftest.py (8)
async_iter_chunks(253-258)build_content_chunk(158-176)build_finish_chunk(232-250)build_model_response(109-138)build_usage_chunk(179-197)make_anthropic_config(33-56)user_messages(265-267)multi_turn_messages(271-278)
tests/integration/providers/test_openrouter_pipeline.py (3)
src/ai_company/providers/enums.py (1)
FinishReason(15-22)src/ai_company/providers/registry.py (3)
ProviderRegistry(21-139)from_config(96-139)get(56-78)tests/integration/providers/conftest.py (3)
build_model_response(109-138)make_openrouter_config(59-83)user_messages(265-267)
tests/integration/providers/test_tool_calling_pipeline.py (4)
src/ai_company/providers/enums.py (3)
FinishReason(15-22)MessageRole(6-12)StreamEventType(25-32)src/ai_company/providers/models.py (4)
ChatMessage(114-186)ToolCall(73-95)ToolDefinition(45-70)ToolResult(98-111)src/ai_company/providers/registry.py (3)
ProviderRegistry(21-139)from_config(96-139)get(56-78)src/ai_company/providers/base.py (1)
BaseCompletionProvider(26-273)
tests/integration/providers/conftest.py (3)
src/ai_company/config/schema.py (2)
ProviderConfig(52-95)ProviderModelConfig(17-49)src/ai_company/providers/enums.py (1)
MessageRole(6-12)src/ai_company/providers/models.py (2)
ChatMessage(114-186)ToolDefinition(45-70)
tests/integration/providers/test_error_scenarios.py (3)
src/ai_company/providers/errors.py (5)
AuthenticationError(72-75)RateLimitError(78-104)ProviderTimeoutError(125-128)ProviderConnectionError(131-134)ProviderInternalError(137-140)tests/integration/providers/conftest.py (2)
build_content_chunk(158-176)make_anthropic_config(33-56)tests/integration/providers/test_tool_calling_pipeline.py (1)
_make_driver(40-44)
🔇 Additional comments (9)
tests/integration/providers/__init__.py (1)
1-1: Module scaffolding looks good.Clean and appropriate package-level docstring for the integration test namespace.
tests/integration/providers/conftest.py (1)
33-307: Strong shared test infrastructure.The config factories plus
ModelResponse/stream builders give good integration-level realism while keeping tests deterministic.tests/integration/providers/test_ollama_pipeline.py (1)
25-97: Ollama pipeline checks are well targeted.Good coverage of key local-provider behaviors: omitted API key, base URL forwarding, zero-cost usage mapping, and finish/model/request-id mapping.
tests/integration/providers/test_openrouter_pipeline.py (1)
25-124: OpenRouter integration scenarios are comprehensive.Great coverage of base_url propagation, model transformation, API key forwarding, full mapping, and multi-model alias cost behavior.
tests/integration/providers/test_anthropic_pipeline.py (2)
36-303: Excellent end-to-end coverage for Anthropic adapter behavior.The suite exercises aliasing, config forwarding, cost mapping, finish reasons, streaming events, and multi-turn message serialization in a realistic way.
28-28: No action needed. This module already complies with the timeout requirement through the global pytest configuration inpyproject.tomlwhich setstimeout = 30for all tests. Adding an explicitpytest.mark.timeout(30)marker is redundant.Likely an incorrect or invalid review comment.
tests/integration/providers/test_error_scenarios.py (1)
131-301: Error-mapping integration coverage is strong.Good depth across retryability flags,
retry_afterextraction, and streaming/non-streaming exception translation paths.tests/integration/providers/test_tool_calling_pipeline.py (2)
50-318: Tool-calling integration coverage is excellent.The suite validates extraction, forwarding, streaming accumulation, mixed event flows, and realistic multi-turn tool-result conversations.
35-35: No action required — 30-second timeout is already enforced globally.The pytest configuration in
pyproject.tomlsetstimeout = 30at the global level in[tool.pytest.ini_options], which automatically applies a 30-second timeout to all tests. This satisfies the coding guideline requirement. Adding an explicitpytest.mark.timeout(30)to the pytestmark would be redundant and is not necessary. The file is in compliance.Likely an incorrect or invalid review comment.
…mini
- Add pytest.mark.timeout(30) to all 5 integration test files
- Build message dict immutably in conftest.py build_model_response
- Strengthen error context assertions (provider, model) across all error tests
- Add retry_after assertion to streaming rate limit test
- Add streaming AuthenticationError test (stream setup failure)
- Add streaming ConnectionError test (mid-stream failure)
- Add ModelNotFoundError test for unknown model alias
- Add ProviderError passthrough test (re-raise without double-wrapping)
- Strengthen unknown exception message assertion with full context
- Add cost_usd assertion to streaming usage test
- Add stop_sequences and top_p to CompletionConfig forwarding test
- Add malformed JSON tool call args test (silent degradation to {})
- Assert forwarded messages in multi-turn tool conversation test
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
litellm.acompletionlevel using reallitellm.ModelResponseobjects (not MagicMock), exercising actual attribute access paths through_map_response,_process_chunk, andextract_tool_callsTest files
test_anthropic_pipeline.pytest_openrouter_pipeline.pytest_ollama_pipeline.pytest_error_scenarios.pytest_tool_calling_pipeline.pyconftest.pyVerification
ruff check— all passedruff format— all formattedmypy— 0 errors (7 files)pytest— 1331 total tests pass, 94.49% coverage (80% required)Closes #5
Test plan
pytest -m integration