feat: implement basic tool system (registry, invocation, results) (#15)#104
feat: implement basic tool system (registry, invocation, results) (#15)#104
Conversation
Adds the application-level tool system that bridges LLM tool calls with actual tool execution: - ToolError hierarchy (ToolNotFoundError, ToolParameterError, ToolExecutionError) - BaseTool ABC with ToolExecutionResult frozen model - ToolRegistry with immutable MappingProxyType backing - ToolInvoker with JSON Schema validation and SSRF-safe $ref resolution - EchoTool reference implementation - 65 unit tests with near-100% coverage on new modules Pre-reviewed by 8 agents, 17 findings addressed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.OpenSSF Scorecard
Scanned Files
|
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a new tool subsystem: immutable ToolExecutionResult and BaseTool, a ToolError hierarchy, an immutable ToolRegistry, a ToolInvoker with JSON Schema validation and SSRF blocking, an EchoTool example, observability event constants, jsonschema dependency, and comprehensive unit tests and fixtures. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Invoker as ToolInvoker
participant Registry as ToolRegistry
participant Tool as BaseTool
Client->>Invoker: invoke(tool_call)
Invoker->>Registry: get(tool_name)
alt tool found
Registry-->>Invoker: BaseTool
Invoker->>Invoker: validate(arguments, schema) (jsonschema, no-remote-$ref)
alt validation OK
Invoker->>Tool: execute(arguments)
Tool-->>Invoker: ToolExecutionResult
Invoker-->>Client: ToolResult (is_error: false)
else validation failed
Invoker-->>Client: ToolResult (is_error: true, validation error)
end
else not found
Registry-->>Invoker: raises ToolNotFoundError
Invoker-->>Client: ToolResult (is_error: true, not found)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a foundational tool system that enables the application to effectively manage and execute tools based on LLM requests. It establishes clear abstractions for tools, robust error handling, and secure parameter validation, providing a reliable and extensible framework for integrating external functionalities. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Greptile SummaryThis PR implements the complete application-level tool system (Issue #15, Design Spec §11.1–11.2), bridging LLM
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant LLM as LLM Provider
participant TI as ToolInvoker
participant TR as ToolRegistry
participant BT as BaseTool
participant JS as jsonschema (SSRF-safe)
LLM->>TI: invoke(ToolCall)
TI->>TR: get(tool_call.name)
alt Tool not found
TR-->>TI: raises ToolNotFoundError
TI-->>LLM: ToolResult(is_error=True)
else Tool found
TR-->>TI: BaseTool instance
TI->>JS: validate(arguments, schema, _SAFE_REGISTRY)
alt SchemaError (invalid schema def)
JS-->>TI: raises SchemaError
TI-->>LLM: ToolResult(is_error=True)
else ValidationError (bad arguments)
JS-->>TI: raises ValidationError
TI-->>LLM: ToolResult(is_error=True)
else MemoryError / RecursionError
JS-->>TI: raises non-recoverable
TI-->>LLM: re-raises (propagates)
else Unexpected exception
JS-->>TI: raises Exception
TI-->>LLM: ToolResult(is_error=True)
else Validation passed
JS-->>TI: None (success)
TI->>BT: execute(arguments=...)
alt Execution raises MemoryError/RecursionError
BT-->>TI: raises non-recoverable
TI-->>LLM: re-raises (propagates)
else Execution raises Exception
BT-->>TI: raises Exception
TI-->>LLM: ToolResult(is_error=True)
else Execution returns result
BT-->>TI: ToolExecutionResult
TI-->>LLM: ToolResult(content, is_error)
end
end
end
Last reviewed commit: 88bc06f |
There was a problem hiding this comment.
Pull request overview
Implements the core “tool system” layer that maps provider ToolCalls to registered application tools, with JSON Schema validation, structured results, and observability events (Issue #15 / Design Spec §11.1–11.2).
Changes:
- Added
BaseTool+ToolExecutionResult, tool error hierarchy, and anEchoToolexample implementation. - Added
ToolRegistry(immutable mapping) andToolInvoker(schema validation + safe execution returningToolResulton failures). - Added tool-related observability event constants and pinned
jsonschema==4.26.0as a direct dependency; included unit test suite for the tool subsystem.
Reviewed changes
Copilot reviewed 14 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Adds jsonschema to the locked dependencies. |
| pyproject.toml | Pins jsonschema==4.26.0 and adds a mypy override for jsonschema.*. |
| src/ai_company/tools/base.py | Introduces BaseTool abstraction and ToolExecutionResult. |
| src/ai_company/tools/errors.py | Adds ToolError hierarchy with immutable context mapping. |
| src/ai_company/tools/registry.py | Implements immutable ToolRegistry with lookup/list/definitions. |
| src/ai_company/tools/invoker.py | Implements tool invocation with JSON Schema validation and error-to-ToolResult conversion. |
| src/ai_company/tools/examples/echo.py | Adds a reference EchoTool implementation. |
| src/ai_company/tools/init.py | Exposes tool system public API surface. |
| src/ai_company/observability/events.py | Adds tool lifecycle event constants for structured logging. |
| tests/unit/tools/conftest.py | Adds fixtures and test-only tools for invoker/registry testing. |
| tests/unit/tools/test_base.py | Unit tests for BaseTool and ToolExecutionResult. |
| tests/unit/tools/test_errors.py | Unit tests for error hierarchy and context immutability. |
| tests/unit/tools/test_registry.py | Unit tests for registry behavior and immutability. |
| tests/unit/tools/test_invoker.py | Unit tests for invoker happy-path, validation, and failure modes. |
| tests/unit/tools/test_echo.py | Unit + integration-style tests for EchoTool pipeline. |
| tests/unit/tools/init.py | Adds tools test package marker. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| result = await tool.execute(arguments=dict(tool_call.arguments)) | ||
| except MemoryError, RecursionError: | ||
| raise | ||
| except Exception as exc: | ||
| error_msg = str(exc) or f"{type(exc).__name__} (no message)" |
There was a problem hiding this comment.
invoke() catches all Exception from tool.execute(...), which will also catch asyncio.CancelledError when it subclasses Exception (common in recent Python versions). That breaks task cancellation semantics and contradicts the module docstring claim that CancelledError propagates. Add an explicit except asyncio.CancelledError: raise (and keep it before the broad except Exception) so cancellations are not converted into ToolResult errors.
src/ai_company/tools/registry.py
Outdated
| available = sorted(self._tools) or ["(none)"] | ||
| logger.warning( | ||
| TOOL_NOT_FOUND, | ||
| name=name, |
There was a problem hiding this comment.
For TOOL_NOT_FOUND logging, this uses the field key name=... while the rest of the tool logging consistently uses tool_name=... (e.g., all TOOL_INVOKE_* events). Using inconsistent keys makes log filtering/analytics harder; consider renaming name to tool_name here for consistency.
| name=name, | |
| tool_name=name, |
| Returns: | ||
| Tuple of results in the same order as the input. | ||
| """ | ||
| return tuple([await self.invoke(call) for call in tool_calls]) |
There was a problem hiding this comment.
invoke_all() builds a list via comprehension and then immediately converts it to a tuple (tuple([ ... ])), which does an extra allocation/copy. Prefer collecting results in a list and returning tuple(results), or other approaches that avoid the intermediate list-to-tuple conversion overhead.
| return tuple([await self.invoke(call) for call in tool_calls]) | |
| results: list[ToolResult] = [] | |
| for call in tool_calls: | |
| results.append(await self.invoke(call)) | |
| return tuple(results) |
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive and well-designed tool system, including an abstract base class for tools, a registry for managing them, and an invoker for executing them. While the implementation demonstrates strong attention to security (SSRF prevention), there are critical issues related to internal state mutation, version-dependent exception handling that could lead to a SyntaxError in some Python versions, and potential information exposure through error messages. Addressing these will ensure the system's reliability and security.
src/ai_company/tools/invoker.py
Outdated
|
|
||
| try: | ||
| result = await tool.execute(arguments=dict(tool_call.arguments)) | ||
| except MemoryError, RecursionError: |
There was a problem hiding this comment.
The except MemoryError, RecursionError: syntax is problematic. While it's valid in Python 3.13+ (PEP 758), in earlier Python 3 versions, it's interpreted as except MemoryError as RecursionError:. This means RecursionError is not caught as intended and the RecursionError built-in class name is shadowed. This could lead to SyntaxError in some Python 3 environments or incorrect exception handling, contradicting the design goal of propagating non-recoverable errors.
| except MemoryError, RecursionError: | |
| except (MemoryError, RecursionError): |
src/ai_company/tools/base.py
Outdated
| return ToolDefinition( | ||
| name=self._name, | ||
| description=self._description, | ||
| parameters_schema=self._parameters_schema or {}, |
There was a problem hiding this comment.
The to_definition method returns a ToolDefinition object that contains a reference to the tool's internal parameters_schema dictionary. Since dictionaries are mutable in Python, any external modification to the parameters_schema in the returned ToolDefinition will directly mutate the tool's internal state. This can lead to unexpected behavior or security bypasses if the schema is used for input validation in other parts of the system (like the ToolInvoker).
| parameters_schema=self._parameters_schema or {}, | |
| parameters_schema=copy.deepcopy(self._parameters_schema) if self._parameters_schema is not None else {}, |
| except MemoryError, RecursionError: | ||
| raise | ||
| except Exception as exc: | ||
| error_msg = str(exc) or f"{type(exc).__name__} (no message)" |
There was a problem hiding this comment.
When a tool execution fails, the raw exception message (str(exc)) is captured and returned to the LLM as part of the ToolResult. These error messages may contain sensitive information about the system's internal state, database schemas, or third-party service details, depending on the tool's implementation. An attacker could potentially use the LLM to trigger specific errors and exfiltrate this information.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/tools/invoker.py`:
- Around line 77-194: The invoke() method is too large; split it into small
private helpers: implement _get_tool(tool_name, tool_call_id) to perform
registry lookup and handle ToolNotFoundError, _validate_parameters(tool,
tool_call) to run jsonschema.validate and return ToolParameterError/ToolResult
on SchemaError/ValidationError, _execute_tool(tool, tool_call) to call await
tool.execute(...) and translate exceptions into ToolExecutionError (preserve the
MemoryError/RecursionError re-raise behavior), and
_map_execution_result(tool_call, result) to log
TOOL_INVOKE_TOOL_ERROR/TOOL_INVOKE_SUCCESS and return the final ToolResult; then
refactor invoke() to orchestrate these helpers so each helper is under ~50 lines
and invoke() becomes a thin coordinator calling _get_tool, _validate_parameters,
_execute_tool, and _map_execution_result.
- Around line 157-158: The except clause in invoke that catches MemoryError and
RecursionError re-raises them silently; update that clause in
src/ai_company/tools/invoker.py (function invoke) to log an ERROR-level message
with contextual details (including input args/state and exception info) before
re-raising, using the module's existing logger object (e.g., logger) and
exc_info=True; after adding the logging, refactor invoke into smaller helpers
(validate_params, execute_tool, handle_results, and handle_errors) to keep each
function under ~50 lines and separate validation, execution, and error-logging
responsibilities.
In `@src/ai_company/tools/registry.py`:
- Around line 64-68: The state-transition log using TOOL_REGISTRY_BUILT is
currently emitted with logger.debug; change that call to logger.info so the
event is logged at INFO level (replace the logger.debug(...) call that passes
TOOL_REGISTRY_BUILT, tool_count=len(self._tools), tools=sorted(self._tools) with
logger.info while preserving the same message and keyword args) — locate this in
the ToolRegistry/build (or the method containing the TOOL_REGISTRY_BUILT
emission) in registry.py and update the logging level only.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2aa8c784-cbed-4316-ac43-ffec8d080bef
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (16)
pyproject.tomlsrc/ai_company/observability/events.pysrc/ai_company/tools/__init__.pysrc/ai_company/tools/base.pysrc/ai_company/tools/errors.pysrc/ai_company/tools/examples/__init__.pysrc/ai_company/tools/examples/echo.pysrc/ai_company/tools/invoker.pysrc/ai_company/tools/registry.pytests/unit/tools/__init__.pytests/unit/tools/conftest.pytests/unit/tools/test_base.pytests/unit/tools/test_echo.pytests/unit/tools/test_errors.pytests/unit/tools/test_invoker.pytests/unit/tools/test_registry.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax (without parentheses) for exception handling — ruff enforces this on Python 3.14
Include type hints on all public functions, validated with mypy strict mode
Use Google-style docstrings on all public classes and functions, enforced by ruff D rules
Create new objects rather than mutating existing ones — prioritize immutability
Use Pydantic v2 for data models withBaseModel,model_validator, andConfigDict
Enforce 88 character line length (ruff configured)
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files) in Python code
Every module with business logic must import logging with:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Always use variable namelogger(not_logger, notlog) for the logging instance
Pure data models, enums, and re-exports do not require logging
Files:
tests/unit/tools/test_registry.pytests/unit/tools/test_errors.pysrc/ai_company/observability/events.pysrc/ai_company/tools/errors.pysrc/ai_company/tools/base.pysrc/ai_company/tools/__init__.pytests/unit/tools/test_base.pytests/unit/tools/test_echo.pysrc/ai_company/tools/registry.pytests/unit/tools/test_invoker.pytests/unit/tools/conftest.pysrc/ai_company/tools/invoker.pysrc/ai_company/tools/examples/echo.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow
Useasyncio_mode = 'auto'for async tests — no manual@pytest.mark.asyncioneeded
Enforce 30 second timeout per test
Use vendor-agnostic fake model IDs/names in tests (e.g.test-haiku-001,test-provider), never real vendor model IDs
Files:
tests/unit/tools/test_registry.pytests/unit/tools/test_errors.pytests/unit/tools/test_base.pytests/unit/tools/test_echo.pytests/unit/tools/test_invoker.pytests/unit/tools/conftest.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Never useimport logging,logging.getLogger(), orprint()in application code — use the project's logger
Use event name constants fromai_company.observability.eventsrather than inline strings
Use structured logging format:logger.info(EVENT, key=value)— neverlogger.info('msg %s', val)
All error paths must log at WARNING or ERROR level with context before raising
All state transitions must log at INFO level
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
All provider calls must go throughBaseCompletionProviderwhich applies retry and rate limiting automatically
Files:
src/ai_company/observability/events.pysrc/ai_company/tools/errors.pysrc/ai_company/tools/base.pysrc/ai_company/tools/__init__.pysrc/ai_company/tools/registry.pysrc/ai_company/tools/invoker.pysrc/ai_company/tools/examples/echo.py
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Pin all dependency versions using==inpyproject.toml
Organize dependencies into groups:test(pytest + plugins),dev(includes test + ruff, mypy, pre-commit, commitizen, pydantic)
Files:
pyproject.toml
🧠 Learnings (10)
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/**/*.py : Use event name constants from `ai_company.observability.events` rather than inline strings
Applied to files:
src/ai_company/observability/events.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/providers/**/*.py : Retryable errors include `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, and `ProviderInternalError` with `is_retryable=True`
Applied to files:
src/ai_company/tools/errors.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising
Applied to files:
src/ai_company/tools/errors.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/providers/**/*.py : Non-retryable errors raise immediately without retry
Applied to files:
src/ai_company/tools/errors.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/unit/test_*.py : Write unit tests for new functionality using pytest in `tests/unit/` with `test_*.py` naming convention
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/unit/test_*.py : Write unit tests for new functionality using pytest. Place test files in `tests/unit/` with `test_*.py` naming convention.
Applied to files:
tests/unit/tools/conftest.py
🔇 Additional comments (14)
pyproject.toml (2)
154-156: LGTM!The mypy override for
jsonschema.*follows the established pattern for packages without complete type stubs, consistent with the existinglitellm.*override.
17-17: jsonschema 4.26.0 is correctly configured.The version is available on PyPI and properly pinned with
==per coding guidelines. It's correctly placed in main dependencies as a runtime requirement.src/ai_company/tools/examples/echo.py (1)
1-49: LGTM!Clean reference implementation of
BaseTool. The schema is well-defined withadditionalProperties: Falsefor strict validation, and the reliance on upstream validation byToolInvokeris correctly documented in the base class contract.src/ai_company/tools/__init__.py (1)
1-19: LGTM!Clean public API surface with alphabetically sorted
__all__. The re-exports provide convenient access to all tool system components from a single import path.src/ai_company/tools/base.py (2)
16-42: LGTM!Well-documented frozen model with clear note about shallow immutability of the
metadatadict. The field defaults and descriptions are appropriate.
45-132: LGTM!Solid design with defensive deep-copy on construction and property access. The
to_definition()method correctly passes the internal schema reference to the frozenToolDefinition, which is documented as read-only in its docstring. The abstractexecute()contract clearly states that arguments are pre-validated byToolInvoker.tests/unit/tools/test_registry.py (1)
1-121: LGTM!Comprehensive test coverage for
ToolRegistryincluding edge cases (empty registry, non-string containment, unhashable types) and immutability verification. Tests are properly marked with@pytest.mark.unitand follow the established patterns.tests/unit/tools/test_base.py (1)
1-145: LGTM!Thorough test coverage for both
ToolExecutionResultandBaseTool. The tests validate immutability, deep-copy semantics, name validation, and the abstract contract. The_ConcreteToolhelper is a clean minimal implementation for testing.tests/unit/tools/test_echo.py (1)
1-77: LGTM!Excellent test coverage for
EchoToolincluding unit tests for properties/execution and an integration test demonstrating the full pipeline (registry → invoker → result). The test uses vendor-agnostic IDs as required.tests/unit/tools/test_errors.py (1)
1-98: LGTM!Comprehensive test coverage for the error hierarchy including message storage, context immutability (both read-only and isolation from original dict), inheritance relationships, and string formatting. The tests validate the key design decisions around
MappingProxyTypefor context protection.tests/unit/tools/test_invoker.py (1)
15-229: Strong unit coverage forinvoke()andinvoke_all().This test module exercises the key success/error and ordering paths with clear assertions on
ToolResultshape and IDs.src/ai_company/tools/errors.py (1)
12-56: Error hierarchy and immutable context handling look solid.
ToolError+ specific subclasses are clear, andMappingProxyTypeprotects context mutation as intended.src/ai_company/observability/events.py (1)
111-122: Tool event constants are clean and consistent with the catalog.Naming and grouping align with the existing observability convention and support the new tool lifecycle logs.
tests/unit/tools/conftest.py (1)
15-184: Fixture scaffolding is well-structured and reusable.The private test tools plus
sample_registry/sample_invokerfixtures provide clear, deterministic setup for tool-system unit tests.
Source fixes: - Fix to_definition() schema leak: use property (deep copy) instead of raw _parameters_schema reference (Greptile, Gemini, type-analyzer) - Refactor invoke() into 4 private helpers to meet <50-line guideline (CodeRabbit) - Add logging before re-raising MemoryError/RecursionError (CodeRabbit) - Add catch-all except in _validate_params for SSRF/ref errors - Change TOOL_REGISTRY_BUILT log level from DEBUG to INFO (CodeRabbit) - Fix inconsistent log key name= to tool_name= in registry (Copilot) - Add debug log for TypeError in __contains__ (silent-failure-hunter) - Add logger to base.py, log before ValueError raise (logging-audit) - Fix execute() docstring: caveat no-schema case (comment-analyzer) - Fix registry module docstring "tuples" wording (comment-analyzer) - Remove TODO-like "future iteration" note from invoke_all (comment-analyzer) New tests (12 added, 77 total): - RecursionError re-raise propagation (2 tests) - jsonschema.SchemaError path with invalid tool schema (2 tests) - SSRF protection via blocked remote $ref resolution (2 tests) - Empty exception message fallback formatting (1 test) New event constants: - TOOL_INVOKE_NON_RECOVERABLE, TOOL_BASE_INVALID_NAME, TOOL_REGISTRY_CONTAINS_TYPE_ERROR All 1672 tests pass, mypy strict clean, ruff clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/tools/invoker.py`:
- Around line 57-59: Update the docstrings in src/ai_company/tools/invoker.py to
accurately describe the error contract: state that while most tool errors are
caught and converted to ToolResult(is_error=True), certain non-recoverable
exceptions are re-raised and not swallowed. Modify the docstrings for the public
entry points (e.g., invoke, any public ToolInvoker method) and any internal
methods that currently claim "all errors are caught" (referencing ToolResult and
the method names present in this file) so callers know which errors will be
returned as ToolResult and which will propagate.
- Around line 133-197: The _validate_params method is too large and should be
split into focused helpers: extract the jsonschema.validate call and try/except
into a new helper (e.g., _run_schema_validation(tool_call, schema) that returns
None or raises the underlying exception), and create small mapping helpers like
_map_schema_error(exc, tool_call) , _map_validation_error(exc, tool_call) and
_map_generic_error(exc, tool_call) that each log using TOOL_INVOKE_SCHEMA_ERROR
or TOOL_INVOKE_PARAMETER_ERROR and return the appropriate ToolResult (using
ToolResult and ToolParameterError) or None; then refactor _validate_params to
simply fetch schema, call _run_schema_validation with _SAFE_REGISTRY, catch
exceptions and delegate to the mapping helpers, keeping _validate_params under
50 lines.
In `@src/ai_company/tools/registry.py`:
- Around line 110-119: The __contains__ method currently only logs
TOOL_REGISTRY_CONTAINS_TYPE_ERROR on TypeError (unhashable inputs) but silently
returns False for hashable non-str inputs; update __contains__ (in class using
self._tools) to consistently emit the observability event whenever the queried
name is not a str (e.g., use isinstance(name, str) check) before membership
testing, calling logger.debug(TOOL_REGISTRY_CONTAINS_TYPE_ERROR,
name_type=type(name).__name__) for non-str inputs, and then return False; keep
the existing TypeError handling for unhashable values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2323a323-ac50-448c-a2d1-abb71379af44
📒 Files selected for processing (6)
src/ai_company/observability/events.pysrc/ai_company/tools/base.pysrc/ai_company/tools/invoker.pysrc/ai_company/tools/registry.pytests/unit/tools/conftest.pytests/unit/tools/test_invoker.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax (without parentheses) for exception handling — ruff enforces this on Python 3.14
Include type hints on all public functions, validated with mypy strict mode
Use Google-style docstrings on all public classes and functions, enforced by ruff D rules
Create new objects rather than mutating existing ones — prioritize immutability
Use Pydantic v2 for data models withBaseModel,model_validator, andConfigDict
Enforce 88 character line length (ruff configured)
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files) in Python code
Every module with business logic must import logging with:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Always use variable namelogger(not_logger, notlog) for the logging instance
Pure data models, enums, and re-exports do not require logging
Files:
src/ai_company/tools/registry.pysrc/ai_company/tools/invoker.pysrc/ai_company/observability/events.pytests/unit/tools/test_invoker.pytests/unit/tools/conftest.pysrc/ai_company/tools/base.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Never useimport logging,logging.getLogger(), orprint()in application code — use the project's logger
Use event name constants fromai_company.observability.eventsrather than inline strings
Use structured logging format:logger.info(EVENT, key=value)— neverlogger.info('msg %s', val)
All error paths must log at WARNING or ERROR level with context before raising
All state transitions must log at INFO level
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
All provider calls must go throughBaseCompletionProviderwhich applies retry and rate limiting automatically
Files:
src/ai_company/tools/registry.pysrc/ai_company/tools/invoker.pysrc/ai_company/observability/events.pysrc/ai_company/tools/base.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow
Useasyncio_mode = 'auto'for async tests — no manual@pytest.mark.asyncioneeded
Enforce 30 second timeout per test
Use vendor-agnostic fake model IDs/names in tests (e.g.test-haiku-001,test-provider), never real vendor model IDs
Files:
tests/unit/tools/test_invoker.pytests/unit/tools/conftest.py
🧠 Learnings (17)
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/**/*.py : All state transitions must log at INFO level
Applied to files:
src/ai_company/tools/registry.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/**/*.py : All functions and methods should have appropriate logging using `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.
Applied to files:
src/ai_company/tools/registry.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/**/*.py : Use structured logging format: `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`
Applied to files:
src/ai_company/tools/registry.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : All functions and methods should have appropriate logging for debugging and traceability. Use `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.
Applied to files:
src/ai_company/tools/registry.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/**/*.py : Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
Applied to files:
src/ai_company/tools/registry.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) for exception handling — ruff enforces this on Python 3.14
Applied to files:
src/ai_company/tools/invoker.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions
Applied to files:
src/ai_company/tools/invoker.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to **/*.py : Keep functions under 50 lines and files under 800 lines
Applied to files:
src/ai_company/tools/invoker.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR level with context before raising
Applied to files:
src/ai_company/tools/invoker.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions
Applied to files:
src/ai_company/tools/invoker.py
📚 Learning: 2026-03-05T14:30:10.714Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T14:30:10.714Z
Learning: Applies to src/ai_company/**/*.py : Use event name constants from `ai_company.observability.events` rather than inline strings
Applied to files:
src/ai_company/observability/events.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/unit/test_*.py : Write unit tests for new functionality using pytest in `tests/unit/` with `test_*.py` naming convention
Applied to files:
tests/unit/tools/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/unit/test_*.py : Write unit tests for new functionality using pytest. Place test files in `tests/unit/` with `test_*.py` naming convention.
Applied to files:
tests/unit/tools/conftest.py
🔇 Additional comments (4)
src/ai_company/observability/events.py (1)
111-125: Tool lifecycle event constants look solid.Clear, consistent event catalog extension for the tool subsystem.
tests/unit/tools/test_invoker.py (1)
12-336: Great invoker test coverage.The suite exercises happy paths, error mapping, schema failures, SSRF blocking behavior, and non-recoverable propagation in a clean, structured way.
src/ai_company/tools/base.py (1)
63-118: Base abstractions are implemented cleanly.Name validation, schema deep-copy protection, and
to_definition()mapping are all in good shape for safe tool contracts.tests/unit/tools/conftest.py (1)
15-282: Fixture scaffolding is strong and practical.The private tool doubles and invoker/registry fixtures make edge-case testing straightforward and keep test setup isolated.
| def __contains__(self, name: object) -> bool: | ||
| """Check whether a tool name is registered.""" | ||
| try: | ||
| return name in self._tools | ||
| except TypeError: | ||
| logger.debug( | ||
| TOOL_REGISTRY_CONTAINS_TYPE_ERROR, | ||
| name_type=type(name).__name__, | ||
| ) | ||
| return False |
There was a problem hiding this comment.
Log non-string membership checks consistently.
__contains__ logs TOOL_REGISTRY_CONTAINS_TYPE_ERROR only when input is unhashable. Hashable non-str values (for example, 1) return False without the observability event.
Proposed fix
def __contains__(self, name: object) -> bool:
"""Check whether a tool name is registered."""
- try:
- return name in self._tools
- except TypeError:
+ if not isinstance(name, str):
logger.debug(
TOOL_REGISTRY_CONTAINS_TYPE_ERROR,
name_type=type(name).__name__,
)
return False
+ return name in self._tools📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def __contains__(self, name: object) -> bool: | |
| """Check whether a tool name is registered.""" | |
| try: | |
| return name in self._tools | |
| except TypeError: | |
| logger.debug( | |
| TOOL_REGISTRY_CONTAINS_TYPE_ERROR, | |
| name_type=type(name).__name__, | |
| ) | |
| return False | |
| def __contains__(self, name: object) -> bool: | |
| """Check whether a tool name is registered.""" | |
| if not isinstance(name, str): | |
| logger.debug( | |
| TOOL_REGISTRY_CONTAINS_TYPE_ERROR, | |
| name_type=type(name).__name__, | |
| ) | |
| return False | |
| return name in self._tools |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/tools/registry.py` around lines 110 - 119, The __contains__
method currently only logs TOOL_REGISTRY_CONTAINS_TYPE_ERROR on TypeError
(unhashable inputs) but silently returns False for hashable non-str inputs;
update __contains__ (in class using self._tools) to consistently emit the
observability event whenever the queried name is not a str (e.g., use
isinstance(name, str) check) before membership testing, calling
logger.debug(TOOL_REGISTRY_CONTAINS_TYPE_ERROR, name_type=type(name).__name__)
for non-str inputs, and then return False; keep the existing TypeError handling
for unhashable values.
…refactors) - Add MemoryError/RecursionError re-raise in _validate_params before catch-all - Use dedicated TOOL_INVOKE_VALIDATION_UNEXPECTED event for catch-all handler - Extract 3 error handler helpers to keep _validate_params under 50 lines - Update docstrings to document non-recoverable error propagation - Use isinstance check in ToolRegistry.__contains__ instead of try/except Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| return self._schema_error_result(tool_call, exc.message) | ||
| except jsonschema.ValidationError as exc: | ||
| return self._param_error_result(tool_call, exc.message) | ||
| except (MemoryError, RecursionError) as exc: |
There was a problem hiding this comment.
PEP 758 no-parentheses convention not applied
CLAUDE.md explicitly states the project convention: "PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14." Both MemoryError/RecursionError guard clauses still use the parenthesized form.
The same violation occurs at line 240.
| except (MemoryError, RecursionError) as exc: | |
| except MemoryError, RecursionError as exc: |
Context Used: Rule from dashboard - CLAUDE.md (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/tools/invoker.py
Line: 157
Comment:
**PEP 758 no-parentheses convention not applied**
`CLAUDE.md` explicitly states the project convention: *"PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14."* Both `MemoryError`/`RecursionError` guard clauses still use the parenthesized form.
The same violation occurs at line 240.
```suggestion
except MemoryError, RecursionError as exc:
```
**Context Used:** Rule from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=6816cd03-d0e1-4fd0-9d04-2417487a584c))
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| """Execute the tool, catching errors as ``ToolResult``.""" | ||
| try: | ||
| return await tool.execute(arguments=dict(tool_call.arguments)) | ||
| except (MemoryError, RecursionError) as exc: |
There was a problem hiding this comment.
PEP 758 no-parentheses convention not applied (second occurrence)
Same CLAUDE.md convention applies here — except A, B: without parentheses. See also line 157.
| except (MemoryError, RecursionError) as exc: | |
| except MemoryError, RecursionError as exc: |
Context Used: Rule from dashboard - CLAUDE.md (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/tools/invoker.py
Line: 240
Comment:
**PEP 758 no-parentheses convention not applied (second occurrence)**
Same `CLAUDE.md` convention applies here — `except A, B:` without parentheses. See also line 157.
```suggestion
except MemoryError, RecursionError as exc:
```
**Context Used:** Rule from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=6816cd03-d0e1-4fd0-9d04-2417487a584c))
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| logger.exception( | ||
| TOOL_INVOKE_SCHEMA_ERROR, | ||
| tool_call_id=tool_call.id, | ||
| tool_name=tool_call.name, | ||
| error=error_msg, | ||
| ) |
There was a problem hiding this comment.
logger.exception called outside the immediate except scope
_schema_error_result is a helper method called from within an except jsonschema.SchemaError block, and _unexpected_validation_result is called from except Exception. Python does propagate sys.exc_info() into called functions, so both calls work correctly today — the traceback is captured. However, this is a fragile pattern: if either helper is ever called from a non-except context in a future refactor (e.g., called proactively with a pre-built error message), logger.exception() will silently log NoneType: None as the exception info, making the log record misleading without any error at the call site.
A more explicit and resilient approach is to accept the live exception as a parameter:
def _schema_error_result(
self,
tool_call: ToolCall,
error_msg: str,
exc: Exception | None = None,
) -> ToolResult:
logger.error(
TOOL_INVOKE_SCHEMA_ERROR,
...,
exc_info=exc,
)This makes the dependency on exception context explicit and prevents future accidental misuse.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/tools/invoker.py
Line: 176-181
Comment:
**`logger.exception` called outside the immediate `except` scope**
`_schema_error_result` is a helper method called from within an `except jsonschema.SchemaError` block, and `_unexpected_validation_result` is called from `except Exception`. Python does propagate `sys.exc_info()` into called functions, so both calls work correctly today — the traceback is captured. However, this is a fragile pattern: if either helper is ever called from a non-except context in a future refactor (e.g., called proactively with a pre-built error message), `logger.exception()` will silently log `NoneType: None` as the exception info, making the log record misleading without any error at the call site.
A more explicit and resilient approach is to accept the live exception as a parameter:
```python
def _schema_error_result(
self,
tool_call: ToolCall,
error_msg: str,
exc: Exception | None = None,
) -> ToolResult:
logger.error(
TOOL_INVOKE_SCHEMA_ERROR,
...,
exc_info=exc,
)
```
This makes the dependency on exception context explicit and prevents future accidental misuse.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 17 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Returns: | ||
| Tuple of results in the same order as the input. | ||
| """ | ||
| return tuple([await self.invoke(call) for call in tool_calls]) |
There was a problem hiding this comment.
This builds an intermediate list before converting to a tuple. For large tool call batches, avoid the extra allocation by accumulating results in a preallocated list + converting once, or by using a direct tuple construction approach that doesn’t materialize an intermediate list.
| def __init__( | ||
| self, | ||
| message: str, | ||
| *, | ||
| context: dict[str, Any] | None = None, | ||
| ) -> None: |
There was a problem hiding this comment.
The context parameter is typed as dict[str, Any], but the implementation immediately copies it and stores an immutable mapping. Accepting a broader Mapping[str, Any] | None (and documenting that it’s copied) would make the API more flexible for callers (e.g., when they already have an immutable mapping) without changing runtime behavior.
| logger.warning( | ||
| TOOL_NOT_FOUND, | ||
| tool_name=name, | ||
| available=available, | ||
| ) |
There was a problem hiding this comment.
A tool-miss is logged here (TOOL_NOT_FOUND) and then logged again in ToolInvoker._lookup_tool (TOOL_INVOKE_NOT_FOUND). Consider consolidating responsibility for logging misses in one layer (registry or invoker) to reduce duplicate log volume and make alerting/dashboards less noisy.
| logger.warning( | |
| TOOL_NOT_FOUND, | |
| tool_name=name, | |
| available=available, | |
| ) |
Summary
Implements the application-level tool system that bridges LLM
ToolCallobjects with actual tool execution (Issue #15, Design Spec §11.1-11.2).ToolError→ToolNotFoundError,ToolParameterError,ToolExecutionErrorwith immutableMappingProxyTypecontextToolExecutionResultfrozen Pydantic model, deep-copy schema protection,to_definition()for LLM integrationMappingProxyType-backed registry with duplicate rejection, sorted tool listing, andto_definitions()for provider layer$refresolution), executes tools, never propagates exceptions — all errors becomeToolResult(is_error=True)jsonschema==4.26.0added as explicit dependency (was only transitive via litellm)Key Design Decisions
ToolInvokerusesreferencing.Registrywith a custom retrieve function that raisesNoSuchResourceto block remote$refresolution (SSRF prevention)except MemoryError, RecursionError: raisebeforeexcept Exceptionensures non-recoverable errors propagate (PEP 758 syntax)BaseTool.parameters_schemaproperty returnscopy.deepcopy()to prevent external mutationinvoke_all()executes sequentially (parallel execution deferred to future iteration)Test Plan
pytest tests/ -n auto)Review Coverage
Pre-reviewed by 8 agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, security-reviewer). 17 findings addressed:
Closes #15
🤖 Generated with Claude Code