Skip to content

feat: implement tool-based memory retrieval injection strategy#1039

Merged
Aureliolo merged 4 commits intomainfrom
feat/tool-based-memory-retrieval
Apr 3, 2026
Merged

feat: implement tool-based memory retrieval injection strategy#1039
Aureliolo merged 4 commits intomainfrom
feat/tool-based-memory-retrieval

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

Bridge search_memory and recall_memory tools into the standard ToolRegistry/ToolInvoker dispatch pipeline. The ToolBasedInjectionStrategy already handles memory retrieval logic; this adds BaseTool wrappers that delegate to it, following the established registry_with_approval_tool pattern.

Changes

New files

  • src/synthorg/memory/tools.py -- SearchMemoryTool, RecallMemoryTool (BaseTool subclasses), create_memory_tools factory, registry_with_memory_tools augmentation function
  • tests/unit/memory/test_memory_tools.py -- 48 unit tests

Modified files

  • src/synthorg/core/enums.py -- Added ToolCategory.MEMORY, ActionType.MEMORY_READ
  • src/synthorg/engine/agent_engine.py -- Added memory_injection_strategy parameter, wired into _make_tool_invoker
  • src/synthorg/memory/tool_retriever.py -- Made schema/name constants public, wrapped schemas with MappingProxyType, extracted error message constants
  • src/synthorg/security/action_type_mapping.py -- Added MEMORY -> MEMORY_READ mapping
  • src/synthorg/security/action_types.py -- Added ActionTypeCategory.MEMORY
  • src/synthorg/security/rules/risk_classifier.py -- Added MEMORY_READ as LOW risk
  • src/synthorg/security/timeout/risk_tier_classifier.py -- Added MEMORY_READ as LOW risk
  • src/synthorg/tools/permissions.py -- Added MEMORY to SANDBOXED/RESTRICTED/STANDARD access levels
  • docs/design/memory.md -- Added ToolRegistry integration details to Tool-Based Retrieval section
  • docs/design/operations.md -- Added MEMORY to Tool Categories table and memory:read to Action Type Taxonomy
  • CLAUDE.md -- Updated memory/ package description

Design decisions

  • Per-agent tool binding: Tools are constructed with agent_id baked in (via NotBlankStr), preventing cross-agent memory leakage
  • Thin wrappers: BaseTool.execute() delegates entirely to ToolBasedInjectionStrategy.handle_tool_call() -- no duplicate logic
  • Shared error constants: Error message strings defined once in tool_retriever.py, imported by tools.py for _is_error_response detection
  • Graceful degradation: registry_with_memory_tools catches construction failures and returns the original registry
  • MEMORY at all access levels: Read-only, agent-scoped -- safe from SANDBOXED upward

Test plan

  • 48 new tests covering: tool properties, delegation, error paths, schema isolation, factory, registry augmentation, round-trips, error response detection, generic exception paths, graceful degradation
  • Updated existing parametrize lists for MEMORY category in permissions, risk classifiers, action type mapping, and enum count tests
  • Full suite: 13151 passed, 0 failed

Review coverage

Pre-reviewed by 8 agents (code-reviewer, conventions-enforcer, silent-failure-hunter, type-design-analyzer, docs-consistency, test-quality-reviewer, issue-resolution-verifier, logging-audit). 15 findings addressed.

Closes #207

Bridge search_memory and recall_memory tools into the standard
ToolRegistry/ToolInvoker dispatch pipeline. The ToolBasedInjection
Strategy already handles memory retrieval logic; this adds BaseTool
wrappers (SearchMemoryTool, RecallMemoryTool) that delegate to it.

- Add MEMORY to ToolCategory enum and MEMORY_READ to ActionType
- Add ToolCategory.MEMORY to all permission access levels
- Create memory/tools.py with BaseTool wrappers, factory function,
  and registry_with_memory_tools() augmentation
- Wire memory_injection_strategy into AgentEngine._make_tool_invoker
- Add MEMORY_READ to both risk classifier maps (LOW tier)
- Add ActionTypeCategory.MEMORY for action type taxonomy
- Make tool_retriever.py schema constants public for reuse
- Full test coverage in test_memory_tools.py (33 tests)

Closes #207
- Extract error message constants from tool_retriever.py, import in
  tools.py to eliminate fragile string duplication
- Wrap SEARCH_MEMORY_SCHEMA and RECALL_MEMORY_SCHEMA with
  MappingProxyType + Final for read-only enforcement
- Replace MEMORY_RETRIEVAL_START with TOOL_REGISTRY_BUILT event in
  registry_with_memory_tools (semantic correctness)
- Remove __slots__ from tool classes (BaseTool has no __slots__)
- Change agent_id type to NotBlankStr for construction-time validation
- Add graceful degradation for registry construction failures
- Fix whitespace-only memory_id validation gap
- Update docs/design/operations.md with MEMORY category and memory:read
- Update docs/design/memory.md with ToolRegistry integration details
- Update CLAUDE.md memory/ package description
- Add _is_error_response direct tests (10 tests)
- Add generic exception path tests (2 tests)
- Add MEMORY_READ to risk classifier parametrize lists
- Add MEMORY to permissions parametrize lists
- Add graceful degradation test for duplicate tool names
Copilot AI review requested due to automatic review settings April 3, 2026 14:42
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 3, 2026 14:43 — with GitHub Actions Inactive
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 586f154c-2804-45a0-9d2e-156534d18ed6

📥 Commits

Reviewing files that changed from the base of the PR and between 87a69c7 and 665179e.

📒 Files selected for processing (2)
  • src/synthorg/memory/tools.py
  • tests/unit/memory/test_memory_tools.py
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: use except A, B: (no parentheses)—ruff enforces this on Python 3.14.
All public functions must have type hints; mypy strict mode is required.
Docstrings must use Google style and are required on all public classes and functions—enforced by ruff D rules.
Create new objects instead of mutating existing ones—use immutability as the default pattern.
For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement.
Use model_copy(update=...) for runtime state that evolves; separate mutable-via-copy models from frozen Pydantic models for config/identity. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing and validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr from core.types for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Function line length limit: 88 characters; enforced by ruff.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly; never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
Logger variable name must always be logger (not _logger, not log).
Always use structured logging with kwargs: `l...

Files:

  • tests/unit/memory/test_memory_tools.py
  • src/synthorg/memory/tools.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.
Prefer @pytest.mark.parametrize for testing similar cases in Python tests.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names in tests: test-provider, test-small-001, etc.
Use Hypothesis for property-based testing in Python with @given and @settings decorators. Hypothesis profiles: ci (50 examples, default) and dev (1000 examples). Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/memory/test_memory_tools.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/memory/test_memory_tools.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__).
Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction and bootstrap.
Always use event name constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Vendor names may only appear in: (1) Operations design page, (2) .claude/ files, (3) third-party import paths, (4) provider presets. Tests must use test-provider, test-small-001, etc.

Files:

  • src/synthorg/memory/tools.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/memory/tools.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)

Applied to files:

  • tests/unit/memory/test_memory_tools.py
  • src/synthorg/memory/tools.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-04-03T13:34:37.835Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T13:34:37.835Z
Learning: Applies to **/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement.

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-04-01T09:09:43.948Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-04-01T09:58:27.410Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:58:27.410Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for dict/list fields in frozen Pydantic models

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for `dict`/`list` fields

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.

Applied to files:

  • src/synthorg/memory/tools.py
🔇 Additional comments (14)
src/synthorg/memory/tools.py (7)

1-39: Module setup and imports look good.

The module docstring is clear and follows Google style. The imports are well-organized with TYPE_CHECKING for type-only imports, the logger is correctly initialized using get_logger(__name__), and error prefixes are imported from the single source of truth in tool_retriever.py.


41-56: Error classification is well-documented and correctly implemented.

The contract comment at lines 41-43 clearly states the maintenance requirement. The _is_error_response function correctly uses startswith to match error prefixes, and the test suite (lines 552-584) verifies edge cases like "No memories found." not being classified as an error.


59-113: SearchMemoryTool implementation is clean and follows the delegation pattern.

The class correctly:

  • Uses keyword-only arguments in __init__
  • Documents the shallow dict() conversion with a comment explaining that BaseTool.__init__ handles the defensive deepcopy
  • Binds agent_id at construction time for per-agent tool instances
  • Delegates execution to the strategy and wraps results appropriately

116-166: RecallMemoryTool correctly mirrors the SearchMemoryTool pattern.

The implementation follows the same delegation pattern with appropriate differences in name, description, and schema. The symmetry between the two tool classes is intentional and aids maintainability.


169-195: Factory function follows best practices.

The create_memory_tools factory:

  • Uses keyword-only arguments
  • Returns an immutable tuple
  • Logs at DEBUG level (appropriate for object creation per coding guidelines)
  • Uses structured logging with kwargs

198-213: Helper correctly constructs augmented registry.

The local import avoids circular dependencies, and the spread pattern [*existing, *memory_tools] creates a new list passed to the ToolRegistry constructor, preserving immutability.


216-267: Registry augmentation handles errors correctly.

The exception handling properly:

  • Re-raises MemoryError and RecursionError (system errors)
  • Re-raises ValueError (configuration errors like duplicate names) per the commit fix
  • Gracefully degrades for unexpected exceptions with appropriate WARNING logging

This addresses the previous review feedback about not silently downgrading reserved-name collisions.

tests/unit/memory/test_memory_tools.py (7)

1-85: Test fixtures and helpers are well-structured.

The helper functions (_make_entry, _make_backend, _tool_config, _make_strategy, _make_empty_registry) provide clean test setup. The _DummyTool class is a minimal concrete implementation for testing registry augmentation.


87-220: SearchMemoryTool tests provide comprehensive coverage.

The tests cover:

  • Tool identity and metadata (name, category, description, schema)
  • Schema immutability via deep-copy verification
  • Successful execution delegation
  • Error cases (blank query, no results, backend errors)
  • Security: verifying internal error details aren't leaked (lines 208-209)

222-360: RecallMemoryTool tests correctly distinguish "not found" as an error.

The test at line 304 asserts that a not-found recall is classified as an error (is_error=True), which is correct behavior since RECALL_NOT_FOUND_PREFIX is in _ERROR_PREFIXES. This correctly distinguishes from search's "No memories found." (empty results are valid), while recall's "Memory not found: {id}" indicates the specific requested memory doesn't exist.


362-414: Factory tests verify agent-scoped isolation.

The test_agent_id_bound_to_tools test (lines 398-413) is particularly important—it verifies that the bound agent_id is correctly passed through to backend calls, ensuring memory isolation between agents.


416-547: Registry integration tests cover the ValueError propagation fix.

The test_duplicate_names_raise_value_error test (lines 532-546) directly validates the fix from the previous review—duplicate tool names now properly raise ValueError instead of being silently caught. The round-trip tests (lines 477-516) provide excellent end-to-end coverage.


549-584: Error classification tests include important edge cases.

The test_partial_match_not_error test (lines 582-584) verifies that "Error occurred" is not misclassified—only strings starting with "Error:" (with colon) match. The test_no_memories_found_is_not_error test (lines 572-574) documents the intentional distinction between empty search results (valid) and actual errors.


587-621: Generic exception tests enforce error message sanitization.

These tests verify that when unexpected exceptions occur, the error messages returned to users contain generic language ("unexpected error") and don't leak internal exception details like "internal boom" or "refused". This is important for security.


Walkthrough

Adds a tool-based memory injection path: search_memory and recall_memory BaseTool wrappers delegate to a ToolBasedInjectionStrategy via handle_tool_call and can be injected into an agent’s ToolRegistry at runtime using a new AgentEngine constructor parameter memory_injection_strategy and helper registry_with_memory_tools(). Introduces ToolCategory.MEMORY and ActionType.MEMORY_READ, updates security mappings and permission tables, refactors memory retriever constants/validation and tool schemas, adds observability event for augmentation failures, and includes tests and docs covering wiring, behavior, and error handling.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 24.42% which is insufficient. The required threshold is 40.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: implement tool-based memory retrieval injection strategy' accurately summarizes the main change: integrating memory tools into the ToolRegistry/ToolInvoker pipeline via BaseTool wrappers.
Description check ✅ Passed The description comprehensively explains the changes (new files, modified files), design decisions (per-agent binding, thin wrappers, error constants, graceful degradation), and test coverage (48 new tests, 13,151 passed).
Linked Issues check ✅ Passed The PR fully implements the requirements from issue #207: tool-based memory retrieval (recall_memory and search_memory tools), MemoryInjectionStrategy protocol compliance, schema exposure via get_tool_definitions(), and ToolRegistry/ToolInvoker integration.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing tool-based memory retrieval: core enums (MEMORY category, MEMORY_READ action type), security/action mappings, permission rules, agent engine wiring, and comprehensive test coverage. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 665179e.
Ensure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request integrates memory retrieval capabilities into the standard tool execution pipeline by introducing SearchMemoryTool and RecallMemoryTool. Key changes include updating the AgentEngine to register these tools, expanding the security taxonomy with a new MEMORY_READ action type, and adding comprehensive unit tests. Feedback points out that the error detection logic in src/synthorg/memory/tools.py is fragile because it relies on string prefix matching, which could cause false positives; refactoring to use custom exceptions or unique error markers is recommended for better robustness.

Comment on lines +50 to +52
def _is_error_response(text: str) -> bool:
"""Check whether the strategy response indicates an error."""
return any(text.startswith(prefix) for prefix in _ERROR_PREFIXES)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current error detection mechanism, which relies on checking if the response string starts with common error prefixes like "Error:" or "Memory not found:", is fragile. A valid memory entry could coincidentally start with one of these strings, leading to a false positive where a successful retrieval is incorrectly flagged as an error.

To make this more robust, I suggest one of two approaches:

  1. (Recommended) Refactor the error handling to use custom exceptions instead of returning error strings. The handle_tool_call method in ToolBasedInjectionStrategy could raise specific exceptions (e.g., MemoryNotFoundError, MemorySearchError), and the execute methods in the tools would catch these and create a ToolExecutionResult with is_error=True.

  2. (Simpler fix) If refactoring to exceptions is too large a change, make the error prefixes more unique and less likely to appear in normal content. For example, you could introduce a special prefix like __MEMORY_TOOL_ERROR__: and prepend it to all error messages.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 3, 2026

Codecov Report

❌ Patch coverage is 95.06173% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.48%. Comparing base (760bfbd) to head (665179e).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/memory/tools.py 91.11% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1039      +/-   ##
==========================================
+ Coverage   91.45%   91.48%   +0.02%     
==========================================
  Files         686      687       +1     
  Lines       38324    38392      +68     
  Branches     3818     3821       +3     
==========================================
+ Hits        35051    35124      +73     
+ Misses       2613     2608       -5     
  Partials      660      660              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Integrates tool-based memory retrieval (search_memory, recall_memory) into the standard tool dispatch pipeline by wrapping the existing ToolBasedInjectionStrategy with BaseTool implementations and wiring them into ToolRegistry construction per agent execution.

Changes:

  • Added SearchMemoryTool / RecallMemoryTool wrappers plus registry augmentation helpers for ToolRegistry integration.
  • Introduced ToolCategory.MEMORY and ActionType.MEMORY_READ, and updated security/permissions/risk classification to recognize them.
  • Added extensive unit tests for the new memory tools and updated enum/mapping/risk/permissions tests accordingly.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/synthorg/memory/tools.py New BaseTool wrappers for memory retrieval and helper to augment registries.
src/synthorg/memory/tool_retriever.py Exposes tool names/schemas/constants for reuse; hardens recall arg validation; centralizes error strings.
src/synthorg/engine/agent_engine.py Adds optional memory_injection_strategy and augments per-run tool registries when tool-based strategy is provided.
src/synthorg/core/enums.py Adds ToolCategory.MEMORY and ActionType.MEMORY_READ.
src/synthorg/security/action_type_mapping.py Maps ToolCategory.MEMORYActionType.MEMORY_READ.
src/synthorg/security/action_types.py Adds ActionTypeCategory.MEMORY to taxonomy.
src/synthorg/security/rules/risk_classifier.py Classifies MEMORY_READ as LOW risk.
src/synthorg/security/timeout/risk_tier_classifier.py Classifies MEMORY_READ as LOW risk for timeout tiering.
src/synthorg/tools/permissions.py Allows ToolCategory.MEMORY across SANDBOXED/RESTRICTED/STANDARD access levels.
tests/unit/memory/test_memory_tools.py New unit tests covering tool behavior, delegation, registry augmentation, and error detection.
tests/unit/tools/test_permissions.py Updates permissions matrix tests to include MEMORY category.
tests/unit/tools/test_base_action_type.py Ensures default action type mapping includes MEMORY → MEMORY_READ.
tests/unit/security/test_action_type_mapping.py Updates mapping tests to include MEMORY category.
tests/unit/security/test_action_types.py Updates taxonomy tests for new MEMORY category.
tests/unit/security/rules/test_risk_classifier.py Adds MEMORY_READ to LOW-risk action list.
tests/unit/security/timeout/test_risk_tier_classifier.py Adds MEMORY_READ to LOW-risk tier assertions.
tests/unit/core/test_enums.py Updates ActionType member count to include MEMORY_READ.
docs/design/memory.md Documents ToolRegistry integration for tool-based strategy.
docs/design/operations.md Adds MEMORY to tool categories and memory:read to action taxonomy.
CLAUDE.md Updates repository package description to mention the new memory tool wrappers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +73 to +81
super().__init__(
name=SEARCH_MEMORY_TOOL_NAME,
description=(
"Search agent memory for relevant past context, "
"decisions, or learned information."
),
parameters_schema=copy.deepcopy(dict(SEARCH_MEMORY_SCHEMA)),
category=ToolCategory.MEMORY,
)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parameters_schema is deep-copied here before passing into BaseTool, but BaseTool.__init__ already deep-copies and freezes the schema internally. This results in an unnecessary extra deepcopy for both tools (minor but avoidable overhead and duplication). Consider passing dict(SEARCH_MEMORY_SCHEMA) / dict(RECALL_MEMORY_SCHEMA) (or even the mapping directly) and rely on BaseTool for the defensive copy/freeze.

Copilot uses AI. Check for mistakes.
Comment on lines +218 to +226
except Exception as exc:
logger.warning(
MEMORY_RETRIEVAL_DEGRADED,
source="registry_augmentation",
agent_id=agent_id,
error=str(exc),
exc_info=True,
)
return tool_registry
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception path logs MEMORY_RETRIEVAL_DEGRADED for registry augmentation failures (e.g., duplicate tool names). That event name is also used for actual retrieval pipeline failures, so this can create misleading operational signals/alerts. Consider logging a tool-registry–specific event (or at least a TOOL_* event) here, and reserve MEMORY_RETRIEVAL_DEGRADED for backend/retrieval failures.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/core/test_enums.py`:
- Around line 117-118: The test test_action_type_has_26_members only asserts
len(ActionType) == 26 which can miss accidental member replacements; update the
test to assert the presence of specific critical members (e.g.,
ActionType.MEMORY_READ) and/or assert that the set of ActionType member names
equals the expected set of 26 names so the test fails if any required member is
renamed/removed/replaced; modify test_action_type_has_26_members to include
these direct membership checks alongside or instead of the count assertion.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0a62ecd3-e5de-4d4c-be96-219d68fd787f

📥 Commits

Reviewing files that changed from the base of the PR and between 760bfbd and 9aea7ae.

📒 Files selected for processing (20)
  • CLAUDE.md
  • docs/design/memory.md
  • docs/design/operations.md
  • src/synthorg/core/enums.py
  • src/synthorg/engine/agent_engine.py
  • src/synthorg/memory/tool_retriever.py
  • src/synthorg/memory/tools.py
  • src/synthorg/security/action_type_mapping.py
  • src/synthorg/security/action_types.py
  • src/synthorg/security/rules/risk_classifier.py
  • src/synthorg/security/timeout/risk_tier_classifier.py
  • src/synthorg/tools/permissions.py
  • tests/unit/core/test_enums.py
  • tests/unit/memory/test_memory_tools.py
  • tests/unit/security/rules/test_risk_classifier.py
  • tests/unit/security/test_action_type_mapping.py
  • tests/unit/security/test_action_types.py
  • tests/unit/security/timeout/test_risk_tier_classifier.py
  • tests/unit/tools/test_base_action_type.py
  • tests/unit/tools/test_permissions.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Agent
  • GitHub Check: Build Backend
  • GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: use except A, B: (no parentheses)—ruff enforces this on Python 3.14.
All public functions must have type hints; mypy strict mode is required.
Docstrings must use Google style and are required on all public classes and functions—enforced by ruff D rules.
Create new objects instead of mutating existing ones—use immutability as the default pattern.
For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement.
Use model_copy(update=...) for runtime state that evolves; separate mutable-via-copy models from frozen Pydantic models for config/identity. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing and validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr from core.types for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Function line length limit: 88 characters; enforced by ruff.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly; never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
Logger variable name must always be logger (not _logger, not log).
Always use structured logging with kwargs: `l...

Files:

  • tests/unit/security/rules/test_risk_classifier.py
  • src/synthorg/security/rules/risk_classifier.py
  • tests/unit/tools/test_base_action_type.py
  • tests/unit/security/timeout/test_risk_tier_classifier.py
  • tests/unit/security/test_action_types.py
  • tests/unit/security/test_action_type_mapping.py
  • tests/unit/tools/test_permissions.py
  • src/synthorg/tools/permissions.py
  • src/synthorg/security/action_types.py
  • src/synthorg/security/action_type_mapping.py
  • src/synthorg/security/timeout/risk_tier_classifier.py
  • src/synthorg/core/enums.py
  • tests/unit/core/test_enums.py
  • src/synthorg/engine/agent_engine.py
  • tests/unit/memory/test_memory_tools.py
  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.
Prefer @pytest.mark.parametrize for testing similar cases in Python tests.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names in tests: test-provider, test-small-001, etc.
Use Hypothesis for property-based testing in Python with @given and @settings decorators. Hypothesis profiles: ci (50 examples, default) and dev (1000 examples). Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/security/rules/test_risk_classifier.py
  • tests/unit/tools/test_base_action_type.py
  • tests/unit/security/timeout/test_risk_tier_classifier.py
  • tests/unit/security/test_action_types.py
  • tests/unit/security/test_action_type_mapping.py
  • tests/unit/tools/test_permissions.py
  • tests/unit/core/test_enums.py
  • tests/unit/memory/test_memory_tools.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/security/rules/test_risk_classifier.py
  • tests/unit/tools/test_base_action_type.py
  • tests/unit/security/timeout/test_risk_tier_classifier.py
  • tests/unit/security/test_action_types.py
  • tests/unit/security/test_action_type_mapping.py
  • tests/unit/tools/test_permissions.py
  • tests/unit/core/test_enums.py
  • tests/unit/memory/test_memory_tools.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__).
Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction and bootstrap.
Always use event name constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Vendor names may only appear in: (1) Operations design page, (2) .claude/ files, (3) third-party import paths, (4) provider presets. Tests must use test-provider, test-small-001, etc.

Files:

  • src/synthorg/security/rules/risk_classifier.py
  • src/synthorg/tools/permissions.py
  • src/synthorg/security/action_types.py
  • src/synthorg/security/action_type_mapping.py
  • src/synthorg/security/timeout/risk_tier_classifier.py
  • src/synthorg/core/enums.py
  • src/synthorg/engine/agent_engine.py
  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/security/rules/risk_classifier.py
  • src/synthorg/tools/permissions.py
  • src/synthorg/security/action_types.py
  • src/synthorg/security/action_type_mapping.py
  • src/synthorg/security/timeout/risk_tier_classifier.py
  • src/synthorg/core/enums.py
  • src/synthorg/engine/agent_engine.py
  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
src/synthorg/engine/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

RetryExhaustedError signals that all retries failed—the engine layer catches this to trigger fallback chains.

Files:

  • src/synthorg/engine/agent_engine.py
🧠 Learnings (24)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • src/synthorg/security/rules/risk_classifier.py
  • src/synthorg/security/timeout/risk_tier_classifier.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)

Applied to files:

  • CLAUDE.md
  • docs/design/memory.md
  • src/synthorg/engine/agent_engine.py
  • tests/unit/memory/test_memory_tools.py
  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • CLAUDE.md
  • src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • CLAUDE.md
  • src/synthorg/engine/agent_engine.py
  • docs/design/operations.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

  • CLAUDE.md
  • src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/core/**/*.py : Core module must contain shared domain models, base classes, resilience config (RetryConfig, RateLimiterConfig)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • src/synthorg/security/timeout/risk_tier_classifier.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Applied to files:

  • src/synthorg/engine/agent_engine.py
  • src/synthorg/memory/tools.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module

Applied to files:

  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement

Applied to files:

  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-04-01T09:09:43.948Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)

Applied to files:

  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-04-03T13:34:37.835Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T13:34:37.835Z
Learning: Applies to **/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement.

Applied to files:

  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.

Applied to files:

  • src/synthorg/memory/tool_retriever.py
🔇 Additional comments (34)
src/synthorg/core/enums.py (1)

376-376: Good taxonomy extension for memory actions.

ToolCategory.MEMORY and ActionType.MEMORY_READ are aligned and consistent with the existing category:action convention and downstream wiring expectations.

Also applies to: 479-479

src/synthorg/security/action_types.py (1)

3-3: Category registry update looks correct.

Adding memory to ActionTypeCategory matches the built-in taxonomy expansion and is guarded by the existing category parity validation.

Also applies to: 35-35

src/synthorg/security/rules/risk_classifier.py (1)

43-43: Correct risk classification addition for memory reads.

This explicit LOW mapping removes unintended HIGH fallback behavior for memory:read in default classification paths.

tests/unit/security/rules/test_risk_classifier.py (1)

94-94: Nice coverage extension for the new action type.

Including ActionType.MEMORY_READ in the LOW-risk parametrization correctly validates the classifier update.

tests/unit/tools/test_base_action_type.py (1)

58-58: Default action-type mapping test is correctly extended.

The new MEMORY case ensures BaseTool resolves to memory:read through the shared default map.

tests/unit/security/timeout/test_risk_tier_classifier.py (1)

35-35: Good parity update for timeout risk-tier defaults.

This assertion closes the gap so memory:read is covered in both risk classification paths.

docs/design/operations.md (1)

498-499: Documentation update is consistent and complete.

The new Memory tool category row and memory:read taxonomy entry match the code-level action/category additions.

Also applies to: 651-665

tests/unit/tools/test_permissions.py (1)

47-77: Memory category coverage in the permission matrix looks correct.

The new SANDBOXED/RESTRICTED/STANDARD ToolCategory.MEMORY expectations are consistent and well-integrated into the existing parametrized test table.

src/synthorg/security/action_type_mapping.py (1)

14-31: Default action-type mapping for MEMORY is correctly wired.

Adding ToolCategory.MEMORY -> ActionType.MEMORY_READ keeps category-to-action resolution complete for memory tools.

tests/unit/security/test_action_type_mapping.py (1)

19-20: Mapping tests now correctly enforce exhaustiveness and MEMORY behavior.

The new assertions improve future-proof coverage and explicitly validate the ToolCategory.MEMORY mapping.

Also applies to: 50-66

tests/unit/security/test_action_types.py (1)

13-15: ActionTypeCategory test updates are consistent with MEMORY enum expansion.

Both the member-count and value-table updates are correct and keep enum-contract coverage complete.

Also applies to: 16-31

src/synthorg/tools/permissions.py (1)

54-91: Access-level category allowlists now consistently permit MEMORY tools.

The SANDBOXED/RESTRICTED/STANDARD updates are coherent and preserve existing resolution precedence (denied > allowed > level).

CLAUDE.md (1)

100-100: Memory package docs now reflect the tool-wrapper architecture clearly.

Including SearchMemoryTool and RecallMemoryTool in the package map improves alignment between docs and implementation.

src/synthorg/security/timeout/risk_tier_classifier.py (1)

13-46: Risk-tier defaults correctly include MEMORY_READ as LOW.

This keeps timeout-policy risk classification explicit and avoids unnecessary HIGH fallback for known memory-read actions.

src/synthorg/engine/agent_engine.py (1)

122-123: Memory tool injection is cleanly wired into the existing ToolInvoker pipeline.

The optional strategy path is well-contained, preserves existing behavior when unset, and correctly binds memory tools with agent-scoped context.

Also applies to: 207-210, 239-240, 306-307, 1202-1211

docs/design/memory.md (1)

717-726: LGTM!

The documentation accurately describes the tool-based memory retrieval integration: BaseTool wrappers delegating to ToolBasedInjectionStrategy, the registry_with_memory_tools() factory, and AgentEngine wiring. The content aligns with the implementation in src/synthorg/memory/tools.py.

tests/unit/memory/test_memory_tools.py (7)

27-85: LGTM!

The test fixtures and helper functions are well-structured. The _make_entry, _make_backend, and _make_strategy helpers provide clean abstractions for setting up test scenarios, and _DummyTool is minimal yet sufficient for registry augmentation tests.


90-220: LGTM!

Comprehensive test coverage for SearchMemoryTool: property assertions, schema deep-copy verification, delegation testing, input validation (empty query), empty results handling, category filtering, and backend error handling with proper assertions that internal error details are not leaked to users.


225-345: LGTM!

Test coverage for RecallMemoryTool properly mirrors SearchMemoryTool tests while adding recall-specific scenarios: memory not found, empty memory_id validation, and backend error handling with internal detail suppression.


350-399: LGTM!

Factory tests verify the create_memory_tools contract: returns two tools with expected names, all are BaseTool instances with ToolCategory.MEMORY, and importantly validates agent ID binding (per-agent isolation) by checking the backend call arguments.


404-534: LGTM!

Thorough registry augmentation tests covering: tool addition, existing tool preservation, strategy type checking (non-ToolBasedInjectionStrategy and None), round-trip execution through the registry, to_definitions() output, and graceful degradation on duplicate tool names. The duplicate-name test at lines 517-533 validates the fallback behavior documented in the implementation.


539-572: LGTM!

Critical test coverage for _is_error_response(). The test at lines 559-561 correctly ensures "No memories found." (successful empty result) is NOT classified as an error, which is essential for correct is_error flag behavior. The partial match test at lines 569-571 validates strict prefix matching ("Error:" vs "Error occurred").


577-608: LGTM!

Generic exception path tests validate that unexpected errors (RuntimeError, ConnectionError) are handled gracefully with user-friendly messages and importantly do not leak internal exception details ("internal boom", "refused").

src/synthorg/memory/tools.py (5)

40-52: LGTM!

The _ERROR_PREFIXES tuple consolidates all error message prefixes from tool_retriever.py as a single source of truth, and _is_error_response() correctly uses startswith() for prefix matching. This approach ensures consistent error detection between the wrapper tools and the underlying strategy.


55-107: LGTM!

SearchMemoryTool correctly implements the thin wrapper pattern: copy.deepcopy(dict(SEARCH_MEMORY_SCHEMA)) properly handles the MappingProxyType, and execute() safely wraps the string result from handle_tool_call() (which always returns str per the strategy's contract) in ToolExecutionResult with appropriate error detection.


109-158: LGTM!

RecallMemoryTool follows the same thin wrapper pattern as SearchMemoryTool, maintaining consistency in the codebase. The implementation correctly delegates to handle_tool_call() with the recall tool name.


160-180: LGTM!

The create_memory_tools() factory is clean and straightforward: returns both memory tools bound to the specified agent ID and sharing the strategy instance. The return type tuple[BaseTool, ...] is appropriately abstract.


211-233: LGTM!

The registry_with_memory_tools() function follows the registry_with_approval_tool pattern with appropriate graceful degradation: ValueError from duplicate tool names (per ToolRegistry.__init__) is caught and logged with full context (exc_info=True), returning the original registry. The success path logs at DEBUG level with structured metadata.

src/synthorg/memory/tool_retriever.py (6)

32-41: LGTM!

The new public constants establish a single source of truth for tool names and error message prefixes. These are correctly exported and consumed by src/synthorg/memory/tools.py for error detection in _is_error_response().


49-88: LGTM!

The schema constants now use Final[MappingProxyType[str, Any]] for read-only enforcement at runtime, following the coding guideline for non-Pydantic internal collections. Consumers correctly use copy.deepcopy(dict(...)) to obtain mutable copies when needed.


231-245: LGTM!

get_tool_definitions() correctly uses the new tool name constants and creates deep copies of the immutable schemas via copy.deepcopy(dict(SCHEMA)). Each invocation returns fresh schema copies, preventing accidental mutation of the source.


266-271: LGTM!

The handle_tool_call() dispatch now uses the public tool name constants, ensuring consistency with the exported API.


273-331: LGTM!

_handle_search() now uses the error message constants (SEARCH_UNAVAILABLE, SEARCH_UNEXPECTED) and tool name constant in logging. This ensures the error prefixes match exactly what _is_error_response() checks in tools.py.


333-372: LGTM!

_handle_recall() improvements:

  • Line 340: Tightened validation rejects whitespace-only memory_id (not just falsy values).
  • Error message constants ensure prefix matching works with _is_error_response().
  • Line 370's "Memory not found: {id}" format correctly starts with RECALL_NOT_FOUND_PREFIX.

…nd CodeRabbit

- Add MemoryError/RecursionError re-raise guard in registry_with_memory_tools
- Fix stale ~25 -> ~26 action type count in operations.md
- Add MEMORY_RETRIEVAL_START/COMPLETE INFO logs to _handle_recall
- Add WARNING log before ValueError raise in handle_tool_call
- Add whitespace-only tests for query and memory_id
- Add engine wiring tests for memory_injection_strategy
- Extract _build_augmented_registry to keep function under 50 lines
- Remove redundant deepcopy (BaseTool.__init__ already deep-copies)
- Use TOOL_MEMORY_AUGMENTATION_FAILED event instead of MEMORY_RETRIEVAL_DEGRADED
- Add maxLength:256 to RECALL_MEMORY_SCHEMA + length validation
- Add comments explaining dict(MappingProxyType) pattern
- Add DEBUG log to create_memory_tools factory
- Truncate memory_id in not-found response to 64 chars
- Add MEMORY_READ value assertion alongside count check in test
- Document MEMORY at SANDBOXED as read-only in permissions.py
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/tools/permissions.py (1)

57-85: ⚠️ Potential issue | 🟠 Major

Encode the read-only MEMORY guarantee in code, not just a comment.

ToolPermissionChecker allows categories wholesale, so this change now permits every ToolCategory.MEMORY tool for SANDBOXED/RESTRICTED/STANDARD agents. The problem is that the guarantee here is only documentary: src/synthorg/memory/protocol.py still exposes mutating operations like store() and delete(). If a write-capable memory tool lands under the same category later, it will inherit these low-access allowances automatically. Please split read/write memory categories or otherwise enforce read-only memory actions in the permission model itself.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/permissions.py` around lines 57 - 85, The permission model
currently grants ToolCategory.MEMORY wholesale to
ToolAccessLevel.SANDBOXED/RESTRICTED/STANDARD via the mapping in the permissions
dict, but the read-only guarantee is only a comment; update the model by either
(A) introducing separate categories or flags (e.g., ToolCategory.MEMORY_READ vs
ToolCategory.MEMORY_WRITE) and replace references in the mapping so low-access
levels only get MEMORY_READ, or (B) enhance ToolPermissionChecker to
special-case ToolCategory.MEMORY and deny any mutating memory operations by
checking for memory protocol methods (e.g., store(), delete()) and refusing them
unless the access level is elevated; make changes around the permissions mapping
and in ToolPermissionChecker to enforce read-only memory at runtime and adjust
any uses of ToolCategory.MEMORY in code (and memory/protocol.py) to reflect the
new read/write distinction.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/memory/tools.py`:
- Around line 240-256: The current except block around _build_augmented_registry
silently treats all exceptions as operational and returns the original
tool_registry; instead detect and re-raise configuration errors coming from
ToolRegistry (e.g., ValueError for reserved-name collisions like "search_memory"
or "recall_memory") so they fail hard. Update the except handling in the
registry augmentation code to let ValueError propagate (or explicitly catch
Exception as exc and if isinstance(exc, ValueError): raise) while continuing to
log-and-return only for unexpected runtime errors; reference
_build_augmented_registry, ToolRegistry, and ToolBasedInjectionStrategy to
locate the relevant logic.

---

Outside diff comments:
In `@src/synthorg/tools/permissions.py`:
- Around line 57-85: The permission model currently grants ToolCategory.MEMORY
wholesale to ToolAccessLevel.SANDBOXED/RESTRICTED/STANDARD via the mapping in
the permissions dict, but the read-only guarantee is only a comment; update the
model by either (A) introducing separate categories or flags (e.g.,
ToolCategory.MEMORY_READ vs ToolCategory.MEMORY_WRITE) and replace references in
the mapping so low-access levels only get MEMORY_READ, or (B) enhance
ToolPermissionChecker to special-case ToolCategory.MEMORY and deny any mutating
memory operations by checking for memory protocol methods (e.g., store(),
delete()) and refusing them unless the access level is elevated; make changes
around the permissions mapping and in ToolPermissionChecker to enforce read-only
memory at runtime and adjust any uses of ToolCategory.MEMORY in code (and
memory/protocol.py) to reflect the new read/write distinction.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4a8597c9-8bf1-406a-a69a-95d1b47bb5a6

📥 Commits

Reviewing files that changed from the base of the PR and between 9aea7ae and 87a69c7.

📒 Files selected for processing (8)
  • docs/design/operations.md
  • src/synthorg/memory/tool_retriever.py
  • src/synthorg/memory/tools.py
  • src/synthorg/observability/events/tool.py
  • src/synthorg/tools/permissions.py
  • tests/unit/core/test_enums.py
  • tests/unit/engine/test_agent_engine.py
  • tests/unit/memory/test_memory_tools.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: use except A, B: (no parentheses)—ruff enforces this on Python 3.14.
All public functions must have type hints; mypy strict mode is required.
Docstrings must use Google style and are required on all public classes and functions—enforced by ruff D rules.
Create new objects instead of mutating existing ones—use immutability as the default pattern.
For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement.
Use model_copy(update=...) for runtime state that evolves; separate mutable-via-copy models from frozen Pydantic models for config/identity. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing and validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr from core.types for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Function line length limit: 88 characters; enforced by ruff.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly; never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
Logger variable name must always be logger (not _logger, not log).
Always use structured logging with kwargs: `l...

Files:

  • src/synthorg/tools/permissions.py
  • src/synthorg/observability/events/tool.py
  • tests/unit/engine/test_agent_engine.py
  • tests/unit/core/test_enums.py
  • tests/unit/memory/test_memory_tools.py
  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__).
Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction and bootstrap.
Always use event name constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Vendor names may only appear in: (1) Operations design page, (2) .claude/ files, (3) third-party import paths, (4) provider presets. Tests must use test-provider, test-small-001, etc.

Files:

  • src/synthorg/tools/permissions.py
  • src/synthorg/observability/events/tool.py
  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/tools/permissions.py
  • src/synthorg/observability/events/tool.py
  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.
Prefer @pytest.mark.parametrize for testing similar cases in Python tests.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names in tests: test-provider, test-small-001, etc.
Use Hypothesis for property-based testing in Python with @given and @settings decorators. Hypothesis profiles: ci (50 examples, default) and dev (1000 examples). Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/engine/test_agent_engine.py
  • tests/unit/core/test_enums.py
  • tests/unit/memory/test_memory_tools.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/engine/test_agent_engine.py
  • tests/unit/core/test_enums.py
  • tests/unit/memory/test_memory_tools.py
🧠 Learnings (20)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module

Applied to files:

  • src/synthorg/observability/events/tool.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • src/synthorg/observability/events/tool.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-04-03T13:34:37.835Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T13:34:37.835Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

  • src/synthorg/observability/events/tool.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/observability/events/tool.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)

Applied to files:

  • tests/unit/memory/test_memory_tools.py
  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-04-03T13:34:37.835Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T13:34:37.835Z
Learning: Applies to **/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement.

Applied to files:

  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-04-01T09:09:43.948Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)

Applied to files:

  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement

Applied to files:

  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-04-01T09:58:27.410Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:58:27.410Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for dict/list fields in frozen Pydantic models

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for `dict`/`list` fields

Applied to files:

  • src/synthorg/memory/tools.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.

Applied to files:

  • src/synthorg/memory/tools.py
  • src/synthorg/memory/tool_retriever.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • docs/design/operations.md

Configuration errors like duplicate tool names are programming bugs,
not transient runtime failures. Re-raise ValueError explicitly so
they fail hard instead of being silently caught by the broad except.
@Aureliolo
Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 3, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Aureliolo Aureliolo merged commit 329270e into main Apr 3, 2026
33 checks passed
@Aureliolo Aureliolo deleted the feat/tool-based-memory-retrieval branch April 3, 2026 15:53
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 3, 2026 15:53 — with GitHub Actions Inactive
Aureliolo added a commit that referenced this pull request Apr 3, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.6.0](v0.5.9...v0.6.0)
(2026-04-03)


### Features

* dashboard UI for ceremony policy settings
([#1038](#1038))
([865554c](865554c)),
closes [#979](#979)
* implement tool-based memory retrieval injection strategy
([#1039](#1039))
([329270e](329270e)),
closes [#207](#207)
* local model management for Ollama and LM Studio
([#1037](#1037))
([e1b14d3](e1b14d3)),
closes [#1030](#1030)
* workflow execution -- instantiate tasks from WorkflowDefinition
([#1040](#1040))
([e9235e3](e9235e3)),
closes [#1004](#1004)


### Maintenance

* shared Hypothesis failure DB + deterministic CI profile
([#1041](#1041))
([901ae92](901ae92))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: implement tool-based memory retrieval injection strategy

2 participants