feat: add pluggable PersistenceBackend protocol with SQLite implementation#179
feat: add pluggable PersistenceBackend protocol with SQLite implementation#179
Conversation
…eBackend protocol (#36) Adds DESIGN_SPEC §7.5 defining the PersistenceBackend protocol for operational data (tasks, costs, messages, audit logs) — separate from agent memory (§7.1-7.4). SQLite ships as first backend; PostgreSQL and MariaDB planned as future backends swappable via config. Updates tech stack entry and project structure to include persistence/ module.
…ation (#36) Implement §7.5 Operational Data Persistence: - PersistenceBackend protocol with connect/disconnect/health_check/migrate - Repository protocols: TaskRepository, CostRecordRepository, MessageRepository - SQLite backend via aiosqlite with WAL mode, schema migrations, TOCTOU-safe writes - Frozen Pydantic config models (PersistenceConfig, SQLiteConfig) - Error hierarchy: PersistenceError → ConnectionError, MigrationError, etc. - Factory pattern for backend creation - Structured logging with 30+ event constants - 96% test coverage (unit + integration) Pre-reviewed by 10 agents, 42 findings addressed.
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.OpenSSF Scorecard
Scanned Files
|
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (10)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a pluggable Operational Data Persistence layer: runtime-checkable PersistenceBackend protocol, Task/CostRecord/Message repository protocols, Pydantic PersistenceConfig (with SQLite subconfig), SQLite backend (aiosqlite) with migrations, observability event constants, package exports, config integration, and comprehensive unit + integration tests. RootConfig and defaults updated to include persistence. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Factory as PersistenceFactory
participant Backend as SQLitePersistenceBackend
participant Migrate as Migrations
participant DB as SQLiteDB
participant Repos as Repositories
App->>Factory: create_backend(config)
Factory->>Backend: instantiate (disconnected)
Factory-->>App: backend instance
App->>Backend: connect()
Backend->>DB: aiosqlite.connect(path)
DB-->>Backend: connection
Backend->>DB: PRAGMA journal_mode=WAL
Backend->>Backend: init repositories
Backend-->>App: connected
App->>Backend: migrate()
Backend->>Migrate: run_migrations(db)
Migrate->>DB: PRAGMA user_version / execute DDL
DB-->>Migrate: migrations applied
Migrate-->>Backend: migrations completed
Backend-->>App: migrated
App->>Repos: tasks.save(task)
Repos->>DB: INSERT/UPDATE (JSON fields)
DB-->>Repos: result
Repos-->>App: saved
App->>Repos: messages.get_history(channel)
Repos->>DB: SELECT ... ORDER BY timestamp DESC LIMIT ?
DB-->>Repos: rows
Repos-->>App: messages
App->>Backend: disconnect()
Backend->>DB: close()
DB-->>Backend: closed
Backend-->>App: disconnected
Estimated code review effort🎯 4 (Complex) | ⏱️ ~70 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request establishes a foundational persistence layer for the project's operational data, such as tasks, cost records, and messages. By introducing a pluggable Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a well-designed and comprehensive persistence layer with a pluggable protocol and an initial SQLite backend. While the implementation demonstrates high code quality and good separation of concerns, a critical security vulnerability was identified: the SQLiteConfig model allows for arbitrary database paths, potentially leading to path traversal and unauthorized file access in multi-tenant deployments. Further improvements are suggested regarding the atomicity of database migrations, a minor simplification in data mapping, and strengthening an integration test assertion.
src/ai_company/persistence/config.py
Outdated
| def _resolve_path(self) -> Self: | ||
| """Resolve relative paths to absolute to prevent traversal ambiguity. | ||
|
|
||
| The special ``:memory:`` identifier is passed through unchanged. | ||
| """ | ||
| if self.path == ":memory:": | ||
| return self | ||
| resolved = str(PurePosixPath(self.path)) | ||
| object.__setattr__(self, "path", resolved) | ||
| return self |
There was a problem hiding this comment.
The SQLiteConfig model is vulnerable to path traversal because it allows arbitrary filesystem paths to be specified for the database file without sufficient validation. The _resolve_path validator uses PurePosixPath, which does not prevent the use of absolute paths or parent directory references (..). In a multi-tenant environment where configuration might be provided by tenants, an attacker could exploit this to access or modify other tenants' databases or sensitive system files.
To remediate this, implement strict validation to ensure the path remains within a designated base directory. You should resolve the path relative to a trusted root and verify that the resulting absolute path is still contained within that root.
| def _resolve_path(self) -> Self: | |
| """Resolve relative paths to absolute to prevent traversal ambiguity. | |
| The special ``:memory:`` identifier is passed through unchanged. | |
| """ | |
| if self.path == ":memory:": | |
| return self | |
| resolved = str(PurePosixPath(self.path)) | |
| object.__setattr__(self, "path", resolved) | |
| return self | |
| def _resolve_path(self) -> Self: | |
| """Validate and normalize the database path. | |
| The special ``:memory:`` identifier is passed through unchanged. | |
| """ | |
| if self.path == ":memory:": | |
| return self | |
| path_obj = PurePosixPath(self.path) | |
| if path_obj.is_absolute() or ".." in path_obj.parts: | |
| raise ValueError("Database path must be a relative path without traversal components") | |
| object.__setattr__(self, "path", str(path_obj)) | |
| return self |
| try: | ||
| for target_version, migrate_fn in _MIGRATIONS: | ||
| if current < target_version: | ||
| await migrate_fn(db) | ||
| current = target_version | ||
|
|
||
| await set_user_version(db, SCHEMA_VERSION) | ||
| await db.commit() | ||
| except (sqlite3.Error, aiosqlite.Error) as exc: | ||
| await db.rollback() | ||
| msg = f"Migration to version {SCHEMA_VERSION} failed" | ||
| logger.exception( | ||
| PERSISTENCE_MIGRATION_FAILED, | ||
| target_version=SCHEMA_VERSION, | ||
| error=str(exc), | ||
| ) | ||
| raise MigrationError(msg) from exc | ||
|
|
There was a problem hiding this comment.
The current implementation of run_migrations does not guarantee atomicity. In aiosqlite's default autocommit mode, each DDL statement in _apply_v1 will run in its own transaction and be committed immediately. If a later statement fails, the earlier ones will not be rolled back. To ensure all migrations for a version are applied atomically, you should wrap the migration logic in a transaction context manager.
| try: | |
| for target_version, migrate_fn in _MIGRATIONS: | |
| if current < target_version: | |
| await migrate_fn(db) | |
| current = target_version | |
| await set_user_version(db, SCHEMA_VERSION) | |
| await db.commit() | |
| except (sqlite3.Error, aiosqlite.Error) as exc: | |
| await db.rollback() | |
| msg = f"Migration to version {SCHEMA_VERSION} failed" | |
| logger.exception( | |
| PERSISTENCE_MIGRATION_FAILED, | |
| target_version=SCHEMA_VERSION, | |
| error=str(exc), | |
| ) | |
| raise MigrationError(msg) from exc | |
| try: | |
| async with db.transaction(): | |
| for target_version, migrate_fn in _MIGRATIONS: | |
| if current < target_version: | |
| await migrate_fn(db) | |
| current = target_version | |
| await set_user_version(db, SCHEMA_VERSION) | |
| except (sqlite3.Error, aiosqlite.Error) as exc: | |
| msg = f"Migration to version {SCHEMA_VERSION} failed" | |
| logger.exception( | |
| PERSISTENCE_MIGRATION_FAILED, | |
| target_version=SCHEMA_VERSION, | |
| error=str(exc), | |
| ) | |
| raise MigrationError(msg) from exc |
| class MessageRepository(Protocol): | ||
| """CRUD + query interface for Message persistence.""" | ||
|
|
||
| async def save(self, message: Message) -> None: ... | ||
| async def get_history(self, channel: str, *, limit: int | None = None) -> tuple[Message, ...]: ... |
There was a problem hiding this comment.
The MessageRepository protocol definition for get_history seems less capable than what is described in the "Entities Persisted" table (line 1440). The protocol only allows filtering by channel and limit, while the table also mentions filtering by sender and time range. It would be good to align the protocol with the documented query capabilities for consistency, or update the table to reflect the current protocol's scope.
| # Map DB column "sender" to Message's "from" alias. | ||
| data["from"] = data.pop("sender") |
There was a problem hiding this comment.
The manual mapping from the sender database column to the from field alias is not necessary. Since Message.model_config has populate_by_name=True, Pydantic can automatically handle populating the sender field from a dictionary with a sender key. You can simplify this method by removing the manual re-mapping.
| # WAL file may or may not exist depending on checkpoint behavior, | ||
| # but the db file should exist | ||
| assert Path(db_path).exists() # noqa: ASYNC240 | ||
| await backend.disconnect() |
There was a problem hiding this comment.
The assertion Path(db_path).exists() is a bit weak for a test named test_wal_file_created. It only confirms the database file was created, not that WAL mode was successfully enabled. As the comment notes, the -wal file's existence can be intermittent. A more robust way to verify this would be to connect to the database and query the journal_mode PRAGMA to ensure it's set to 'wal'.
| # WAL file may or may not exist depending on checkpoint behavior, | |
| # but the db file should exist | |
| assert Path(db_path).exists() # noqa: ASYNC240 | |
| await backend.disconnect() | |
| await backend.disconnect() | |
| # Reconnect and check journal_mode to confirm WAL was enabled. | |
| async with aiosqlite.connect(db_path) as db: | |
| async with db.execute("PRAGMA journal_mode") as cursor: | |
| row = await cursor.fetchone() | |
| assert row is not None | |
| assert row[0] == "wal" |
Greptile SummaryThis PR implements DESIGN_SPEC §7.5 Operational Data Persistence, introducing a pluggable Issues found:
Confidence Score: 4/5
|
There was a problem hiding this comment.
Pull request overview
Implements DESIGN_SPEC §7.5 by introducing a pluggable operational persistence layer (PersistenceBackend + repository protocols) with an initial SQLite (aiosqlite) backend, integrating it into config and observability, and adding comprehensive unit/integration tests.
Changes:
- Added
PersistenceBackend+ repository protocols, error hierarchy, config models, and a backend factory. - Implemented SQLite backend with migrations (
user_version), repositories for tasks/cost-records/messages, and structured logging events. - Integrated persistence config into
RootConfig, updated docs (DESIGN_SPEC/CLAUDE), and added extensive tests.
Reviewed changes
Copilot reviewed 30 out of 34 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Adds aiosqlite dependency to the lockfile. |
| pyproject.toml | Adds aiosqlite dependency and mypy override for aiosqlite.*. |
| src/ai_company/persistence/init.py | Re-exports persistence public API (protocols/config/errors/factory). |
| src/ai_company/persistence/config.py | Adds frozen Pydantic config models for persistence + SQLite settings. |
| src/ai_company/persistence/errors.py | Adds persistence domain error hierarchy. |
| src/ai_company/persistence/factory.py | Adds create_backend() factory for backend selection. |
| src/ai_company/persistence/protocol.py | Defines PersistenceBackend runtime-checkable protocol. |
| src/ai_company/persistence/repositories.py | Defines repository protocols (Task/CostRecord/Message). |
| src/ai_company/persistence/sqlite/init.py | Exports SQLite backend components. |
| src/ai_company/persistence/sqlite/backend.py | Implements SQLite backend lifecycle + repo access + WAL setup. |
| src/ai_company/persistence/sqlite/migrations.py | Implements SQLite migrations using PRAGMA user_version. |
| src/ai_company/persistence/sqlite/repositories.py | Implements SQLite repositories for Task/CostRecord/Message. |
| src/ai_company/observability/events/persistence.py | Adds structured logging event constants for persistence domain. |
| src/ai_company/config/schema.py | Integrates PersistenceConfig into RootConfig. |
| src/ai_company/config/defaults.py | Adds persistence defaults entry. |
| tests/unit/observability/test_events.py | Extends event discovery/constant assertions to persistence domain. |
| tests/unit/config/test_schema.py | Adds assertions for persistence defaults/customization on RootConfig. |
| tests/unit/config/conftest.py | Updates RootConfigFactory to include PersistenceConfig. |
| tests/unit/persistence/init.py | Adds unit test package marker for persistence tests. |
| tests/unit/persistence/conftest.py | Adds shared helpers (task/message builders) for persistence tests. |
| tests/unit/persistence/test_config.py | Adds unit tests for persistence config validation/frozenness. |
| tests/unit/persistence/test_errors.py | Adds unit tests for persistence error hierarchy. |
| tests/unit/persistence/test_factory.py | Adds unit tests for backend factory behavior + isolation. |
| tests/unit/persistence/test_protocol.py | Adds runtime protocol compliance tests via fake implementations. |
| tests/unit/persistence/sqlite/init.py | Adds sqlite unit test subpackage marker. |
| tests/unit/persistence/sqlite/conftest.py | Adds in-memory SQLite fixtures + migration fixture. |
| tests/unit/persistence/sqlite/test_backend.py | Adds unit tests for SQLite backend lifecycle/health/migrate behavior. |
| tests/unit/persistence/sqlite/test_migrations.py | Adds unit tests for migration/versioning behavior and failures. |
| tests/unit/persistence/sqlite/test_repositories.py | Adds unit tests for SQLite repositories CRUD/query/round-trips. |
| tests/integration/persistence/init.py | Adds integration test package marker for persistence. |
| tests/integration/persistence/conftest.py | Adds on-disk backend fixtures for integration coverage. |
| tests/integration/persistence/test_sqlite_integration.py | Adds on-disk SQLite lifecycle/concurrency/persistence integration tests. |
| DESIGN_SPEC.md | Documents §7.5 persistence architecture/config/strategy + structure updates. |
| CLAUDE.md | Updates repo structure docs to include new persistence package. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await backend.connect() | ||
| assert backend.is_connected is True | ||
|
|
||
| await backend.disconnect() # type: ignore[unreachable] |
There was a problem hiding this comment.
The # type: ignore[unreachable] on await backend.disconnect() looks spurious (disconnect is reachable here) and can mask real control-flow issues. Please remove it, or replace it with a narrower ignore that matches the actual type-checker error being suppressed (with a brief comment explaining why it’s needed).
| await backend.disconnect() # type: ignore[unreachable] | |
| await backend.disconnect() |
DESIGN_SPEC.md
Outdated
|
|
||
| - Migrations run programmatically at startup via `PersistenceBackend.migrate()` | ||
| - Initial migration creates all tables | ||
| - Versioned migration scripts tracked in `persistence/migrations/` |
There was a problem hiding this comment.
DESIGN_SPEC §7.5 says versioned migration scripts are tracked in persistence/migrations/, but this PR’s implementation places migrations under ai_company/persistence/sqlite/migrations.py (and there is no persistence/migrations/ package). Please update this bullet to match the actual structure (e.g., per-backend migrations modules/directories) to avoid sending implementers to a non-existent path.
| - Versioned migration scripts tracked in `persistence/migrations/` | |
| - Versioned migration logic is implemented per-backend (e.g. `ai_company/persistence/sqlite/migrations.py` for SQLite), rather than in a shared `persistence/migrations/` package |
DESIGN_SPEC.md
Outdated
| | `Task` | `core/task.py` | `TaskRepository` | by status, by assignee, by project | | ||
| | `CostRecord` | `budget/cost_record.py` | `CostRecordRepository` | by agent, by task, aggregations | | ||
| | `Message` | `communication/message.py` | `MessageRepository` | by channel, by sender, time range | | ||
| | Audit entries | `security/` | `AuditRepository` | by agent, by action type, time range | |
There was a problem hiding this comment.
The “Entities Persisted” table lists an AuditRepository and audit entries as persisted, but there is no AuditRepository protocol or backend implementation in this PR. Either add the audit repository/backend support, or mark audit persistence as planned/future work in the table so the spec matches what’s actually implemented.
| | Audit entries | `security/` | `AuditRepository` | by agent, by action type, time range | | |
| | Audit entries (planned) | `security/` (planned) | `AuditRepository` (planned) | planned: by agent, by action type, time range | |
src/ai_company/persistence/config.py
Outdated
| def _resolve_path(self) -> Self: | ||
| """Resolve relative paths to absolute to prevent traversal ambiguity. | ||
|
|
||
| The special ``:memory:`` identifier is passed through unchanged. | ||
| """ | ||
| if self.path == ":memory:": | ||
| return self | ||
| resolved = str(PurePosixPath(self.path)) | ||
| object.__setattr__(self, "path", resolved) | ||
| return self |
There was a problem hiding this comment.
SQLiteConfig._resolve_path docstring claims it resolves relative paths to absolute and prevents traversal ambiguity, but the implementation only normalizes the string via PurePosixPath and does not make paths absolute or reject '..' components. Either adjust the validator to actually enforce/resolve what the docstring describes, or update the docstring (and method name) to match the current behavior so callers aren’t misled about security/semantics.
| err_msg = f"Message {msg_id} already exists" | ||
| logger.warning(PERSISTENCE_MESSAGE_DUPLICATE, message_id=msg_id) | ||
| raise DuplicateRecordError(err_msg) from exc |
There was a problem hiding this comment.
SQLiteMessageRepository.save treats any sqlite3.IntegrityError as a duplicate-id condition and raises DuplicateRecordError. IntegrityError can also be triggered by other constraint failures (e.g., NOT NULL), which would be misreported as a duplicate. Consider checking the error message / extended error code to confirm it’s specifically the primary-key/unique violation on messages.id, and otherwise raise QueryError (or wrap the original error appropriately).
| err_msg = f"Message {msg_id} already exists" | |
| logger.warning(PERSISTENCE_MESSAGE_DUPLICATE, message_id=msg_id) | |
| raise DuplicateRecordError(err_msg) from exc | |
| error_text = str(exc) | |
| is_duplicate_id = ( | |
| "UNIQUE constraint failed: messages.id" in error_text | |
| or "PRIMARY KEY constraint failed: messages.id" in error_text | |
| ) | |
| if is_duplicate_id: | |
| err_msg = f"Message {msg_id} already exists" | |
| logger.warning(PERSISTENCE_MESSAGE_DUPLICATE, message_id=msg_id) | |
| raise DuplicateRecordError(err_msg) from exc | |
| # Other integrity errors (e.g., NOT NULL, different UNIQUE) are treated | |
| # as generic query errors rather than duplicate-ID conditions. | |
| msg = f"Failed to save message {msg_id!r}" | |
| logger.exception( | |
| PERSISTENCE_MESSAGE_SAVE_FAILED, | |
| message_id=msg_id, | |
| error=str(exc), | |
| ) | |
| raise QueryError(msg) from exc |
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/persistence/config.py`:
- Around line 47-57: The _resolve_path model_validator currently uses
PurePosixPath which only normalizes paths; change it to produce absolute paths
by replacing PurePosixPath usage with pathlib.Path and call
Path(self.path).resolve(strict=False) (preserving the ":memory:" special-case),
then set the resolved string back on self via object.__setattr__(self, "path",
str(resolved_path)); alternatively, if absolute resolution is not desired,
update the _resolve_path docstring to say it normalizes rather than resolves so
the behavior and docs match.
In `@src/ai_company/persistence/factory.py`:
- Around line 19-27: The function create_backend is annotated with
PersistenceConfig and PersistenceBackend but those names are only imported
inside the if TYPE_CHECKING: block, which causes runtime annotation resolution
(e.g. inspect.get_annotations()/typing.get_type_hints()) to raise NameError
under Python 3.14; fix this by importing PersistenceConfig and
PersistenceBackend at runtime (move or duplicate the imports out of the if
TYPE_CHECKING: block or add a safe runtime import fallback) so the symbols exist
when annotations are evaluated, keeping references to create_backend,
PersistenceConfig, PersistenceBackend and TYPE_CHECKING to locate the change.
In `@src/ai_company/persistence/repositories.py`:
- Around line 132-166: Add a new runtime-checkable Protocol named
AuditRepository next to the existing MessageRepository to expose the pluggable
audit-log persistence contract; define async methods (at minimum) save(self,
audit: AuditRecord) -> None to persist an audit/approval event and
list_for_message(self, message_id: str, *, limit: int | None = None) ->
tuple[AuditRecord, ...] (or get_history equivalent) to query audits for a
message/channel, include docstrings mirroring MessageRepository (args, returns,
and raises DuplicateRecordError and PersistenceError) and keep the same
typing/async style used by MessageRepository so implementers can be swapped in
consistently.
In `@src/ai_company/persistence/sqlite/repositories.py`:
- Around line 401-413: The deserialization error path in _row_to_message
currently raises QueryError without logging; add a log call (e.g.,
self._logger.warning or self._logger.error) just before raising in the except
block inside _row_to_message to record the exception and message id (use
row["id"] if present) and include the exception details (exc) and the serialized
attachments/metadata or full row dict for debugging; keep the existing raise
QueryError(msg) from exc behavior after logging.
- Around line 137-147: The _row_to_task method currently raises QueryError when
JSON decoding or validation fails but does not log; update _row_to_task to log
the failure at WARNING or ERROR including task_id, the exception details, and
relevant row/context before raising QueryError (use the same logger used
elsewhere in this module), and consider adding/using a new event constant
PERSISTENCE_TASK_DESERIALIZE_FAILED (semantically distinct from fetch failures)
when emitting or recording the log; ensure you reference _JSON_FIELDS,
Task.model_validate, and the caught exceptions (json.JSONDecodeError,
ValidationError) in the log message so debugging context is preserved.
In `@tests/integration/persistence/test_sqlite_integration.py`:
- Around line 101-107: The read_all helper unnecessarily calls the migration on
a reader connection; remove the await b.migrate() call in the read_all function
so it only constructs SQLitePersistenceBackend(SQLiteConfig(path=db_path)),
await b.connect(), reads tasks via await b.tasks.list_tasks(), then await
b.disconnect(), which simulates a concurrent reader without redundant
migrations.
In `@tests/unit/observability/test_events.py`:
- Around line 370-411: Refactor test_persistence_events_exist to use
pytest.mark.parametrize instead of many one-off asserts: replace the repeated
assertions that compare each constant (e.g., PERSISTENCE_BACKEND_CONNECTING,
PERSISTENCE_BACKEND_CONNECTED, PERSISTENCE_BACKEND_DISCONNECTING,
PERSISTENCE_BACKEND_DISCONNECTED, PERSISTENCE_BACKEND_HEALTH_CHECK,
PERSISTENCE_MIGRATION_STARTED, PERSISTENCE_MIGRATION_COMPLETED,
PERSISTENCE_MIGRATION_SKIPPED, PERSISTENCE_TASK_SAVED, PERSISTENCE_TASK_FETCHED,
PERSISTENCE_TASK_LISTED, PERSISTENCE_TASK_DELETED,
PERSISTENCE_COST_RECORD_SAVED, PERSISTENCE_COST_RECORD_QUERIED,
PERSISTENCE_COST_RECORD_AGGREGATED, PERSISTENCE_MESSAGE_SAVED,
PERSISTENCE_MESSAGE_HISTORY_FETCHED) against their expected string values by
defining a single `@pytest.mark.parametrize` table of (constant, expected_string)
pairs and a single test function that asserts constant == expected_string,
keeping the import of the constants (from
ai_company.observability.events.persistence) and the test name
(test_persistence_events_exist) intact.
In `@tests/unit/persistence/conftest.py`:
- Around line 15-38: The helper make_task can produce invalid Task objects when
callers pass status in {TaskStatus.ASSIGNED, TaskStatus.IN_PROGRESS,
TaskStatus.IN_REVIEW, TaskStatus.COMPLETED} but leave assigned_to=None; update
make_task to detect those statuses and either (a) auto-fill a sensible default
assignee (e.g., "auto-assignee") when assigned_to is None before constructing
Task, or (b) raise a clear ValueError from make_task indicating that assigned_to
must be provided for those statuses; modify the logic around the assigned_to
parameter in make_task (and its docstring) to enforce one of these behaviors so
Task(...) validation no longer fails silently.
In `@tests/unit/persistence/sqlite/conftest.py`:
- Around line 14-30: The fixtures memory_db and migrated_db can leak the
aiosqlite.Connection if setup raises; modify both fixtures (memory_db and
migrated_db) so the connection is always closed on errors by wrapping
setup/yield/teardown in a try/finally (or use async with) and call await
db.close() in the finally block; for migrated_db ensure run_migrations(db) is
called before the yield but still inside the try so any exception still triggers
the finally cleanup.
In `@tests/unit/persistence/test_errors.py`:
- Around line 17-33: The tests duplicate the same issubclass assertions across
five functions; replace the multiple tests (test_connection_error_inherits,
test_migration_error_inherits, test_record_not_found_inherits,
test_duplicate_record_inherits, test_query_error_inherits) with a single
parametrized pytest test that iterates over the subclasses
[PersistenceConnectionError, MigrationError, RecordNotFoundError,
DuplicateRecordError, QueryError] and asserts issubclass(subclass,
PersistenceError); use pytest.mark.parametrize to supply each subclass to the
new test function (e.g., test_error_subclasses_inherit_from_persistence_error)
so the inheritance contract is covered with one concise test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 754abba8-0975-4a71-90ae-9e0b49fbceb3
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (33)
CLAUDE.mdDESIGN_SPEC.mdpyproject.tomlsrc/ai_company/config/defaults.pysrc/ai_company/config/schema.pysrc/ai_company/observability/events/persistence.pysrc/ai_company/persistence/__init__.pysrc/ai_company/persistence/config.pysrc/ai_company/persistence/errors.pysrc/ai_company/persistence/factory.pysrc/ai_company/persistence/protocol.pysrc/ai_company/persistence/repositories.pysrc/ai_company/persistence/sqlite/__init__.pysrc/ai_company/persistence/sqlite/backend.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/persistence/sqlite/repositories.pytests/integration/persistence/__init__.pytests/integration/persistence/conftest.pytests/integration/persistence/test_sqlite_integration.pytests/unit/config/conftest.pytests/unit/config/test_schema.pytests/unit/observability/test_events.pytests/unit/persistence/__init__.pytests/unit/persistence/conftest.pytests/unit/persistence/sqlite/__init__.pytests/unit/persistence/sqlite/conftest.pytests/unit/persistence/sqlite/test_backend.pytests/unit/persistence/sqlite/test_migrations.pytests/unit/persistence/sqlite/test_repositories.pytests/unit/persistence/test_config.pytests/unit/persistence/test_errors.pytests/unit/persistence/test_factory.pytests/unit/persistence/test_protocol.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 style enforced by ruff on Python 3.14
All public functions and classes must have type hints; use mypy strict mode
Use Google-style docstrings on all public classes and functions — enforced by ruff D rules
Create new objects instead of mutating existing ones; for non-Pydantic internal collections usecopy.deepcopy()at construction andMappingProxyTypefor read-only enforcement
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models withmodel_copy(update=...)for runtime state that evolves
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields
UseNotBlankStrfromcore.typesfor all identifier and name fields (including optional and tuple variants) instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations (e.g., multiple tool invocations, parallel agent calls) over barecreate_task
Keep functions under 50 lines and files under 800 lines
Enforce 88-character line length — enforced by ruff
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Every module with business logic must import logger viafrom ai_company.observability import get_loggerand createlogger = get_logger(__name__)
Never useimport logging,logging.getLogger(), orprint()in application code — use the structured logger only
Always useloggeras the variable name (not_loggerorlog)
Use event name constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider); import directly instead of using string literals
Use structured kwargs in logger calls: `logger.info(EVENT, key=value...
Files:
tests/unit/persistence/test_config.pytests/unit/persistence/conftest.pysrc/ai_company/persistence/protocol.pytests/integration/persistence/conftest.pysrc/ai_company/persistence/__init__.pytests/unit/persistence/sqlite/test_repositories.pysrc/ai_company/config/defaults.pysrc/ai_company/persistence/factory.pysrc/ai_company/persistence/errors.pytests/unit/persistence/sqlite/test_migrations.pytests/integration/persistence/test_sqlite_integration.pytests/unit/config/test_schema.pysrc/ai_company/persistence/sqlite/migrations.pytests/unit/persistence/test_factory.pytests/unit/persistence/test_errors.pysrc/ai_company/config/schema.pysrc/ai_company/persistence/sqlite/backend.pysrc/ai_company/persistence/repositories.pysrc/ai_company/persistence/sqlite/repositories.pytests/unit/persistence/sqlite/test_backend.pytests/unit/persistence/sqlite/conftest.pytests/unit/config/conftest.pysrc/ai_company/persistence/sqlite/__init__.pytests/unit/observability/test_events.pytests/unit/persistence/test_protocol.pysrc/ai_company/persistence/config.pysrc/ai_company/observability/events/persistence.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow
Maintain minimum 80% code coverage (enforced in CI)
Useasyncio_mode = 'auto'in pytest configuration — no manual@pytest.mark.asyncioneeded on async tests
Set 30-second timeout per test
Use@pytest.mark.parametrizeto test similar cases rather than duplicating test code
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in tests; usetest-provider,test-small-001,test-medium-001,test-large-001instead
Files:
tests/unit/persistence/test_config.pytests/unit/persistence/conftest.pytests/integration/persistence/conftest.pytests/unit/persistence/sqlite/test_repositories.pytests/unit/persistence/sqlite/test_migrations.pytests/integration/persistence/test_sqlite_integration.pytests/unit/config/test_schema.pytests/unit/persistence/test_factory.pytests/unit/persistence/test_errors.pytests/unit/persistence/sqlite/test_backend.pytests/unit/persistence/sqlite/conftest.pytests/unit/config/conftest.pytests/unit/observability/test_events.pytests/unit/persistence/test_protocol.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, or comments; use generic names like
example-provider,example-large-001,large/medium/small
Files:
src/ai_company/persistence/protocol.pysrc/ai_company/persistence/__init__.pysrc/ai_company/config/defaults.pysrc/ai_company/persistence/factory.pysrc/ai_company/persistence/errors.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/config/schema.pysrc/ai_company/persistence/sqlite/backend.pysrc/ai_company/persistence/repositories.pysrc/ai_company/persistence/sqlite/repositories.pysrc/ai_company/persistence/sqlite/__init__.pysrc/ai_company/persistence/config.pysrc/ai_company/observability/events/persistence.py
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Pin all dependency versions using==inpyproject.toml
Organize dependencies into groups:test(pytest + plugins) anddev(includes test + ruff, mypy, pre-commit, commitizen); install withuv sync
Files:
pyproject.toml
🧠 Learnings (3)
📚 Learning: 2026-03-08T22:05:51.784Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.784Z
Learning: Applies to **/*.py : Every module with business logic must import logger via `from ai_company.observability import get_logger` and create `logger = get_logger(__name__)`
Applied to files:
src/ai_company/config/schema.pysrc/ai_company/observability/events/persistence.py
📚 Learning: 2026-03-08T22:05:51.784Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.784Z
Learning: Applies to **/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly instead of using string literals
Applied to files:
tests/unit/observability/test_events.pysrc/ai_company/observability/events/persistence.py
📚 Learning: 2026-03-08T22:05:51.783Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.783Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves
Applied to files:
src/ai_company/persistence/config.py
🧬 Code graph analysis (18)
tests/unit/persistence/test_config.py (1)
src/ai_company/persistence/config.py (2)
PersistenceConfig(60-98)SQLiteConfig(19-57)
tests/unit/persistence/conftest.py (2)
src/ai_company/communication/message.py (2)
Message(88-138)MessageMetadata(33-85)src/ai_company/core/task.py (1)
Task(45-261)
src/ai_company/persistence/protocol.py (2)
src/ai_company/persistence/repositories.py (3)
CostRecordRepository(83-129)MessageRepository(133-166)TaskRepository(17-79)src/ai_company/persistence/sqlite/backend.py (9)
connect(60-102)disconnect(104-121)health_check(123-140)migrate(142-153)is_connected(156-158)backend_name(161-163)tasks(182-188)cost_records(191-197)messages(200-206)
src/ai_company/persistence/__init__.py (5)
src/ai_company/persistence/config.py (2)
PersistenceConfig(60-98)SQLiteConfig(19-57)src/ai_company/persistence/errors.py (6)
DuplicateRecordError(28-29)MigrationError(16-17)PersistenceConnectionError(12-13)PersistenceError(8-9)QueryError(32-33)RecordNotFoundError(20-25)src/ai_company/persistence/factory.py (1)
create_backend(26-65)src/ai_company/persistence/protocol.py (1)
PersistenceBackend(18-88)src/ai_company/persistence/repositories.py (3)
CostRecordRepository(83-129)MessageRepository(133-166)TaskRepository(17-79)
src/ai_company/persistence/factory.py (5)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/persistence/errors.py (1)
PersistenceConnectionError(12-13)src/ai_company/persistence/sqlite/backend.py (1)
SQLitePersistenceBackend(35-206)src/ai_company/persistence/config.py (1)
PersistenceConfig(60-98)src/ai_company/persistence/protocol.py (1)
PersistenceBackend(18-88)
tests/unit/persistence/sqlite/test_migrations.py (3)
src/ai_company/persistence/errors.py (1)
MigrationError(16-17)src/ai_company/persistence/sqlite/migrations.py (3)
get_user_version(94-98)run_migrations(131-181)set_user_version(101-115)tests/unit/persistence/sqlite/conftest.py (2)
memory_db(15-20)migrated_db(24-30)
tests/unit/config/test_schema.py (2)
src/ai_company/config/schema.py (1)
RootConfig(439-623)src/ai_company/persistence/config.py (2)
PersistenceConfig(60-98)SQLiteConfig(19-57)
src/ai_company/persistence/sqlite/migrations.py (2)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/persistence/errors.py (1)
MigrationError(16-17)
tests/unit/persistence/test_factory.py (5)
src/ai_company/persistence/config.py (2)
PersistenceConfig(60-98)SQLiteConfig(19-57)src/ai_company/persistence/errors.py (1)
PersistenceConnectionError(12-13)src/ai_company/persistence/factory.py (1)
create_backend(26-65)src/ai_company/persistence/protocol.py (7)
PersistenceBackend(18-88)backend_name(71-73)is_connected(66-68)connect(33-40)migrate(57-63)tasks(76-78)disconnect(42-47)src/ai_company/persistence/sqlite/backend.py (7)
SQLitePersistenceBackend(35-206)backend_name(161-163)is_connected(156-158)connect(60-102)migrate(142-153)tasks(182-188)disconnect(104-121)
tests/unit/persistence/test_errors.py (1)
src/ai_company/persistence/errors.py (6)
DuplicateRecordError(28-29)MigrationError(16-17)PersistenceConnectionError(12-13)PersistenceError(8-9)QueryError(32-33)RecordNotFoundError(20-25)
src/ai_company/config/schema.py (1)
src/ai_company/persistence/config.py (1)
PersistenceConfig(60-98)
src/ai_company/persistence/sqlite/backend.py (4)
src/ai_company/persistence/errors.py (1)
PersistenceConnectionError(12-13)src/ai_company/persistence/sqlite/migrations.py (1)
run_migrations(131-181)src/ai_company/persistence/sqlite/repositories.py (3)
SQLiteCostRecordRepository(233-345)SQLiteMessageRepository(348-450)SQLiteTaskRepository(57-230)src/ai_company/persistence/config.py (1)
SQLiteConfig(19-57)
src/ai_company/persistence/repositories.py (4)
src/ai_company/budget/cost_record.py (1)
CostRecord(15-56)src/ai_company/communication/message.py (1)
Message(88-138)src/ai_company/core/enums.py (1)
TaskStatus(165-191)src/ai_company/core/task.py (1)
Task(45-261)
src/ai_company/persistence/sqlite/repositories.py (7)
src/ai_company/budget/cost_record.py (1)
CostRecord(15-56)src/ai_company/communication/message.py (1)
Message(88-138)src/ai_company/core/task.py (1)
Task(45-261)src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/persistence/errors.py (2)
DuplicateRecordError(28-29)QueryError(32-33)src/ai_company/persistence/protocol.py (2)
tasks(76-78)messages(86-88)src/ai_company/persistence/sqlite/backend.py (2)
tasks(182-188)messages(200-206)
tests/unit/persistence/sqlite/conftest.py (3)
src/ai_company/persistence/sqlite/migrations.py (1)
run_migrations(131-181)src/ai_company/persistence/protocol.py (1)
connect(33-40)src/ai_company/persistence/sqlite/backend.py (1)
connect(60-102)
tests/unit/config/conftest.py (1)
src/ai_company/persistence/config.py (1)
PersistenceConfig(60-98)
src/ai_company/persistence/sqlite/__init__.py (3)
src/ai_company/persistence/sqlite/backend.py (1)
SQLitePersistenceBackend(35-206)src/ai_company/persistence/sqlite/migrations.py (1)
run_migrations(131-181)src/ai_company/persistence/sqlite/repositories.py (3)
SQLiteCostRecordRepository(233-345)SQLiteMessageRepository(348-450)SQLiteTaskRepository(57-230)
src/ai_company/persistence/config.py (1)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)
🔇 Additional comments (28)
tests/unit/persistence/test_config.py (1)
9-78: LGTM: config coverage is well targeted.Defaults, immutability, invalid inputs, and custom SQLite wiring are all covered without coupling these tests to backend internals.
tests/unit/config/test_schema.py (1)
330-345: LGTM: the new root-level persistence wiring is pinned cleanly.These tests cover both the default
PersistenceConfigand the custom-path flow throughRootConfig.src/ai_company/config/schema.py (1)
23-23: LGTM:RootConfigexposes persistence with a safe default.Using
default_factory=PersistenceConfigkeeps existing configs valid while surfacing backend selection at the top level.Also applies to: 462-533
tests/unit/persistence/test_protocol.py (1)
21-116: LGTM!The fake implementations correctly satisfy the protocol contracts, and the
isinstancechecks verify runtime-checkable protocol compliance. Clean approach to testing structural subtyping.tests/unit/persistence/test_factory.py (1)
12-77: LGTM!Comprehensive factory tests covering backend creation, protocol compliance, config passthrough, error handling for unknown backends (with clever validation bypass via
model_copy), and multi-tenancy isolation. The inline import in the async test is acceptable for test-scoped helpers.tests/integration/persistence/conftest.py (1)
15-28: LGTM!Well-structured fixtures following proper backend lifecycle (connect → migrate → yield → disconnect). The
on_disk_backendfixture correctly handles cleanup in the implicit finally block after yield.tests/integration/persistence/test_sqlite_integration.py (2)
12-48: LGTM!Solid integration tests covering WAL mode behavior, data persistence across reconnects, and proper backend lifecycle management. The tests correctly exercise on-disk SQLite persistence scenarios.
49-88: LGTM!Good coverage of multi-entity persistence, verifying that tasks, cost records, and messages all persist correctly in the same database.
tests/unit/persistence/sqlite/test_migrations.py (1)
21-96: LGTM!Comprehensive migration tests covering version management, table/index creation, idempotence, skip behavior, and error handling. The failure simulation test (lines 82-96) properly patches
_MIGRATIONSto verify error wrapping.src/ai_company/persistence/errors.py (1)
1-33: LGTM!Clean and well-documented error hierarchy. The base
PersistenceErrorenables catch-all handling, and each derived exception has appropriate docstrings. Good practice documentingRecordNotFoundErroras reserved for future strict-fetch methods.tests/unit/persistence/sqlite/test_backend.py (1)
18-127: LGTM!Comprehensive backend tests covering lifecycle management, idempotent operations, health checks, migration behavior, repository access guards, WAL mode configuration, error handling, and protocol compliance. The
# type: ignore[unreachable]on line 27 suppresses a mypy narrowing false positive and is acceptable.tests/unit/persistence/sqlite/test_repositories.py (3)
38-197: LGTM!Thorough task repository tests covering save/get, upsert behavior, listing with filters (status, assigned_to, project), delete operations, combined filters, and complex nested model round-trips. The round-trip test (lines 147-196) properly validates that all nested fields survive JSON serialization.
202-315: LGTM!Good cost record repository coverage including query operations with filters, aggregation (total and by agent), empty aggregation handling, combined filters, and
call_categoryenum round-trips (including null handling).
320-440: LGTM!Comprehensive message repository tests covering save/get_history, ordering (newest first), limit functionality, channel filtering, duplicate rejection with
DuplicateRecordError, alias handling forsender/from, attachments/metadata round-trips, and UUID identity preservation.src/ai_company/persistence/protocol.py (1)
1-88: LGTM!The protocol is well-designed with:
- Correct use of
@runtime_checkablefor structural subtyping checks.- TYPE_CHECKING guard for repository imports preventing circular dependencies.
- Clear async lifecycle methods (
connect,disconnect,health_check,migrate) and property accessors.- Comprehensive docstrings documenting error semantics.
One minor observation: the
migrate()docstring only mentionsMigrationError, but the SQLite implementation (lines 142-153 insqlite/backend.py) also raisesPersistenceConnectionErrorwhen not connected. Consider documenting both exceptions for completeness.src/ai_company/persistence/config.py (1)
60-98: LGTM!The
PersistenceConfigimplementation is solid:
- Frozen model for immutability.
ClassVarfor valid backends allows extension without instance mutation.- Backend validation logs at WARNING level with context before raising, following coding guidelines.
src/ai_company/persistence/sqlite/migrations.py (3)
28-89: LGTM!The schema definition is well-structured:
- Tables align with domain models (tasks, cost_records, messages).
- Appropriate indexes on frequently filtered columns (status, assigned_to, project, agent_id, task_id, channel, timestamp).
- JSON fields stored as TEXT with sensible defaults (
'[]','{}').
131-181: LGTM on migration orchestration logic.The
run_migrationsfunction properly:
- Skips migrations when already at or above target version.
- Logs state transitions at INFO level.
- Commits on success, rolls back on failure.
- Wraps errors in domain
MigrationErrorwith proper logging.
168-176: No action needed. The exception handling syntax is correct per PEP 758.PEP 758 distinguishes between two cases:
except A, B:(no parentheses) applies only when there is noastarget. When anasclause is present, parentheses are required:except (A, B) as exc:. The code correctly uses parentheses because it includes theas exc:clause and therefore complies with PEP 758.> Likely an incorrect or invalid review comment.src/ai_company/persistence/sqlite/backend.py (2)
60-102: LGTM onconnect()implementation.Solid connection handling:
- Idempotent (returns early if already connected).
- Configures WAL mode with warning on failure (appropriately excludes
:memory:check).- Properly cleans up partial state on failure before re-raising.
- Repository initialization after successful connection setup.
104-206: LGTM!The remaining implementation is well-structured:
disconnect()gracefully handles close failures with warning-level logging.health_check()is simple and effective.migrate()correctly enforces connection requirement before delegating.- Generic
_require_connected[T]helper (Python 3.12+ syntax) elegantly guards repository access with proper error logging.- All properties have clear docstrings documenting potential exceptions.
src/ai_company/persistence/__init__.py (1)
1-39: LGTM!Clean package initialization providing a convenient public API surface:
- All persistence components (protocols, config, errors, factory, repositories) are re-exported.
__all__is alphabetically sorted for maintainability.- Enables clean imports like
from ai_company.persistence import PersistenceBackend, create_backend.src/ai_company/persistence/sqlite/repositories.py (6)
43-54: LGTM on_json_listhelper.Clean utility function for serializing tuple fields to JSON, properly handling both Pydantic models (via
model_dump) and scalar values.
67-126: LGTM onsave()implementation.Correct upsert semantics using
ON CONFLICT(id) DO UPDATE. JSON fields are properly serialized, and error handling logs before raising.
156-230: LGTM onget(),list_tasks(), anddelete()implementations.
- Query parameterization prevents SQL injection.
- Dynamic WHERE clause construction is safe (clauses are hardcoded strings, only values are parameterized).
- Proper logging for success and error paths.
233-345: LGTM onSQLiteCostRecordRepository.Well-implemented append-only cost record storage:
save()correctly uses insert-only semantics.query()builds filters safely with parameterized queries.aggregate()usesCOALESCEfor empty result handling with a defensiveNonecheck.
358-399: LGTM onsave()implementation.Correctly handles duplicate detection via
sqlite3.IntegrityError, raising the domain-specificDuplicateRecordError. General database errors are logged and wrapped inQueryError.
415-450: LGTM onget_history()implementation.Properly implements channel-scoped history retrieval with optional limit, ordered by timestamp descending. Parameterized queries prevent injection.
| @runtime_checkable | ||
| class MessageRepository(Protocol): | ||
| """CRUD + query interface for Message persistence.""" | ||
|
|
||
| async def save(self, message: Message) -> None: | ||
| """Persist a message. | ||
|
|
||
| Args: | ||
| message: The message to persist. | ||
|
|
||
| Raises: | ||
| DuplicateRecordError: If a message with the same ID exists. | ||
| PersistenceError: If the operation fails. | ||
| """ | ||
| ... | ||
|
|
||
| async def get_history( | ||
| self, | ||
| channel: str, | ||
| *, | ||
| limit: int | None = None, | ||
| ) -> tuple[Message, ...]: | ||
| """Retrieve message history for a channel. | ||
|
|
||
| Args: | ||
| channel: Channel name to query. | ||
| limit: Maximum number of messages to return (newest first). | ||
|
|
||
| Returns: | ||
| Messages ordered by timestamp descending. | ||
|
|
||
| Raises: | ||
| PersistenceError: If the operation fails. | ||
| """ | ||
| ... |
There was a problem hiding this comment.
Add the missing AuditRepository to the initial persistence contract.
The linked issue scope and DESIGN_SPEC §7.5 both treat audit-log persistence as part of this backend slice, but the public API stops at MessageRepository. Without an AuditRepository protocol here, callers still have no pluggable contract for persisting approval/audit events, and the backend surface is not feature-complete.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/persistence/repositories.py` around lines 132 - 166, Add a new
runtime-checkable Protocol named AuditRepository next to the existing
MessageRepository to expose the pluggable audit-log persistence contract; define
async methods (at minimum) save(self, audit: AuditRecord) -> None to persist an
audit/approval event and list_for_message(self, message_id: str, *, limit: int |
None = None) -> tuple[AuditRecord, ...] (or get_history equivalent) to query
audits for a message/channel, include docstrings mirroring MessageRepository
(args, returns, and raises DuplicateRecordError and PersistenceError) and keep
the same typing/async style used by MessageRepository so implementers can be
swapped in consistently.
…iewers - Fix PEP 649 runtime annotation errors in factory, protocol, repositories - Harden path traversal validation in SQLiteConfig (reject .. components) - Add deserialization error logging in SQLite repositories - Fix disconnect() to only log success on successful close - Add rollback error protection in migrations - Add set_user_version input validation - Add limit validation in MessageRepository.get_history - Fix docstrings for append-only repos (CostRecord, Message) - Add fixture cleanup (try/finally) to prevent connection leaks - Add auto-fill assignee for statuses requiring one in make_task - Update DESIGN_SPEC and README to reflect persistence completion - Add comprehensive test coverage: deserialization failures, protocol compliance, disconnect error handling, PRAGMA failure cleanup, migration version validation, invalid limit, path traversal rejection
aiosqlite is MIT-licensed (confirmed via PyPI classifiers) but scancode detects it as "LicenseRef-scancode-free-unknown AND MIT" because the package metadata lacks a License field.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 32 out of 36 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await backend.connect() | ||
| assert backend.is_connected is True | ||
|
|
||
| await backend.disconnect() # type: ignore[unreachable] |
There was a problem hiding this comment.
# type: ignore[unreachable] on this await backend.disconnect() looks incorrect (the line is reachable) and can hide real dead-code issues in the future. Please remove the ignore (or, if there’s a real mypy issue here, adjust the typing/structure so the ignore is actually justified).
| await backend.disconnect() # type: ignore[unreachable] | |
| await backend.disconnect() |
| CC0-1.0, Python-2.0, | ||
| LicenseRef-scancode-free-unknown |
There was a problem hiding this comment.
Allowing LicenseRef-scancode-free-unknown effectively permits dependencies with an unknown/unclassified license, which weakens the purpose of the dependency-review gate. If this was added to accommodate a specific package’s metadata, consider tightening this (e.g., fixing/overriding the detected license upstream or documenting why this exception is safe/necessary) rather than allowing unknown licenses repo-wide.
| CC0-1.0, Python-2.0, | |
| LicenseRef-scancode-free-unknown | |
| CC0-1.0, Python-2.0 |
|
|
||
| @pytest.mark.integration |
There was a problem hiding this comment.
Integration tests in this repo typically include a pytest.mark.timeout(30) at module scope (often combined with pytest.mark.integration). This new integration module only marks integration, so these tests can hang indefinitely in CI; please add the standard timeout marker for consistency and reliability.
| @pytest.mark.integration | |
| pytestmark = [pytest.mark.integration, pytest.mark.timeout(30)] |
| @pytest.mark.integration | ||
| class TestSQLiteOnDisk: | ||
| async def test_wal_file_created(self, db_path: str) -> None: | ||
| """WAL mode creates -wal and -shm files on disk.""" |
There was a problem hiding this comment.
This test docstring claims it verifies that "-wal" and "-shm" files are created, but the assertions only check the main DB file exists and that PRAGMA journal_mode returns wal. Either update the docstring to match what’s asserted, or add explicit checks for the *-wal / *-shm files after writing.
| """WAL mode creates -wal and -shm files on disk.""" | |
| """WAL journal mode is enabled for the on-disk SQLite database.""" |
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
src/ai_company/persistence/sqlite/repositories.py (1)
458-460:⚠️ Potential issue | 🟡 MinorLog the invalid
limitbefore raising.This validation branch raises
QueryErrorsilently, which leaves the bad input out of the persistence logs.🔧 Proposed fix
if limit is not None and limit < 1: msg = f"limit must be a positive integer, got {limit}" + logger.warning( + PERSISTENCE_MESSAGE_HISTORY_FAILED, + channel=channel, + limit=limit, + error=msg, + ) raise QueryError(msg)As per coding guidelines, "All error paths must log at WARNING or ERROR with context before raising".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/persistence/sqlite/repositories.py` around lines 458 - 460, Log the invalid `limit` value with context before raising the QueryError: locate the validation branch that checks `if limit is not None and limit < 1` and add a warning-level log statement (using the repository/module logger used elsewhere, e.g., logger or self._logger) that includes the offending `limit` value and the current operation or function name, then raise QueryError as before.src/ai_company/persistence/repositories.py (1)
131-165:⚠️ Potential issue | 🟠 MajorThe initial persistence contract is still missing
AuditRepository.Issue
#36/ DESIGN_SPEC §7.5 include audit-log persistence in this slice. Without anAuditRepositoryprotocol here, callers still have no pluggable audit abstraction and the backend surface remains incomplete.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/persistence/repositories.py` around lines 131 - 165, Add a runtime-checkable AuditRepository Protocol alongside MessageRepository to provide the pluggable audit-log abstraction; define methods such as async def append(self, audit: AuditRecord) -> None (raise DuplicateRecordError/PersistenceError where appropriate) and async def query_by_channel(self, channel: str, *, limit: int | None = None) -> tuple[AuditRecord, ...] (or an async def query_by_message(self, message_id: str) -> tuple[AuditRecord, ...]) so callers have a consistent interface for persisting and retrieving audit entries; place it in the same module and mirror docstrings/exception semantics used by MessageRepository (refer to AuditRepository, append, query_by_channel/query_by_message, and AuditRecord to locate and implement).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@DESIGN_SPEC.md`:
- Around line 1436-1441: The MessageRepository row in the table currently
promises sender and time-range queries but the protocol only exposes
get_history(channel, *, limit=None); either update the design surface to add
explicit repository methods (e.g., add get_by_sender(sender, *, limit=None) and
get_history(channel, *, start_time=None, end_time=None, limit=None) to
MessageRepository) or narrow the table row to only "by channel
(get_history(channel, *, limit=None))" so the table matches the
MessageRepository protocol; modify the DESIGN_SPEC.md table entry and/or the
protocol section to keep MessageRepository and get_history(channel, *,
limit=None) consistent.
In `@src/ai_company/persistence/repositories.py`:
- Around line 30-34: The repository protocol methods currently accept plain str
for public identifiers (e.g., the TaskRepository.get signature) which allows
blank values; update those method signatures to use NotBlankStr from core.types
(e.g., change get(self, task_id: str) -> ... to get(self, task_id: NotBlankStr)
-> ...), and where identifiers are optional use NotBlankStr | None, and for any
tuple/sequence identifiers use Tuple[NotBlankStr, ...]; also add the import for
NotBlankStr from core.types and update all other repository method signatures
mentioned (the other ID/name parameters at the same locations) so
implementations receive the tightened contract.
In `@src/ai_company/persistence/sqlite/backend.py`:
- Around line 116-128: The disconnect() implementation currently logs close
failures then clears the DB state, hiding the error and losing the handle;
change disconnect() so that after catching sqlite3.Error or OSError from await
self._db.close() you log the failure (as now) and then raise
PersistenceConnectionError (preserving the original exception info) instead of
proceeding to finally; only call self._clear_state() after a successful close
(i.e., move the _clear_state() call to the success path following await
self._db.close()), ensuring callers can catch/inspect the raised
PersistenceConnectionError and the DB handle remains until close succeeds.
- Around line 59-128: connect() and disconnect() mutate shared backend state
(_db, _tasks, _messages, _cost_records) without synchronization, allowing races
when called concurrently; fix by adding an asyncio.Lock (e.g.,
self._lifecycle_lock) on the instance and acquiring it at the start of both
connect() and disconnect() to serialize the lifecycle operations, perform the
existing None-checks, connection open/close, repository assignments and calls to
_clear_state while holding the lock, and finally release the lock (use async
with self._lifecycle_lock:) so overlapping calls cannot open/close connections
concurrently.
In `@src/ai_company/persistence/sqlite/repositories.py`:
- Around line 39-40: The annotation TaskStatus is only imported under
TYPE_CHECKING but is used at runtime in the public method signature
list_tasks(self, status: TaskStatus | None, ...), causing NameError during
get_type_hints()/inspect.get_annotations(); move the import for TaskStatus out
of the TYPE_CHECKING block into a normal runtime import (import TaskStatus from
ai_company.core.enums at top-level), keep the TYPE_CHECKING block for other
type-only imports if needed, and run tests that call
typing.get_type_hints(list_tasks) to verify the NameError is resolved.
---
Duplicate comments:
In `@src/ai_company/persistence/repositories.py`:
- Around line 131-165: Add a runtime-checkable AuditRepository Protocol
alongside MessageRepository to provide the pluggable audit-log abstraction;
define methods such as async def append(self, audit: AuditRecord) -> None (raise
DuplicateRecordError/PersistenceError where appropriate) and async def
query_by_channel(self, channel: str, *, limit: int | None = None) ->
tuple[AuditRecord, ...] (or an async def query_by_message(self, message_id: str)
-> tuple[AuditRecord, ...]) so callers have a consistent interface for
persisting and retrieving audit entries; place it in the same module and mirror
docstrings/exception semantics used by MessageRepository (refer to
AuditRepository, append, query_by_channel/query_by_message, and AuditRecord to
locate and implement).
In `@src/ai_company/persistence/sqlite/repositories.py`:
- Around line 458-460: Log the invalid `limit` value with context before raising
the QueryError: locate the validation branch that checks `if limit is not None
and limit < 1` and add a warning-level log statement (using the
repository/module logger used elsewhere, e.g., logger or self._logger) that
includes the offending `limit` value and the current operation or function name,
then raise QueryError as before.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 79afc1a9-25f0-44af-9440-32430d3cd30a
📒 Files selected for processing (20)
DESIGN_SPEC.mdREADME.mdsrc/ai_company/observability/events/persistence.pysrc/ai_company/persistence/config.pysrc/ai_company/persistence/errors.pysrc/ai_company/persistence/factory.pysrc/ai_company/persistence/protocol.pysrc/ai_company/persistence/repositories.pysrc/ai_company/persistence/sqlite/backend.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/persistence/sqlite/repositories.pytests/integration/persistence/test_sqlite_integration.pytests/unit/observability/test_events.pytests/unit/persistence/conftest.pytests/unit/persistence/sqlite/conftest.pytests/unit/persistence/sqlite/test_backend.pytests/unit/persistence/sqlite/test_migrations.pytests/unit/persistence/sqlite/test_repositories.pytests/unit/persistence/test_config.pytests/unit/persistence/test_errors.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 style enforced by ruff on Python 3.14
All public functions and classes must have type hints; use mypy strict mode
Use Google-style docstrings on all public classes and functions — enforced by ruff D rules
Create new objects instead of mutating existing ones; for non-Pydantic internal collections usecopy.deepcopy()at construction andMappingProxyTypefor read-only enforcement
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models withmodel_copy(update=...)for runtime state that evolves
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields
UseNotBlankStrfromcore.typesfor all identifier and name fields (including optional and tuple variants) instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations (e.g., multiple tool invocations, parallel agent calls) over barecreate_task
Keep functions under 50 lines and files under 800 lines
Enforce 88-character line length — enforced by ruff
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Every module with business logic must import logger viafrom ai_company.observability import get_loggerand createlogger = get_logger(__name__)
Never useimport logging,logging.getLogger(), orprint()in application code — use the structured logger only
Always useloggeras the variable name (not_loggerorlog)
Use event name constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider); import directly instead of using string literals
Use structured kwargs in logger calls: `logger.info(EVENT, key=value...
Files:
tests/unit/persistence/sqlite/test_backend.pysrc/ai_company/persistence/factory.pytests/unit/persistence/test_errors.pytests/unit/persistence/sqlite/test_repositories.pytests/unit/observability/test_events.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/persistence/sqlite/repositories.pytests/unit/persistence/test_config.pytests/unit/persistence/sqlite/test_migrations.pysrc/ai_company/persistence/repositories.pysrc/ai_company/persistence/sqlite/backend.pytests/unit/persistence/sqlite/conftest.pysrc/ai_company/persistence/protocol.pytests/unit/persistence/conftest.pysrc/ai_company/persistence/config.pytests/integration/persistence/test_sqlite_integration.pysrc/ai_company/persistence/errors.pysrc/ai_company/observability/events/persistence.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow
Maintain minimum 80% code coverage (enforced in CI)
Useasyncio_mode = 'auto'in pytest configuration — no manual@pytest.mark.asyncioneeded on async tests
Set 30-second timeout per test
Use@pytest.mark.parametrizeto test similar cases rather than duplicating test code
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in tests; usetest-provider,test-small-001,test-medium-001,test-large-001instead
Files:
tests/unit/persistence/sqlite/test_backend.pytests/unit/persistence/test_errors.pytests/unit/persistence/sqlite/test_repositories.pytests/unit/observability/test_events.pytests/unit/persistence/test_config.pytests/unit/persistence/sqlite/test_migrations.pytests/unit/persistence/sqlite/conftest.pytests/unit/persistence/conftest.pytests/integration/persistence/test_sqlite_integration.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, or comments; use generic names like
example-provider,example-large-001,large/medium/small
Files:
src/ai_company/persistence/factory.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/persistence/sqlite/repositories.pysrc/ai_company/persistence/repositories.pysrc/ai_company/persistence/sqlite/backend.pysrc/ai_company/persistence/protocol.pysrc/ai_company/persistence/config.pysrc/ai_company/persistence/errors.pysrc/ai_company/observability/events/persistence.py
🧠 Learnings (8)
📚 Learning: 2026-03-08T22:05:51.783Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.783Z
Learning: Applies to **/*.py : No `from __future__ import annotations` — Python 3.14 has PEP 649 native lazy annotations
Applied to files:
src/ai_company/persistence/factory.py
📚 Learning: 2026-03-08T22:05:51.785Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.785Z
Learning: Target Python 3.14+ with PEP 649 native lazy annotations support
Applied to files:
src/ai_company/persistence/factory.py
📚 Learning: 2026-03-08T22:05:51.784Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.784Z
Learning: Applies to tests/**/*.py : Use `pytest.mark.parametrize` to test similar cases rather than duplicating test code
Applied to files:
tests/unit/persistence/test_errors.pytests/unit/observability/test_events.py
📚 Learning: 2026-03-08T22:05:51.784Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.784Z
Learning: Applies to **/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly instead of using string literals
Applied to files:
tests/unit/observability/test_events.pysrc/ai_company/observability/events/persistence.py
📚 Learning: 2026-03-08T22:05:51.784Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.784Z
Learning: Applies to **/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
src/ai_company/persistence/sqlite/repositories.py
📚 Learning: 2026-03-08T22:05:51.784Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.784Z
Learning: Applies to tests/**/*.py : Use `asyncio_mode = 'auto'` in pytest configuration — no manual `pytest.mark.asyncio` needed on async tests
Applied to files:
tests/unit/persistence/sqlite/conftest.py
📚 Learning: 2026-03-08T22:05:51.783Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.783Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves
Applied to files:
src/ai_company/persistence/config.py
📚 Learning: 2026-03-08T22:05:51.785Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T22:05:51.785Z
Learning: Project layout uses src layout (`src/ai_company/`) with tests in `tests/` (unit/integration/e2e); see DESIGN_SPEC.md for full architecture
Applied to files:
DESIGN_SPEC.md
🧬 Code graph analysis (9)
tests/unit/persistence/sqlite/test_backend.py (3)
src/ai_company/persistence/config.py (1)
SQLiteConfig(19-68)src/ai_company/persistence/errors.py (1)
PersistenceConnectionError(12-13)src/ai_company/persistence/protocol.py (10)
PersistenceBackend(17-87)is_connected(65-67)connect(32-39)disconnect(41-46)health_check(48-54)backend_name(70-72)migrate(56-62)tasks(75-77)cost_records(80-82)messages(85-87)
src/ai_company/persistence/factory.py (5)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/persistence/config.py (1)
PersistenceConfig(71-109)src/ai_company/persistence/errors.py (1)
PersistenceConnectionError(12-13)src/ai_company/persistence/protocol.py (1)
PersistenceBackend(17-87)src/ai_company/persistence/sqlite/backend.py (1)
SQLitePersistenceBackend(34-213)
tests/unit/persistence/test_errors.py (1)
src/ai_company/persistence/errors.py (6)
DuplicateRecordError(29-30)MigrationError(16-17)PersistenceConnectionError(12-13)PersistenceError(8-9)QueryError(33-34)RecordNotFoundError(20-26)
tests/unit/persistence/sqlite/test_repositories.py (10)
src/ai_company/budget/cost_record.py (1)
CostRecord(15-56)src/ai_company/communication/message.py (2)
Message(88-138)MessageMetadata(33-85)src/ai_company/core/enums.py (5)
ArtifactType(223-228)CoordinationTopology(325-336)Priority(205-211)TaskStatus(165-191)TaskType(194-202)src/ai_company/persistence/errors.py (2)
DuplicateRecordError(29-30)QueryError(33-34)src/ai_company/persistence/sqlite/repositories.py (11)
SQLiteCostRecordRepository(245-357)SQLiteMessageRepository(360-489)save(69-128)save(255-284)save(370-425)get(168-188)list_tasks(190-223)delete(225-242)query(286-323)aggregate(325-357)get_history(451-489)tests/unit/persistence/conftest.py (2)
make_message(58-84)make_task(25-55)src/ai_company/persistence/repositories.py (12)
save(19-28)save(85-94)save(135-145)get(30-42)list_tasks(44-64)delete(66-78)query(96-114)aggregate(116-128)get_history(147-165)TaskRepository(16-78)CostRecordRepository(82-128)MessageRepository(132-165)src/ai_company/persistence/protocol.py (1)
tasks(75-77)src/ai_company/core/artifact.py (1)
ExpectedArtifact(13-28)src/ai_company/communication/enums.py (1)
AttachmentType(50-61)
src/ai_company/persistence/sqlite/migrations.py (2)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/persistence/errors.py (1)
MigrationError(16-17)
tests/unit/persistence/test_config.py (1)
src/ai_company/persistence/config.py (2)
PersistenceConfig(71-109)SQLiteConfig(19-68)
tests/unit/persistence/sqlite/conftest.py (1)
src/ai_company/persistence/sqlite/migrations.py (1)
run_migrations(136-200)
src/ai_company/persistence/protocol.py (3)
src/ai_company/persistence/repositories.py (3)
CostRecordRepository(82-128)MessageRepository(132-165)TaskRepository(16-78)src/ai_company/persistence/sqlite/backend.py (9)
connect(59-109)disconnect(111-128)health_check(130-147)migrate(149-160)is_connected(163-165)backend_name(168-170)tasks(189-195)cost_records(198-204)messages(207-213)tests/unit/persistence/test_protocol.py (9)
connect(71-72)disconnect(74-75)health_check(77-78)migrate(80-81)is_connected(84-85)backend_name(88-89)tasks(92-93)cost_records(96-97)messages(100-101)
src/ai_company/persistence/config.py (2)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/tools/base.py (1)
description(115-117)
🪛 LanguageTool
README.md
[typographical] ~23-~23: To join two clauses or introduce examples, consider using an em dash.
Context: ...ity presets - Persistence Layer (M5) - Pluggable PersistenceBackend protocol ...
(DASH_RULE)
🔇 Additional comments (18)
tests/unit/persistence/sqlite/conftest.py (1)
14-34: LGTM!Both fixtures now correctly wrap their bodies in
try/finallyblocks, ensuringdb.close()is always called even ifrun_migrations()or other setup code raises. This addresses the earlier review feedback about guaranteeing fixture teardown.tests/integration/persistence/test_sqlite_integration.py (1)
13-116: LGTM!The integration tests comprehensively cover on-disk SQLite persistence including WAL mode, data persistence across reconnect, multi-entity types, and concurrent reads. The
read_all()helper no longer includes the redundantmigrate()call, addressing the previous review feedback.tests/unit/persistence/test_config.py (1)
9-86: LGTM!Comprehensive test coverage for persistence configuration models including validation edge cases (path traversal, negative values, blank strings), immutability enforcement, and backend validation. The tests properly use
@pytest.mark.unitmarkers.src/ai_company/persistence/sqlite/migrations.py (4)
1-27: LGTM!Module structure is well-organized with proper imports, structured logging setup via
get_logger(__name__), and event constants from the persistence events module. TheSCHEMA_VERSIONconstant provides a clear single source of truth for the current schema version.
28-89: LGTM!The V1 schema definitions are comprehensive with appropriate table structures, indexes for common query patterns (status, assigned_to, project, agent_id, task_id, channel, timestamp), and proper use of SQLite features like
AUTOINCREMENTfor cost_records rowid.
94-121: LGTM!The
get_user_versionandset_user_versionfunctions properly handle PRAGMA operations with appropriate validation. The version validation inset_user_version(line 111) correctly rejects non-integer and negative values, with proper error logging before raisingMigrationError.
136-200: LGTM!The
run_migrationsfunction implements a robust migration workflow with:
- Version check before applying migrations
- Ordered migration application
- Explicit transaction commit on success
- Rollback on failure with secondary error handling
- Proper exception wrapping and logging
The error handling correctly distinguishes between
MigrationError(re-raised directly) and other database errors (wrapped with context).tests/unit/persistence/sqlite/test_backend.py (1)
19-164: LGTM!Comprehensive test coverage for
SQLitePersistenceBackendincluding:
- Connection lifecycle (connect, disconnect, idempotent behavior)
- Health check scenarios (connected, disconnected, error cases)
- Migration error handling (not connected)
- Repository access guards (before connect)
- WAL mode configuration
- Error path cleanup (PRAGMA failures, close failures)
- Protocol compliance verification
The use of
AsyncMockfor simulating database errors is appropriate for unit testing error handling paths.tests/unit/persistence/sqlite/test_repositories.py (4)
38-197: LGTM!
TestSQLiteTaskRepositoryprovides excellent coverage including:
- Basic CRUD operations (save, get, delete)
- Upsert behavior for existing records
- List operations with various filter combinations
- Round-trip verification for complex nested models (artifacts, acceptance criteria, delegation chain, topology)
202-315: LGTM!
TestSQLiteCostRecordRepositorycovers:
- Append-only save and query operations
- Filtering by agent_id and task_id (individual and combined)
- Aggregation with and without filters
- Edge case for empty aggregation returning 0.0
- Round-trip of optional
call_categoryfield (including null)The
_make_recordhelper appropriately uses"test-provider"and"test-model-001"per coding guidelines.
320-453: LGTM!
TestSQLiteMessageRepositorycomprehensively tests:
- Save and history retrieval
- Ordering (newest first)
- Limit enforcement and validation
- Channel filtering
- Duplicate rejection via
DuplicateRecordError- Round-trip of sender alias (
fromfield)- Complex nested fields (attachments, metadata)
- UUID identity preservation
455-531: LGTM!Protocol compliance tests verify that SQLite repository implementations satisfy their respective protocol interfaces. Deserialization failure tests validate proper error handling when encountering corrupt JSON data in the database, ensuring
QueryErroris raised with appropriate messages.src/ai_company/persistence/protocol.py (1)
1-87: LGTM!The
PersistenceBackendprotocol is well-designed with:
- Clear lifecycle methods (
connect,disconnect,health_check,migrate)- Property-based repository access (
tasks,cost_records,messages)@runtime_checkabledecorator enablingisinstance()checks- Comprehensive docstrings with error semantics documented
- Proper use of
# noqa: TC001for imports used in runtime type checkingtests/unit/persistence/sqlite/test_migrations.py (2)
21-43: LGTM!
TestUserVersionproperly validates the PRAGMA user_version operations including:
- Default version is 0
- Set/get round-trip
- Rejection of negative versions
- Rejection of non-integer versions (with appropriate
type: ignorefor static analysis)
45-108: LGTM!
TestRunMigrationsprovides thorough coverage:
- Table and index creation verification
- Version setting after migration
- Idempotency (running twice is safe)
- Skip behavior when already at target version
- Failure handling with
MigrationErrorwrappingThe patch-based approach for simulating migration failures is clean and effective.
src/ai_company/persistence/config.py (1)
19-109: Config boundary looks solid.Frozen models, boundary validation, and warning-before-raise logging are a good fit for this surface. Based on learnings, "Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with
model_copy(update=...)for runtime state that evolves".src/ai_company/observability/events/persistence.py (1)
10-62: Nice centralization of persistence event names.This keeps logger call sites consistent and grepable across the new persistence surface. Based on learnings, "Use event name constants from domain-specific modules under
ai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider); import directly instead of using string literals".src/ai_company/persistence/sqlite/backend.py (1)
88-97: The existing exception handlers are already compliant with PEP 758. When an exception is bound withas exc, parentheses are required per PEP 758. The comma-separated form without parentheses applies only to bareexceptclauses without theastarget. All occurrences at lines 88, 97, 120, and 138 use the correct syntax.No changes needed.
> Likely an incorrect or invalid review comment.
| logger.info(PERSISTENCE_BACKEND_DISCONNECTING, path=self._config.path) | ||
| try: | ||
| await self._db.close() | ||
| logger.info(PERSISTENCE_BACKEND_DISCONNECTED, path=self._config.path) | ||
| except (sqlite3.Error, OSError) as exc: | ||
| logger.warning( | ||
| PERSISTENCE_BACKEND_DISCONNECT_ERROR, | ||
| path=self._config.path, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| finally: | ||
| self._clear_state() |
There was a problem hiding this comment.
Don't report a failed close as a clean disconnect.
disconnect() logs the error and then clears _db in finally, so callers cannot tell that the close failed and the backend loses the only handle it had for retry/inspection. Raise PersistenceConnectionError after logging, and only clear state once the close succeeds.
🔧 Suggested fix
logger.info(PERSISTENCE_BACKEND_DISCONNECTING, path=self._config.path)
try:
await self._db.close()
logger.info(PERSISTENCE_BACKEND_DISCONNECTED, path=self._config.path)
- except (sqlite3.Error, OSError) as exc:
+ self._clear_state()
+ except sqlite3.Error, OSError as exc:
logger.warning(
PERSISTENCE_BACKEND_DISCONNECT_ERROR,
path=self._config.path,
error=str(exc),
error_type=type(exc).__name__,
)
- finally:
- self._clear_state()
+ msg = "Failed to disconnect from persistence backend"
+ raise PersistenceConnectionError(msg) from exc📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.info(PERSISTENCE_BACKEND_DISCONNECTING, path=self._config.path) | |
| try: | |
| await self._db.close() | |
| logger.info(PERSISTENCE_BACKEND_DISCONNECTED, path=self._config.path) | |
| except (sqlite3.Error, OSError) as exc: | |
| logger.warning( | |
| PERSISTENCE_BACKEND_DISCONNECT_ERROR, | |
| path=self._config.path, | |
| error=str(exc), | |
| error_type=type(exc).__name__, | |
| ) | |
| finally: | |
| self._clear_state() | |
| logger.info(PERSISTENCE_BACKEND_DISCONNECTING, path=self._config.path) | |
| try: | |
| await self._db.close() | |
| logger.info(PERSISTENCE_BACKEND_DISCONNECTED, path=self._config.path) | |
| self._clear_state() | |
| except sqlite3.Error, OSError as exc: | |
| logger.warning( | |
| PERSISTENCE_BACKEND_DISCONNECT_ERROR, | |
| path=self._config.path, | |
| error=str(exc), | |
| error_type=type(exc).__name__, | |
| ) | |
| msg = "Failed to disconnect from persistence backend" | |
| raise PersistenceConnectionError(msg) from exc |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/persistence/sqlite/backend.py` around lines 116 - 128, The
disconnect() implementation currently logs close failures then clears the DB
state, hiding the error and losing the handle; change disconnect() so that after
catching sqlite3.Error or OSError from await self._db.close() you log the
failure (as now) and then raise PersistenceConnectionError (preserving the
original exception info) instead of proceeding to finally; only call
self._clear_state() after a successful close (i.e., move the _clear_state() call
to the success path following await self._db.close()), ensuring callers can
catch/inspect the raised PersistenceConnectionError and the DB handle remains
until close succeeds.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/dependency-review.yml:
- Around line 28-29: Add a short explanatory comment next to the license list
that includes "LicenseRef-scancode-free-unknown" stating which dependency
(package name and version) in the dependency tree required this ScanCode
fallback allowance; update the workflow's license list comment (the block that
contains CC0-1.0, Python-2.0, LicenseRef-scancode-free-unknown) and/or include
the same note in the PR description so future maintainers can see the rationale
for allowing LicenseRef-scancode-free-unknown.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 89273b71-3bdd-40a3-84ed-a819f4ac8aac
📒 Files selected for processing (1)
.github/workflows/dependency-review.yml
…eptile - Add asyncio.Lock for backend lifecycle serialization (CodeRabbit) - Use NotBlankStr for repository protocol identifier params (CodeRabbit) - Move TaskStatus to runtime import for PEP 649 compat (CodeRabbit) - Add IF NOT EXISTS guards to migration DDL for crash recovery (Greptile) - Fix migration docstring to be honest about DDL auto-commit (Greptile) - Add task_id filter to CostRecordRepository.aggregate() (Greptile) - Add PERSISTENCE_BACKEND_NOT_CONNECTED event, stop reusing CONNECTION_FAILED for precondition checks (Greptile) - Add pytest.mark.timeout(30) to integration tests (Copilot) - Fix WAL test docstring to match actual assertions (Copilot) - Narrow DESIGN_SPEC MessageRepository queries to match protocol - Document aiosqlite license exception in CI workflow
Summary
PersistenceBackendprotocol and SQLite as the initial backendTaskRepository,CostRecordRepository,MessageRepository) with full async CRUD operationsaiosqlitewith WAL mode, schema migrations viauser_versionPRAGMA, TOCTOU-safe duplicate detection, and named column accessPersistenceConfig,SQLiteConfig) with path resolution and validationPersistenceError→PersistenceConnectionError,MigrationError,RecordNotFoundError,DuplicateRecordError,QueryErrorcreate_backend()) returns protocol type, not concrete classobservability.events.persistenceTest plan
Review coverage
Pre-reviewed by 10 agents, 42 findings addressed:
Key fixes from review: TOCTOU race condition fix, protocol abstraction leak, half-connected state prevention, error wrapping on all repo methods, named column access,
dict(row)simplifications,_clear_state()and_require_connected()helpers.Closes #36