Conversation
…cy review agent - CLAUDE.md: add immutability strategy (MappingProxyType), config/runtime split, computed_field, NotBlankStr, asyncio.TaskGroup, per-domain event imports - DESIGN_SPEC.md: update §3.1 (config/runtime split), §10.2 (computed_field note), §11.1.1 (tool execution model), §15.2-15.5 (Python 3.14+, project structure, Pydantic conventions table) - pre-pr-review: add docs-consistency agent that ALWAYS runs on every PR - aurelio-review-pr: add docs-consistency agent that ALWAYS runs on every PR Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds an always-run docs-consistency agent to two PR-review SKILLs; expands CLAUDE.md with immutability, Pydantic, concurrency, and logging guidance; updates README status/tech-stack; and reworks DESIGN_SPEC.md to separate Agent Config vs Runtime state, introduce NotBlankStr and spending models, and describe planned parallel tool execution semantics. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses the critical issue of documentation drift by performing a significant update to the project's core documentation, CLAUDE.md and DESIGN_SPEC.md, to reflect the latest M2.5 architectural and coding conventions. Concurrently, it introduces an automated docs-consistency review agent into the PR review workflow. This agent will continuously monitor future pull requests to ensure that documentation remains synchronized with the evolving codebase, thereby preventing future inconsistencies and maintaining a reliable source of truth for the project. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request is a great initiative to update project documentation with M2.5 conventions and add an automated process to prevent future documentation drift. The updates to CLAUDE.md and DESIGN_SPEC.md are comprehensive and clearly document important patterns. The addition of the docs-consistency agent is a valuable step towards maintaining a high-quality source of truth. I have a couple of minor suggestions to improve the clarity of the new agent's instructions.
Note: Security Review has been skipped due to the limited scope of the PR.
| 2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR) | ||
| 3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR) |
There was a problem hiding this comment.
For better readability and to align with the document's structure, it would be clearer to list the DESIGN_SPEC.md checklist items in the same order as their section numbers. Section §15.4 should come before §15.5.
| 2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR) | |
| 3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR) | |
| 2. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR) | |
| 3. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR) |
| 2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR) | ||
| 3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR) |
There was a problem hiding this comment.
To improve clarity and consistency with the document's layout, I suggest reordering these checklist items to follow the section numbering in DESIGN_SPEC.md. This places §15.4 before §15.5.
| 2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR) | |
| 3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR) | |
| 2. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR) | |
| 3. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR) |
Greptile SummaryThis PR brings All five issues flagged in previous review threads have been resolved in this PR:
New content verified accurate against codebase:
One minor gap to follow up on: Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
PR[Pull Request opened] --> AGENTS[Launch review agents in parallel]
AGENTS --> TYPE[type-design-analyzer\nif types/classes changed]
AGENTS --> LOG[logging-audit\nif any src/*.py changed]
AGENTS --> RES[resilience-audit\nif providers/*.py changed]
AGENTS --> SEC[security-reviewer\nif api/security/tools changed\nor dangerous patterns detected]
AGENTS --> DOCS[docs-consistency\n⬛ ALWAYS — every PR]
DOCS --> CHECK1[Read DESIGN_SPEC.md\nCLAUDE.md, README.md]
CHECK1 --> CHECK2{Compare against\nPR diff + codebase}
CHECK2 --> FLAG1[§15.3 project structure\nmatch? CRITICAL]
CHECK2 --> FLAG2[CLAUDE.md conventions\nmatch code? CRITICAL]
CHECK2 --> FLAG3[Event import paths\naccurate? CRITICAL]
CHECK2 --> FLAG4[Tech stack / versions\ncurrent? MAJOR]
CHECK2 --> FLAG5[README instructions\nstill valid? MAJOR]
FLAG1 & FLAG2 & FLAG3 & FLAG4 & FLAG5 --> REPORT[Flag drift findings\nin review]
|
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
DESIGN_SPEC.md (2)
104-172:⚠️ Potential issue | 🔴 CriticalCRITICAL: Documentation describes unimplemented two-layer agent model architecture.
The DESIGN_SPEC.md (lines 104-172) documents a split between
AgentConfig(frozen at hire time) andAgentRuntimeState(mutable-via-copy), but the actual implementation combines all fields in a single frozenAgentIdentityclass (src/ai_company/core/agent.py:213-280).Specifically:
- Spec promises
AgentRuntimeStatewith fields likecurrent_task_id,turn_count,accumulated_cost_usdupdated viamodel_copy(update=...)- Code has
AgentIdentitywithmodel_config = ConfigDict(frozen=True)— completely immutable- No
AgentRuntimeStateclass exists; onlyAgentConfigexists (inconfig/schema.pyfor YAML loading)- Runtime state evolution pattern documented in spec is not implemented
Users reading the spec will expect an API for managing mutable agent state that does not exist in code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 104 - 172, The spec calls for a frozen config layer and a separately updatable runtime layer but the code only defines a single frozen AgentIdentity; add a distinct Pydantic model AgentRuntimeState (with fields current_task_id, turn_count, accumulated_cost_usd, status, etc.) and keep AgentConfig/AgentIdentity as the immutable ConfigDict-based model; update the agent factory/loader (where AgentConfig is currently created) to populate both AgentConfig and AgentRuntimeState, change any callers that mutate agent state to call runtime.model_copy(update=...) instead of mutating AgentIdentity, and adjust AgentIdentity (or the agent wrapper) to reference both AgentConfig and AgentRuntimeState rather than being fully frozen itself so runtime updates are supported while config remains immutable.
1276-1340:⚠️ Potential issue | 🔴 CriticalUpdate DESIGN_SPEC.md or complete missing module implementations to match documented structure.
The documented project structure significantly diverges from the actual codebase:
- engine/, memory/, security/ directories are empty (only
__init__.pyexists), but spec documents multiple implementation files for each- observability/events is documented as a subdirectory with 8 separate files (config.py, provider.py, routing.py, budget.py, tool.py, task.py, template.py, prompt.py), but actually exists as a single
events.pyfile- Milestone annotations (M3, M4, M7) documented throughout the spec are not present in the actual code
Either update DESIGN_SPEC.md lines 1276-1340 to reflect the actual structure, or implement the missing modules to match the documented specification.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 1276 - 1340, The DESIGN_SPEC.md section describing module layout (lines referencing engine/, memory/, security/, observability/events and files like prompt.py, agent_engine.py, task_engine.py, workflow_engine.py, meeting_engine.py, hr_engine.py, store.py, retrieval.py, consolidation.py, approval.py, secops_agent.py, audit.py, permissions.py and the events.* files) does not match the codebase (many engine/memory/security modules missing and events.py collapsed). Either update DESIGN_SPEC.md to reflect the actual repo structure or implement the missing modules; specifically decide whether to (A) edit the DESIGN_SPEC.md entries under the engine/, memory/, security/ and observability/events sections to list the actual files (replace the multi-file events.* list with events.py and remove milestone annotations), or (B) create the documented modules (add prompt.py, agent_engine.py, task_engine.py, workflow_engine.py, meeting_engine.py, hr_engine.py under engine/, add store.py/retrieval.py/consolidation.py under memory/, add approval.py/secops_agent.py/audit.py/permissions.py under security/, and split observability/events into the eight files config.py/provider.py/routing.py/budget.py/tool.py/task.py/template.py/prompt.py), keeping names exactly as in the spec so references (e.g., agent_engine, task_engine, events.*) resolve.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CLAUDE.md`:
- Around line 69-72: CLAUDE.md currently asserts conventions that the codebase
hasn't adopted (NotBlankStr, `@computed_field`, asyncio.TaskGroup) while other
items (MappingProxyType wrapping, frozen Pydantic config) are accurate; update
the document to remove or defer the NotBlankStr, `@computed_field`, and TaskGroup
recommendations (or mark them as "planned" with a short migration TODO) and keep
the MappingProxyType and frozen model guidelines as-is; reference the exact
symbols NotBlankStr, `@computed_field`, and asyncio.TaskGroup in the change so
readers know which conventions are deferred and optionally add a one-line note
pointing to the work needed to adopt each in future (e.g., replace
Field(min_length=1) usages with NotBlankStr, replace stored derived fields with
`@computed_field`, and migrate parallel code to asyncio.TaskGroup).
- Line 83: Update the documentation sentence that references non-existent
submodules (e.g., "events.provider", "events.budget", etc.) to describe the
actual flat module pattern: state that event constants live directly in the
ai_company.observability.events module and show the typical import style (for
example, importing PROVIDER_CALL_ERROR from ai_company.observability.events).
Replace the list of domain submodules with guidance to use the single flat
module and an example constant name like PROVIDER_CALL_ERROR to make the correct
usage explicit.
In `@DESIGN_SPEC.md`:
- Around line 901-906: The implementation currently runs tool calls
sequentially; update ToolInvoker.invoke_all to run calls concurrently using
asyncio.TaskGroup (or create tasks and await them with structured concurrency),
spawning a task for each call that calls self.invoke(call) and capturing
exceptions so each task returns a ToolResult(is_error=True) on failure without
cancelling sibling tasks; ensure the method still returns a tuple of ToolResult
objects and preserve the docstring/semantics (i.e., individual errors are
captured and do not cancel other invocations).
- Line 109: AgentIdentity currently types name, role, and department as plain
str with a manual `@model_validator` to reject whitespace-only values; change
those field annotations to use NotBlankStr from core.types (update imports to
import NotBlankStr) and remove the manual whitespace-checking `@model_validator`
method (or its checks for those fields) so validation relies on NotBlankStr;
keep other fields unchanged.
---
Outside diff comments:
In `@DESIGN_SPEC.md`:
- Around line 104-172: The spec calls for a frozen config layer and a separately
updatable runtime layer but the code only defines a single frozen AgentIdentity;
add a distinct Pydantic model AgentRuntimeState (with fields current_task_id,
turn_count, accumulated_cost_usd, status, etc.) and keep
AgentConfig/AgentIdentity as the immutable ConfigDict-based model; update the
agent factory/loader (where AgentConfig is currently created) to populate both
AgentConfig and AgentRuntimeState, change any callers that mutate agent state to
call runtime.model_copy(update=...) instead of mutating AgentIdentity, and
adjust AgentIdentity (or the agent wrapper) to reference both AgentConfig and
AgentRuntimeState rather than being fully frozen itself so runtime updates are
supported while config remains immutable.
- Around line 1276-1340: The DESIGN_SPEC.md section describing module layout
(lines referencing engine/, memory/, security/, observability/events and files
like prompt.py, agent_engine.py, task_engine.py, workflow_engine.py,
meeting_engine.py, hr_engine.py, store.py, retrieval.py, consolidation.py,
approval.py, secops_agent.py, audit.py, permissions.py and the events.* files)
does not match the codebase (many engine/memory/security modules missing and
events.py collapsed). Either update DESIGN_SPEC.md to reflect the actual repo
structure or implement the missing modules; specifically decide whether to (A)
edit the DESIGN_SPEC.md entries under the engine/, memory/, security/ and
observability/events sections to list the actual files (replace the multi-file
events.* list with events.py and remove milestone annotations), or (B) create
the documented modules (add prompt.py, agent_engine.py, task_engine.py,
workflow_engine.py, meeting_engine.py, hr_engine.py under engine/, add
store.py/retrieval.py/consolidation.py under memory/, add
approval.py/secops_agent.py/audit.py/permissions.py under security/, and split
observability/events into the eight files
config.py/provider.py/routing.py/budget.py/tool.py/task.py/template.py/prompt.py),
keeping names exactly as in the spec so references (e.g., agent_engine,
task_engine, events.*) resolve.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1c8bb46d-840d-4d12-a81b-cd90ab3499ef
📒 Files selected for processing (4)
.claude/skills/aurelio-review-pr/SKILL.md.claude/skills/pre-pr-review/SKILL.mdCLAUDE.mdDESIGN_SPEC.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
🧠 Learnings (36)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks
📚 Learning: 2026-03-05T19:01:04.310Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.md.claude/skills/pre-pr-review/SKILL.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.md.claude/skills/pre-pr-review/SKILL.mdCLAUDE.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`), CI enforces this coverage requirement
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.md.claude/skills/pre-pr-review/SKILL.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Validate all LLM outputs by checking required fields, verifying data types, handling malformed responses, and using Pydantic models for structured data
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.mdDESIGN_SPEC.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit. The CI enforces 100% coverage on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`).
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : Core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`) must maintain 100% test coverage
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions
Applied to files:
.claude/skills/pre-pr-review/SKILL.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Use structured prompts with clear instructions including role definition, constraints, output format (JSON when needed), and context from story state
Applied to files:
.claude/skills/pre-pr-review/SKILL.mdDESIGN_SPEC.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Line length: 100 characters (enforced by Ruff)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Keep line length to 88 characters (enforced by ruff)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Google-style docstrings on all public classes and functions (enforced by ruff D rules)
Applied to files:
CLAUDE.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to **/*.py : Maintain line length at 100 characters per Ruff formatting rules.
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff check .` to lint code (use `ruff check --fix .` to auto-fix)
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Before each commit, run `ruff format .` to format code, `ruff check .` to lint code (use `ruff check --fix .` to auto-fix), and `pytest` to run tests
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff format .` to format code with Ruff
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Run `ruff check .` and `ruff check --fix .` to lint code before committing
Applied to files:
CLAUDE.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to **/*.py : Never dismiss warnings from `ruff check`, `pytest`, or other tools. Investigate and fix them, even if they appear unrelated to the current task.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 style enforced by ruff on Python 3.14
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic MUST import logger with: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the structured logger instead
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Always name logger variable as `logger` (not `_logger`, not `log`)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use event name constants from `ai_company.observability.events` for logging
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : All functions and methods should have appropriate logging for debugging and traceability. Use `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use structured keyword argument logging: `logger.info(EVENT, key=value)` — never use string formatting like `logger.info('msg %s', val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/**/*.py : All functions and methods should have appropriate logging using `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use logging levels: debug (routine), info (significant), warning (unexpected but recoverable), error (failures)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Log all error paths at WARNING or ERROR level with context before raising exceptions
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions
Applied to files:
CLAUDE.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Agents must extend `BaseAgent`, use retry logic, and implement configurable timeout via settings.
Applied to files:
DESIGN_SPEC.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Configure appropriate temperature settings based on agent role: Writer (0.9), Editor (0.6), Continuity (0.3), Architect (0.85), Interviewer (0.7)
Applied to files:
DESIGN_SPEC.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : All agents must extend `agents/base.py` with BaseAgent class
Applied to files:
DESIGN_SPEC.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Keep agent logic focused on its specific role - don't mix agent responsibilities (Interviewer, Architect, Writer, Editor, Continuity)
Applied to files:
DESIGN_SPEC.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py} : Ollama Integration - all AI agents use Ollama for local LLM serving with default endpoint `http://localhost:11434`
Applied to files:
DESIGN_SPEC.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, and `ConfigDict` for data models
Applied to files:
DESIGN_SPEC.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use type hints where appropriate. Use Pydantic models for data validation in `src/memory/story_state.py`, dataclasses in `src/settings.py`.
Applied to files:
DESIGN_SPEC.md
🪛 LanguageTool
CLAUDE.md
[typographical] ~70-~70: In American English, use a period after an abbreviation.
Context: ...erializing for persistence). - Config vs runtime state: frozen Pydantic models...
(MISSING_PERIOD_AFTER_ABBREVIATION)
[style] ~70-~70: Consider using the typographical ellipsis character here instead.
Context: ...separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves (e.g....
(ELLIPSIS)
[style] ~83-~83: A comma is missing here.
Context: ...der ai_company.observability.events/ (e.g. events.provider, events.budget, `ev...
(EG_NO_COMMA)
DESIGN_SPEC.md
[style] ~107-~107: Consider using the typographical ellipsis character here instead.
Context: ...n. Represented as Pydantic models using model_copy(update=...) for state transitions — never mutated...
(ELLIPSIS)
[typographical] ~1419-~1419: In American English, use a period after an abbreviation.
Context: ...d shallow (pydantic#7784). | | Config vs runtime split | Frozen models for con...
(MISSING_PERIOD_AFTER_ABBREVIATION)
[style] ~1419-~1419: Consider using the typographical ellipsis character here instead.
Context: ...** | Frozen models for config/identity; model_copy(update=...) for runtime state transitions | Froze...
(ELLIPSIS)
🔇 Additional comments (4)
.claude/skills/aurelio-review-pr/SKILL.md (1)
97-128: Consistent docs-consistency agent across both review skills.The docs-consistency agent is identically configured in both
pre-pr-reviewandaurelio-review-prskills, which provides consistent documentation drift detection across the entire PR lifecycle:
- Pre-PR: catches drift before PR creation
- Post-external-review: catches drift after external reviewers provide feedback
This symmetry is good design.
Minor observation: The prompt is duplicated verbatim in both files (169-200 in pre-pr-review, 99-128 here). If the checks need updates in the future, both files must be updated. Consider extracting to a shared prompt file if this becomes a maintenance burden.
DESIGN_SPEC.md (2)
1412-1425:⚠️ Potential issue | 🟠 MajorPydantic conventions table references unverified patterns.
Lines 1412-1425 document M2.5 Pydantic conventions including:
- MappingProxyType at construction vs deepcopy on access
- Config/runtime split with frozen models
@computed_fieldfor derived values- NotBlankStr for identifiers
- Shared field groups via base models
- Per-domain event modules
- TaskGroup for parallel tool execution
Several of these are contradicted by earlier findings (agent split, NotBlankStr usage, TaskGroup execution, event organization). This table should reflect actual implementation, not aspirational design.
Consider adding a conventions status column.
Since several documented conventions aren't yet implemented, consider adding a "Status" column to this table (Implemented / Planned / In Progress) to clarify which conventions are currently in use versus planned for future milestones.
⛔ Skipped due to learnings
Learnt from: CR Repo: Aureliolo/ai-company PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-05T19:01:04.309Z Learning: Applies to **/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, and `ConfigDict` for data modelsLearnt from: CR Repo: Aureliolo/story-factory PR: 0 File: .github/copilot-instructions.md:0-0 Timestamp: 2026-01-26T08:59:32.818Z Learning: Applies to **/*.py : Use type hints where appropriate. Use Pydantic models for data validation in `src/memory/story_state.py`, dataclasses in `src/settings.py`.
841-848: Remove or correct misleading implementation details from documentation.The documented patterns do not exist in the codebase:
total_tokensis not a@computed_fieldinCostRecord— it has separateinput_tokensandoutput_tokensfields with no computed total.- There is no
_SpendingTotalsbase class.PeriodSpending,AgentSpending, andDepartmentSpendingare independent Pydantic models that duplicate field definitions (total_cost_usd,total_input_tokens,total_output_tokens,record_count) rather than inheriting from a shared base.Update the implementation note to match the actual design, or refactor the code to match the documented pattern if that was the intended design.
> Likely an incorrect or invalid review comment..claude/skills/pre-pr-review/SKILL.md (1)
167-200: 🧹 Nitpick | 🔵 TrivialDocs-consistency agent is well-designed and will catch existing drift.
The new docs-consistency agent with ALWAYS execution is excellent. Its checks directly target the areas where this PR introduces documentation-code mismatches:
Will catch:
- §15.3 Project Structure check → will flag if new modules documented but not created
- §3.1 Agent Identity Card check → will catch the config/runtime split documentation mismatch
- §11.1.1 Tool Execution Model check → will catch the TaskGroup parallel execution claim
- §15.5 Pydantic Model Conventions check → will validate convention adoption
- CLAUDE.md logging section → will catch the events/ reorganization claim
Suggestion: Consider adding this agent to run before merge, not just on PR creation. The current setup runs during
/pre-pr-review, but the drift introduced by this PR won't be detected until someone creates a future PR that changes code. A pre-merge check would catch documentation-code mismatches before they reach main.Would you like me to add a pre-merge workflow that runs the docs-consistency check as a CI gate?
⛔ Skipped due to learnings
Learnt from: CR Repo: Aureliolo/ai-company PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-05T19:01:04.310Z Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checksLearnt from: CR Repo: Aureliolo/ai-company PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-05T19:01:04.309Z Learning: NEVER create a PR directly — ALWAYS use `/pre-pr-review` to create PRs, which runs automated checks + review agents + fixes before creating the PRLearnt from: CR Repo: Aureliolo/story-factory PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-02-26T17:43:50.902Z Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.Learnt from: CR Repo: Aureliolo/story-factory PR: 0 File: .github/instructions/agents.instructions.md:0-0 Timestamp: 2026-01-24T09:54:45.426Z Learning: Applies to agents/*.py : Use structured prompts with clear instructions including role definition, constraints, output format (JSON when needed), and context from story stateLearnt from: CR Repo: Aureliolo/ai-company PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-05T19:01:04.309Z Learning: Always read `DESIGN_SPEC.md` before implementing any feature or planning any issue
CLAUDE.md
Outdated
| - **Never** use `import logging` / `logging.getLogger()` / `print()` in application code | ||
| - **Variable name**: always `logger` (not `_logger`, not `log`) | ||
| - **Event names**: always use constants from `ai_company.observability.events` | ||
| - **Event names**: always use constants from the appropriate domain module under `ai_company.observability.events/` (e.g. `events.provider`, `events.budget`, `events.tool`, `events.config`, `events.template`, `events.routing`, `events.task`, `events.prompt`) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if domain-specific event submodules exist
echo "=== Checking events/ directory structure ==="
ls -la src/ai_company/observability/events/ 2>/dev/null || echo "events/ directory does not exist or is a file"
echo ""
echo "=== Checking if events.py is a file or directory ==="
file src/ai_company/observability/events.py
echo ""
echo "=== Files importing from ai_company.observability.events ==="
rg -l "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | head -20
echo ""
echo "=== Count of import statements ==="
rg "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | wc -l
echo ""
echo "=== Sample imports from flat structure ==="
rg -n "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | head -15
echo ""
echo "=== Check CLAUDE.md line 83 context ==="
sed -n '80,90p' CLAUDE.mdRepository: Aureliolo/ai-company
Length of output: 3530
🏁 Script executed:
#!/bin/bash
echo "=== Content of events.py (first 60 lines) ==="
head -60 src/ai_company/observability/events.py
echo ""
echo "=== Show import pattern in one file ==="
head -30 src/ai_company/providers/base.py | grep -A 10 "from ai_company.observability.events"Repository: Aureliolo/ai-company
Length of output: 3392
CLAUDE.md line 83 documents a non-existent module structure. The documentation prescribes domain-specific submodules (events.provider, events.budget, events.tool, etc.), but the actual implementation uses a flat module structure (ai_company.observability.events) where all constants are defined at the module level. This is confirmed by all 20 files across the codebase, which correctly import directly from the flat module (e.g., from ai_company.observability.events import PROVIDER_CALL_ERROR). The learnings also explicitly recommend the flat structure.
Update CLAUDE.md line 83 to reflect the actual flat module pattern instead of describing non-existent submodules.
🧰 Tools
🪛 LanguageTool
[style] ~83-~83: A comma is missing here.
Context: ...der ai_company.observability.events/ (e.g. events.provider, events.budget, `ev...
(EG_NO_COMMA)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CLAUDE.md` at line 83, Update the documentation sentence that references
non-existent submodules (e.g., "events.provider", "events.budget", etc.) to
describe the actual flat module pattern: state that event constants live
directly in the ai_company.observability.events module and show the typical
import style (for example, importing PROVIDER_CALL_ERROR from
ai_company.observability.events). Replace the list of domain submodules with
guidance to use the single flat module and an example constant name like
PROVIDER_CALL_ERROR to make the correct usage explicit.
DESIGN_SPEC.md
Outdated
| - **Config (immutable)**: identity, personality, skills, model preferences, tool permissions, authority. Defined at hire time, changed only by explicit reconfiguration. Represented as frozen Pydantic models. | ||
| - **Runtime state (mutable-via-copy)**: current status, active task, conversation history, execution metrics. Evolves during agent operation. Represented as Pydantic models using `model_copy(update=...)` for state transitions — never mutated in place. | ||
|
|
||
| All non-optional identifier and name fields (id, name, role, department, etc.) use the `NotBlankStr` validated type from `core.types`. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, check if the file exists and get its structure
echo "=== Checking agent.py file ==="
wc -l src/ai_company/core/agent.py
echo -e "\n=== Looking for AgentIdentity class ==="
rg -n "class AgentIdentity" src/ai_company/core/agent.py
echo -e "\n=== Checking NotBlankStr imports in agent.py ==="
rg -n "NotBlankStr" src/ai_company/core/agent.py
echo -e "\n=== Checking AgentIdentity field definitions ==="
rg -A 30 "class AgentIdentity" src/ai_company/core/agent.py
echo -e "\n=== Checking validators in AgentIdentity (around lines 268-273) ==="
sed -n '260,280p' src/ai_company/core/agent.pyRepository: Aureliolo/ai-company
Length of output: 2454
🏁 Script executed:
#!/bin/bash
# Check NotBlankStr definition in core.types
echo "=== Checking core.types for NotBlankStr ==="
rg -n "NotBlankStr" src/ai_company/core/types.py
echo -e "\n=== Getting full NotBlankStr definition ==="
rg -A 10 "NotBlankStr" src/ai_company/core/types.py
echo -e "\n=== Checking DESIGN_SPEC.md around line 109 ==="
sed -n '105,115p' DESIGN_SPEC.mdRepository: Aureliolo/ai-company
Length of output: 1208
AgentIdentity must use NotBlankStr for identifier and name fields as specified in DESIGN_SPEC.md.
Line 109 of DESIGN_SPEC.md requires: "All non-optional identifier and name fields (id, name, role, department, etc.) use the NotBlankStr validated type from core.types."
However, AgentIdentity (src/ai_company/core/agent.py, line 213) uses plain str with min_length=1 for the name, role, and department fields, and implements a manual @model_validator workaround (lines 274-281) to check for whitespace-only values.
Replace the plain str type with NotBlankStr for these fields, as NotBlankStr is already defined in core.types with the proper validation constraints.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@DESIGN_SPEC.md` at line 109, AgentIdentity currently types name, role, and
department as plain str with a manual `@model_validator` to reject whitespace-only
values; change those field annotations to use NotBlankStr from core.types
(update imports to import NotBlankStr) and remove the manual whitespace-checking
`@model_validator` method (or its checks for those fields) so validation relies on
NotBlankStr; keep other fields unchanged.
| ### 11.1.1 Tool Execution Model | ||
|
|
||
| When the LLM requests multiple tool calls in a single turn, `ToolInvoker.invoke_all` executes them **in parallel** using `asyncio.TaskGroup` for structured concurrency. Individual errors are captured as `ToolResult(is_error=True)` without cancelling sibling invocations. | ||
|
|
||
| Tool parameter schemas (`parameters_schema`) are exposed as read-only via `MappingProxyType` wrapping at construction time — not via `deepcopy` on each access. Deep copies are made only at system boundaries (e.g. when passing arguments to `tool.execute()`). | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/ai_company/tools/invoker.py | sed -n '280,320p'Repository: Aureliolo/ai-company
Length of output: 1071
Tool execution model documentation contradicts implementation.
Lines 903-904 of DESIGN_SPEC.md state: "ToolInvoker.invoke_all executes them in parallel using asyncio.TaskGroup", but the current implementation executes tool calls sequentially using a list comprehension: tuple([await self.invoke(call) for call in tool_calls]) (lines 291-306 in src/ai_company/tools/invoker.py). The method's own docstring confirms this: "Execute multiple tool calls sequentially."
This is a critical behavioral mismatch. The documented parallel execution with TaskGroup semantics (individual errors captured without canceling siblings) is not yet implemented.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@DESIGN_SPEC.md` around lines 901 - 906, The implementation currently runs
tool calls sequentially; update ToolInvoker.invoke_all to run calls concurrently
using asyncio.TaskGroup (or create tasks and await them with structured
concurrency), spawning a task for each call that calls self.invoke(call) and
capturing exceptions so each task returns a ToolResult(is_error=True) on failure
without cancelling sibling tasks; ensure the method still returns a tuple of
ToolResult objects and preserve the docstring/semantics (i.e., individual errors
are captured and do not cancel other invocations).
…le, and Gemini DESIGN_SPEC.md: - §3.1: Note config/runtime split as planned (M3), current state is single AgentIdentity - §10.2: Fix implementation note — total_tokens is stored Field + model_validator, not @computed_field; _SpendingTotals base doesn't exist yet - §11.1.1: Fix tool execution model — sequential (not parallel), deepcopy (not MappingProxyType) for parameters_schema - §15.2: Fix Web UI — "Vue 3" (not "React or Vue 3") to match §15.4 - §15.3: Complete overhaul — match actual filesystem, mark planned files with milestone tags, fix all wrong filenames (context.py→correlation.py, cost_tracker.py→tracker.py, litellm_driver.py→drivers/litellm_driver.py), add missing files/dirs - §15.5: Add Adopted/Planned status column — MappingProxyType and events.py are adopted; @computed_field, NotBlankStr, _SpendingTotals, TaskGroup, config/runtime split are planned CLAUDE.md: - Fix event import path — flat ai_company.observability.events module (not per-domain submodules) - Distinguish adopted vs planned conventions (Models, Async concurrency lines) - Fix Dependencies section — pydantic is a main dep, not dev group README.md: - Update status to M2 complete (was M0 only) - Update tech stack — LiteLLM is implemented (not planned), add structlog SKILL.md (both aurelio-review-pr and pre-pr-review): - Reorder docs-consistency checklist to follow DESIGN_SPEC section numbering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
DESIGN_SPEC.md (1)
104-172: 🧹 Nitpick | 🔵 TrivialDocumentation's M2/M3 split is accurate; suggestion for clearer labeling remains valid.
Verification confirms:
AgentRuntimeStatedoesn't exist, runtime state fields (current_task_id,turn_count,accumulated_cost_usd) are not implemented, andAgentIdentityis frozen with only config-layer fields. The documented placement ofstatusin the config layer matches the codebase. Line 109's note correctly describes M2 as config-only with M3 runtime state planned.However, section 3.1 mixes current M2 state with aspirational M3 design (shown commented). Adding explicit labels like "Current (M2):" and "Planned (M3):" throughout the section—not just in line 109—would prevent readers from mistaking planned fields as already implemented. The suggestion stands as a valid documentation improvement.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 104 - 172, The documentation mixes implemented M2 config fields with planned M3 runtime fields; update DESIGN_SPEC.md to clearly separate current vs planned by adding explicit labels and relocating commented examples: mark the existing config description and AgentIdentity as "Current (M2): config-only — AgentIdentity (frozen Pydantic model)", and mark runtime fields and examples (current_task_id, turn_count, accumulated_cost_usd) under a distinct "Planned (M3): runtime state — AgentRuntimeState (mutable-via-copy)" section so readers cannot confuse planned fields with implemented ones.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/pre-pr-review/SKILL.md:
- Around line 167-200: Add two new checklist entries to the docs-consistency
agent's "DESIGN_SPEC.md" checks: include "§9.2 Provider Configuration — verify
model IDs, provider capability examples, and config/runtime mapping" and "§9.3
LiteLLM Integration — verify integration examples, API surface, and supported
model list"; place these entries alongside the existing DESIGN_SPEC checks in
the Docs-consistency custom prompt (referencing the docs-consistency agent block
and DESIGN_SPEC.md checklist), and mark their severity as MAJOR or HIGH so they
are treated as important drift checks.
---
Outside diff comments:
In `@DESIGN_SPEC.md`:
- Around line 104-172: The documentation mixes implemented M2 config fields with
planned M3 runtime fields; update DESIGN_SPEC.md to clearly separate current vs
planned by adding explicit labels and relocating commented examples: mark the
existing config description and AgentIdentity as "Current (M2): config-only —
AgentIdentity (frozen Pydantic model)", and mark runtime fields and examples
(current_task_id, turn_count, accumulated_cost_usd) under a distinct "Planned
(M3): runtime state — AgentRuntimeState (mutable-via-copy)" section so readers
cannot confuse planned fields with implemented ones.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3d9199a3-ca65-4be7-ba1b-34bb97adcb43
📒 Files selected for processing (5)
.claude/skills/aurelio-review-pr/SKILL.md.claude/skills/pre-pr-review/SKILL.mdCLAUDE.mdDESIGN_SPEC.mdREADME.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
🧠 Learnings (40)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Always read `DESIGN_SPEC.md` before implementing any feature or planning any issue
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to README.md : Update README.md for significant feature changes
Applied to files:
README.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.
Applied to files:
README.mdCLAUDE.md.claude/skills/pre-pr-review/SKILL.md.claude/skills/aurelio-review-pr/SKILL.mdDESIGN_SPEC.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use type hints where appropriate. Use Pydantic models for data validation in `src/memory/story_state.py`, dataclasses in `src/settings.py`.
Applied to files:
CLAUDE.mdDESIGN_SPEC.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use event name constants from `ai_company.observability.events` for logging
Applied to files:
CLAUDE.mdDESIGN_SPEC.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Line length: 100 characters (enforced by Ruff)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Keep line length to 88 characters (enforced by ruff)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Google-style docstrings on all public classes and functions (enforced by ruff D rules)
Applied to files:
CLAUDE.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to **/*.py : Maintain line length at 100 characters per Ruff formatting rules.
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff check .` to lint code (use `ruff check --fix .` to auto-fix)
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff format .` to format code with Ruff
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Before each commit, run `ruff format .` to format code, `ruff check .` to lint code (use `ruff check --fix .` to auto-fix), and `pytest` to run tests
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 style enforced by ruff on Python 3.14
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to **/*.py : Follow PEP 8 style guidelines for Python code
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic MUST import logger with: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the structured logger instead
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Always name logger variable as `logger` (not `_logger`, not `log`)
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : All functions and methods should have appropriate logging for debugging and traceability. Use `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.
Applied to files:
CLAUDE.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/**/*.py : All functions and methods should have appropriate logging using `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use structured keyword argument logging: `logger.info(EVENT, key=value)` — never use string formatting like `logger.info('msg %s', val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use logging levels: debug (routine), info (significant), warning (unexpected but recoverable), error (failures)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Log all error paths at WARNING or ERROR level with context before raising exceptions
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to pyproject.toml : Organize dependencies into groups: `test` (pytest + plugins), `dev` (includes test + ruff, mypy, pre-commit, commitizen, pydantic)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.310Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: Install all dependencies with `uv sync` (installs dev group by default)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to pyproject.toml : Pin all dependency versions using `==` in `pyproject.toml`
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to requirements.txt : When adding new dependencies, add them to `requirements.txt` and pin to exact versions (e.g., `nicegui==3.5.0`) - Dependabot handles upgrades
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to pyproject.toml : Minimize external dependencies. When adding new dependencies, add them to `pyproject.toml` with exact version pins.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-05T19:01:04.310Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks
Applied to files:
.claude/skills/pre-pr-review/SKILL.md.claude/skills/aurelio-review-pr/SKILL.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Always read `DESIGN_SPEC.md` before implementing any feature or planning any issue
Applied to files:
.claude/skills/pre-pr-review/SKILL.md.claude/skills/aurelio-review-pr/SKILL.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions
Applied to files:
.claude/skills/pre-pr-review/SKILL.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Use structured prompts with clear instructions including role definition, constraints, output format (JSON when needed), and context from story state
Applied to files:
.claude/skills/pre-pr-review/SKILL.mdDESIGN_SPEC.md
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`), CI enforces this coverage requirement
Applied to files:
.claude/skills/pre-pr-review/SKILL.md.claude/skills/aurelio-review-pr/SKILL.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: When approved deviations occur, update `DESIGN_SPEC.md` to reflect the new reality
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Validate all LLM outputs by checking required fields, verifying data types, handling malformed responses, and using Pydantic models for structured data
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.mdDESIGN_SPEC.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit. The CI enforces 100% coverage on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`).
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : Core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`) must maintain 100% test coverage
Applied to files:
.claude/skills/aurelio-review-pr/SKILL.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Agents must extend `BaseAgent`, use retry logic, and implement configurable timeout via settings.
Applied to files:
DESIGN_SPEC.md
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Core modules (`agents/`, `services/`, `workflows/`, `memory/`, `utils/`, `settings.py`) must maintain 100% test coverage; UI components (`ui/`) are excluded from coverage requirements
Applied to files:
DESIGN_SPEC.md
📚 Learning: 2026-03-05T19:01:04.309Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, and `ConfigDict` for data models
Applied to files:
DESIGN_SPEC.md
🪛 LanguageTool
CLAUDE.md
[typographical] ~70-~70: In American English, use a period after an abbreviation.
Context: ...erializing for persistence). - Config vs runtime state: frozen Pydantic models...
(MISSING_PERIOD_AFTER_ABBREVIATION)
[style] ~70-~70: Consider using the typographical ellipsis character here instead.
Context: ...separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves (e.g....
(ELLIPSIS)
DESIGN_SPEC.md
[style] ~107-~107: Consider using the typographical ellipsis character here instead.
Context: ...n. Represented as Pydantic models using model_copy(update=...) for state transitions — never mutated...
(ELLIPSIS)
[typographical] ~1423-~1423: In American English, use a period after an abbreviation.
Context: ...d shallow (pydantic#7784). | | Config vs runtime split | Planned (M3) | Frozen...
(MISSING_PERIOD_AFTER_ABBREVIATION)
[style] ~1423-~1423: Consider using the typographical ellipsis character here instead.
Context: ...3) | Frozen models for config/identity; model_copy(update=...) for runtime state transitions | Froze...
(ELLIPSIS)
🔇 Additional comments (4)
README.md (1)
26-37: LGTM! Documentation updates accurately reflect project progress.The status update to M2 completion and tech stack clarifications (LiteLLM implemented, structlog addition) align with the broader observability and provider layer work documented in DESIGN_SPEC.md and CLAUDE.md.
.claude/skills/aurelio-review-pr/SKILL.md (1)
97-128: LGTM! Docs-consistency agent correctly added to post-review workflow.The docs-consistency agent is properly duplicated from the pre-pr-review skill, ensuring documentation verification runs in both workflows:
- Pre-PR review (before external reviewers see the PR)
- Post-external review (after addressing external feedback)
This two-phase approach catches docs drift early while also ensuring external feedback didn't introduce new inconsistencies. The ALWAYS condition and identical checklist maintain consistency across both skills.
CLAUDE.md (1)
136-136: No changes needed. The documentation in CLAUDE.md correctly reflects the actual dependency configuration inpyproject.toml. Thedevgroup includestest + ruff, mypy, pre-commit, commitizenas stated, and pydantic is correctly listed only in the main dependencies, not in the dev group.DESIGN_SPEC.md (1)
1416-1429:⚠️ Potential issue | 🟠 MajorCorrect NotBlankStr adoption status in the conventions table.
The table largely matches reality, but NotBlankStr is mislabeled. It's marked "Planned" with the note that
NotBlankStris "defined but models still useField(min_length=1)+ manual validators," yet the codebase shows ~50 usages ofNotBlankStracross 10+ files (templates/schema.py, providers/routing/models.py, config/schema.py, communication/message.py, etc.). This indicatesNotBlankStris already adopted, not planned.All other convention labels verify correctly:
- MappingProxyType "Adopted" ✓ (15 usages across registries and error contexts)
- Event constants "Adopted (flat)" ✓ (130-line single module with domain-scoped naming)
- Config/runtime split "Planned (M3)" ✓ (no RuntimeState implementations found)
- Derived fields "Planned" ✓ (no
@computed_fieldusages)- Parallel tool execution "Planned" ✓ (ToolInvoker.invoke_all executes sequentially)
⛔ Skipped due to learnings
Learnt from: CR Repo: Aureliolo/ai-company PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-05T19:01:04.309Z Learning: Applies to **/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, and `ConfigDict` for data modelsLearnt from: CR Repo: Aureliolo/story-factory PR: 0 File: .github/copilot-instructions.md:0-0 Timestamp: 2026-01-26T08:59:32.818Z Learning: Applies to **/*.py : Use type hints where appropriate. Use Pydantic models for data validation in `src/memory/story_state.py`, dataclasses in `src/settings.py`.
- DESIGN_SPEC.md §11.1.1: Clarify non-recoverable errors (MemoryError, RecursionError) propagate immediately and abort invoke_all sequence - DESIGN_SPEC.md §3.1: Add explicit "Current (M2)" / "Planned (M3)" labels to YAML section headers - Both SKILL.md: Add §9.2 Provider Config and §9.3 LiteLLM Integration to docs-consistency checklist, fix numbering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
@computed_field,NotBlankStr,asyncio.TaskGroup, per-domain event importsdocs-consistencyagent that runs on every PR regardless of change type — checks DESIGN_SPEC.md, CLAUDE.md, and README.md against actual codebase statedocs-consistencyagent addedMotivation
M2.5 review identified that documentation drift was not caught by any automated process. Code changes could introduce new patterns, conventions, or structural changes without updating the design spec or CLAUDE.md. This PR:
Test plan
src/ai_company/layout/pre-pr-review quickon a future PR to confirm the agent is invoked