docs: update project docs for M2.5 conventions and add docs-consistency review agent by Aureliolo · Pull Request #114 · Aureliolo/synthorg

Aureliolo · 2026-03-05T18:59:53Z

Summary

CLAUDE.md: Document M2.5 conventions — immutability strategy (MappingProxyType over deepcopy), config/runtime state split, @computed_field, NotBlankStr, asyncio.TaskGroup, per-domain event imports
DESIGN_SPEC.md: Update §3.1 (config/runtime split with YAML example), §10.2 (computed_field + spending base note), §11.1.1 (new: tool execution model), §15.2 (Python 3.14+), §15.3 (actual project structure with observability, tools, budget, engine details), §15.4 (Python version), §15.5 (new: Pydantic Model Conventions table)
pre-pr-review skill: Add docs-consistency agent that runs on every PR regardless of change type — checks DESIGN_SPEC.md, CLAUDE.md, and README.md against actual codebase state
aurelio-review-pr skill: Same docs-consistency agent added

Motivation

M2.5 review identified that documentation drift was not caught by any automated process. Code changes could introduce new patterns, conventions, or structural changes without updating the design spec or CLAUDE.md. This PR:

Brings docs up to date with all M2.5 decisions
Adds a mandatory review agent to prevent future drift

Test plan

Verify CLAUDE.md conventions match current codebase patterns
Verify DESIGN_SPEC.md §15.3 structure matches src/ai_company/ layout
Verify both skills include docs-consistency in agent roster
Run /pre-pr-review quick on a future PR to confirm the agent is invoked

…cy review agent - CLAUDE.md: add immutability strategy (MappingProxyType), config/runtime split, computed_field, NotBlankStr, asyncio.TaskGroup, per-domain event imports - DESIGN_SPEC.md: update §3.1 (config/runtime split), §10.2 (computed_field note), §11.1.1 (tool execution model), §15.2-15.5 (Python 3.14+, project structure, Pydantic conventions table) - pre-pr-review: add docs-consistency agent that ALWAYS runs on every PR - aurelio-review-pr: add docs-consistency agent that ALWAYS runs on every PR Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-05T19:00:04Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-03-05T19:00:25Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c5a9b008-fd8e-4d6c-8b11-c172901abcea

📥 Commits

Reviewing files that changed from the base of the PR and between 8559ef3 and 721a69a.

📒 Files selected for processing (3)

.claude/skills/aurelio-review-pr/SKILL.md
.claude/skills/pre-pr-review/SKILL.md
DESIGN_SPEC.md

📝 Walkthrough

Summary by CodeRabbit

Chores
- Added an always-on documentation-consistency check to PR reviews to surface doc drift automatically.
Documentation
- Clarified immutability and state-management guidance, async/parallel execution patterns, and logging/observability practices.
- Expanded agent architecture, tool execution model, public API surface, and module organization.
- Updated project status and tech-stack notes in the main README.

Walkthrough

Adds an always-run docs-consistency agent to two PR-review SKILLs; expands CLAUDE.md with immutability, Pydantic, concurrency, and logging guidance; updates README status/tech-stack; and reworks DESIGN_SPEC.md to separate Agent Config vs Runtime state, introduce NotBlankStr and spending models, and describe planned parallel tool execution semantics.

Changes

Cohort / File(s)	Summary
PR review skills `.claude/skills/aurelio-review-pr/SKILL.md`, `.claude/skills/pre-pr-review/SKILL.md`	Added a persistent `docs-consistency` agent entry with `ALWAYS` trigger and inserted full custom prompt blocks (scope: `DESIGN_SPEC.md`, `CLAUDE.md`, `README.md`); prompt text duplicated in Phase 3 narratives.
Development guidance / docs `CLAUDE.md`, `README.md`	Expanded guidance on immutability (MappingProxyType for frozen models), Pydantic v2 conventions (`@computed_field`, NotBlankStr), runtime vs config model patterns, async concurrency recommendation (asyncio.TaskGroup), structured logging/event naming, and small README status/tech-stack edits.
Design spec / public API `DESIGN_SPEC.md`	Reworked public surface: split Agent identity into `AgentConfig` (frozen) and `AgentRuntimeState` (mutable via `model_copy`), added `NotBlankStr`, computed spending fields (`total_tokens`), described ToolInvoker’s planned move to parallel execution with `asyncio.TaskGroup`, and enumerated many new/reshuffled public modules (tools, providers, observability, cost/budget, API/CLI).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Add design specification, license, and project setup #2 — Introduced the original aurelio-review-pr skill and related README/design artifacts that this change extends by adding the always-run docs-consistency agent.
chore: add pre-pr-review skill and update CLAUDE.md #103 — Prior changes to the pre-pr-review skill and CLAUDE.md that this PR further modifies (adds docs-consistency prompts and guidance).
docs: update project docs for M2.5 conventions and add docs-consistency review agent #114 — Makes similar code-level edits: adding an always-on docs-consistency review agent to SKILL.md files and updating DESIGN_SPEC.md / CLAUDE.md / README.md conventions.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main changes: updating documentation for M2.5 conventions and adding a docs-consistency review agent.
Description check	✅ Passed	The description clearly outlines the specific documentation updates (CLAUDE.md, DESIGN_SPEC.md sections) and the addition of the docs-consistency review agent across skills.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch docs/m25-conventions-and-doc-consistency

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-05T19:00:51Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses the critical issue of documentation drift by performing a significant update to the project's core documentation, CLAUDE.md and DESIGN_SPEC.md, to reflect the latest M2.5 architectural and coding conventions. Concurrently, it introduces an automated docs-consistency review agent into the PR review workflow. This agent will continuously monitor future pull requests to ensure that documentation remains synchronized with the evolving codebase, thereby preventing future inconsistencies and maintaining a reliable source of truth for the project.

Highlights

Documentation Updates: CLAUDE.md and DESIGN_SPEC.md have been comprehensively updated to align with M2.5 conventions, covering immutability strategies, config/runtime state separation, Pydantic model usage (e.g., @computed_field, NotBlankStr), asynchronous concurrency with asyncio.TaskGroup, and refined logging event imports.
Automated Docs Consistency Agent: A new docs-consistency review agent has been integrated into both pre-pr-review and aurelio-review-pr skills. This agent is configured to run on every pull request, regardless of change type, to proactively identify and flag any discrepancies between the codebase and key documentation files (DESIGN_SPEC.md, CLAUDE.md, README.md).
Detailed Consistency Checks: The new agent includes specific, prioritized checks for various sections within DESIGN_SPEC.md (e.g., Project Structure, Pydantic Model Conventions, Agent Identity Card, Tool Execution Model) and CLAUDE.md (e.g., Code Conventions, Logging, Resilience), ensuring thorough documentation integrity.

Changelog

.claude/skills/aurelio-review-pr/SKILL.md
- Added the docs-consistency agent to the list of review agents.
- Included detailed instructions and a checklist for the docs-consistency agent, specifying what to check in DESIGN_SPEC.md, CLAUDE.md, and README.md.
.claude/skills/pre-pr-review/SKILL.md
- Added the docs-consistency agent to the list of review agents.
- Included a custom prompt with detailed instructions and a checklist for the docs-consistency agent, specifying what to check in DESIGN_SPEC.md, CLAUDE.md, and README.md.
CLAUDE.md
- Updated the immutability convention to specify MappingProxyType for dict/list fields in frozen Pydantic models and clarified deep-copy usage.
- Added a new convention for splitting config and runtime state using frozen Pydantic models and model_copy(update=...).
- Expanded Pydantic model conventions to include @computed_field for derived values and NotBlankStr for identifier fields.
- Introduced asyncio.TaskGroup as the preferred method for async concurrency.
- Refined the logging convention for event names to use constants from appropriate per-domain modules under ai_company.observability.events/.
DESIGN_SPEC.md
- Updated section §3.1 "Agent Identity Card" to explain the split between immutable config and mutable-via-copy runtime state, and the use of NotBlankStr.
- Modified the YAML example in §3.1 to illustrate the config and runtime state layers.
- Added an implementation note in §10.2 "Cost Tracking" to clarify that total_tokens is a @computed_field and how spending summary models share a base.
- Introduced a new section §11.1.1 "Tool Execution Model" detailing parallel tool execution with asyncio.TaskGroup and MappingProxyType for tool parameter schemas.
- Updated the specified Python version from 3.12+ to 3.14+ in §15.2 "Technology Stack" and §15.4 "Key Design Decisions".
- Revised section §15.3 "Project Structure" with updated comments and new entries for core/types.py, engine/prompt.py, observability/ subdirectories (context, events, sinks), providers/ subdirectories (litellm_driver, routing, models, capabilities, resilience), tools/ subdirectories (base, invoker, errors), and budget/ subdirectories (cost_record, cost_tracker, spending_summary, hierarchy).
- Added a new section §15.5 "Pydantic Model Conventions (M2.5)" with a table summarizing key decisions and rationales for immutability, config/runtime split, derived fields, string validation, shared field groups, event constants, and parallel tool execution.

Activity

The pull request was created by Aureliolo to address documentation drift and introduce an automated consistency check.
The author provided a detailed summary, motivation, and test plan for the changes.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request is a great initiative to update project documentation with M2.5 conventions and add an automated process to prevent future documentation drift. The updates to CLAUDE.md and DESIGN_SPEC.md are comprehensive and clearly document important patterns. The addition of the docs-consistency agent is a valuable step towards maintaining a high-quality source of truth. I have a couple of minor suggestions to improve the clarity of the new agent's instructions.

_{Note: Security Review has been skipped due to the limited scope of the PR.}

gemini-code-assist · 2026-03-05T19:03:51Z

.claude/skills/aurelio-review-pr/SKILL.md

+2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)
+3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)


For better readability and to align with the document's structure, it would be clearer to list the DESIGN_SPEC.md checklist items in the same order as their section numbers. Section §15.4 should come before §15.5.

Suggested change

2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)

3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)

2. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)

3. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)

gemini-code-assist · 2026-03-05T19:03:51Z

.claude/skills/pre-pr-review/SKILL.md

+2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)
+3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)


To improve clarity and consistency with the document's layout, I suggest reordering these checklist items to follow the section numbering in DESIGN_SPEC.md. This places §15.4 before §15.5.

Suggested change

2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)

3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)

2. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)

3. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)

greptile-apps · 2026-03-05T19:04:41Z

Greptile Summary

This PR brings CLAUDE.md and DESIGN_SPEC.md up to date with the M2.5 review cycle decisions, and adds a mandatory docs-consistency agent to both review skill files so future PRs are automatically checked for documentation drift.

All five issues flagged in previous review threads have been resolved in this PR:

observability/events/ split reverted — events.py is now documented as a flat module, and CLAUDE.md's event import path is corrected to from ai_company.observability.events import EVENT_CONSTANT
observability/context.py → correlation.py rename corrected in §15.3
providers/litellm_driver.py → providers/drivers/litellm_driver.py nesting corrected
budget/cost_tracker.py → budget/tracker.py rename corrected, and missing config.py/enums.py files added
engine/prompt.py label changed from (implemented) to (M3), consistent with only __init__.py existing

New content verified accurate against codebase:

§15.3 project structure matches every file in src/ai_company/ (confirmed by directory inspection)
§11.1.1 tool execution model correctly describes sequential execution and the recoverable/non-recoverable error split (verified against invoker.py)
CLAUDE.md dev-group description correctly drops pydantic — it lives in main dependencies in pyproject.toml, not the dev group
§15.5 Pydantic Model Conventions table accurately distinguishes "Adopted" (MappingProxyType, flat events) from "Planned" (@computed_field, NotBlankStr, shared field groups, parallel tool execution)

One minor gap to follow up on: invoker.py's module-level docstring (line 5) still reads "Never propagates exceptions — always returns a ToolResult", directly contradicting the accurate behavior described in §11.1.1. The class docstring on ToolInvoker is correct, but the module docstring's leading claim is misleading. Updating it in a follow-up (or alongside this PR) would close the loop.

Confidence Score: 4/5

Safe to merge — all previously flagged documentation inaccuracies are resolved, and the new content is verified accurate against the actual codebase.
All five issues from previous review threads are addressed. §15.3, CLAUDE.md conventions, and the new §11.1.1 tool execution model were verified against actual source files and match. The one outstanding item (invoker.py module docstring self-contradiction) is a minor documentation gap in a file not touched by this PR — it does not affect correctness or merge safety.
DESIGN_SPEC.md §11.1.1 — follow up by updating the invoker.py module-level docstring to remove the contradictory "Never propagates exceptions" claim.

Important Files Changed

Filename	Overview
DESIGN_SPEC.md	Major documentation overhaul correctly syncing spec with actual M2.5 codebase state. Previously flagged issues (events/ directory split, context.py naming, litellm_driver.py path, budget module filenames, prompt.py `(implemented)` label, invoke_all non-recoverable behavior) are all addressed. §15.3 project structure matches actual files. One minor gap: §11.1.1 now accurately describes ToolInvoker non-recoverable error propagation, but invoker.py's module-level docstring still self-contradicts ("Never propagates exceptions" vs the Note that says MemoryError/RecursionError are re-raised).
CLAUDE.md	Correctly documents M2.5 conventions: MappingProxyType at construction, config/runtime split, @computed_field, NotBlankStr, asyncio.TaskGroup, and flat event imports from events.py. Dev dependency group fix (pydantic removed) is accurate — pyproject.toml confirms pydantic is a main dependency, not a dev dependency.
.claude/skills/aurelio-review-pr/SKILL.md	Adds docs-consistency agent with full checklist embedded directly after the agent table. Content is functionally identical to the pre-pr-review counterpart. Follows the existing convention in this skill file (no separate "### X custom prompt" section heading), so the structural difference from pre-pr-review is intentional.
.claude/skills/pre-pr-review/SKILL.md	Adds docs-consistency agent in a dedicated "### Docs-consistency custom prompt" section, consistent with how other custom prompts are structured in this skill file (e.g., logging-audit). The agent runs unconditionally on every PR, which is correctly reflected in the trigger condition ("ALWAYS").
README.md	Status update to "M2: Provider Layer complete" accurately reflects the codebase (M0–M2 fully implemented per §15.3 structure). LiteLLM and structlog promoted from "(planned)" to active, consistent with actual dependencies in pyproject.toml.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    PR[Pull Request opened] --> AGENTS[Launch review agents in parallel]

    AGENTS --> TYPE[type-design-analyzer\nif types/classes changed]
    AGENTS --> LOG[logging-audit\nif any src/*.py changed]
    AGENTS --> RES[resilience-audit\nif providers/*.py changed]
    AGENTS --> SEC[security-reviewer\nif api/security/tools changed\nor dangerous patterns detected]
    AGENTS --> DOCS[docs-consistency\n⬛ ALWAYS — every PR]

    DOCS --> CHECK1[Read DESIGN_SPEC.md\nCLAUDE.md, README.md]
    CHECK1 --> CHECK2{Compare against\nPR diff + codebase}
    CHECK2 --> FLAG1[§15.3 project structure\nmatch? CRITICAL]
    CHECK2 --> FLAG2[CLAUDE.md conventions\nmatch code? CRITICAL]
    CHECK2 --> FLAG3[Event import paths\naccurate? CRITICAL]
    CHECK2 --> FLAG4[Tech stack / versions\ncurrent? MAJOR]
    CHECK2 --> FLAG5[README instructions\nstill valid? MAJOR]

    FLAG1 & FLAG2 & FLAG3 & FLAG4 & FLAG5 --> REPORT[Flag drift findings\nin review]

Comments Outside Diff (1)

DESIGN_SPEC.md, line 182-184 (link)

invoker.py module docstring contradicts §11.1.1

The new §11.1.1 text correctly describes that non-recoverable errors propagate out of invoke_all. However, invoker.py's module-level docstring (line 5) still reads:

"Never propagates exceptions — always returns a ToolResult."

This directly contradicts the accurate description just added here, and will mislead contributors who read only the module docstring. The class-level docstring on ToolInvoker is accurate, but the module docstring is self-contradictory — it later clarifies in a Note: block that MemoryError/RecursionError are re-raised, yet leads with an unconditional "never propagates" claim.

Since this PR explicitly introduces a docs-consistency agent to catch exactly this kind of drift, updating invoker.py's module docstring as part of this PR would be the cleanest resolution:
```
# invoker.py module docstring suggestion
"""Tool invoker — validates and executes tool calls.

Bridges LLM ``ToolCall`` objects with concrete ``BaseTool.execute``
methods. Recoverable errors are returned as ``ToolResult(is_error=True)``;
non-recoverable errors (``MemoryError``, ``RecursionError``) are
re-raised after logging.

Note:
    ``BaseException`` subclasses (``KeyboardInterrupt``, ``SystemExit``,
    ``asyncio.CancelledError``) are NOT caught and will propagate
    normally.
"""
```

_{Last reviewed commit: 721a69a}

DESIGN_SPEC.md

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

DESIGN_SPEC.md (2)
104-172: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Documentation describes unimplemented two-layer agent model architecture.

The DESIGN_SPEC.md (lines 104-172) documents a split between AgentConfig (frozen at hire time) and AgentRuntimeState (mutable-via-copy), but the actual implementation combines all fields in a single frozen AgentIdentity class (src/ai_company/core/agent.py:213-280).

Specifically:

Spec promises AgentRuntimeState with fields like current_task_id, turn_count, accumulated_cost_usd updated via model_copy(update=...)

Code has AgentIdentity with model_config = ConfigDict(frozen=True) — completely immutable

No AgentRuntimeState class exists; only AgentConfig exists (in config/schema.py for YAML loading)

Runtime state evolution pattern documented in spec is not implemented

Users reading the spec will expect an API for managing mutable agent state that does not exist in code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DESIGN_SPEC.md` around lines 104 - 172, The spec calls for a frozen config
layer and a separately updatable runtime layer but the code only defines a
single frozen AgentIdentity; add a distinct Pydantic model AgentRuntimeState
(with fields current_task_id, turn_count, accumulated_cost_usd, status, etc.)
and keep AgentConfig/AgentIdentity as the immutable ConfigDict-based model;
update the agent factory/loader (where AgentConfig is currently created) to
populate both AgentConfig and AgentRuntimeState, change any callers that mutate
agent state to call runtime.model_copy(update=...) instead of mutating
AgentIdentity, and adjust AgentIdentity (or the agent wrapper) to reference both
AgentConfig and AgentRuntimeState rather than being fully frozen itself so
runtime updates are supported while config remains immutable.
1276-1340: ⚠️ Potential issue | 🔴 Critical

Update DESIGN_SPEC.md or complete missing module implementations to match documented structure.

The documented project structure significantly diverges from the actual codebase:

engine/, memory/, security/ directories are empty (only __init__.py exists), but spec documents multiple implementation files for each

observability/events is documented as a subdirectory with 8 separate files (config.py, provider.py, routing.py, budget.py, tool.py, task.py, template.py, prompt.py), but actually exists as a single events.py file

Milestone annotations (M3, M4, M7) documented throughout the spec are not present in the actual code

Either update DESIGN_SPEC.md lines 1276-1340 to reflect the actual structure, or implement the missing modules to match the documented specification.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DESIGN_SPEC.md` around lines 1276 - 1340, The DESIGN_SPEC.md section
describing module layout (lines referencing engine/, memory/, security/,
observability/events and files like prompt.py, agent_engine.py, task_engine.py,
workflow_engine.py, meeting_engine.py, hr_engine.py, store.py, retrieval.py,
consolidation.py, approval.py, secops_agent.py, audit.py, permissions.py and the
events.* files) does not match the codebase (many engine/memory/security modules
missing and events.py collapsed). Either update DESIGN_SPEC.md to reflect the
actual repo structure or implement the missing modules; specifically decide
whether to (A) edit the DESIGN_SPEC.md entries under the engine/, memory/,
security/ and observability/events sections to list the actual files (replace
the multi-file events.* list with events.py and remove milestone annotations),
or (B) create the documented modules (add prompt.py, agent_engine.py,
task_engine.py, workflow_engine.py, meeting_engine.py, hr_engine.py under
engine/, add store.py/retrieval.py/consolidation.py under memory/, add
approval.py/secops_agent.py/audit.py/permissions.py under security/, and split
observability/events into the eight files
config.py/provider.py/routing.py/budget.py/tool.py/task.py/template.py/prompt.py),
keeping names exactly as in the spec so references (e.g., agent_engine,
task_engine, events.*) resolve.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CLAUDE.md`:
- Around line 69-72: CLAUDE.md currently asserts conventions that the codebase
hasn't adopted (NotBlankStr, `@computed_field`, asyncio.TaskGroup) while other
items (MappingProxyType wrapping, frozen Pydantic config) are accurate; update
the document to remove or defer the NotBlankStr, `@computed_field`, and TaskGroup
recommendations (or mark them as "planned" with a short migration TODO) and keep
the MappingProxyType and frozen model guidelines as-is; reference the exact
symbols NotBlankStr, `@computed_field`, and asyncio.TaskGroup in the change so
readers know which conventions are deferred and optionally add a one-line note
pointing to the work needed to adopt each in future (e.g., replace
Field(min_length=1) usages with NotBlankStr, replace stored derived fields with
`@computed_field`, and migrate parallel code to asyncio.TaskGroup).
- Line 83: Update the documentation sentence that references non-existent
submodules (e.g., "events.provider", "events.budget", etc.) to describe the
actual flat module pattern: state that event constants live directly in the
ai_company.observability.events module and show the typical import style (for
example, importing PROVIDER_CALL_ERROR from ai_company.observability.events).
Replace the list of domain submodules with guidance to use the single flat
module and an example constant name like PROVIDER_CALL_ERROR to make the correct
usage explicit.

In `@DESIGN_SPEC.md`:
- Around line 901-906: The implementation currently runs tool calls
sequentially; update ToolInvoker.invoke_all to run calls concurrently using
asyncio.TaskGroup (or create tasks and await them with structured concurrency),
spawning a task for each call that calls self.invoke(call) and capturing
exceptions so each task returns a ToolResult(is_error=True) on failure without
cancelling sibling tasks; ensure the method still returns a tuple of ToolResult
objects and preserve the docstring/semantics (i.e., individual errors are
captured and do not cancel other invocations).
- Line 109: AgentIdentity currently types name, role, and department as plain
str with a manual `@model_validator` to reject whitespace-only values; change
those field annotations to use NotBlankStr from core.types (update imports to
import NotBlankStr) and remove the manual whitespace-checking `@model_validator`
method (or its checks for those fields) so validation relies on NotBlankStr;
keep other fields unchanged.

---

Outside diff comments:
In `@DESIGN_SPEC.md`:
- Around line 104-172: The spec calls for a frozen config layer and a separately
updatable runtime layer but the code only defines a single frozen AgentIdentity;
add a distinct Pydantic model AgentRuntimeState (with fields current_task_id,
turn_count, accumulated_cost_usd, status, etc.) and keep
AgentConfig/AgentIdentity as the immutable ConfigDict-based model; update the
agent factory/loader (where AgentConfig is currently created) to populate both
AgentConfig and AgentRuntimeState, change any callers that mutate agent state to
call runtime.model_copy(update=...) instead of mutating AgentIdentity, and
adjust AgentIdentity (or the agent wrapper) to reference both AgentConfig and
AgentRuntimeState rather than being fully frozen itself so runtime updates are
supported while config remains immutable.
- Around line 1276-1340: The DESIGN_SPEC.md section describing module layout
(lines referencing engine/, memory/, security/, observability/events and files
like prompt.py, agent_engine.py, task_engine.py, workflow_engine.py,
meeting_engine.py, hr_engine.py, store.py, retrieval.py, consolidation.py,
approval.py, secops_agent.py, audit.py, permissions.py and the events.* files)
does not match the codebase (many engine/memory/security modules missing and
events.py collapsed). Either update DESIGN_SPEC.md to reflect the actual repo
structure or implement the missing modules; specifically decide whether to (A)
edit the DESIGN_SPEC.md entries under the engine/, memory/, security/ and
observability/events sections to list the actual files (replace the multi-file
events.* list with events.py and remove milestone annotations), or (B) create
the documented modules (add prompt.py, agent_engine.py, task_engine.py,
workflow_engine.py, meeting_engine.py, hr_engine.py under engine/, add
store.py/retrieval.py/consolidation.py under memory/, add
approval.py/secops_agent.py/audit.py/permissions.py under security/, and split
observability/events into the eight files
config.py/provider.py/routing.py/budget.py/tool.py/task.py/template.py/prompt.py),
keeping names exactly as in the spec so references (e.g., agent_engine,
task_engine, events.*) resolve.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1c8bb46d-840d-4d12-a81b-cd90ab3499ef

📥 Commits

Reviewing files that changed from the base of the PR and between c51068b and b3b1f66.

📒 Files selected for processing (4)

.claude/skills/aurelio-review-pr/SKILL.md
.claude/skills/pre-pr-review/SKILL.md
CLAUDE.md
DESIGN_SPEC.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

🧠 Learnings (36)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks

📚 Learning: 2026-03-05T19:01:04.310Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md
.claude/skills/pre-pr-review/SKILL.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md
.claude/skills/pre-pr-review/SKILL.md
CLAUDE.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`), CI enforces this coverage requirement

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md
.claude/skills/pre-pr-review/SKILL.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Validate all LLM outputs by checking required fields, verifying data types, handling malformed responses, and using Pydantic models for structured data

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md
DESIGN_SPEC.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit. The CI enforces 100% coverage on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`).

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : Core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`) must maintain 100% test coverage

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions

Applied to files:

.claude/skills/pre-pr-review/SKILL.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Use structured prompts with clear instructions including role definition, constraints, output format (JSON when needed), and context from story state

Applied to files:

.claude/skills/pre-pr-review/SKILL.md
DESIGN_SPEC.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Line length: 100 characters (enforced by Ruff)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Keep line length to 88 characters (enforced by ruff)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Google-style docstrings on all public classes and functions (enforced by ruff D rules)

Applied to files:

CLAUDE.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to **/*.py : Maintain line length at 100 characters per Ruff formatting rules.

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff check .` to lint code (use `ruff check --fix .` to auto-fix)

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Before each commit, run `ruff format .` to format code, `ruff check .` to lint code (use `ruff check --fix .` to auto-fix), and `pytest` to run tests

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff format .` to format code with Ruff

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Run `ruff check .` and `ruff check --fix .` to lint code before committing

Applied to files:

CLAUDE.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to **/*.py : Never dismiss warnings from `ruff check`, `pytest`, or other tools. Investigate and fix them, even if they appear unrelated to the current task.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 style enforced by ruff on Python 3.14

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic MUST import logger with: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the structured logger instead

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Always name logger variable as `logger` (not `_logger`, not `log`)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use event name constants from `ai_company.observability.events` for logging

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : All functions and methods should have appropriate logging for debugging and traceability. Use `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use structured keyword argument logging: `logger.info(EVENT, key=value)` — never use string formatting like `logger.info('msg %s', val)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/**/*.py : All functions and methods should have appropriate logging using `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use logging levels: debug (routine), info (significant), warning (unexpected but recoverable), error (failures)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Log all error paths at WARNING or ERROR level with context before raising exceptions

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions

Applied to files:

CLAUDE.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Agents must extend `BaseAgent`, use retry logic, and implement configurable timeout via settings.

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Configure appropriate temperature settings based on agent role: Writer (0.9), Editor (0.6), Continuity (0.3), Architect (0.85), Interviewer (0.7)

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : All agents must extend `agents/base.py` with BaseAgent class

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Keep agent logic focused on its specific role - don't mix agent responsibilities (Interviewer, Architect, Writer, Editor, Continuity)

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py} : Ollama Integration - all AI agents use Ollama for local LLM serving with default endpoint `http://localhost:11434`

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, and `ConfigDict` for data models

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use type hints where appropriate. Use Pydantic models for data validation in `src/memory/story_state.py`, dataclasses in `src/settings.py`.

Applied to files:

DESIGN_SPEC.md

🪛 LanguageTool

CLAUDE.md

[typographical] ~70-~70: In American English, use a period after an abbreviation.
Context: ...erializing for persistence). - Config vs runtime state: frozen Pydantic models...

(MISSING_PERIOD_AFTER_ABBREVIATION)

[style] ~70-~70: Consider using the typographical ellipsis character here instead.
Context: ...separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves (e.g....

(ELLIPSIS)

[style] ~83-~83: A comma is missing here.
Context: ...der ai_company.observability.events/ (e.g. events.provider, events.budget, `ev...

(EG_NO_COMMA)

DESIGN_SPEC.md

[style] ~107-~107: Consider using the typographical ellipsis character here instead.
Context: ...n. Represented as Pydantic models using model_copy(update=...) for state transitions — never mutated...

(ELLIPSIS)

[typographical] ~1419-~1419: In American English, use a period after an abbreviation.
Context: ...d shallow (pydantic#7784). | | Config vs runtime split | Frozen models for con...

(MISSING_PERIOD_AFTER_ABBREVIATION)

[style] ~1419-~1419: Consider using the typographical ellipsis character here instead.
Context: ...** | Frozen models for config/identity; model_copy(update=...) for runtime state transitions | Froze...

(ELLIPSIS)

🔇 Additional comments (4)

.claude/skills/aurelio-review-pr/SKILL.md (1)

97-128: Consistent docs-consistency agent across both review skills.

The docs-consistency agent is identically configured in both pre-pr-review and aurelio-review-pr skills, which provides consistent documentation drift detection across the entire PR lifecycle:

Pre-PR: catches drift before PR creation

Post-external-review: catches drift after external reviewers provide feedback

This symmetry is good design.

Minor observation: The prompt is duplicated verbatim in both files (169-200 in pre-pr-review, 99-128 here). If the checks need updates in the future, both files must be updated. Consider extracting to a shared prompt file if this becomes a maintenance burden.
DESIGN_SPEC.md (2)
1412-1425: ⚠️ Potential issue | 🟠 Major

Pydantic conventions table references unverified patterns.

Lines 1412-1425 document M2.5 Pydantic conventions including:

MappingProxyType at construction vs deepcopy on access

Config/runtime split with frozen models

@computed_field for derived values

NotBlankStr for identifiers

Shared field groups via base models

Per-domain event modules

TaskGroup for parallel tool execution

Several of these are contradicted by earlier findings (agent split, NotBlankStr usage, TaskGroup execution, event organization). This table should reflect actual implementation, not aspirational design.

Consider adding a conventions status column.

Since several documented conventions aren't yet implemented, consider adding a "Status" column to this table (Implemented / Planned / In Progress) to clarify which conventions are currently in use versus planned for future milestones.
⛔ Skipped due to learnings
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, and `ConfigDict` for data models
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use type hints where appropriate. Use Pydantic models for data validation in `src/memory/story_state.py`, dataclasses in `src/settings.py`.
841-848: Remove or correct misleading implementation details from documentation.

The documented patterns do not exist in the codebase:

total_tokens is not a @computed_field in CostRecord — it has separate input_tokens and output_tokens fields with no computed total.

There is no _SpendingTotals base class. PeriodSpending, AgentSpending, and DepartmentSpending are independent Pydantic models that duplicate field definitions (total_cost_usd, total_input_tokens, total_output_tokens, record_count) rather than inheriting from a shared base.

Update the implementation note to match the actual design, or refactor the code to match the documented pattern if that was the intended design.
			> Likely an incorrect or invalid review comment.
.claude/skills/pre-pr-review/SKILL.md (1)
167-200: 🧹 Nitpick | 🔵 Trivial

Docs-consistency agent is well-designed and will catch existing drift.

The new docs-consistency agent with ALWAYS execution is excellent. Its checks directly target the areas where this PR introduces documentation-code mismatches:

Will catch:

§15.3 Project Structure check → will flag if new modules documented but not created

§3.1 Agent Identity Card check → will catch the config/runtime split documentation mismatch

§11.1.1 Tool Execution Model check → will catch the TaskGroup parallel execution claim

§15.5 Pydantic Model Conventions check → will validate convention adoption

CLAUDE.md logging section → will catch the events/ reorganization claim

Suggestion: Consider adding this agent to run before merge, not just on PR creation. The current setup runs during /pre-pr-review, but the drift introduced by this PR won't be detected until someone creates a future PR that changes code. A pre-merge check would catch documentation-code mismatches before they reach main.

Would you like me to add a pre-merge workflow that runs the docs-consistency check as a CI gate?
⛔ Skipped due to learnings
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: NEVER create a PR directly — ALWAYS use `/pre-pr-review` to create PRs, which runs automated checks + review agents + fixes before creating the PR
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Use structured prompts with clear instructions including role definition, constraints, output format (JSON when needed), and context from story state
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Always read `DESIGN_SPEC.md` before implementing any feature or planning any issue

CLAUDE.md

coderabbitai · 2026-03-05T19:06:36Z

CLAUDE.md

 - **Never** use `import logging` / `logging.getLogger()` / `print()` in application code
 - **Variable name**: always `logger` (not `_logger`, not `log`)
- **Event names**: always use constants from `ai_company.observability.events`
+- **Event names**: always use constants from the appropriate domain module under `ai_company.observability.events/` (e.g. `events.provider`, `events.budget`, `events.tool`, `events.config`, `events.template`, `events.routing`, `events.task`, `events.prompt`)


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if domain-specific event submodules exist echo "=== Checking events/ directory structure ===" ls -la src/ai_company/observability/events/ 2>/dev/null || echo "events/ directory does not exist or is a file" echo "" echo "=== Checking if events.py is a file or directory ===" file src/ai_company/observability/events.py echo "" echo "=== Files importing from ai_company.observability.events ===" rg -l "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | head -20 echo "" echo "=== Count of import statements ===" rg "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | wc -l echo "" echo "=== Sample imports from flat structure ===" rg -n "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | head -15 echo "" echo "=== Check CLAUDE.md line 83 context ===" sed -n '80,90p' CLAUDE.md

Repository: Aureliolo/ai-company

Length of output: 3530

🏁 Script executed:

#!/bin/bash echo "=== Content of events.py (first 60 lines) ===" head -60 src/ai_company/observability/events.py echo "" echo "=== Show import pattern in one file ===" head -30 src/ai_company/providers/base.py | grep -A 10 "from ai_company.observability.events"

Repository: Aureliolo/ai-company

Length of output: 3392

CLAUDE.md line 83 documents a non-existent module structure. The documentation prescribes domain-specific submodules (events.provider, events.budget, events.tool, etc.), but the actual implementation uses a flat module structure (ai_company.observability.events) where all constants are defined at the module level. This is confirmed by all 20 files across the codebase, which correctly import directly from the flat module (e.g., from ai_company.observability.events import PROVIDER_CALL_ERROR). The learnings also explicitly recommend the flat structure.

Update CLAUDE.md line 83 to reflect the actual flat module pattern instead of describing non-existent submodules.

🧰 Tools

🪛 LanguageTool

[style] ~83-~83: A comma is missing here.
Context: ...der ai_company.observability.events/ (e.g. events.provider, events.budget, `ev...

(EG_NO_COMMA)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@CLAUDE.md` at line 83, Update the documentation sentence that references non-existent submodules (e.g., "events.provider", "events.budget", etc.) to describe the actual flat module pattern: state that event constants live directly in the ai_company.observability.events module and show the typical import style (for example, importing PROVIDER_CALL_ERROR from ai_company.observability.events). Replace the list of domain submodules with guidance to use the single flat module and an example constant name like PROVIDER_CALL_ERROR to make the correct usage explicit.

coderabbitai · 2026-03-05T19:06:36Z

DESIGN_SPEC.md

+- **Config (immutable)**: identity, personality, skills, model preferences, tool permissions, authority. Defined at hire time, changed only by explicit reconfiguration. Represented as frozen Pydantic models.
+- **Runtime state (mutable-via-copy)**: current status, active task, conversation history, execution metrics. Evolves during agent operation. Represented as Pydantic models using `model_copy(update=...)` for state transitions — never mutated in place.
+
+All non-optional identifier and name fields (id, name, role, department, etc.) use the `NotBlankStr` validated type from `core.types`.


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, check if the file exists and get its structure echo "=== Checking agent.py file ===" wc -l src/ai_company/core/agent.py echo -e "\n=== Looking for AgentIdentity class ===" rg -n "class AgentIdentity" src/ai_company/core/agent.py echo -e "\n=== Checking NotBlankStr imports in agent.py ===" rg -n "NotBlankStr" src/ai_company/core/agent.py echo -e "\n=== Checking AgentIdentity field definitions ===" rg -A 30 "class AgentIdentity" src/ai_company/core/agent.py echo -e "\n=== Checking validators in AgentIdentity (around lines 268-273) ===" sed -n '260,280p' src/ai_company/core/agent.py

Repository: Aureliolo/ai-company

Length of output: 2454

🏁 Script executed:

#!/bin/bash # Check NotBlankStr definition in core.types echo "=== Checking core.types for NotBlankStr ===" rg -n "NotBlankStr" src/ai_company/core/types.py echo -e "\n=== Getting full NotBlankStr definition ===" rg -A 10 "NotBlankStr" src/ai_company/core/types.py echo -e "\n=== Checking DESIGN_SPEC.md around line 109 ===" sed -n '105,115p' DESIGN_SPEC.md

Repository: Aureliolo/ai-company

Length of output: 1208

AgentIdentity must use NotBlankStr for identifier and name fields as specified in DESIGN_SPEC.md.

Line 109 of DESIGN_SPEC.md requires: "All non-optional identifier and name fields (id, name, role, department, etc.) use the NotBlankStr validated type from core.types."

However, AgentIdentity (src/ai_company/core/agent.py, line 213) uses plain str with min_length=1 for the name, role, and department fields, and implements a manual @model_validator workaround (lines 274-281) to check for whitespace-only values.

Replace the plain str type with NotBlankStr for these fields, as NotBlankStr is already defined in core.types with the proper validation constraints.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` at line 109, AgentIdentity currently types name, role, and department as plain str with a manual `@model_validator` to reject whitespace-only values; change those field annotations to use NotBlankStr from core.types (update imports to import NotBlankStr) and remove the manual whitespace-checking `@model_validator` method (or its checks for those fields) so validation relies on NotBlankStr; keep other fields unchanged.

coderabbitai · 2026-03-05T19:06:37Z

DESIGN_SPEC.md

+### 11.1.1 Tool Execution Model
+
+When the LLM requests multiple tool calls in a single turn, `ToolInvoker.invoke_all` executes them **in parallel** using `asyncio.TaskGroup` for structured concurrency. Individual errors are captured as `ToolResult(is_error=True)` without cancelling sibling invocations.
+
+Tool parameter schemas (`parameters_schema`) are exposed as read-only via `MappingProxyType` wrapping at construction time — not via `deepcopy` on each access. Deep copies are made only at system boundaries (e.g. when passing arguments to `tool.execute()`).
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n src/ai_company/tools/invoker.py | sed -n '280,320p'

Repository: Aureliolo/ai-company

Length of output: 1071

Tool execution model documentation contradicts implementation.

Lines 903-904 of DESIGN_SPEC.md state: "ToolInvoker.invoke_all executes them in parallel using asyncio.TaskGroup", but the current implementation executes tool calls sequentially using a list comprehension: tuple([await self.invoke(call) for call in tool_calls]) (lines 291-306 in src/ai_company/tools/invoker.py). The method's own docstring confirms this: "Execute multiple tool calls sequentially."

This is a critical behavioral mismatch. The documented parallel execution with TaskGroup semantics (individual errors captured without canceling siblings) is not yet implemented.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 901 - 906, The implementation currently runs tool calls sequentially; update ToolInvoker.invoke_all to run calls concurrently using asyncio.TaskGroup (or create tasks and await them with structured concurrency), spawning a task for each call that calls self.invoke(call) and capturing exceptions so each task returns a ToolResult(is_error=True) on failure without cancelling sibling tasks; ensure the method still returns a tuple of ToolResult objects and preserve the docstring/semantics (i.e., individual errors are captured and do not cancel other invocations).

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…le, and Gemini DESIGN_SPEC.md: - §3.1: Note config/runtime split as planned (M3), current state is single AgentIdentity - §10.2: Fix implementation note — total_tokens is stored Field + model_validator, not @computed_field; _SpendingTotals base doesn't exist yet - §11.1.1: Fix tool execution model — sequential (not parallel), deepcopy (not MappingProxyType) for parameters_schema - §15.2: Fix Web UI — "Vue 3" (not "React or Vue 3") to match §15.4 - §15.3: Complete overhaul — match actual filesystem, mark planned files with milestone tags, fix all wrong filenames (context.py→correlation.py, cost_tracker.py→tracker.py, litellm_driver.py→drivers/litellm_driver.py), add missing files/dirs - §15.5: Add Adopted/Planned status column — MappingProxyType and events.py are adopted; @computed_field, NotBlankStr, _SpendingTotals, TaskGroup, config/runtime split are planned CLAUDE.md: - Fix event import path — flat ai_company.observability.events module (not per-domain submodules) - Distinguish adopted vs planned conventions (Models, Async concurrency lines) - Fix Dependencies section — pydantic is a main dep, not dev group README.md: - Update status to M2 complete (was M0 only) - Update tech stack — LiteLLM is implemented (not planned), add structlog SKILL.md (both aurelio-review-pr and pre-pr-review): - Reorder docs-consistency checklist to follow DESIGN_SPEC section numbering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

DESIGN_SPEC.md (1)
104-172: 🧹 Nitpick | 🔵 Trivial

Documentation's M2/M3 split is accurate; suggestion for clearer labeling remains valid.

Verification confirms: AgentRuntimeState doesn't exist, runtime state fields (current_task_id, turn_count, accumulated_cost_usd) are not implemented, and AgentIdentity is frozen with only config-layer fields. The documented placement of status in the config layer matches the codebase. Line 109's note correctly describes M2 as config-only with M3 runtime state planned.

However, section 3.1 mixes current M2 state with aspirational M3 design (shown commented). Adding explicit labels like "Current (M2):" and "Planned (M3):" throughout the section—not just in line 109—would prevent readers from mistaking planned fields as already implemented. The suggestion stands as a valid documentation improvement.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DESIGN_SPEC.md` around lines 104 - 172, The documentation mixes implemented
M2 config fields with planned M3 runtime fields; update DESIGN_SPEC.md to
clearly separate current vs planned by adding explicit labels and relocating
commented examples: mark the existing config description and AgentIdentity as
"Current (M2): config-only — AgentIdentity (frozen Pydantic model)", and mark
runtime fields and examples (current_task_id, turn_count, accumulated_cost_usd)
under a distinct "Planned (M3): runtime state — AgentRuntimeState
(mutable-via-copy)" section so readers cannot confuse planned fields with
implemented ones.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/pre-pr-review/SKILL.md:
- Around line 167-200: Add two new checklist entries to the docs-consistency
agent's "DESIGN_SPEC.md" checks: include "§9.2 Provider Configuration — verify
model IDs, provider capability examples, and config/runtime mapping" and "§9.3
LiteLLM Integration — verify integration examples, API surface, and supported
model list"; place these entries alongside the existing DESIGN_SPEC checks in
the Docs-consistency custom prompt (referencing the docs-consistency agent block
and DESIGN_SPEC.md checklist), and mark their severity as MAJOR or HIGH so they
are treated as important drift checks.

---

Outside diff comments:
In `@DESIGN_SPEC.md`:
- Around line 104-172: The documentation mixes implemented M2 config fields with
planned M3 runtime fields; update DESIGN_SPEC.md to clearly separate current vs
planned by adding explicit labels and relocating commented examples: mark the
existing config description and AgentIdentity as "Current (M2): config-only —
AgentIdentity (frozen Pydantic model)", and mark runtime fields and examples
(current_task_id, turn_count, accumulated_cost_usd) under a distinct "Planned
(M3): runtime state — AgentRuntimeState (mutable-via-copy)" section so readers
cannot confuse planned fields with implemented ones.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3d9199a3-ca65-4be7-ba1b-34bb97adcb43

📥 Commits

Reviewing files that changed from the base of the PR and between b3b1f66 and 8559ef3.

📒 Files selected for processing (5)

.claude/skills/aurelio-review-pr/SKILL.md
.claude/skills/pre-pr-review/SKILL.md
CLAUDE.md
DESIGN_SPEC.md
README.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

🧠 Learnings (40)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Always read `DESIGN_SPEC.md` before implementing any feature or planning any issue

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to README.md : Update README.md for significant feature changes

Applied to files:

README.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.

Applied to files:

README.md
CLAUDE.md
.claude/skills/pre-pr-review/SKILL.md
.claude/skills/aurelio-review-pr/SKILL.md
DESIGN_SPEC.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use type hints where appropriate. Use Pydantic models for data validation in `src/memory/story_state.py`, dataclasses in `src/settings.py`.

Applied to files:

CLAUDE.md
DESIGN_SPEC.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use event name constants from `ai_company.observability.events` for logging

Applied to files:

CLAUDE.md
DESIGN_SPEC.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Line length: 100 characters (enforced by Ruff)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Keep line length to 88 characters (enforced by ruff)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Google-style docstrings on all public classes and functions (enforced by ruff D rules)

Applied to files:

CLAUDE.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to **/*.py : Maintain line length at 100 characters per Ruff formatting rules.

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff check .` to lint code (use `ruff check --fix .` to auto-fix)

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff format .` to format code with Ruff

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Before each commit, run `ruff format .` to format code, `ruff check .` to lint code (use `ruff check --fix .` to auto-fix), and `pytest` to run tests

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 style enforced by ruff on Python 3.14

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to **/*.py : Follow PEP 8 style guidelines for Python code

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic MUST import logger with: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the structured logger instead

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Always name logger variable as `logger` (not `_logger`, not `log`)

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : All functions and methods should have appropriate logging for debugging and traceability. Use `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.

Applied to files:

CLAUDE.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/**/*.py : All functions and methods should have appropriate logging using `logger.debug()` for routine operations, `logger.info()` for significant events, `logger.warning()` for unexpected but recoverable situations, and `logger.error()` for failures.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use structured keyword argument logging: `logger.info(EVENT, key=value)` — never use string formatting like `logger.info('msg %s', val)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use logging levels: debug (routine), info (significant), warning (unexpected but recoverable), error (failures)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Use DEBUG level logging for object creation, internal flow, and entry/exit of key functions

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to src/ai_company/**/*.py : Log all error paths at WARNING or ERROR level with context before raising exceptions

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to pyproject.toml : Organize dependencies into groups: `test` (pytest + plugins), `dev` (includes test + ruff, mypy, pre-commit, commitizen, pydantic)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.310Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: Install all dependencies with `uv sync` (installs dev group by default)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to pyproject.toml : Pin all dependency versions using `==` in `pyproject.toml`

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to requirements.txt : When adding new dependencies, add them to `requirements.txt` and pin to exact versions (e.g., `nicegui==3.5.0`) - Dependabot handles upgrades

Applied to files:

CLAUDE.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to pyproject.toml : Minimize external dependencies. When adding new dependencies, add them to `pyproject.toml` with exact version pins.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-05T19:01:04.310Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.310Z
Learning: For trivial/docs-only changes, use `/pre-pr-review quick` to skip agents but still run automated checks

Applied to files:

.claude/skills/pre-pr-review/SKILL.md
.claude/skills/aurelio-review-pr/SKILL.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Always read `DESIGN_SPEC.md` before implementing any feature or planning any issue

Applied to files:

.claude/skills/pre-pr-review/SKILL.md
.claude/skills/aurelio-review-pr/SKILL.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Implement graceful error recovery: retry with different prompts if needed, fall back to simpler approaches on failure, and don't fail silently - log and raise appropriate exceptions

Applied to files:

.claude/skills/pre-pr-review/SKILL.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Use structured prompts with clear instructions including role definition, constraints, output format (JSON when needed), and context from story state

Applied to files:

.claude/skills/pre-pr-review/SKILL.md
DESIGN_SPEC.md

📚 Learning: 2026-01-24T16:33:29.354Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`), CI enforces this coverage requirement

Applied to files:

.claude/skills/pre-pr-review/SKILL.md
.claude/skills/aurelio-review-pr/SKILL.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: When approved deviations occur, update `DESIGN_SPEC.md` to reflect the new reality

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md

📚 Learning: 2026-01-24T09:54:45.426Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Validate all LLM outputs by checking required fields, verifying data types, handling malformed responses, and using Pydantic models for structured data

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md
DESIGN_SPEC.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : 100% test coverage is MANDATORY for every commit. The CI enforces 100% coverage on core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`).

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to {src/agents/**/*.py,src/services/**/*.py,src/memory/**/*.py,src/utils/**/*.py,src/settings.py} : Core modules (`src/agents/`, `src/services/`, `src/memory/`, `src/utils/`, `src/settings.py`) must maintain 100% test coverage

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md

📚 Learning: 2026-02-26T17:43:50.902Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to src/agents/**/*.py : Agents must extend `BaseAgent`, use retry logic, and implement configurable timeout via settings.

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-01-24T09:54:56.100Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Core modules (`agents/`, `services/`, `workflows/`, `memory/`, `utils/`, `settings.py`) must maintain 100% test coverage; UI components (`ui/`) are excluded from coverage requirements

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-03-05T19:01:04.309Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, and `ConfigDict` for data models

Applied to files:

DESIGN_SPEC.md

🪛 LanguageTool

CLAUDE.md

[typographical] ~70-~70: In American English, use a period after an abbreviation.
Context: ...erializing for persistence). - Config vs runtime state: frozen Pydantic models...

(MISSING_PERIOD_AFTER_ABBREVIATION)

[style] ~70-~70: Consider using the typographical ellipsis character here instead.
Context: ...separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves (e.g....

(ELLIPSIS)

DESIGN_SPEC.md

[style] ~107-~107: Consider using the typographical ellipsis character here instead.
Context: ...n. Represented as Pydantic models using model_copy(update=...) for state transitions — never mutated...

(ELLIPSIS)

[typographical] ~1423-~1423: In American English, use a period after an abbreviation.
Context: ...d shallow (pydantic#7784). | | Config vs runtime split | Planned (M3) | Frozen...

(MISSING_PERIOD_AFTER_ABBREVIATION)

[style] ~1423-~1423: Consider using the typographical ellipsis character here instead.
Context: ...3) | Frozen models for config/identity; model_copy(update=...) for runtime state transitions | Froze...

(ELLIPSIS)

🔇 Additional comments (4)

README.md (1)

26-37: LGTM! Documentation updates accurately reflect project progress.

The status update to M2 completion and tech stack clarifications (LiteLLM implemented, structlog addition) align with the broader observability and provider layer work documented in DESIGN_SPEC.md and CLAUDE.md.

.claude/skills/aurelio-review-pr/SKILL.md (1)

97-128: LGTM! Docs-consistency agent correctly added to post-review workflow.

The docs-consistency agent is properly duplicated from the pre-pr-review skill, ensuring documentation verification runs in both workflows:

Pre-PR review (before external reviewers see the PR)

Post-external review (after addressing external feedback)

This two-phase approach catches docs drift early while also ensuring external feedback didn't introduce new inconsistencies. The ALWAYS condition and identical checklist maintain consistency across both skills.

CLAUDE.md (1)

136-136: No changes needed. The documentation in CLAUDE.md correctly reflects the actual dependency configuration in pyproject.toml. The dev group includes test + ruff, mypy, pre-commit, commitizen as stated, and pydantic is correctly listed only in the main dependencies, not in the dev group.
DESIGN_SPEC.md (1)
1416-1429: ⚠️ Potential issue | 🟠 Major

Correct NotBlankStr adoption status in the conventions table.

The table largely matches reality, but NotBlankStr is mislabeled. It's marked "Planned" with the note that NotBlankStr is "defined but models still use Field(min_length=1) + manual validators," yet the codebase shows ~50 usages of NotBlankStr across 10+ files (templates/schema.py, providers/routing/models.py, config/schema.py, communication/message.py, etc.). This indicates NotBlankStr is already adopted, not planned.

All other convention labels verify correctly:

MappingProxyType "Adopted" ✓ (15 usages across registries and error contexts)

Event constants "Adopted (flat)" ✓ (130-line single module with domain-scoped naming)

Config/runtime split "Planned (M3)" ✓ (no RuntimeState implementations found)

Derived fields "Planned" ✓ (no @computed_field usages)

Parallel tool execution "Planned" ✓ (ToolInvoker.invoke_all executes sequentially)
⛔ Skipped due to learnings
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-05T19:01:04.309Z
Learning: Applies to **/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, and `ConfigDict` for data models
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Use type hints where appropriate. Use Pydantic models for data validation in `src/memory/story_state.py`, dataclasses in `src/settings.py`.

.claude/skills/pre-pr-review/SKILL.md

DESIGN_SPEC.md

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

- DESIGN_SPEC.md §11.1.1: Clarify non-recoverable errors (MemoryError, RecursionError) propagate immediately and abort invoke_all sequence - DESIGN_SPEC.md §3.1: Add explicit "Current (M2)" / "Planned (M3)" labels to YAML section headers - Both SKILL.md: Add §9.2 Provider Config and §9.3 LiteLLM Integration to docs-consistency checklist, fix numbering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 5, 2026 18:59

gemini-code-assist bot reviewed Mar 5, 2026

View reviewed changes

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

DESIGN_SPEC.md Show resolved Hide resolved

DESIGN_SPEC.md Outdated Show resolved Hide resolved

DESIGN_SPEC.md Outdated Show resolved Hide resolved

DESIGN_SPEC.md Outdated Show resolved Hide resolved

DESIGN_SPEC.md Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

Copilot AI reviewed Mar 5, 2026

View reviewed changes

Aureliolo requested a review from Copilot March 5, 2026 19:28

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

.claude/skills/pre-pr-review/SKILL.md Show resolved Hide resolved

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

DESIGN_SPEC.md Outdated Show resolved Hide resolved

Copilot AI reviewed Mar 5, 2026

View reviewed changes

Aureliolo merged commit 99766ee into main Mar 5, 2026
10 of 11 checks passed

Aureliolo deleted the docs/m25-conventions-and-doc-consistency branch March 5, 2026 19:46

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release ai-company 0.1.1 #282

Merged

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release 0.1.0 #283

Merged

This was referenced Mar 15, 2026

chore(main): release 0.2.4 #431

Merged

chore(main): release 0.2.0 #442

Closed

This was referenced Mar 15, 2026

chore(main): release 0.2.5 #447

Merged

chore(main): release 0.2.0 #460

Closed

chore(main): release 0.2.0 #471

Closed

		2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)
		3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)

Conversation

Aureliolo commented Mar 5, 2026

Summary

Motivation

Test plan

Uh oh!

github-actions bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

gemini-code-assist bot commented Mar 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

github-actions bot commented Mar 5, 2026 •

edited

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

greptile-apps bot commented Mar 5, 2026 •

edited

Loading