docs: address external review feedback with MVP scope and new protocols by Aureliolo · Pull Request #128 · Aureliolo/synthorg

Aureliolo · 2026-03-06T12:29:11Z

Summary

Addresses feedback from 3 external reviews of the design spec. All reviewers flagged scope communication, missing protocols, and over-engineering concerns.

Changes

§1.4 MVP Definition — explicit M3 scope (what's built vs deferred), "how to read this spec" guide
§1.5 Configuration Philosophy — progressive disclosure, minimal config example, pluggability note
§6.7 Graceful Shutdown Protocol — new ShutdownStrategy protocol with 4 strategies (cooperative+timeout default), INTERRUPTED status, Windows compatibility notes, in-flight LLM cost handling
§6.8 Concurrent Workspace Isolation — new WorkspaceIsolationStrategy protocol with 3 strategies (planner+worktrees default), state coordination table distinguishing framework state vs code/files
§7.4 Memory backends 2-3 relabeled as "Research Directions (M5+)" — not planned implementations
§11.3 Progressive Trust — "Disabled" marked as default, security invariant (human gate for elevated access) made prominent
12 inline MVP markers across §5.6, §5.7, §6.5, §6.6, §6.7, §6.8, §8, §10.3, §10.5, §11.1.2, §11.3, §12.4
Minimal config examples in §10.4, §12.2
§15.5 Engineering Conventions — 3 new rows: state coordination, workspace isolation, graceful shutdown
Table of Contents updated with new sections

Design Decisions (from external review analysis)

Concern	Decision
Scope communication	§1.4 section + inline MVP markers
Concurrent state coordination	Centralized single-writer (`TaskEngine`)
Graceful shutdown	Cooperative + 30s timeout (pluggable `ShutdownStrategy`)
Concurrent file edits	Planner + git worktrees (pluggable `WorkspaceIsolationStrategy`)
Memory backend labeling	GraphRAG/TKG → "Research Directions"
Progressive trust default	Disabled (static access) for M3-M6
Config complexity	Progressive disclosure philosophy + minimal examples

Test plan

Verify all markdown renders correctly on GitHub
Verify Table of Contents links work
Verify inline MVP markers are visible in each section
Verify §6.7 and §6.8 section numbering is consistent
Verify §15.5 table rows render correctly

🤖 Generated with Claude Code

… and inline markers Add §1.4 MVP Definition and §1.5 Configuration Philosophy to clearly separate the full vision from what M3 actually builds. Add §6.7 Graceful Shutdown Protocol (ShutdownStrategy protocol) and §6.8 Concurrent Workspace Isolation (WorkspaceIsolationStrategy protocol) addressing reviewer concerns about missing shutdown handling and multi-agent file conflicts. Relabel §7.4 GraphRAG/Temporal KG backends as Research Directions. Make progressive trust default to "disabled" with prominent security invariant. Add 12 inline MVP markers across all deferred feature sections. Add minimal config examples and sensible defaults documentation. Update §15.5 conventions table with state coordination, workspace isolation, and graceful shutdown decisions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-06T12:29:21Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

gemini-code-assist · 2026-03-06T12:29:28Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the design specification based on external review feedback, primarily by clarifying the Minimum Viable Product (MVP) scope, introducing new essential protocols for system robustness and multi-agent coordination, and addressing concerns about over-engineering. The changes aim to provide a clearer roadmap for implementation, ensure a more robust system architecture, and improve the overall readability and understanding of the framework's design.

Highlights

MVP Definition and Configuration Philosophy: New sections were added to explicitly define the M3 MVP scope, detailing what features are built versus deferred, and outlining the progressive disclosure philosophy for configuration with minimal examples.
Graceful Shutdown Protocol: A new ShutdownStrategy protocol was introduced, including four strategies (cooperative with timeout as default), a new INTERRUPTED task status, Windows compatibility notes, and handling for in-flight LLM call costs during shutdown.
Concurrent Workspace Isolation: A new WorkspaceIsolationStrategy protocol was defined, offering three strategies (planner + git worktrees as default) for managing concurrent file edits by multiple agents, along with a table clarifying state coordination mechanisms.
Memory Backend Reclassification: Memory backends previously labeled as 'Future' (GraphRAG and Temporal Knowledge Graph) were reclassified as 'Research Directions (M5+)' to clarify they are not planned implementations for earlier milestones.
Progressive Trust Clarification: The 'Progressive Trust' section was updated to mark 'Disabled' as the default for M3-M6 and to prominently state the security invariant that human approval is always required for elevated access.
Inline MVP Markers and Examples: Twelve inline MVP markers were added across various sections to clearly indicate M3 scope, and minimal configuration examples were provided for cost controls and autonomy levels.
Engineering Conventions Update: The Engineering Conventions table was updated with new rows for state coordination, workspace isolation, and graceful shutdown, reflecting the newly introduced protocols and strategies.
Table of Contents Enhancement: The Table of Contents was updated to include new sections and sub-sections, providing a more detailed overview of the document's structure.

Changelog

DESIGN_SPEC.md
- Updated the Table of Contents to reflect new sections and sub-sections.
- Added a dedicated 'MVP Definition' section to clearly outline M3 scope and deferred features.
- Introduced a 'Configuration Philosophy' section detailing progressive disclosure and pluggability.
- Added MVP notes to sections on Conflict Resolution, Meeting Protocols, Execution Loops, and Crash Recovery to specify M3 scope.
- Introduced a comprehensive 'Graceful Shutdown Protocol' section, defining strategies and handling for process termination.
- Added a 'Concurrent Workspace Isolation' section, outlining strategies for multi-agent file access and state coordination.
- Renamed 'Backend 2' and 'Backend 3' in Memory & Persistence to 'Research Directions (M5+)' and 'Research' respectively.
- Added MVP notes to sections on HR & Workforce Management, CFO Agent Responsibilities, LLM Call Analytics, Sandbox Backends, and Approval Timeout.
- Included minimal configuration examples for Cost Controls and Autonomy Levels.
- Updated the 'Engineering Conventions' table with new entries for State Coordination, Workspace Isolation, and Graceful Shutdown.
- Clarified the default strategy and security invariant for 'Progressive Trust'.

Activity

No human activity has occurred on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-03-06T12:29:44Z

📝 Walkthrough

Summary by CodeRabbit

Release Notes

Documentation
- Updated design specification clarifying which features are included in the MVP release versus future milestones
- Reorganized documentation with explicit milestone definitions (M3–M7+) for planned features
- Added detailed MVP scope across architecture, execution strategies, trust models, and persistence approaches

Walkthrough

DESIGN_SPEC.md undergoes restructuring with renumbered table of contents, introduces formal MVP definitions and scope (sections 1.4–1.5), and adds MVP annotations throughout multiple sections (conflict resolution, execution loop, crash recovery, sandbox backends, trust mechanisms) alongside a new INTERRUPTED terminal task state.

Changes

Cohort / File(s)	Summary
Design Specification Document `DESIGN_SPEC.md`	Table of contents restructured with sequence renumbering; MVP definitions introduced (sections 1.4–1.5) outlining core goals and deferred milestones (M4–M7+). MVP annotations inserted across conflict resolution (5.6), meeting protocol (5.7), execution loop (6.5), crash recovery (6.6), graceful shutdown (6.7), workspace isolation (6.8), memory/persistence (7.x), workforce management (8.x), sandbox backends (11.1.2), progressive trust (11.3), and approval timeout (12.4). New INTERRUPTED terminal task state introduced for shutdown scenarios. Multiple features marked as deferred to post-MVP milestones.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Add design specification, license, and project setup #2 — Directly modifies DESIGN_SPEC.md by introducing the same MVP scope restructuring and annotations across multiple sections.
docs: add crash recovery, sandboxing, analytics, and testing decisions #127 — Updates DESIGN_SPEC.md with overlapping MVP changes to crash recovery strategies, sandbox backend tiers, and terminal task states.
docs: expand design spec with pluggable strategy protocols #121 — Modifies DESIGN_SPEC.md to configure pluggable protocol design and memory layer components with concurrent MVP scope annotations.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: addressing external review feedback by adding MVP scope definition and new protocols to the design spec.
Description check	✅ Passed	The description is well-related to the changeset, providing a summary of changes, specific sections modified, design decisions, and a test plan aligned with the actual changes to DESIGN_SPEC.md.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch docs/spec-mvp-scope

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request significantly improves the design specification by clarifying the MVP scope, adding new protocols for graceful shutdown and workspace isolation, and introducing a configuration philosophy of progressive disclosure. The changes are well-structured and address the feedback from external reviews effectively. I have one minor suggestion to improve the consistency of the 'State Coordination vs Workspace Isolation' table.

_{Note: Security Review has been skipped due to the limited scope of the PR.}

gemini-code-assist · 2026-03-06T12:30:44Z

DESIGN_SPEC.md

+| Framework state (tasks, assignments, budget) | Centralized single-writer (`TaskEngine`) | `model_copy(update=...)` via async queue |
+| Code and files (agent work output) | Workspace isolation (`WorkspaceIsolationStrategy`) | Git worktrees / branches |
+| Agent memory (personal) | Per-agent ownership | Each agent owns its memory exclusively |
+| Org memory (shared knowledge) | Single-writer (`OrgMemoryBackend`) | Write-access controlled by role level |


In the 'State Coordination vs Workspace Isolation' table, the 'Mechanism' for 'Org memory' seems to describe a policy rather than an implementation mechanism, which makes it inconsistent with other rows. For clarity and consistency, consider describing the technical mechanism, similar to how it's done for 'Framework state'.

Suggested change

| Org memory (shared knowledge) | Single-writer (`OrgMemoryBackend`) | Write-access controlled by role level |

| Org memory (shared knowledge) | Single-writer (`OrgMemoryBackend`) | `OrgMemoryBackend` protocol with role-based access control |

greptile-apps · 2026-03-06T12:32:31Z

Greptile Summary

This PR addresses three rounds of external review feedback by clarifying MVP scope and adding two new protocol sections to the design spec. It introduces §1.4 (explicit M3 MVP definition with deferred-feature list), §1.5 (progressive-disclosure config philosophy), §6.7 (pluggable ShutdownStrategy with cooperative + 30 s timeout as the M3 default), and §6.8 (pluggable WorkspaceIsolationStrategy with planner + git worktrees as the M4+ default). Twelve inline MVP callout boxes are scattered across §5–§12 and three new rows are appended to the §15.5 Engineering Conventions table.

Key changes and observations:

The FAILED terminal/non-terminal contradiction flagged by a previous reviewer is correctly resolved — §6.6 now reads "New non-terminal state" and explicitly contrasts FAILED (reassignable) with CANCELLED (terminal).
§1.4 "approval workflows (M7)" contradicts §12.4's MVP marker — basic wait-and-resume approval plumbing is tagged as M3 in §12.4, while §1.4 defers all "approval workflows" to M7; needs clarification (see inline comment).
§7.4 heading hierarchy — the new #### Research Directions (M5+) header sits at the same level as Backend 2 and Backend 3, making them document siblings rather than children; promoting it to ### would correctly nest the backends beneath it.

Confidence Score: 4/5

Safe to merge — documentation-only changes with no code impact; two minor inconsistencies are easy to address.
All changes are confined to DESIGN_SPEC.md with no code or configuration side-effects. The FAILED state contradiction from prior review is correctly resolved. Two remaining issues (§1.4 milestone label for approval workflows contradicting §12.4's MVP callout, and §7.4 heading level) are low-severity documentation clarity problems that won't block implementation but should be fixed to prevent implementor confusion about M3 scope.
DESIGN_SPEC.md — specifically line 78 (approval workflows milestone) and line 1125 (Research Directions heading level).

Important Files Changed

Filename	Overview
DESIGN_SPEC.md	Comprehensive documentation update adding §1.4 MVP Definition, §1.5 Config Philosophy, §6.7 Graceful Shutdown, §6.8 Workspace Isolation, 12 inline MVP markers, and §15.5 Engineering Convention rows. Two issues found: §1.4 defers "approval workflows" to M7 while §12.4's new MVP marker puts basic approval-waiting in M3 (contradiction); §7.4 "Research Directions" header sits at the same #### level as the backend entries it introduces, breaking the intended hierarchy.

Sequence Diagram

sequenceDiagram
    participant OS as OS Signal
    participant Engine as TaskEngine
    participant Agent as In-flight Agent(s)
    participant Storage as Storage / Connections

    OS->>Engine: SIGTERM / SIGINT
    Engine->>Engine: Set shutdown_event (asyncio.Event)
    Engine->>Engine: Close drain gate — stop accepting new tasks
    Note over Agent: Checks shutdown_event at turn boundaries
    Agent-->>Engine: Exit cooperatively (within grace_seconds)

    alt All agents exit within grace_seconds (30s)
        Agent->>Engine: Cooperative exit
    else grace_seconds exceeded
        Engine->>Agent: task.cancel()
        Agent-->>Engine: Task → INTERRUPTED
    end

    Engine->>Storage: Persist cost records
    Engine->>Storage: Close provider connections
    Engine->>Storage: Flush logs
    Note over Engine,Storage: cleanup_seconds (5s) then process exits

_{Last reviewed commit: 072a6f2}

DESIGN_SPEC.md

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@DESIGN_SPEC.md`:
- Around line 971-974: Remove the blank line between the two consecutive
blockquote lines so they remain a single blockquote; locate the paragraph
containing the `INTERRUPTED` TaskStatus note and the Windows compatibility note
referencing `loop.add_signal_handler()` and `signal.signal()` and merge them
into one continuous blockquote (or convert both to normal paragraphs) so MD028
is not triggered.
- Around line 1369-1375: The fenced YAML examples in the "Minimal config"
blockquote are flush with surrounding text and must be padded with blank quoted
lines: insert a standalone quoted blank line (a line containing just ">")
immediately before the opening ```yaml fence and another immediately after the
closing ``` fence in the "Minimal config" callout (the block containing the
"budget: total_monthly: 100.00" example), and do the same for the second
occurrence mentioned (the callout around lines 1712-1718) so both fenced blocks
have a blank `>` line above and below them.
- Around line 1125-1129: The current "Research Directions (M5+)" and "Backend 2:
GraphRAG Knowledge Graph (Research)" headings conflict with the later
extensibility note by calling Backends 2–3 “not planned implementations”; update
the wording under the "Research Directions (M5+)" section (and any Backend 2/3
subheadings) to match the later phrase "planned extensions" (or otherwise
normalize to the approved terminology used in the extensibility note) so the
intent for OrgMemoryBackend (the protocol) and future backend work is consistent
across the document.
- Around line 946-976: The Task lifecycle docs must reflect the new INTERRUPTED
state: add INTERRUPTED as a first-class recoverable TaskStatus in the canonical
status list/enum (TaskStatus) and update all references/diagrams in the
lifecycle/state model (sections §6.1 and §6.6) to include INTERRUPTED on the
non-happy path (distinct from FAILED and CANCELLED), adjust the state
transitions to show that shutdown leads to INTERRUPTED (and that INTERRUPTED is
eligible for reassignment on restart), and either remove the “defer
enum/lifecycle updates” note or mark those sections explicitly as
“proposed-only” so there is a single source of truth for M3 task-state
semantics.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 709fdccc-cee0-483d-94b9-86e72a121c62

📥 Commits

Reviewing files that changed from the base of the PR and between 5c11595 and e0b50d8.

📒 Files selected for processing (1)

DESIGN_SPEC.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Agent
GitHub Check: Greptile Review

🧰 Additional context used

🧠 Learnings (4)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-06T11:57:38.189Z
Learning: Use `/aurelio-review-pr` to handle external reviewer feedback on created PRs

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-06T11:57:38.189Z
Learning: Update DESIGN_SPEC.md to reflect approved deviations from the original specification

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-06T11:57:38.189Z
Learning: Always read DESIGN_SPEC.md before implementing any feature or planning any issue — it is the mandatory starting point for architecture, data models, and behavior

📚 Learning: 2026-03-06T11:57:38.189Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-06T11:57:38.189Z
Learning: Update DESIGN_SPEC.md to reflect approved deviations from the original specification

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-03-06T11:57:38.189Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-06T11:57:38.189Z
Learning: Always read DESIGN_SPEC.md before implementing any feature or planning any issue — it is the mandatory starting point for architecture, data models, and behavior

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-01-26T08:59:32.818Z

Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to README.md : Update README.md for significant feature changes

Applied to files:

DESIGN_SPEC.md

🪛 markdownlint-cli2 (0.21.0)

DESIGN_SPEC.md

[warning] 972-972: Blank line inside blockquote

(MD028, no-blanks-blockquote)

[warning] 974-974: Blank line inside blockquote

(MD028, no-blanks-blockquote)

[warning] 1370-1370: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

[warning] 1373-1373: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

[warning] 1713-1713: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

[warning] 1716-1716: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

coderabbitai · 2026-03-06T12:32:50Z

DESIGN_SPEC.md

+### 6.7 Graceful Shutdown Protocol
+
+When the process receives SIGTERM/SIGINT (user Ctrl+C, Docker stop, systemd shutdown), the framework needs to stop cleanly without losing work or leaking costs. Shutdown strategies are implemented behind a `ShutdownStrategy` protocol, making the system pluggable — new strategies can be added without modifying existing ones.
+
+> **MVP: Cooperative with Timeout only (Strategy 1).** Other strategies are future options enabled by the protocol interface.
+
+#### Strategy 1: Cooperative with Timeout (Default / MVP)
+
+The engine sets a shutdown event, stops accepting new tasks, and gives in-flight agents a grace period to finish their current turn. Agents check the shutdown event at turn boundaries (between LLM calls, before tool invocations) and exit cooperatively. After the grace period, remaining agents are force-cancelled and their tasks marked `INTERRUPTED`.
+
+```yaml
+graceful_shutdown:
+  strategy: "cooperative_timeout"    # cooperative_timeout, immediate, finish_tool, checkpoint
+  cooperative_timeout:
+    grace_seconds: 30                # time for agents to finish cooperatively
+    cleanup_seconds: 5               # time for final cleanup (persist cost records, close connections)
+```
+
+On shutdown signal:
+1. Set `shutdown_event` (`asyncio.Event`) — agents check this at turn boundaries
+2. Stop accepting new tasks (drain gate closes)
+3. Wait up to `grace_seconds` for agents to exit cooperatively
+4. Force-cancel remaining agents (`task.cancel()`) — tasks transition to `INTERRUPTED`
+5. Cleanup phase (`cleanup_seconds`): persist cost records, close provider connections, flush logs
+
+> **New non-terminal status:** `INTERRUPTED` is a new `TaskStatus` variant. Unlike `FAILED` (eligible for automatic reassignment) or `CANCELLED` (terminal), `INTERRUPTED` indicates the task was stopped due to process shutdown and is eligible for manual or automatic reassignment on restart.
+
+> **Windows compatibility:** `loop.add_signal_handler()` is not supported on Windows. The implementation uses `signal.signal()` as a fallback. SIGINT (Ctrl+C) works cross-platform; SIGTERM on Windows requires `os.kill()`.
+
+> **In-flight LLM calls:** Non-streaming API calls that are interrupted result in tokens billed but no response received (silent cost leak). The engine logs request start (with input token count) before each provider call, so interrupted calls have at minimum an input-cost audit record. Streaming calls are charged only for tokens sent before disconnect.
+


⚠️ Potential issue | 🟠 Major

Reflect INTERRUPTED in the canonical task lifecycle/state model.

§6.7 makes INTERRUPTED a first-class recoverable status, but §6.1 still shows only BLOCKED / CANCELLED on the non-happy path and §6.6 still defers enum/lifecycle updates to “when implemented.” That leaves two sources of truth for M3 task-state semantics. Please update the lifecycle/status definitions now, or explicitly mark this section as proposed-only.

Based on learnings: Update DESIGN_SPEC.md to reflect approved deviations from the original specification.

🧰 Tools

🪛 markdownlint-cli2 (0.21.0)

[warning] 972-972: Blank line inside blockquote

(MD028, no-blanks-blockquote)

[warning] 974-974: Blank line inside blockquote

(MD028, no-blanks-blockquote)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 946 - 976, The Task lifecycle docs must reflect the new INTERRUPTED state: add INTERRUPTED as a first-class recoverable TaskStatus in the canonical status list/enum (TaskStatus) and update all references/diagrams in the lifecycle/state model (sections §6.1 and §6.6) to include INTERRUPTED on the non-happy path (distinct from FAILED and CANCELLED), adjust the state transitions to show that shutdown leads to INTERRUPTED (and that INTERRUPTED is eligible for reassignment on restart), and either remove the “defer enum/lifecycle updates” note or mark those sections explicitly as “proposed-only” so there is a single source of truth for M3 task-state semantics.

coderabbitai · 2026-03-06T12:32:50Z

DESIGN_SPEC.md

+> **New non-terminal status:** `INTERRUPTED` is a new `TaskStatus` variant. Unlike `FAILED` (eligible for automatic reassignment) or `CANCELLED` (terminal), `INTERRUPTED` indicates the task was stopped due to process shutdown and is eligible for manual or automatic reassignment on restart.
+
+> **Windows compatibility:** `loop.add_signal_handler()` is not supported on Windows. The implementation uses `signal.signal()` as a fallback. SIGINT (Ctrl+C) works cross-platform; SIGTERM on Windows requires `os.kill()`.
+


⚠️ Potential issue | 🟡 Minor

Remove the blank lines inside this blockquote callout.

Lines 972 and 974 trigger MD028, and several renderers will split these notes into separate quote blocks. Keep the consecutive notes in one blockquote or convert them into normal paragraphs.

🧰 Tools

🪛 markdownlint-cli2 (0.21.0)

[warning] 972-972: Blank line inside blockquote

(MD028, no-blanks-blockquote)

[warning] 974-974: Blank line inside blockquote

(MD028, no-blanks-blockquote)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 971 - 974, Remove the blank line between the two consecutive blockquote lines so they remain a single blockquote; locate the paragraph containing the `INTERRUPTED` TaskStatus note and the Windows compatibility note referencing `loop.add_signal_handler()` and `signal.signal()` and merge them into one continuous blockquote (or convert both to normal paragraphs) so MD028 is not triggered.

coderabbitai · 2026-03-06T12:32:50Z

DESIGN_SPEC.md

+### Research Directions (M5+)
+
+The following backends illustrate why `OrgMemoryBackend` is a protocol — the architecture supports future upgrades without modifying existing code. These are **not planned implementations**; they are research directions that may inform future work if/when organizational memory needs outgrow the Hybrid Prompt + Retrieval approach.
+
+#### Backend 2: GraphRAG Knowledge Graph (Research)


⚠️ Potential issue | 🟡 Minor

Align the “research directions” wording with the later backend note.

This new heading says Backends 2–3 are “not planned implementations,” but the later extensibility note in the same section still describes them as “planned extensions.” That changes the roadmap meaning materially, so the terminology should be normalized here.

Based on learnings: Update DESIGN_SPEC.md to reflect approved deviations from the original specification.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 1125 - 1129, The current "Research Directions (M5+)" and "Backend 2: GraphRAG Knowledge Graph (Research)" headings conflict with the later extensibility note by calling Backends 2–3 “not planned implementations”; update the wording under the "Research Directions (M5+)" section (and any Backend 2/3 subheadings) to match the later phrase "planned extensions" (or otherwise normalize to the approved terminology used in the extensibility note) so the intent for OrgMemoryBackend (the protocol) and future backend work is consistent across the document.

coderabbitai · 2026-03-06T12:32:51Z

DESIGN_SPEC.md

+> **Minimal config:**
+> ```yaml
+> budget:
+>   total_monthly: 100.00
+> ```
+> All other fields below have sensible defaults.
+


⚠️ Potential issue | 🟡 Minor

Add blank quoted lines around these fenced YAML examples.

Both minimal-config callouts trip MD031 because the fenced blocks are flush with the surrounding blockquote text. Please pad each fence with blank > lines so they render consistently across markdown viewers.

Also applies to: 1712-1718

🧰 Tools

🪛 markdownlint-cli2 (0.21.0)

[warning] 1370-1370: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

[warning] 1373-1373: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 1369 - 1375, The fenced YAML examples in the "Minimal config" blockquote are flush with surrounding text and must be padded with blank quoted lines: insert a standalone quoted blank line (a line containing just ">") immediately before the opening ```yaml fence and another immediately after the closing ``` fence in the "Minimal config" callout (the block containing the "budget: total_monthly: 100.00" example), and do the same for the second occurrence mentioned (the callout around lines 1712-1718) so both fenced blocks have a blank `>` line above and below them.

Copilot

Pull request overview

Updates DESIGN_SPEC.md to incorporate external review feedback by clarifying MVP (M3) scope, introducing new protocol-based designs (graceful shutdown + workspace isolation), and tightening several “future vs now” boundaries across the spec.

Changes:

Adds explicit M3 MVP definition and configuration philosophy (incl. minimal config examples).
Introduces new protocol sections for graceful shutdown (ShutdownStrategy) and concurrent workspace isolation (WorkspaceIsolationStrategy), with MVP callouts.
Reframes org-memory backends 2–3 as “research directions” and expands engineering conventions to include new coordination/isolation/shutdown rows.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-06T12:33:48Z

DESIGN_SPEC.md

+4. Force-cancel remaining agents (`task.cancel()`) — tasks transition to `INTERRUPTED`
+5. Cleanup phase (`cleanup_seconds`): persist cost records, close provider connections, flush logs
+
+> **New non-terminal status:** `INTERRUPTED` is a new `TaskStatus` variant. Unlike `FAILED` (eligible for automatic reassignment) or `CANCELLED` (terminal), `INTERRUPTED` indicates the task was stopped due to process shutdown and is eligible for manual or automatic reassignment on restart.


INTERRUPTED is described as a new TaskStatus variant, but the current TaskStatus enum in code (src/ai_company/core/enums.py) only includes CREATED/ASSIGNED/IN_PROGRESS/IN_REVIEW/COMPLETED/BLOCKED/CANCELLED (and the spec already calls out that FAILED will be added later). To keep the spec consistent with the repo’s current state, consider adding the same kind of “will be added when implemented” note here (and/or referencing the transition map that will need updating) rather than stating it as an already-existing variant.

Suggested change

> **New non-terminal status:** `INTERRUPTED` is a new `TaskStatus` variant. Unlike `FAILED` (eligible for automatic reassignment) or `CANCELLED` (terminal), `INTERRUPTED` indicates the task was stopped due to process shutdown and is eligible for manual or automatic reassignment on restart.

> **Planned non-terminal status:** `INTERRUPTED` will be introduced as a new `TaskStatus` variant once implemented in code (and the task status transition map is updated). Unlike `FAILED` (eligible for automatic reassignment) or `CANCELLED` (terminal), `INTERRUPTED` will indicate the task was stopped due to process shutdown and is eligible for manual or automatic reassignment on restart.

Copilot · 2026-03-06T12:33:49Z

DESIGN_SPEC.md

-#### Backend 2: GraphRAG Knowledge Graph (Future)
+---
+
+### Research Directions (M5+)


This introduces an unnumbered level-3 heading (### Research Directions (M5+)) inside the numbered §7.4 area, which breaks the otherwise consistent “### 7.x …” structure and also makes the Table of Contents note (“7.4 … (Research Directions)”) a bit misleading since Research Directions is now a sibling section rather than clearly part of 7.4. Consider making this a #### subsection under 7.4, or renumbering it to ### 7.5 … if it’s meant to be a peer section.

Suggested change

### Research Directions (M5+)

#### Research Directions (M5+)

Copilot · 2026-03-06T12:33:49Z

DESIGN_SPEC.md

+### Research Directions (M5+)
+
+The following backends illustrate why `OrgMemoryBackend` is a protocol — the architecture supports future upgrades without modifying existing code. These are **not planned implementations**; they are research directions that may inform future work if/when organizational memory needs outgrow the Hybrid Prompt + Retrieval approach.
+
+#### Backend 2: GraphRAG Knowledge Graph (Research)


This section says Backends 2–3 are “not planned implementations”, but later in the same §7.4 area the spec describes them as “planned extensions.” Please align the wording so it’s unambiguous whether GraphRAG/Temporal KG are only research ideas or actual roadmap items.

Copilot · 2026-03-06T12:33:49Z

DESIGN_SPEC.md

+> **Minimal config:**
+> ```yaml
+> autonomy:
+>   level: "semi"
+> ```
+> All presets below are built-in. Most users only set the level.


The “Minimal config” callout suggests users can configure autonomy.level: "semi", but the current validated YAML schema uses config.autonomy as a float (0.0–1.0) in RootConfig/CompanyConfig (see src/ai_company/config/schema.py + src/ai_company/core/company.py) and does not define an autonomy: object with presets. Either mark this autonomy-preset structure as future/planned (separate from today’s config), or update the minimal example to match the current config shape.

Suggested change

> **Minimal config:**

> ```yaml

> autonomy:

> level: "semi"

> ```

> All presets below are built-in. Most users only set the level.

> **Planned minimal config (future, not yet implemented):**

> ```yaml

> autonomy:

> level: "semi"

> ```

> This `autonomy.level` + presets structure is a design target for a future config schema.

>

> **Current implementation (v0.x) uses a numeric flag:**

> ```yaml

> config:

> autonomy: 0.5 # float between 0.0 (no autonomy) and 1.0 (full autonomy)

> ```

Copilot · 2026-03-06T12:33:49Z

DESIGN_SPEC.md

+**Minimal custom company** (all other settings use defaults):
+
+```yaml
+company:
+  name: "Acme Corp"
+  template: "startup"


The “Minimal custom company” YAML snippet doesn’t match any current config shape in the repo: built-in templates are defined under a top-level template: key (with template.company.type, template.company.budget_monthly, etc.), while validated runtime config uses company_name/company_type plus config.budget_monthly (see src/ai_company/config/schema.py). Please either (a) update the snippet to a real, loadable template/config example consistent with the existing schemas, or (b) explicitly label it as a conceptual/future higher-level config format so readers don’t try to use it as-is.

Suggested change

**Minimal custom company** (all other settings use defaults):

```yaml

company:

name: "Acme Corp"

template: "startup"

**Minimal custom company (runtime config example)** (all other settings use defaults):

```yaml

company_name: "Acme Corp"

company_type: "startup"

config:

…le, Copilot, and Gemini - Fix "not planned" vs "planned extensions" contradiction in §7.4 - Fix FAILED labeled "terminal" in §6.6 (should be non-terminal) - Fix INTERRUPTED wording to "Planned non-terminal status" - Fix §5.6 MVP marker to say "Not in M3" (single-agent, no conflicts) - Add missing SUSPENDED "new status" callout in §6.7 Strategy 4 - Remove mid-section --- and fix Research Directions heading level - Mark §12.2 autonomy minimal config as planned (not yet implemented) - Fix Progressive Trust deferred list annotation pattern - Fix MD028: merge consecutive blockquotes in §6.7 - Fix MD031: add blank > lines around fenced code in §10.4 and §12.2 - Update org memory mechanism column for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

greptile-apps · 2026-03-06T12:45:25Z

DESIGN_SPEC.md

+- Memory layer integration, org memory backends (M5)
+- Web UI, WebSocket real-time updates (M6)
+- CLI commands beyond basic `start` (M6)
+- Security ops agent, approval workflows (M7)


§1.4 "approval workflows (M7)" contradicts §12.4 MVP marker

Line 78 places "approval workflows" in the M7 deferred bucket, but §12.4 (line 1766) adds:

MVP: Wait Forever only (Policy 1). Other timeout policies are M5+.

This means the human-approval waiting mechanism (parking a task, receiving approval, resuming) is explicitly planned for M3 — not M7. A reader parsing §1.4 will conclude that no approval integration exists until M7, then reach §12.4 and find a contradicting MVP callout. This is especially likely to confuse implementors scoping M3 work.

The §1.4 entry most likely intends to defer the Security ops agent and the full approval workflow UI / escalation policies, while the basic "wait for human, resume on approval" plumbing ships in M3. Splitting the entry or adjusting the milestone annotation would remove the ambiguity:

Suggested change

- Security ops agent, approval workflows (M7)

- Security ops agent (M7)

- Approval workflows: full timeout/escalation policies (M5+); basic wait-and-resume ships in M3 (§12.4 Policy 1)

Prompt To Fix With AI

This is a comment left during a code review. Path: DESIGN_SPEC.md Line: 78 Comment: **§1.4 "approval workflows (M7)" contradicts §12.4 MVP marker** Line 78 places "approval workflows" in the M7 deferred bucket, but §12.4 (line 1766) adds: > **MVP: Wait Forever only (Policy 1).** Other timeout policies are M5+. This means the human-approval waiting mechanism (parking a task, receiving approval, resuming) is explicitly planned for M3 — not M7. A reader parsing §1.4 will conclude that no approval integration exists until M7, then reach §12.4 and find a contradicting MVP callout. This is especially likely to confuse implementors scoping M3 work. The §1.4 entry most likely intends to defer the **Security ops agent** and the full **approval workflow UI / escalation policies**, while the basic "wait for human, resume on approval" plumbing ships in M3. Splitting the entry or adjusting the milestone annotation would remove the ambiguity: ```suggestion - Security ops agent (M7) - Approval workflows: full timeout/escalation policies (M5+); basic wait-and-resume ships in M3 (§12.4 Policy 1) ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-06T12:45:26Z

DESIGN_SPEC.md

+#### Research Directions (M5+)
+
+The following backends illustrate why `OrgMemoryBackend` is a protocol — the architecture supports future upgrades without modifying existing code. These are **not planned implementations**; they are research directions that may inform future work if/when organizational memory needs outgrow the Hybrid Prompt + Retrieval approach.
+
+#### Backend 2: GraphRAG Knowledge Graph (Research)


Heading hierarchy makes "Research Directions" a sibling of the backends it introduces

#### Research Directions (M5+) is at the same #### level as #### Backend 2: and #### Backend 3:. In Markdown's heading hierarchy these are document siblings, not a parent containing the two backends. A reader navigating via rendered headings or a document outline sees all three at equal depth, which undermines the intent of "Research Directions" as a label/container for the subsequent entries.

Consider promoting the container to ### so that Backends 2 and 3 sit under it as #### children:

Suggested change

#### Research Directions (M5+)

The following backends illustrate why `OrgMemoryBackend` is a protocol — the architecture supports future upgrades without modifying existing code. These are **not planned implementations**; they are research directions that may inform future work if/when organizational memory needs outgrow the Hybrid Prompt + Retrieval approach.

#### Backend 2: GraphRAG Knowledge Graph (Research)

### Research Directions (M5+)

The following backends illustrate why `OrgMemoryBackend` is a protocol — the architecture supports future upgrades without modifying existing code. These are **not planned implementations**; they are research directions that may inform future work if/when organizational memory needs outgrow the Hybrid Prompt + Retrieval approach.

#### Backend 2: GraphRAG Knowledge Graph (Research)

Prompt To Fix With AI

This is a comment left during a code review. Path: DESIGN_SPEC.md Line: 1125-1129 Comment: **Heading hierarchy makes "Research Directions" a sibling of the backends it introduces** `#### Research Directions (M5+)` is at the same `####` level as `#### Backend 2:` and `#### Backend 3:`. In Markdown's heading hierarchy these are document *siblings*, not a parent containing the two backends. A reader navigating via rendered headings or a document outline sees all three at equal depth, which undermines the intent of "Research Directions" as a label/container for the subsequent entries. Consider promoting the container to `###` so that Backends 2 and 3 sit under it as `####` children: ```suggestion ### Research Directions (M5+) The following backends illustrate why `OrgMemoryBackend` is a protocol — the architecture supports future upgrades without modifying existing code. These are **not planned implementations**; they are research directions that may inform future work if/when organizational memory needs outgrow the Hybrid Prompt + Retrieval approach. #### Backend 2: GraphRAG Knowledge Graph (Research) ``` How can I resolve this? If you propose a fix, please make it concise.

🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 6, 2026 12:29

Copilot started reviewing on behalf of Aureliolo March 6, 2026 12:29 View session

gemini-code-assist bot reviewed Mar 6, 2026

View reviewed changes

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

DESIGN_SPEC.md Outdated Show resolved Hide resolved

DESIGN_SPEC.md Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Aureliolo merged commit 3b30b9a into main Mar 6, 2026
8 checks passed

Aureliolo deleted the docs/spec-mvp-scope branch March 6, 2026 12:42

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

This was referenced Mar 8, 2026

feat: implement meeting protocol system (#123) #167

Merged

fix: address post-merge review feedback from PRs #164-#167 #169

Closed

docs: add ADR-001 memory layer evaluation and selection #178

Merged

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release ai-company 0.1.1 #282

Merged

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release 0.1.0 #283

Merged

This was referenced Mar 15, 2026

chore(main): release 0.2.4 #431

Merged

chore(main): release 0.2.0 #442

Closed

chore(main): release 0.2.5 #447

Merged

chore(main): release 0.2.0 #460

Closed

chore(main): release 0.2.0 #471

Closed

	\| Org memory (shared knowledge) \| Single-writer (`OrgMemoryBackend`) \| Write-access controlled by role level \|
	\| Org memory (shared knowledge) \| Single-writer (`OrgMemoryBackend`) \| `OrgMemoryBackend` protocol with role-based access control \|

		> New non-terminal status: `INTERRUPTED` is a new `TaskStatus` variant. Unlike `FAILED` (eligible for automatic reassignment) or `CANCELLED` (terminal), `INTERRUPTED` indicates the task was stopped due to process shutdown and is eligible for manual or automatic reassignment on restart.

		> Windows compatibility: `loop.add_signal_handler()` is not supported on Windows. The implementation uses `signal.signal()` as a fallback. SIGINT (Ctrl+C) works cross-platform; SIGTERM on Windows requires `os.kill()`.

	> New non-terminal status: `INTERRUPTED` is a new `TaskStatus` variant. Unlike `FAILED` (eligible for automatic reassignment) or `CANCELLED` (terminal), `INTERRUPTED` indicates the task was stopped due to process shutdown and is eligible for manual or automatic reassignment on restart.
	> Planned non-terminal status: `INTERRUPTED` will be introduced as a new `TaskStatus` variant once implemented in code (and the task status transition map is updated). Unlike `FAILED` (eligible for automatic reassignment) or `CANCELLED` (terminal), `INTERRUPTED` will indicate the task was stopped due to process shutdown and is eligible for manual or automatic reassignment on restart.

	- Security ops agent, approval workflows (M7)
	- Security ops agent (M7)
	- Approval workflows: full timeout/escalation policies (M5+); basic wait-and-resume ships in M3 (§12.4 Policy 1)

Conversation

Aureliolo commented Mar 6, 2026

Summary

Changes

Design Decisions (from external review analysis)

Test plan

Uh oh!

github-actions bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

gemini-code-assist bot commented Mar 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 6, 2026 •

edited

Loading

coderabbitai bot commented Mar 6, 2026 •

edited

Loading

greptile-apps bot commented Mar 6, 2026 •

edited

Loading