Skip to content

Add design specification, license, and project setup#2

Merged
Aureliolo merged 4 commits intomainfrom
feat/design-spec
Feb 27, 2026
Merged

Add design specification, license, and project setup#2
Aureliolo merged 4 commits intomainfrom
feat/design-spec

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Design Specification (DESIGN_SPEC.md) — comprehensive 18-section high-level design for the AI Company framework covering agent system, company structure, communication architecture, task engine, memory, HR, model providers, budget management, tools, security, human interaction, templates, technical architecture, research/prior art, open questions, and backlog
  • BSL 1.1 License — Business Source License with non-commercial use grant, converting to Apache 2.0 on February 27, 2030
  • README — updated with project concept, planned features, tech stack, and documentation links
  • Claude Code skillaurelio-review-pr PR review pipeline for future use

All technology choices in the design spec are marked as candidates/TBD pending further research. No final decisions on frameworks, libraries, or specific models.

Test plan

  • Review DESIGN_SPEC.md for completeness and internal consistency
  • Verify LICENSE parameters (change date, use grant, change license)
  • Confirm README accurately reflects current project status
  • Validate all 18 sections are properly cross-referenced in table of contents

🤖 Generated with Claude Code

Comprehensive design spec covering:
- Agent identity system with roles, personalities, skills, seniority
- Company structure templates (solo founder to full enterprise)
- Communication architecture (event-driven bus, hierarchical, meetings, hybrid)
- Task/workflow engine with multiple execution patterns
- Memory system (working, episodic, semantic, procedural, social)
- Multi-provider LLM layer via LiteLLM (Claude, OpenRouter, Ollama)
- Budget management with CFO agent and cost optimization
- Tool/capability system with progressive trust
- Security operations agent and configurable autonomy levels
- API-first architecture (FastAPI + WebSocket) with web UI
- Template system and company builder
- Research analysis of MetaGPT, ChatDev, CrewAI, AutoGen, LangGraph
- Build-from-scratch recommendation leveraging LiteLLM, Mem0, MCP, A2A
- Open questions, risks, backlog, and scaling path
Business Source License 1.1 with parameters:
- Non-commercial use only (personal, education, research, evaluation)
- Commercial/production use requires separate license
- Converts to Apache 2.0 after 4 years per version
- Fix LICENSE Change Date to explicit "February 27, 2030"
- Extract HR into its own section (Section 8), renumber all sections
- Reconcile budget hierarchy to include QA department
- Correct A2A/MCP governance attribution to Linux Foundation
- Keep SecOps agent active in locked autonomy mode for audit logging
- Consolidate terminology on "Agent" (remove "Employee" as separate concept)
- Add communication loop prevention mechanism (Section 5.5)
- Mark model pricing and provider configs as illustrative examples
- Soften specific technology commitments to "candidates - TBD after research"
- Add language spec to workflow diagram code blocks
- Add aurelio-review-pr skill for Claude Code
Copilot AI review requested due to automatic review settings February 27, 2026 09:48
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 27, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eeaec53 and dfa99f3.

📒 Files selected for processing (2)
  • .claude/skills/aurelio-review-pr/skill.md
  • DESIGN_SPEC.md

📝 Walkthrough

Summary by CodeRabbit

  • Documentation

    • Added a comprehensive design spec describing architecture, agent framework, communication, persistence, governance, and implementation guidance.
    • Added a detailed PR review workflow document outlining a multi-phase review, feedback consolidation, and approval/implementation steps.
    • Expanded README with project concept, planned features, status, tech stack, and docs references.
  • Chores

    • Added Business Source License 1.1; production/commercial use requires a separate commercial license.

Walkthrough

Adds foundational documentation and a PR review workflow: a 10-phase PR review pipeline spec, a comprehensive design specification for multi-agent orchestration, a Business Source License 1.1, and an expanded README.

Changes

Cohort / File(s) Summary
Workflow Documentation
​.claude/skills/aurelio-review-pr/skill.md
New markdown describing a 10-phase PR review pipeline: identify PR/branch, fetch issue context, run local review agents in parallel, fetch external reviewer feedback, consolidate/triage with deduplication/conflict detection, AskUserQuestion approval step, implement approved fixes grouped by file, commit/push safeguards, post-push verification, and final summary. Specifies control flow rules, retry semantics, error signaling, and user prompts.
Design & Architecture
DESIGN_SPEC.md
New comprehensive design spec for a config-driven autonomous-agent "AI Company" framework: architecture, agent system, company structure, communication patterns, workflow engine, memory/persistence, HR/security/approval, model providers, cost/budgeting, templates, technical stack candidates, research notes, risks, and backlog.
Project Foundation
LICENSE, README.md
Adds Business Source License 1.1 (Change Date Feb 27, 2030 → Apache 2.0) and expands README with formal project title, concept, planned features, status, tech stack, and documentation references.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant User
participant LocalAgents as Local Review Agents
participant GH as GitHub API
participant Ext as External Reviewers
participant Repo
User->>GH: Identify PR / branch
GH-->>User: Return PR & issue context
User->>LocalAgents: Run local review agents (parallel)
LocalAgents-->>User: Local findings
User->>GH: Fetch external reviewer feedback (retry semantics)
GH-->>Ext: Request external comments
Ext-->>GH: External feedback (may include outside-diff comments)
GH-->>User: External feedback delivered
User->>User: Consolidate & triage feedback (dedupe, detect conflicts)
User->>User: AskUserQuestion (approve all / selective)
User->>LocalAgents: Implement approved fixes (grouped by file)
LocalAgents->>Repo: Create commits (lint/tests considerations)
Repo-->>User: Push/commit result (safeguards)
User->>GH: Verify external reviewer status post-push
GH-->>User: Final summary & report

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: adding a design specification, license, and project setup files to establish the project foundation.
Description check ✅ Passed The description is well-structured and directly related to the changeset, covering all major file additions (DESIGN_SPEC.md, LICENSE, README updates) and the PR review skill with clear context.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/design-spec

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes the foundational elements for the "AI Company" project by introducing its core design principles, licensing, and initial development tooling. It provides a detailed high-level design specification outlining the architecture and components of the AI agent orchestration framework, defines the project's open-source license, updates the main project README with key information, and adds a sophisticated PR review automation skill.

Highlights

  • Design Specification: A comprehensive 18-section high-level design for the AI Company framework has been added, covering agent system, company structure, communication, task engine, memory, HR, model providers, budget, tools, security, human interaction, templates, technical architecture, research, open questions, and backlog.
  • Licensing: The project now includes a Business Source License (BSL 1.1) with a non-commercial use grant, which will convert to Apache 2.0 on February 27, 2030.
  • Project Setup & Documentation: The README has been updated to include the project concept, planned features, tech stack, and links to documentation, reflecting the current project status.
  • Claude Code Skill: A new Claude Code skill, 'aurelio-review-pr', has been introduced to define a full PR review pipeline for future use, integrating local agents, external feedback, triage, and fix implementation.
Changelog
  • .claude/skills/aurelio-review-pr/skill.md
    • Added a new Claude skill for a comprehensive PR review pipeline.
    • Defined 10 phases for the PR review process, from finding the PR to implementing fixes and verifying external reviewer status.
    • Specified tools allowed for the skill, including Bash, Read, Edit, Write, Grep, Glob, Task, and AskUserQuestion.
    • Included detailed logic for issue linkage, context fetching, parallel local agent execution, external feedback collection, and triage.
    • Outlined rules for fix implementation, commit procedures, and loop prevention.
  • DESIGN_SPEC.md
    • Added a detailed 18-section high-level design specification for the AI Company framework.
    • Included sections on vision, core concepts, agent system, company structure, communication, task engine, memory, HR, model providers, cost management, tools, security, human interaction, templates, technical architecture, research, open questions, and future vision.
    • Provided detailed YAML/JSON examples for agent identity, department configuration, communication settings, task definition, memory levels, budget controls, tool access, autonomy levels, and company templates.
    • Compared the proposed architecture with existing AI agent frameworks and justified a build-from-scratch approach leveraging key libraries.
  • LICENSE
    • Added the Business Source License 1.1.
    • Specified "Aurelio" as the Licensor and "AI Company" as the Licensed Work.
    • Included an Additional Use Grant for non-production use (personal, educational, research, evaluation).
    • Set the Change Date to February 27, 2030, and the Change License to Apache License, Version 2.0.
  • README.md
    • Updated the project title to "AI Company".
    • Added a concept section explaining the framework for orchestrating autonomous AI agents.
    • Listed key planned features such as configurable company structure, deep agent identity, multi-provider support, smart cost management, configurable autonomy, persistent memory, HR system, real tool access, API-first design, and templates.
    • Updated the status to "Design phase" and linked to DESIGN_SPEC.md.
    • Included a planned tech stack section with Python, FastAPI, LiteLLM, Mem0, MCP, Vue 3, and SQLite/PostgreSQL.
    • Added a documentation section linking to the Design Specification.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces foundational documents for the project, including a detailed design specification for an AI Company framework, a BSL 1.1 license, and an updated project README. While I've noted minor inconsistencies within the design specification and the aurelio-review-pr skill definition, a critical concern is that the aurelio-review-pr skill contains instructions that could lead to security vulnerabilities. Specifically, it lacks explicit validation for untrusted input extracted from PR bodies, potentially leading to command injection, and does not include safeguards against prompt injection when passing GitHub issue context to review agents. Addressing these inconsistencies and, more importantly, refining the skill definition to mitigate these security risks is crucial for clarity and project security.

Comment on lines +63 to +66
**Fetch issue context.** If an issue reference was found (regardless of warnings), fetch the issue for review context. If the PR body used a full URL (`https://github.com/OWNER/REPO/issues/N`), extract both `OWNER/REPO` and `N` and pass `--repo OWNER/REPO` to query the correct repository:

```bash
gh issue view N --repo OWNER/REPO --json title,body,labels,comments --jq '{title: .title, body: .body, labels: [.labels[].name], comments: [.comments[] | {author: .author.login, body: .body}]}'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The skill instructions direct the agent to extract OWNER/REPO and N from a URL provided in the PR body and use them in a shell command (gh issue view N --repo OWNER/REPO). Since the PR body is untrusted input, an attacker could provide a malicious URL (e.g., https://github.com/owner/repo;injection/issues/1) to perform command injection. The instructions should explicitly require the agent to validate that OWNER/REPO matches the expected repository format and that N is a numeric value before using them in any shell command.

| **comment-analyzer** | Comments or docstrings changed | `pr-review-toolkit:comment-analyzer` |
| **type-design-analyzer** | Type annotations or classes added/modified | `pr-review-toolkit:type-design-analyzer` |

Each agent should receive the list of changed files and focus on reviewing them. **If issue context was collected in Phase 2, include the issue title, body, and key comments in each agent's prompt** so they can verify the PR addresses the issue's requirements.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The skill instructions specify that untrusted data from GitHub (issue title, body, and comments) should be included in the prompts for review agents. This data is not sanitized or wrapped in delimiters, making the agents vulnerable to prompt injection attacks. An attacker could use malicious issue comments to manipulate the behavior of the review agents. The instructions should be updated to require the use of clear delimiters (e.g., XML tags) and to explicitly instruct the sub-agents to treat this content as untrusted data.

```bash
gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate
```
Extract: author, file path, line number, body, subject_type.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In the list of fields to extract for inline review comments, subject_type is mentioned. However, according to the GitHub API documentation, this field is not present in the response for listing PR review comments (/repos/OWNER/REPO/pulls/NUMBER/comments). This field is associated with commit comments. To ensure the implementation is based on the correct API schema, I suggest removing subject_type from this list.

DESIGN_SPEC.md Outdated
```yaml
memory:
level: "full" # none, session, project, full
backend: "sqlite" # sqlite, postgresql, mem0, file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There appears to be an inconsistency in how Mem0 is defined as part of the memory architecture.

  • Here in Section 7.3, mem0 is listed as a value for the backend configuration, suggesting it's a standalone storage option.
  • However, Section 15.2 describes the stack as Mem0 + SQLite and states Mem0 is for semantic/episodic memory while SQLite is for structured data. This implies Mem0 is a layer that works with a database, not as an alternative to it.

To avoid confusion during implementation, it would be helpful to clarify the relationship. Is Mem0 a memory system that can be configured with different storage backends (like SQLite), or is it a backend itself? The configuration schema should reflect this relationship clearly.

DESIGN_SPEC.md Outdated
Comment on lines +770 to +778
- role_level: "c_suite"
preferred_model: "opus"
fallback: "sonnet"
- role_level: "senior"
preferred_model: "sonnet"
fallback: "haiku"
- role_level: "junior"
preferred_model: "haiku"
fallback: "local-small"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an inconsistency in the casing of agent seniority levels across the design document.

  • Section 3.1 (level) and 3.2 (Level) use capitalized values (e.g., Senior, C-Suite).
  • The routing rules here in Section 9.4 use lowercase values (e.g., c_suite, senior, junior).

This could lead to bugs where routing rules don't match agent levels. It would be beneficial to standardize on a single casing convention (e.g., all lowercase) for these values throughout the specification.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR establishes the foundational documentation and legal framework for the AI Company project, a framework for orchestrating autonomous AI agents within a virtual company structure. The PR transitions the repository from a minimal placeholder to a fully specified design-phase project with clear licensing, technical vision, and development tooling.

Changes:

  • Added comprehensive 18-section design specification covering agent system, company structure, communication, tasks, memory, HR, providers, budget, tools, security, human interaction, templates, and technical architecture
  • Implemented Business Source License 1.1 with 4-year conversion to Apache 2.0
  • Updated README with project concept, key features, tech stack, and documentation links
  • Added Claude Code PR review skill for automated review workflow

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
DESIGN_SPEC.md Comprehensive high-level design covering all major system components, with technology choices marked as candidates pending research
LICENSE Business Source License 1.1 with non-production use grant, converting to Apache 2.0 on February 27, 2030
README.md Project overview with concept description, planned features, tech stack, and documentation links
.claude/skills/aurelio-review-pr/skill.md PR review pipeline skill with 10-phase workflow for local agents, external feedback, triage, and fixes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Aureliolo
Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 27, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/aurelio-review-pr/skill.md:
- Around line 32-34: Several fenced code blocks (e.g., the block containing "gh
repo view --json nameWithOwner -q .nameWithOwner") lack surrounding blank lines
and nearby sentences lack terminal punctuation; update each affected fenced
block by inserting a blank line before and after the triple-backtick fence and
ensure the sentence immediately preceding or following each fence ends with
proper punctuation (period, question mark, or exclamation). Apply the same fixes
to the other similar fences called out (the blocks around the other listed
snippets) so all fenced code blocks have blank lines around them and all nearby
sentences end with punctuation for MD lint stability.

In `@DESIGN_SPEC.md`:
- Around line 76-97: The fenced block showing the company org tree (the
triple-backtick block starting with "Company ├── Departments[] ...") and
multiple other fenced blocks lack a language identifier and several headings
(e.g., "#### C-Suite / Executive") are missing surrounding blank lines; to fix,
add an appropriate fence language token (e.g., ```text, ```yaml, ```json, or
```bash) to every fenced code block such as the "Company" tree and the other
blocks listed, and ensure each heading has a blank line above and below it (for
example insert a blank line before "#### C-Suite / Executive" and a blank line
after the heading), applying this pattern across the noted ranges (lines with
MD040/MD022 warnings) so markdownlint MD040 and MD022 are resolved.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e040da and eeaec53.

📒 Files selected for processing (4)
  • .claude/skills/aurelio-review-pr/skill.md
  • DESIGN_SPEC.md
  • LICENSE
  • README.md
📜 Review details
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to README.md : Update README.md for significant feature changes

Applied to files:

  • README.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Use structured prompts with clear instructions including role definition, constraints, output format (JSON when needed), and context from story state

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Never defer work—do not suggest "this can be done later" or "consider for a future PR". Complete all requested changes fully.

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Never defer work. Do not suggest 'this can be done later' or 'consider for a future PR'. Complete all requested changes fully.

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: After every push, you MUST check that CI passes. If CI fails, fix the issue immediately and push again until all checks are green. Never walk away from a failing CI pipeline.

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Always create a PR for issue work. When implementing changes for a GitHub issue, create a branch and open a pull request. Do not wait to be asked.

Applied to files:

  • .claude/skills/aurelio-review-pr/skill.md
🪛 LanguageTool
DESIGN_SPEC.md

[typographical] ~1-~1: To join two clauses or introduce examples, consider using an em dash.
Context: # AI Company - High-Level Design Specification > A fra...

(DASH_RULE)


[typographical] ~193-~193: To join two clauses or introduce examples, consider using an em dash.
Context: ...### Engineering - Software Architect - System design, technology decisions, pat...

(DASH_RULE)


[typographical] ~194-~194: To join two clauses or introduce examples, consider using an em dash.
Context: ...Frontend Developer** (Junior/Mid/Senior) - UI implementation, components, state man...

(DASH_RULE)


[typographical] ~195-~195: To join two clauses or introduce examples, consider using an em dash.
Context: ...Backend Developer* (Junior/Mid/Senior) - APIs, business logic, databases - **Full...

(DASH_RULE)


[typographical] ~196-~196: To join two clauses or introduce examples, consider using an em dash.
Context: ...ll-Stack Developer** (Junior/Mid/Senior) - End-to-end implementation - **DevOps/SRE...

(DASH_RULE)


[typographical] ~197-~197: To join two clauses or introduce examples, consider using an em dash.
Context: ...implementation - DevOps/SRE Engineer - Infrastructure, CI/CD, monitoring, deplo...

(DASH_RULE)


[typographical] ~198-~198: To join two clauses or introduce examples, consider using an em dash.
Context: ...ring, deployment - Database Engineer - Schema design, query optimization, migra...

(DASH_RULE)


[typographical] ~199-~199: To join two clauses or introduce examples, consider using an em dash.
Context: ...tion, migrations - Security Engineer - Security audits, vulnerability assessmen...

(DASH_RULE)


[style] ~383-~383: Consider using a different adjective to strengthen your wording.
Context: ...#### Pattern 4: Hybrid (Recommended for Full Company) Combines all three: - **Messa...

(FULL_ENTIRE)


[style] ~654-~654: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...onality would complement the team? - What model/provider fits the budget? 3. Cand...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~664-~664: You have used the passive voice repeatedly in nearby sentences. To make your writing clearer and easier to read, consider using active voice.
Context: ... 3. Active tasks are reassigned 4. Team is notified ### 8.3 Performance Tracking ```yaml ...

(REP_PASSIVE_VOICE)


[grammar] ~682-~682: Use a hyphen to join words.
Context: ...ce: - Promotion criteria: sustained high quality scores, task complexity handled,...

(QB_NEW_EN_HYPHEN)


[typographical] ~1377-~1377: In American English, use a period after an abbreviation.
Context: ...pid prototyping | ### 16.2 What Exists vs What We Need | Feature | MetaGPT | Cha...

(MISSING_PERIOD_AFTER_ABBREVIATION)


[typographical] ~1381-~1381: To join two clauses or introduce examples, consider using an em dash.
Context: ...ulation | Partial | Partial | No | Yes - complete | | HR (hiring/firing) | No |...

(DASH_RULE)


[typographical] ~1385-~1385: To join two clauses or introduce examples, consider using an em dash.
Context: ...alities | Basic | Basic | Basic | Deep - traits, styles, evolution | | Dynamic ...

(DASH_RULE)


[typographical] ~1386-~1386: To join two clauses or introduce examples, consider using an em dash.
Context: ... team scaling | No | No | Manual | Yes - auto + manual | | Multiple company typ...

(DASH_RULE)


[typographical] ~1387-~1387: To join two clauses or introduce examples, consider using an em dash.
Context: ...company types | No | No | Manual | Yes - templates + builder | | Security ops a...

(DASH_RULE)


[typographical] ~1389-~1389: To join two clauses or introduce examples, consider using an em dash.
Context: ...ble autonomy | No | No | Limited | Yes - full spectrum | | Local + cloud provid...

(DASH_RULE)


[typographical] ~1390-~1390: To join two clauses or introduce examples, consider using an em dash.
Context: ...rs | Partial | Partial | Partial | **Yes - unified abstraction (LiteLLM candidate)*...

(DASH_RULE)


[typographical] ~1391-~1391: To join two clauses or introduce examples, consider using an em dash.
Context: ...racking per agent | No | No | No | Yes - full budget system | | Progressive tru...

(DASH_RULE)


[typographical] ~1398-~1398: In American English, use a period after an abbreviation.
Context: ...Planned (backlog) | ### 16.3 Build vs Fork Decision **Recommendation: Build ...

(MISSING_PERIOD_AFTER_ABBREVIATION)


[style] ~1432-~1432: Consider using the typographical ellipsis character here instead.
Context: ... | Agent A asks Agent B who asks Agent A... | | 8 | Optimal message bus for local-f...

(ELLIPSIS)


[typographical] ~1433-~1433: In American English, use a period after an abbreviation.
Context: ...architecture? | Medium | asyncio queues vs Redis vs embedded broker | | 9 | How to...

(MISSING_PERIOD_AFTER_ABBREVIATION)


[typographical] ~1433-~1433: In American English, use a period after an abbreviation.
Context: ...ure? | Medium | asyncio queues vs Redis vs embedded broker | | 9 | How to handle c...

(MISSING_PERIOD_AFTER_ABBREVIATION)


[typographical] ~1434-~1434: In American English, use a period after an abbreviation.
Context: ...y? | High | Sandboxing strategy, Docker vs WASM vs subprocess | | 10 | What's the ...

(MISSING_PERIOD_AFTER_ABBREVIATION)


[typographical] ~1434-~1434: In American English, use a period after an abbreviation.
Context: ...h | Sandboxing strategy, Docker vs WASM vs subprocess | | 10 | What's the minimum ...

(MISSING_PERIOD_AFTER_ABBREVIATION)


[style] ~1509-~1509: To make your text as clear as possible to all readers, do not use this foreign term. Possible alternatives are ‘in fact’ or ‘in reality’.
Context: ...rop | | OpenAI API format | OpenAI (de facto standard) | LLM API interface | Via pro...

(DE_FACTO)


[style] ~1509-~1509: This phrase is redundant (‘I’ stands for ‘interface’). Use simply “API”.
Context: ...at** | OpenAI (de facto standard) | LLM API interface | Via provider abstraction layer (LiteL...

(ACRONYM_TAUTOLOGY)


[typographical] ~1513-~1513: To join two clauses or introduce examples, consider using an em dash.
Context: ...s://github.com/FoundationAgents/MetaGPT) - Multi-agent SOP framework (64.5k stars) ...

(DASH_RULE)


[typographical] ~1514-~1514: To join two clauses or introduce examples, consider using an em dash.
Context: ...2.0](https://github.com/openbmb/ChatDev) - Zero-code multi-agent platform (31.2k st...

(DASH_RULE)


[typographical] ~1515-~1515: To join two clauses or introduce examples, consider using an em dash.
Context: ...AI](https://github.com/crewAIInc/crewAI) - Role-based agent collaboration framework...

(DASH_RULE)


[typographical] ~1516-~1516: To join two clauses or introduce examples, consider using an em dash.
Context: ...n](https://github.com/microsoft/autogen) - Microsoft async multi-agent framework - ...

(DASH_RULE)


[typographical] ~1517-~1517: To join two clauses or introduce examples, consider using an em dash.
Context: ...LLM](https://github.com/BerriAI/litellm) - Unified LLM API gateway (100+ providers)...

(DASH_RULE)


[typographical] ~1518-~1518: To join two clauses or introduce examples, consider using an em dash.
Context: ...- Mem0 - Universal memory layer for AI agents - [...

(DASH_RULE)


[typographical] ~1519-~1519: To join two clauses or introduce examples, consider using an em dash.
Context: ...ocol](https://github.com/a2aproject/A2A) - Agent-to-Agent protocol (Linux Foundatio...

(DASH_RULE)


[typographical] ~1520-~1520: To join two clauses or introduce examples, consider using an em dash.
Context: ...extprotocol.io/specification/2025-11-25) - Model Context Protocol - [Langfuse Agent...

(DASH_RULE)


[typographical] ~1521-~1521: To join two clauses or introduce examples, consider using an em dash.
Context: ...com/blog/2025-03-19-ai-agent-comparison) - Framework comparison - [Confluent Event-...

(DASH_RULE)


[typographical] ~1522-~1522: To join two clauses or introduce examples, consider using an em dash.
Context: .../blog/event-driven-multi-agent-systems/) - Multi-agent architecture patterns - [Mic...

(DASH_RULE)


[typographical] ~1523-~1523: To join two clauses or introduce examples, consider using an em dash.
Context: ....io/multi-agent-reference-architecture/) - Enterprise patterns - [OpenRouter](https...

(DASH_RULE)


[typographical] ~1524-~1524: To join two clauses or introduce examples, consider using an em dash.
Context: ...s - OpenRouter - Multi-model API gateway

(DASH_RULE)

README.md

[typographical] ~11-~11: To join two clauses or introduce examples, consider using an em dash.
Context: ...s (Planned) - Any Company Structure - From a 2-person startup to a 50+ enterpr...

(DASH_RULE)


[typographical] ~12-~12: To join two clauses or introduce examples, consider using an em dash.
Context: ...nfig/templates - Deep Agent Identity - Names, personalities, skills, seniority ...

(DASH_RULE)


[typographical] ~13-~13: To join two clauses or introduce examples, consider using an em dash.
Context: ...erformance tracking - Multi-Provider - Anthropic Claude, OpenRouter (400+ model...

(DASH_RULE)


[typographical] ~14-~14: To join two clauses or introduce examples, consider using an em dash.
Context: ... via LiteLLM - Smart Cost Management - Per-agent budget tracking, auto model ro...

(DASH_RULE)


[typographical] ~15-~15: To join two clauses or introduce examples, consider using an em dash.
Context: ...optimization - Configurable Autonomy - From fully autonomous to human-approves-...

(DASH_RULE)


[typographical] ~16-~16: To join two clauses or introduce examples, consider using an em dash.
Context: ...agent in between - Persistent Memory - Agents remember past decisions, code, re...

(DASH_RULE)


[typographical] ~17-~17: To join two clauses or introduce examples, consider using an em dash.
Context: ...relationships (via Mem0) - HR System - Hire, fire, promote agents. HR agent ana...

(DASH_RULE)


[typographical] ~18-~18: To join two clauses or introduce examples, consider using an em dash.
Context: ...tem, git, code execution, web, databases - role-based and sandboxed - API-First...

(DASH_RULE)


[grammar] ~20-~20: Please add a punctuation mark at the end of paragraph.
Context: ...built company templates and interactive builder ## Status Design phase. See [DESI...

(PUNCTUATION_PARAGRAPH_END)


[typographical] ~37-~37: To join two clauses or introduce examples, consider using an em dash.
Context: ...- Design Specification - Full high-level design

(DASH_RULE)

.claude/skills/aurelio-review-pr/skill.md

[style] ~107-~107: The phrase ‘Look for patterns’ is used very frequently. Consider using a less frequent alternative to set your writing apart from others.
Context: ... lines are outside the PR's diff range. Look for patterns like "Outside diff range comments (N)" ...

(LOOK_FOR_STYLE)


[style] ~121-~121: This word has been used in one of the immediately preceding sentences. Using a synonym could make your text more interesting to read, unless the repetition is intentional.
Context: ...tionable items, not just summaries). Important: Use gh api with --jq for filteri...

(EN_REPEATEDWORDS_IMPORTANT)


[style] ~123-~123: This word has been used in one of the immediately preceding sentences. Using a synonym could make your text more interesting to read, unless the repetition is intentional.
Context: ...omplex Python scripts to parse JSON. Important: When review bodies are large (e.g. C...

(EN_REPEATEDWORDS_IMPORTANT)


[grammar] ~139-~139: Please add a punctuation mark at the end of paragraph.
Context: ...heck against CLAUDE.md rules and actual code Deduplication: If multiple sources...

(PUNCTUATION_PARAGRAPH_END)


[grammar] ~172-~172: Please add a punctuation mark at the end of paragraph.
Context: ... rely on pre-push hooks and CI for full coverage ## Phase 8: Commit and push After all...

(PUNCTUATION_PARAGRAPH_END)


[style] ~181-~181: Consider using a different verb for a more formal wording.
Context: .... If commit or push fails due to hooks, fix the actual issue and create a NEW commi...

(FIX_RESOLVE)

🪛 markdownlint-cli2 (0.21.0)
DESIGN_SPEC.md

[warning] 76-76: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 178-178: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


[warning] 185-185: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


[warning] 192-192: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


[warning] 201-201: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


[warning] 207-207: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


[warning] 212-212: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


[warning] 218-218: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


[warning] 256-256: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 332-332: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 352-352: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 365-365: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 492-492: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 602-602: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 693-693: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 947-947: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 1020-1020: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 1039-1039: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 1133-1133: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 1180-1180: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 1238-1238: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 1487-1487: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

.claude/skills/aurelio-review-pr/skill.md

[warning] 32-32: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 102-102: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 104-104: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 110-110: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 112-112: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 116-116: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 118-118: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🔇 Additional comments (2)
README.md (1)

1-37: README update aligns well with this PR scope.

The document now matches the current project state and points readers to the design spec clearly.

Based on learnings: Applies to README.md : Update README.md for significant feature changes.

LICENSE (1)

6-25: BSL parameterization looks consistent and complete.

Line 23 and Line 24 correctly define the conversion trigger and target license, and the use-grant language is explicit.

…d Copilot

- Add input validation for OWNER/REPO and issue number extracted from PR bodies (command injection fix)
- Add XML delimiters and untrusted-data instructions for issue context passed to sub-agents (prompt injection fix)
- Remove incorrect subject_type field from PR comments extraction
- Clarify Mem0 relationship: it's a memory layer on top of backends, not a backend itself
- Standardize agent level casing in routing rules to match Section 3.1/3.2 conventions
- Add language identifiers to all fenced code blocks (MD040)
- Add blank lines around headings and fenced code blocks (MD022/MD031)
@Aureliolo Aureliolo merged commit 8669a09 into main Feb 27, 2026
1 check was pending
@Aureliolo Aureliolo deleted the feat/design-spec branch February 27, 2026 13:48
Aureliolo added a commit that referenced this pull request Mar 5, 2026
…eviewers

Source changes (tracker.py):
- Fix budget_used_percent/alert_level threshold disagreement (#1)
- Add logger.warning before ValueError in _validate_time_range (#2)
- Split build_summary into _build_agent_spendings, _build_dept_spendings,
  _build_budget_context helpers to meet <50 line guideline (#3)
- Replace _aggregate tuple return with _AggregateResult NamedTuple (#4)
- Rewrite _aggregate as single-pass loop (#5)
- Aggregate departments from AgentSpending objects, not raw records (#6)
- Add DEBUG log in __init__ for object creation (#7)
- Fix module docstring "Section 10.2 service layer" accuracy (#15)
- Add DEBUG entry logs for get_total_cost/get_agent_cost (#16)

Event constants (events.py):
- Add BUDGET_TRACKER_CREATED, BUDGET_TOTAL_COST_QUERIED,
  BUDGET_AGENT_COST_QUERIED, BUDGET_TIME_RANGE_INVALID

Test improvements (test_tracker.py):
- Assert budget_used_percent in all alert-level tests (#8)
- Assert budget_total_monthly in configured summary test (#9)
- Add start==end raises ValueError test (#10)
- Add end-only time filter test for get_total_cost (#11)
- Add multi-record token aggregation assertions (#12)
- Add pytestmark = pytest.mark.timeout(30) (#13)
- Move @pytest.mark.unit to class level (#14)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aureliolo added a commit that referenced this pull request Mar 6, 2026
…t, Gemini, and greptile

- Add produced_artifacts field to AgentRunResult (#1)
- Wrap _log_completion in try/except to preserve valid results (#2)
- Add test for inner TimeoutError propagation without engine timeout (#3)
- Extract _run_loop_with_timeout from _execute (50-line limit) (#4)
- Extract _validate_run_inputs from run() (50-line limit) (#5)
- Rename metrics docstrings from "completed task" to "agent run" + add termination_reason to metrics event (#6)
- Fix raise exc from build_exc chain direction (#7)
- Replace asyncio.wait_for with asyncio.wait for timeout disambiguation (#8)
- Add test for _apply_post_execution_transitions failure resilience (#9)
- Add test for timeout cost recording behavior (#10)
- Fix hardcoded from_status in transition logs (#11)
- Add agent-task ownership check in _validate_task (#12)
- Split test_invalid_timeout_raises into two test methods (#13)
- Add negative validation tests for cost_per_task/duration_seconds (#14)
- Add test_blank_task_id_rejected (#15)
- Update _execute docstring to mention timeout, transitions, metrics (#16)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aureliolo added a commit that referenced this pull request Mar 7, 2026
…, and Copilot

Source fixes:
- Move asyncio/types out of TYPE_CHECKING for PEP 649 compatibility (#1, #2)
- Guard is_closing() inside try/except in _process_cleanup.py (#4)
- Normalize all control chars (incl. newlines/tabs) in _sanitize_stderr (#6)
- Apply stderr sanitization to sandbox git path too (#3)
- Fix list_directory truncation to use raw scan cap, not post-classification count (#7)
- Narrow except to InvalidStateError + log task identity in shutdown (#8)
- Add loop.stop() fallback when request_shutdown() fails in signal handlers (#9)
- Include zombie diagnostic in _drain_after_kill stderr output (#10)
- Remove @staticmethod from _log_post_cancel_exceptions (#17)
- Add _process_cleanup.py to DESIGN_SPEC §15.3 (#12) and §11.1.1 (#16)

Test additions:
- New test_process_cleanup.py: 7 tests covering all transport states (#5)
- Add _sanitize_stderr truncation test (#14)
- Add _log_post_cancel_exceptions tests (4 tests) (#11)
- Add signal handler recovery tests (3 tests) (#15)
- Use ValidationError instead of ValueError in config tests (#13)
- Update existing tests for new sanitization behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aureliolo added a commit that referenced this pull request Mar 9, 2026
- Add routing optimization feature (#1): new suggest_routing_optimizations()
  method, RoutingSuggestion and RoutingOptimizationAnalysis models
- Add negative estimated_cost_usd validation (#2)
- Fix double snapshot in generate_report (#3)
- Fix deviation_factor to use spike_ratio when stddev=0 (#4)
- Convert DowngradeAnalysis.total_estimated_savings_per_1k to @computed_field (#5)
- Change str to NotBlankStr in SpendingReport tuple fields (#6)
- Add window_count upper bound validation (#7)
- Pre-group records by agent for O(N+M) complexity (#8)
- Update DESIGN_SPEC.md implementation snapshot (#9)
- Use projected alert level for auto-deny check (#11)
- Move approval log after ApprovalDecision construction (#12)
- Add ReportGenerator.__init__ debug log + event constant (#13)
- Fix _ALERT_LEVEL_ORDER comment (#14)
- Fix _classify_severity docstring for dual-use (#15)
- Add WARNING logs before ValueError raises (#16)
- Update evaluate_operation docstring (#17)
- Add sort-order validator to EfficiencyAnalysis.agents (#18)
- Add debug log when _find_most_used_model returns None (#19)
- Remove redundant stddev > 0 check in is_sigma_anomaly (#20)
- Document approval_warn_threshold_usd=0.0 behavior (#21)
- Extract helpers to _optimizer_helpers.py to stay under 800-line limit
Aureliolo added a commit that referenced this pull request Mar 10, 2026
…reptile

- Cap prompt_tokens instead of rejecting when heuristic exceeds actual (#2)
- Log policy_length instead of policy content to avoid leaks (#6)
- Sort _ACTION_VERBS for deterministic regex alternation (#8)
- Use PROMPT_POLICY_VALIDATION_FAILED event for advisory failures (#12)
- Add isinstance check and strip whitespace in TagBasedMemoryFilter (#13)
- Use MEMORY_FILTER_INIT event for filter-init log paths (#14, #16)
- Remove content_preview from store_guard warning log (#18)
- Track tools section conditionally for custom templates (#10)
- Reconcile enforced vs advisory wording in DESIGN_SPEC (#1)
- Inject fixed estimated_tokens in prompt-ratio test (#19)
- Parametrize timeout and org_policies tests (#20, #22)
- Add prompt_token_ratio assertion in lifecycle test (#21)
Aureliolo added a commit that referenced this pull request Mar 16, 2026
Security fixes:
- Mask sensitive values in validation error messages (#2)
- Sanitize SettingsEncryptionError in API response (#3)
- Don't cache decrypted plaintext secrets (#4)
- Validate namespace against SettingNamespace enum at controller (#8)
- Error on empty SYNTHORG_SETTINGS_KEY instead of silently ignoring (#9)

Persistence fixes:
- V9 migration: prefer settings_old as copy source when both exist (#1)
- Remove redundant idx_settings_namespace index (PK covers it) (#15)
- Fix updated_at format consistency (ISO 8601 with timezone) (#17)
- Align get_setting/set_setting types with protocol (NotBlankStr) (#10)

Performance:
- Use batch repo methods in get_all/get_namespace (eliminate N+1) (#6)

Code quality:
- Fix Any type to SettingsService on create_app parameter (#5)
- Add SETTINGS_NOTIFICATION_FAILED + SETTINGS_FETCH_FAILED events (#7,#16)
- Fix wrong event constants in repo error paths and notification (#7,#16)
- Remove duplicate SettingNotFoundError guard in get_entry (#21)
- Log SETTINGS_VALUE_RESOLVED for all resolution sources (#22)

Tests:
- Add V9 migration crash-safety tests (#11)
- Add tests for sensitive read without encryptor (#12)
- Add tests for get_all/get_namespace service methods (#13)
- Add tests for notification exception handling (#14)
- Use pydantic.ValidationError instead of broad Exception (#18)
- Add INTEGER, JSON, and sensitive-masking validation tests (#19)
- Use actual Unicode characters in encryption roundtrip test (#20)
Aureliolo added a commit that referenced this pull request Mar 16, 2026
Security fixes:
- Mask sensitive values in validation error messages (#2)
- Sanitize SettingsEncryptionError in API response (#3)
- Don't cache decrypted plaintext secrets (#4)
- Validate namespace against SettingNamespace enum at controller (#8)
- Error on empty SYNTHORG_SETTINGS_KEY instead of silently ignoring (#9)

Persistence fixes:
- V9 migration: prefer settings_old as copy source when both exist (#1)
- Remove redundant idx_settings_namespace index (PK covers it) (#15)
- Fix updated_at format consistency (ISO 8601 with timezone) (#17)
- Align get_setting/set_setting types with protocol (NotBlankStr) (#10)

Performance:
- Use batch repo methods in get_all/get_namespace (eliminate N+1) (#6)

Code quality:
- Fix Any type to SettingsService on create_app parameter (#5)
- Add SETTINGS_NOTIFICATION_FAILED + SETTINGS_FETCH_FAILED events (#7,#16)
- Fix wrong event constants in repo error paths and notification (#7,#16)
- Remove duplicate SettingNotFoundError guard in get_entry (#21)
- Log SETTINGS_VALUE_RESOLVED for all resolution sources (#22)

Tests:
- Add V9 migration crash-safety tests (#11)
- Add tests for sensitive read without encryptor (#12)
- Add tests for get_all/get_namespace service methods (#13)
- Add tests for notification exception handling (#14)
- Use pydantic.ValidationError instead of broad Exception (#18)
- Add INTEGER, JSON, and sensitive-masking validation tests (#19)
- Use actual Unicode characters in encryption roundtrip test (#20)
Aureliolo added a commit that referenced this pull request Mar 17, 2026
… reviewers

Source fixes:
- Use NotBlankStr | None for DualModeConfig.summarization_model (#1)
- Parallelize LLM calls with asyncio.TaskGroup in summarize_batch
  and _build_content (#2)
- Remove dead-code guards in _build_anchors (#3)
- Narrow except Exception to re-raise non-retryable ProviderErrors (#4)
- Fix double-logging on abstractive fallback (#9)
- Remove unnecessary import builtins (#10)
- Preserve key-value pairs verbatim in extractive mode (#5)
- Emit extracted facts one per line (#6)
- Strengthen ConsolidationResult validator with cross-field checks (#7)
- Check _backend.delete() return value in _process_group (#8)
- Fix mode_map type to dict[NotBlankStr, ArchivalMode] (#11)
- Move tie-breaking comment to _determine_group_mode (#12)
- Fix misleading DualModeConfig docstring (#13)
- Add missing mkdocstrings entries for retention/archival/simple_strategy (#14)
- Use O(M) lookup dict in _archive_entries (#15)
- Document 1000-entry query limit in run_consolidation docstring (#16)
- Add Raises section to AbstractiveSummarizer docstring (#17)

Test fixes:
- Fix imports in test_density.py to module level (#18)
- Strengthen fallback assertion to verify exact content (#19)
- Use exact call counts for summarizer/extractor (#20)
- Add tests: blank model rejection, MemoryError/RecursionError propagation (#21, #22)
- Add tests: validator rejects invalid archival state (#23)
- Add tests: 50/50 tie-breaking, None relevance handling (#24, #25)
- Assert actual preserved facts in extractive tests (#26)
- Prove archival index keyed by original_id not position (#27)
- Add test: empty string classifies as SPARSE (#28)
Aureliolo added a commit that referenced this pull request Mar 20, 2026
Security hardening:
- Service catch-all now respects configured error policy (DENY/ESCALATE)
  instead of unconditionally returning rule verdict (#1)
- Default error policy changed from USE_RULE_VERDICT to ESCALATE (#30)
- XML-delimited prompt to resist prompt injection (#6)
- Configurable reason visibility (full/generic/category) to prevent
  adversarial feedback loop via LLM reason leakage (#7)
- Configurable argument truncation strategy (whole_string/per_value/
  keys_and_values) to prevent padding bypass (#28)
- Reason field sanitized for newlines/control chars (#16)
- MemoryError/RecursionError guard added to engine._safe_evaluate (#2)

Code quality:
- Enum values and mappings derived from source enums (no drift) (#9)
- MappingProxyType for read-only lookup dicts (#10)
- _call_llm and _select_provider refactored under 50-line limit (#11)
- Warning log when _select_model falls back to provider name (#13)
- Debug log on JSON serialization fallback (#14)
- error_type/error_message in structured exception logging (#15)
- USE_RULE_VERDICT annotates reason with failure context (#8)
- Em-dashes replaced with ASCII dashes (#5)
- Module/method docstrings corrected (#4, #12, #24, #25)
- LlmFallbackConfig/LlmFallbackErrorPolicy re-exported (#3)
- agent_visible_reason field on SecurityVerdict for invoker (#7)

Tests:
- Parametrized verdict parsing and error policy tests (#27)
- Timeout test uses asyncio.Future instead of real sleep (#23)
- Truncation assertion tightened from 10000 to 600 (#20)
- Fixed mismatched matched_rules in integration helper (#21)
- Added: DENY+LOW confidence safety net test (#18)
- Added: MemoryError propagation test
- Added: reason sanitization test
- Added: USE_RULE_VERDICT annotation test
- Added: reason visibility config test
- Added: per-value truncation test
- Added: XML delimiter test
- Added: audit entry confidence assertion (#26)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aureliolo added a commit that referenced this pull request Mar 21, 2026
… Gemini

- Fix step order: tame third-party loggers before applying per-logger
  levels so user config overrides take precedence (#1)
- Replace contextlib.suppress(Exception) with try/except + stderr
  warning to match _clear_root_handlers error handling pattern (#2)
- Fix misleading docstring: clarify litellm attribute suppression is
  conditional while handler cleanup runs unconditionally (#3)
- Change PR body from Closes #72 to Related to #72 (already closed
  by PR #73) (#4)
- Add precondition to test_litellm_suppress_debug_info_enabled (#5)
- Add suppress_debug_info to CLAUDE.md observability description (#6)
- Add test for handler.close() failure warning to stderr (#7)
- Add test for removing multiple handlers from single logger (#8)
- Add autouse fixture to reset third-party logger state between
  tests (#9)
- Strengthen test_skips_litellm_when_not_imported to verify handler
  and level cleanup still works (#10)
- Add test for non-LiteLLM handler removal (httpx) (#11)
- Update docs/design/operations.md step number and override note

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aureliolo added a commit that referenced this pull request Mar 29, 2026
…mini

Security: sanitize CSS keyframe names from dynamic edge IDs (#2), replace
hardcoded rgba() with design token (#1). Correctness: guard rollback against
stale overwrites (#6), filter orphan edges in force view (#4), use
collision-safe edge IDs (#3), restrict drag to agent nodes only (#5), fix
pagination offset to use data.length (#8), clamp dashDuration to bounds (#10).
Robustness: add AbortController for fetch cancellation (#9), move announce()
out of state updater (#7), add .catch() on fetchCompanyData (#29), log dept
health errors (#28). Docs: fix WS channel table (#11-12), classify PATCH
endpoint correctly (#13), update ux-guidelines version (#14), update
operations.md summary (#15). Tests: fix misleading test description (#16),
add stale rollback regression test (#24). Polish: extract magic numbers (#21-
22), add JSDoc on CommunicationEdgeData fields (#18-19), simplify Pick<>
type (#20), use Tailwind in stories (#17), add console.warn to empty catch
blocks (#25).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aureliolo added a commit that referenced this pull request Mar 29, 2026
…mini

Security: sanitize CSS keyframe names from dynamic edge IDs (#2), replace
hardcoded rgba() with design token (#1). Correctness: guard rollback against
stale overwrites (#6), filter orphan edges in force view (#4), use
collision-safe edge IDs (#3), restrict drag to agent nodes only (#5), fix
pagination offset to use data.length (#8), clamp dashDuration to bounds (#10).
Robustness: add AbortController for fetch cancellation (#9), move announce()
out of state updater (#7), add .catch() on fetchCompanyData (#29), log dept
health errors (#28). Docs: fix WS channel table (#11-12), classify PATCH
endpoint correctly (#13), update ux-guidelines version (#14), update
operations.md summary (#15). Tests: fix misleading test description (#16),
add stale rollback regression test (#24). Polish: extract magic numbers (#21-
22), add JSDoc on CommunicationEdgeData fields (#18-19), simplify Pick<>
type (#20), use Tailwind in stories (#17), add console.warn to empty catch
blocks (#25).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aureliolo added a commit that referenced this pull request Apr 2, 2026
- Remove docs version banner (user request)
- Fix docs sidebar rectangles -- transparent background (user request)
- Add competitor entry validation in generate_comparison.py (#2)
- Fix compare.astro: import.meta.url path, type imports, validation (#3)
- Export TypeScript interfaces from ComparisonTable.tsx (#4)
- Update CLAUDE.md CI section with comparison generation step (#5)
- Add keyboard accessibility + aria-sort to sort headers (#6)
- Add role=img + aria-label to SupportIcon (#7)
- Add license and feature-set filter dropdowns (#8)
- Escape JSON-LD output to prevent XSS (#9)
- Add aria-pressed to category filter buttons (#10)
- Warn on unknown dimension/category/support keys (#11, #12, #15)
- Remove unused meta.version from competitors.yaml (#13)
- Fix reCAPTCHA: IntersectionObserver fallback + onerror handler (#14)
- Clarify YAML comments (#16, #17)
- Add 28 Python tests for generate_comparison.py (#18)
- Add 13 React tests for ComparisonTable.tsx (#18)
- Set up Vitest test infrastructure for site/
Aureliolo added a commit that referenced this pull request Apr 2, 2026
- Remove docs version banner (user request)
- Fix docs sidebar rectangles -- transparent background (user request)
- Add competitor entry validation in generate_comparison.py (#2)
- Fix compare.astro: import.meta.url path, type imports, validation (#3)
- Export TypeScript interfaces from ComparisonTable.tsx (#4)
- Update CLAUDE.md CI section with comparison generation step (#5)
- Add keyboard accessibility + aria-sort to sort headers (#6)
- Add role=img + aria-label to SupportIcon (#7)
- Add license and feature-set filter dropdowns (#8)
- Escape JSON-LD output to prevent XSS (#9)
- Add aria-pressed to category filter buttons (#10)
- Warn on unknown dimension/category/support keys (#11, #12, #15)
- Remove unused meta.version from competitors.yaml (#13)
- Fix reCAPTCHA: IntersectionObserver fallback + onerror handler (#14)
- Clarify YAML comments (#16, #17)
- Add 28 Python tests for generate_comparison.py (#18)
- Add 13 React tests for ComparisonTable.tsx (#18)
- Set up Vitest test infrastructure for site/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants