feat: implement Task, Project, Artifact, Budget, and Cost domain models#71
feat: implement Task, Project, Artifact, Budget, and Cost domain models#71
Conversation
…ls (#57, #61) Add remaining M1 domain models across two packages: core/ — Task lifecycle with state machine validation, Project with team and deadline tracking, Artifact for produced work items, plus 6 new enums (TaskStatus, TaskType, Priority, Complexity, ArtifactType, ProjectStatus) budget/ — BudgetConfig with alert thresholds and auto-downgrade settings, BudgetHierarchy for Company→Department→Team allocation, immutable CostRecord for per-API-call tracking, and SpendingSummary for aggregated reporting by agent, department, and time period All models follow frozen Pydantic v2 patterns with comprehensive validation. 502 unit tests passing, mypy strict clean. Closes #57, closes #61
|
Caution Review failedThe pull request is closed. ℹ️ Recent review infoConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (7)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds frozen Pydantic domain models and enums for budgeting/cost tracking and for core project/task/artifact domains, a task-state transition map and validator, a shared rounding constant, expanded package-level exports, and extensive unit tests and factories for both domains. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces foundational domain models for managing tasks, projects, artifacts, and a detailed budget and cost tracking system. The changes establish core entities and their relationships, along with robust validation logic and state machine transitions, laying the groundwork for future agentic functionalities and financial oversight within the AI company framework. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive set of domain models for Tasks, Projects, Artifacts, Budgets, and Costs, which is a significant and well-executed feature implementation. The use of Pydantic for data validation and creating immutable models is excellent, and the accompanying unit tests are thorough and well-written, providing high confidence in the new code.
My review focuses on a few opportunities to improve consistency in the validation logic across the new models. Specifically, I've suggested using collections.Counter for detecting duplicate items in lists, which is already used in some parts of the new code. Adopting this pattern everywhere will make the codebase more consistent and provide more informative error messages for developers.
Overall, this is a high-quality contribution that lays a solid foundation for the core domain.
src/ai_company/budget/config.py
Outdated
| def _validate_no_duplicate_source_aliases(self) -> Self: | ||
| """Ensure each source alias maps to exactly one target.""" | ||
| sources = [source for source, _ in self.downgrade_map] | ||
| if len(sources) != len(set(sources)): | ||
| seen: set[str] = set() | ||
| dupes: list[str] = [] | ||
| for s in sources: | ||
| if s in seen: | ||
| dupes.append(s) | ||
| seen.add(s) | ||
| msg = f"Duplicate source aliases in downgrade_map: {sorted(set(dupes))}" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
For consistency with other validators in the codebase (e.g., in hierarchy.py) and for improved conciseness, consider using collections.Counter to detect duplicate source aliases. It simplifies the logic for finding and reporting duplicates. You'll need to add from collections import Counter at the top of the file.
def _validate_no_duplicate_source_aliases(self) -> Self:
"""Ensure each source alias maps to exactly one target."""
from collections import Counter
sources = [source for source, _ in self.downgrade_map]
if len(sources) != len(set(sources)):
dupes = sorted(item for item, count in Counter(sources).items() if count > 1)
msg = f"Duplicate source aliases in downgrade_map: {dupes}"
raise ValueError(msg)
return self| def _validate_unique_agent_ids(self) -> Self: | ||
| """Ensure no duplicate agent_id values in by_agent.""" | ||
| ids = [a.agent_id for a in self.by_agent] | ||
| if len(ids) != len(set(ids)): | ||
| msg = "Duplicate agent_id values in by_agent" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
This validation is correct, but for consistency with other parts of the codebase (like hierarchy.py) and to provide a more helpful error message, consider using collections.Counter to identify and list the specific duplicate agent_id values.
| def _validate_unique_agent_ids(self) -> Self: | |
| """Ensure no duplicate agent_id values in by_agent.""" | |
| ids = [a.agent_id for a in self.by_agent] | |
| if len(ids) != len(set(ids)): | |
| msg = "Duplicate agent_id values in by_agent" | |
| raise ValueError(msg) | |
| return self | |
| @model_validator(mode="after") | |
| def _validate_unique_agent_ids(self) -> Self: | |
| """Ensure no duplicate agent_id values in by_agent.""" | |
| from collections import Counter | |
| ids = [a.agent_id for a in self.by_agent] | |
| if len(ids) != len(set(ids)): | |
| dupes = sorted(item for item, count in Counter(ids).items() if count > 1) | |
| msg = f"Duplicate agent_id values in by_agent: {dupes}" | |
| raise ValueError(msg) | |
| return self | |
| def _validate_unique_department_names(self) -> Self: | ||
| """Ensure no duplicate department_name values in by_department.""" | ||
| names = [d.department_name for d in self.by_department] | ||
| if len(names) != len(set(names)): | ||
| msg = "Duplicate department_name values in by_department" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
Similar to the agent ID validation, this check for duplicate department names can be improved. Using collections.Counter would make it consistent with other validators and provide a more informative error message that lists the duplicate names.
| def _validate_unique_department_names(self) -> Self: | |
| """Ensure no duplicate department_name values in by_department.""" | |
| names = [d.department_name for d in self.by_department] | |
| if len(names) != len(set(names)): | |
| msg = "Duplicate department_name values in by_department" | |
| raise ValueError(msg) | |
| return self | |
| @model_validator(mode="after") | |
| def _validate_unique_department_names(self) -> Self: | |
| """Ensure no duplicate department_name values in by_department.""" | |
| from collections import Counter | |
| names = [d.department_name for d in self.by_department] | |
| if len(names) != len(set(names)): | |
| dupes = sorted(item for item, count in Counter(names).items() if count > 1) | |
| msg = f"Duplicate department_name values in by_department: {dupes}" | |
| raise ValueError(msg) | |
| return self | |
src/ai_company/core/project.py
Outdated
| def _validate_no_duplicate_team_members(self) -> Self: | ||
| """Ensure no duplicate agent IDs in team.""" | ||
| if len(self.team) != len(set(self.team)): | ||
| msg = "Duplicate entries in team" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
To improve consistency across the new models and provide more helpful error messages, consider using collections.Counter here to find and list the duplicate agent IDs in the team field. This pattern is already used effectively in the budget.hierarchy module.
| def _validate_no_duplicate_team_members(self) -> Self: | |
| """Ensure no duplicate agent IDs in team.""" | |
| if len(self.team) != len(set(self.team)): | |
| msg = "Duplicate entries in team" | |
| raise ValueError(msg) | |
| return self | |
| @model_validator(mode="after") | |
| def _validate_no_duplicate_team_members(self) -> Self: | |
| """Ensure no duplicate agent IDs in team.""" | |
| if len(self.team) != len(set(self.team)): | |
| from collections import Counter | |
| dupes = sorted(item for item, count in Counter(self.team).items() if count > 1) | |
| msg = f"Duplicate entries in team: {dupes}" | |
| raise ValueError(msg) | |
| return self | |
src/ai_company/core/project.py
Outdated
| def _validate_no_duplicate_task_ids(self) -> Self: | ||
| """Ensure no duplicate task IDs.""" | ||
| if len(self.task_ids) != len(set(self.task_ids)): | ||
| msg = "Duplicate entries in task_ids" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
For consistency and better debugging, this validator could be updated to use collections.Counter to list the specific duplicate task_ids. This provides a more informative error message, similar to the validation logic in other new models.
| def _validate_no_duplicate_task_ids(self) -> Self: | |
| """Ensure no duplicate task IDs.""" | |
| if len(self.task_ids) != len(set(self.task_ids)): | |
| msg = "Duplicate entries in task_ids" | |
| raise ValueError(msg) | |
| return self | |
| @model_validator(mode="after") | |
| def _validate_no_duplicate_task_ids(self) -> Self: | |
| """Ensure no duplicate task IDs.""" | |
| if len(self.task_ids) != len(set(self.task_ids)): | |
| from collections import Counter | |
| dupes = sorted(item for item, count in Counter(self.task_ids).items() if count > 1) | |
| msg = f"Duplicate entries in task_ids: {dupes}" | |
| raise ValueError(msg) | |
| return self | |
src/ai_company/core/task.py
Outdated
| def _validate_no_duplicate_dependencies(self) -> Self: | ||
| """Ensure no duplicate task IDs in dependencies.""" | ||
| if len(self.dependencies) != len(set(self.dependencies)): | ||
| msg = "Duplicate entries in dependencies" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
To improve consistency with other models in this PR and provide more useful validation errors, consider using collections.Counter to find and report the specific duplicate task IDs in the dependencies list.
| def _validate_no_duplicate_dependencies(self) -> Self: | |
| """Ensure no duplicate task IDs in dependencies.""" | |
| if len(self.dependencies) != len(set(self.dependencies)): | |
| msg = "Duplicate entries in dependencies" | |
| raise ValueError(msg) | |
| return self | |
| @model_validator(mode="after") | |
| def _validate_no_duplicate_dependencies(self) -> Self: | |
| """Ensure no duplicate task IDs in dependencies.""" | |
| if len(self.dependencies) != len(set(self.dependencies)): | |
| from collections import Counter | |
| dupes = sorted(item for item, count in Counter(self.dependencies).items() if count > 1) | |
| msg = f"Duplicate entries in dependencies: {dupes}" | |
| raise ValueError(msg) | |
| return self | |
src/ai_company/core/task.py
Outdated
| def _validate_no_duplicate_reviewers(self) -> Self: | ||
| """Ensure no duplicate agent IDs in reviewers.""" | ||
| if len(self.reviewers) != len(set(self.reviewers)): | ||
| msg = "Duplicate entries in reviewers" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
For consistency and better error reporting, this validator for duplicate reviewers could also be updated to use collections.Counter. This would list the specific duplicate agent IDs, making it easier to debug.
@model_validator(mode="after")
def _validate_no_duplicate_reviewers(self) -> Self:
"""Ensure no duplicate agent IDs in reviewers."""
if len(self.reviewers) != len(set(self.reviewers)):
from collections import Counter
dupes = sorted(item for item, count in Counter(self.reviewers).items() if count > 1)
msg = f"Duplicate entries in reviewers: {dupes}"
raise ValueError(msg)
return selfThere was a problem hiding this comment.
Pull request overview
Implements the Task/Project/Artifact core domain models and Budget/Cost domain models described in DESIGN_SPEC Sections 6 and 10, with accompanying enums, validation logic, and unit test coverage.
Changes:
- Added core domain models for tasks, projects, artifacts, plus a task lifecycle transition map/validator and related enums.
- Added budget domain models for configuration, hierarchy allocation, per-call cost records, and aggregated spending summaries (plus budget alert enum).
- Added extensive unit test coverage and fixtures/factories for the new models.
Reviewed changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/core/test_task_transitions.py | Unit tests for task lifecycle transition validation |
| tests/unit/core/test_task.py | Unit tests for Task + AcceptanceCriterion validation/serialization |
| tests/unit/core/test_project.py | Unit tests for Project validation/serialization |
| tests/unit/core/test_enums.py | Extended enum tests to cover new core enums |
| tests/unit/core/test_artifact.py | Unit tests for Artifact + ExpectedArtifact |
| tests/unit/core/conftest.py | Added factories and fixtures for new core models |
| tests/unit/budget/test_spending_summary.py | Unit tests for spending summary aggregation models |
| tests/unit/budget/test_hierarchy.py | Unit tests for budget hierarchy allocation models |
| tests/unit/budget/test_enums.py | Unit tests for budget-specific enums |
| tests/unit/budget/test_cost_record.py | Unit tests for immutable per-call CostRecord |
| tests/unit/budget/test_config.py | Unit tests for budget configuration + auto-downgrade settings |
| tests/unit/budget/conftest.py | Added factories and fixtures for budget models |
| tests/unit/budget/init.py | Budget unit test package init |
| src/ai_company/core/task_transitions.py | Task status transition map + validator |
| src/ai_company/core/task.py | Task + AcceptanceCriterion Pydantic models and validators |
| src/ai_company/core/project.py | Project Pydantic model and validators |
| src/ai_company/core/enums.py | Added task/project/artifact-related enums |
| src/ai_company/core/artifact.py | Artifact/ExpectedArtifact Pydantic models and validators |
| src/ai_company/core/init.py | Exported new core models/enums/utilities |
| src/ai_company/budget/spending_summary.py | Aggregated spending summary domain models |
| src/ai_company/budget/hierarchy.py | Budget allocation hierarchy models |
| src/ai_company/budget/enums.py | BudgetAlertLevel enum |
| src/ai_company/budget/cost_record.py | Immutable per-API-call cost record model |
| src/ai_company/budget/config.py | Budget configuration models (alerts, limits, auto-downgrade) |
| src/ai_company/budget/init.py | Budget module public exports |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assert "normal" in BudgetAlertLevel.__members__.values() | ||
| assert "warning" in BudgetAlertLevel.__members__.values() |
There was a problem hiding this comment.
BudgetAlertLevel.__members__.values() yields enum members (e.g., BudgetAlertLevel.NORMAL), not their string values, so these assertions will always fail. If the intent is to verify value-based lookups, assert that BudgetAlertLevel("normal") is BudgetAlertLevel.NORMAL (and similarly for "warning"), or compare against {m.value for m in BudgetAlertLevel} instead.
| assert "normal" in BudgetAlertLevel.__members__.values() | |
| assert "warning" in BudgetAlertLevel.__members__.values() | |
| values = {member.value for member in BudgetAlertLevel} | |
| assert "normal" in values | |
| assert "warning" in values |
| def validate_transition(current: TaskStatus, target: TaskStatus) -> None: | ||
| """Validate that a state transition is allowed. | ||
|
|
||
| Args: | ||
| current: The current task status. | ||
| target: The desired target status. | ||
|
|
||
| Raises: | ||
| ValueError: If the transition from *current* to *target* | ||
| is not in :data:`VALID_TRANSITIONS`. | ||
| """ | ||
| allowed = VALID_TRANSITIONS.get(current, frozenset()) | ||
| if target not in allowed: | ||
| msg = ( | ||
| f"Invalid task status transition: {current.value!r} -> " | ||
| f"{target.value!r}. Allowed from {current.value!r}: " | ||
| f"{sorted(s.value for s in allowed)}" | ||
| ) | ||
| raise ValueError(msg) |
There was a problem hiding this comment.
validate_transition / VALID_TRANSITIONS are defined and exported, but they are not invoked anywhere in src/ (only in unit tests). As a result, invalid lifecycle changes can still be created by constructing a new Task with an arbitrary status, so transitions are not actually enforced at the domain/API level yet. Consider adding a Task.transition_to(...) (or similar) helper that calls validate_transition and returns a new Task instance, and/or ensuring the engine layer uses this function consistently.
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/core/test_enums.py (1)
1-28:⚠️ Potential issue | 🟡 MinorAdd a module-level timeout marker for consistency with repository test conventions.
This test file is missing the explicit
pytestmark = pytest.mark.timeout(30)marker present in other test modules. While the global pytest configuration applies a 30-second timeout, the repository convention is to declare it explicitly at the module level.Proposed patch
import pytest from ai_company.core.enums import ( @@ -22,6 +22,8 @@ from ai_company.core.enums import ( TaskType, ) +pytestmark = pytest.mark.timeout(30) + # ── Member Counts ──────────────────────────────────────────────────🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/core/test_enums.py` around lines 1 - 28, Add the module-level pytest timeout marker by defining pytestmark = pytest.mark.timeout(30) near the top of the module (above the TestEnumMemberCounts class); since pytest is already imported in this file, insert the single-line assignment (pytestmark = pytest.mark.timeout(30)) at module scope so the TestEnumMemberCounts tests follow the repository convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/budget/hierarchy.py`:
- Around line 97-108: The validator _validate_team_budget_sum currently only
rejects totals above 100% but should enforce that team allocations sum to
exactly 100% (after rounding). Change the logic in _validate_team_budget_sum
(and the analogous _validate_department_budget_sum) to compute total =
sum(t.budget_percent for t in self.teams) (or departments), round it with
_BUDGET_ROUNDING_PRECISION and compare equality to max_budget_percent (100.0);
if the rounded total != max_budget_percent raise a ValueError that includes
self.department_name (or the appropriate name) and the rounded total in the
message so under- and over-allocation both fail validation.
In `@tests/unit/core/test_project.py`:
- Around line 1-7: Add a module-level pytest timeout marker to this test module
by defining pytestmark = pytest.mark.timeout(...) near the top of
tests/unit/core/test_project.py (after the imports) so the Project tests inherit
the standard timeout used across other test files; update the marker value to
the project's standard timeout constant or numeric value and ensure the pytest
import (pytest) is present so pytestmark resolves.
- Around line 159-166: Move the ProjectFactory import out of the test method and
place it at module level so the file follows the same pattern as other tests;
specifically, add "from tests.unit.core.conftest import ProjectFactory" at top
of the file and remove the inner import from inside
TestProjectFactory.test_factory, leaving the test to simply call
ProjectFactory.build() in test_factory.
In `@tests/unit/core/test_task_transitions.py`:
- Around line 1-6: Add the module-level pytest timeout marker to this test
module by declaring pytestmark = pytest.mark.timeout(30) at the top of
tests/unit/core/test_task_transitions.py so the whole module uses a 30-second
timeout; place the marker near the existing imports (after the pytest import) so
it applies to VALID_TRANSITIONS and validate_transition tests.
In `@tests/unit/core/test_task.py`:
- Around line 1-14: The test module is missing the module-level pytest timeout
marker; add a top-level pytestmark = pytest.mark.timeout(<seconds>) line at the
top of the file so all tests in this module (including AcceptanceCriterion and
Task tests importing ExpectedArtifact, ArtifactType, Complexity, Priority,
TaskStatus, TaskType) inherit the timeout; place it before any test code or
imports that run tests (directly under the module docstring) and use the
project-standard timeout value.
---
Outside diff comments:
In `@tests/unit/core/test_enums.py`:
- Around line 1-28: Add the module-level pytest timeout marker by defining
pytestmark = pytest.mark.timeout(30) near the top of the module (above the
TestEnumMemberCounts class); since pytest is already imported in this file,
insert the single-line assignment (pytestmark = pytest.mark.timeout(30)) at
module scope so the TestEnumMemberCounts tests follow the repository convention.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (25)
src/ai_company/budget/__init__.pysrc/ai_company/budget/config.pysrc/ai_company/budget/cost_record.pysrc/ai_company/budget/enums.pysrc/ai_company/budget/hierarchy.pysrc/ai_company/budget/spending_summary.pysrc/ai_company/core/__init__.pysrc/ai_company/core/artifact.pysrc/ai_company/core/enums.pysrc/ai_company/core/project.pysrc/ai_company/core/task.pysrc/ai_company/core/task_transitions.pytests/unit/budget/__init__.pytests/unit/budget/conftest.pytests/unit/budget/test_config.pytests/unit/budget/test_cost_record.pytests/unit/budget/test_enums.pytests/unit/budget/test_hierarchy.pytests/unit/budget/test_spending_summary.pytests/unit/core/conftest.pytests/unit/core/test_artifact.pytests/unit/core/test_enums.pytests/unit/core/test_project.pytests/unit/core/test_task.pytests/unit/core/test_task_transitions.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Agent
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14+ has PEP 649 native lazy annotations
Include type hints on all public functions in Python
Use Google-style docstrings on all public classes and functions in Python
Create new objects instead of mutating existing ones — maintain immutability
Use Pydantic v2BaseModel,model_validator, andConfigDictfor data models
Enforce 88-character line length in Python code
Keep functions under 50 lines of code
Keep Python files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries — user input, external APIs, config files
Files:
tests/unit/core/test_project.pytests/unit/budget/test_cost_record.pytests/unit/core/test_task_transitions.pytests/unit/core/test_enums.pysrc/ai_company/core/task_transitions.pysrc/ai_company/budget/hierarchy.pytests/unit/core/test_task.pysrc/ai_company/core/task.pysrc/ai_company/budget/spending_summary.pytests/unit/budget/test_config.pysrc/ai_company/core/artifact.pysrc/ai_company/budget/enums.pytests/unit/core/conftest.pytests/unit/core/test_artifact.pysrc/ai_company/budget/__init__.pytests/unit/budget/test_enums.pytests/unit/budget/test_hierarchy.pysrc/ai_company/budget/config.pysrc/ai_company/budget/cost_record.pytests/unit/budget/conftest.pysrc/ai_company/core/__init__.pysrc/ai_company/core/project.pysrc/ai_company/core/enums.pytests/unit/budget/test_spending_summary.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, and@pytest.mark.slowmarkers on test functions
Set test timeout to 30 seconds per test
Files:
tests/unit/core/test_project.pytests/unit/budget/test_cost_record.pytests/unit/core/test_task_transitions.pytests/unit/core/test_enums.pytests/unit/core/test_task.pytests/unit/budget/test_config.pytests/unit/core/conftest.pytests/unit/core/test_artifact.pytests/unit/budget/test_enums.pytests/unit/budget/test_hierarchy.pytests/unit/budget/conftest.pytests/unit/budget/test_spending_summary.py
{src,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Run
ruff checkandruff formaton all source and test Python files
Files:
tests/unit/core/test_project.pytests/unit/budget/test_cost_record.pytests/unit/core/test_task_transitions.pytests/unit/core/test_enums.pysrc/ai_company/core/task_transitions.pysrc/ai_company/budget/hierarchy.pytests/unit/core/test_task.pysrc/ai_company/core/task.pysrc/ai_company/budget/spending_summary.pytests/unit/budget/test_config.pysrc/ai_company/core/artifact.pysrc/ai_company/budget/enums.pytests/unit/core/conftest.pytests/unit/core/test_artifact.pysrc/ai_company/budget/__init__.pytests/unit/budget/test_enums.pytests/unit/budget/test_hierarchy.pysrc/ai_company/budget/config.pysrc/ai_company/budget/cost_record.pytests/unit/budget/conftest.pysrc/ai_company/core/__init__.pysrc/ai_company/core/project.pysrc/ai_company/core/enums.pytests/unit/budget/test_spending_summary.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Maintain minimum 80% code coverage across test suite
Runmypyin strict mode on all source Python files for type checking
Files:
src/ai_company/core/task_transitions.pysrc/ai_company/budget/hierarchy.pysrc/ai_company/core/task.pysrc/ai_company/budget/spending_summary.pysrc/ai_company/core/artifact.pysrc/ai_company/budget/enums.pysrc/ai_company/budget/__init__.pysrc/ai_company/budget/config.pysrc/ai_company/budget/cost_record.pysrc/ai_company/core/__init__.pysrc/ai_company/core/project.pysrc/ai_company/core/enums.py
🧠 Learnings (11)
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : All new code must have corresponding unit tests. When modifying existing code, update related tests. Tests should cover both happy paths and edge cases.
Applied to files:
tests/unit/core/test_project.pytests/unit/core/test_task.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/unit/test_*.py : Write unit tests for new functionality using pytest. Place test files in `tests/unit/` with `test_*.py` naming convention.
Applied to files:
tests/unit/core/test_project.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/unit/test_*.py : Write unit tests for new functionality using pytest in `tests/unit/` with `test_*.py` naming convention
Applied to files:
tests/unit/core/test_project.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
tests/unit/core/test_project.pytests/unit/budget/conftest.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/test*.py : Agent tests should cover: successful generation with valid output, handling malformed LLM responses, error conditions (network errors, timeouts), output format validation, and integration with story state
Applied to files:
tests/unit/core/test_task_transitions.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: All new code must have corresponding unit tests. When modifying existing code, update related tests. Tests should cover both happy paths and edge cases.
Applied to files:
tests/unit/core/test_task.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.
Applied to files:
tests/unit/core/test_task.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings
Applied to files:
tests/unit/core/conftest.py
📚 Learning: 2026-02-27T18:32:19.627Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-27T18:32:19.627Z
Learning: Applies to src/ai_company/tools/**/*.py : Use MCP integration for tool registry with role-based access control in the `tools/` module
Applied to files:
tests/unit/core/conftest.pysrc/ai_company/core/__init__.py
📚 Learning: 2026-02-27T18:32:19.626Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-27T18:32:19.626Z
Learning: Applies to **/*.py : Use Pydantic v2 `BaseModel`, `model_validator`, and `ConfigDict` for data models
Applied to files:
src/ai_company/budget/config.py
📚 Learning: 2026-02-27T18:32:19.627Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-27T18:32:19.627Z
Learning: Applies to src/ai_company/config/**/*.py : Use YAML for company configuration loading and validation in the `config/` module
Applied to files:
src/ai_company/budget/config.py
🧬 Code graph analysis (19)
tests/unit/core/test_project.py (3)
src/ai_company/core/enums.py (1)
ProjectStatus(180-187)src/ai_company/core/project.py (1)
Project(10-103)tests/unit/core/conftest.py (2)
ProjectFactory(134-135)sample_project(262-270)
tests/unit/budget/test_cost_record.py (1)
tests/unit/budget/conftest.py (2)
CostRecordFactory(68-69)sample_cost_record(112-122)
tests/unit/core/test_task_transitions.py (2)
src/ai_company/core/enums.py (1)
TaskStatus(122-140)src/ai_company/core/task_transitions.py (1)
validate_transition(49-67)
tests/unit/core/test_enums.py (1)
src/ai_company/core/enums.py (6)
CompanyType(72-82)Complexity(163-169)Priority(154-160)ProjectStatus(180-187)TaskStatus(122-140)TaskType(143-151)
src/ai_company/core/task_transitions.py (1)
src/ai_company/core/enums.py (1)
TaskStatus(122-140)
src/ai_company/budget/hierarchy.py (1)
src/ai_company/budget/spending_summary.py (2)
_validate_department_name_not_blank(148-153)_validate_unique_department_names(207-213)
tests/unit/core/test_task.py (2)
src/ai_company/core/artifact.py (1)
ExpectedArtifact(13-37)src/ai_company/core/task.py (2)
AcceptanceCriterion(11-36)Task(39-195)
src/ai_company/core/task.py (3)
src/ai_company/core/artifact.py (2)
ExpectedArtifact(13-37)_validate_non_blank_strings(82-88)src/ai_company/core/enums.py (4)
Complexity(163-169)Priority(154-160)TaskStatus(122-140)TaskType(143-151)src/ai_company/core/project.py (2)
_validate_non_blank_strings(65-77)_validate_no_empty_collection_entries(80-87)
src/ai_company/budget/spending_summary.py (2)
src/ai_company/budget/enums.py (1)
BudgetAlertLevel(6-16)src/ai_company/budget/hierarchy.py (2)
_validate_department_name_not_blank(77-82)_validate_unique_department_names(136-143)
tests/unit/budget/test_config.py (1)
src/ai_company/budget/config.py (3)
AutoDowngradeConfig(59-115)BudgetAlertConfig(12-56)BudgetConfig(118-178)
src/ai_company/core/artifact.py (3)
src/ai_company/core/enums.py (1)
ArtifactType(172-177)src/ai_company/core/project.py (1)
_validate_non_blank_strings(65-77)src/ai_company/core/task.py (1)
_validate_non_blank_strings(126-138)
tests/unit/core/conftest.py (5)
src/ai_company/core/artifact.py (2)
Artifact(40-88)ExpectedArtifact(13-37)src/ai_company/core/enums.py (10)
ArtifactType(172-177)Complexity(163-169)DepartmentName(108-119)MemoryType(48-54)Priority(154-160)ProficiencyLevel(99-105)SeniorityLevel(6-21)SkillCategory(85-96)TaskStatus(122-140)TaskType(143-151)src/ai_company/core/project.py (1)
Project(10-103)src/ai_company/core/role.py (5)
Authority(42-84)CustomRole(183-249)Role(123-180)SeniorityInfo(87-120)Skill(15-39)src/ai_company/core/task.py (2)
AcceptanceCriterion(11-36)Task(39-195)
tests/unit/core/test_artifact.py (3)
src/ai_company/core/artifact.py (2)
Artifact(40-88)ExpectedArtifact(13-37)src/ai_company/core/enums.py (1)
ArtifactType(172-177)tests/unit/core/conftest.py (2)
ExpectedArtifactFactory(116-117)ArtifactFactory(120-121)
src/ai_company/budget/__init__.py (5)
src/ai_company/budget/config.py (3)
AutoDowngradeConfig(59-115)BudgetAlertConfig(12-56)BudgetConfig(118-178)src/ai_company/budget/cost_record.py (1)
CostRecord(13-57)src/ai_company/budget/enums.py (1)
BudgetAlertLevel(6-16)src/ai_company/budget/hierarchy.py (2)
DepartmentBudget(47-108)TeamBudget(17-44)src/ai_company/budget/spending_summary.py (4)
AgentSpending(65-106)DepartmentSpending(109-153)PeriodSpending(16-62)SpendingSummary(156-213)
tests/unit/budget/test_enums.py (1)
src/ai_company/budget/enums.py (1)
BudgetAlertLevel(6-16)
tests/unit/budget/test_hierarchy.py (1)
src/ai_company/budget/hierarchy.py (3)
BudgetHierarchy(111-156)DepartmentBudget(47-108)TeamBudget(17-44)
tests/unit/budget/conftest.py (5)
src/ai_company/budget/config.py (3)
AutoDowngradeConfig(59-115)BudgetAlertConfig(12-56)BudgetConfig(118-178)src/ai_company/budget/cost_record.py (1)
CostRecord(13-57)src/ai_company/budget/enums.py (1)
BudgetAlertLevel(6-16)src/ai_company/budget/hierarchy.py (3)
BudgetHierarchy(111-156)DepartmentBudget(47-108)TeamBudget(17-44)src/ai_company/budget/spending_summary.py (4)
AgentSpending(65-106)DepartmentSpending(109-153)PeriodSpending(16-62)SpendingSummary(156-213)
src/ai_company/core/__init__.py (5)
src/ai_company/core/artifact.py (2)
Artifact(40-88)ExpectedArtifact(13-37)src/ai_company/core/enums.py (12)
ArtifactType(172-177)CompanyType(72-82)Complexity(163-169)CostTier(57-69)CreativityLevel(40-45)DepartmentName(108-119)Priority(154-160)ProficiencyLevel(99-105)ProjectStatus(180-187)SeniorityLevel(6-21)TaskStatus(122-140)TaskType(143-151)src/ai_company/core/project.py (1)
Project(10-103)src/ai_company/core/task.py (2)
AcceptanceCriterion(11-36)Task(39-195)src/ai_company/core/task_transitions.py (1)
validate_transition(49-67)
src/ai_company/core/project.py (3)
src/ai_company/core/enums.py (1)
ProjectStatus(180-187)src/ai_company/core/artifact.py (1)
_validate_non_blank_strings(82-88)src/ai_company/core/task.py (2)
_validate_non_blank_strings(126-138)_validate_no_empty_collection_entries(141-148)
🔇 Additional comments (25)
src/ai_company/budget/cost_record.py (1)
13-57: Solid domain-model implementation.Immutability and validation rules are well-scoped for append-only cost records.
tests/unit/budget/test_enums.py (1)
10-33: Good enum coverage for budget alert levels.Member existence and value assertions are clear and targeted.
src/ai_company/budget/enums.py (1)
6-16: Enum definition looks clean and consistent.The values and naming align well with the budget alert domain.
tests/unit/core/conftest.py (1)
116-270: Factory and fixture additions are coherent with new core models.The new test data builders are aligned with task/project/artifact validation constraints.
src/ai_company/budget/__init__.py (1)
8-40: Public export surface is well-structured.Re-exports are explicit and make the budget package API easier to consume.
tests/unit/core/test_artifact.py (1)
1-15: The 30-second timeout requirement is already satisfied through the global pytest configuration inpyproject.toml(timeout = 30in[tool.pytest.ini_options]). This setting applies to all tests in the project, including this file. Adding a module-level marker would be redundant.Likely an incorrect or invalid review comment.
tests/unit/budget/test_cost_record.py (1)
1-206: LGTM! Comprehensive test coverage for CostRecord.The test suite thoroughly covers validation rules (empty/whitespace strings, negative values, token-cost consistency), immutability enforcement, JSON serialization roundtrip, and factory usage. Proper pytest markers (
@pytest.mark.unit) and timeout configuration are in place.tests/unit/core/test_task_transitions.py (1)
11-132: LGTM! Thorough state machine test coverage.Excellent coverage of the task lifecycle transitions: all valid transitions tested individually, invalid transitions rejected with proper error message verification, and completeness checks ensure the transition map covers all
TaskStatusmembers with no self-transitions.src/ai_company/core/enums.py (1)
120-187: LGTM! Well-structured domain enumerations.All new enums follow the established
StrEnumpattern with consistent lowercase string values.TaskStatusincludes helpful documentation of valid transitions referencing the transitions module. The enum definitions align with DESIGN_SPEC requirements from issues#57and#61.tests/unit/budget/test_hierarchy.py (3)
249-258: Same potential error message mismatch for BudgetHierarchy.This test also matches
"exceeding 100%"— verify consistency with the actual error message in the_validate_department_budget_sumvalidator.
1-288: Overall test structure is solid.Good coverage of validation rules, boundary conditions, immutability, and factory usage across all three hierarchy models. The float precision tests (lines 163-173, 260-270) are particularly valuable for catching rounding issues.
152-161: No issue found—test assertion is correct.The error message in
hierarchy.pyline 105 containsf"exceeding {max_budget_percent:.0f}%", which produces"exceeding 100%"with the percent sign. The test assertion at line 154 correctly matches this pattern.Likely an incorrect or invalid review comment.
tests/unit/budget/test_config.py (1)
1-250: LGTM! Excellent test coverage for budget configuration models.The test suite comprehensively covers all three configuration models with:
- Default value verification
- Boundary condition testing (0, 100 for percentages)
- Cross-field validation (threshold ordering, limit vs. monthly constraints)
- Edge cases like zero-monthly budget skipping limit validation
- Immutability enforcement
- JSON roundtrip serialization
tests/unit/core/test_project.py (1)
26-212: LGTM! Well-organized test coverage for Project model.The helper function
_make_projectpromotes DRY test code. Coverage includes construction, string validation, duplicate detection, budget constraints, immutability, serialization, and fixture verification — aligned with retrieved learnings about covering happy paths and edge cases.tests/unit/core/test_task.py (2)
227-280: Excellent assignment/status consistency testing.These tests thoroughly verify the state machine invariants:
CREATEDmust have no assignee, active states (ASSIGNED,IN_PROGRESS,IN_REVIEW,COMPLETED) require an assignee, andBLOCKED/CANCELLEDallow either. This coverage is critical for lifecycle correctness.
37-389: LGTM! Comprehensive Task model test coverage.The test suite covers all 16 Task fields, validation rules, the AcceptanceCriterion sub-model, dependency/reviewer uniqueness, assignment-status consistency, and serialization. The tests align with the retrieved learning about covering happy paths and edge cases.
src/ai_company/core/task_transitions.py (1)
1-67: LGTM! Clean and well-documented state machine implementation.The transition map uses
frozensetfor immutability, aligns with the DESIGN_SPEC Section 6.1 lifecycle, and thevalidate_transitionfunction provides clear error messages including allowed transitions for debugging. The module docstring with the ASCII state diagram is excellent for maintainability.src/ai_company/budget/config.py (1)
1-177: LGTM!The budget configuration models are well-structured with proper Pydantic v2 patterns:
- Frozen models enforce immutability correctly
- Cross-field validations (threshold ordering, alias validation, limit constraints) are properly implemented
- Google-style docstrings are comprehensive and accurate
src/ai_company/core/artifact.py (1)
1-88: LGTM!The artifact models follow established validation patterns consistent with
TaskandProjectmodels. Frozen configuration,min_length=1constraints, and whitespace validation are properly implemented.src/ai_company/core/project.py (1)
1-103: LGTM!The
Projectmodel is well-designed with comprehensive validation:
- Consistent validation patterns with
Taskmodel- Proper duplicate detection for team members and task IDs
- Frozen model with appropriate defaults
src/ai_company/core/__init__.py (1)
11-97: LGTM!The public API surface is correctly extended to expose all new domain models and enums. The
__all__list maintains alphabetical ordering for consistency.tests/unit/budget/test_spending_summary.py (1)
1-272: LGTM!Comprehensive test coverage for spending summary models including:
- Valid construction and defaults
- Validation error cases (ordering, duplicates, blank strings)
- Immutability enforcement
- JSON serialization round-trips
- Factory-based instance creation
Test markers and timeout are properly configured per coding guidelines.
src/ai_company/core/task.py (1)
1-195: LGTM!The
Taskmodel is comprehensive with well-designed validation:
- All 16 fields per DESIGN_SPEC 6.2 are present
- Status/assignment consistency enforces state machine invariants
- Self-dependency and duplicate detection prevent invalid configurations
- Validation patterns are consistent with
ProjectandArtifactmodelssrc/ai_company/budget/spending_summary.py (1)
1-213: LGTM!The spending summary models are well-designed with:
- Proper period ordering validation
- Non-blank string validations consistent with
hierarchy.pypatterns- Unique identifier enforcement in aggregations
- Appropriate use of
BudgetAlertLevelfor alert statustests/unit/budget/conftest.py (1)
1-189: LGTM!Well-structured test configuration with:
- Factories for all budget-related models using
polyfactory- Nested factory references properly configured (e.g.,
alerts = BudgetAlertConfigFactory)- Comprehensive fixtures providing realistic test data
- Independent fixtures per learning: "Each test should be independent"
| def _validate_team_budget_sum(self) -> Self: | ||
| """Ensure team budget percentages do not exceed 100%.""" | ||
| max_budget_percent = 100.0 | ||
| total = sum(t.budget_percent for t in self.teams) | ||
| if round(total, _BUDGET_ROUNDING_PRECISION) > max_budget_percent: | ||
| msg = ( | ||
| f"Team budget allocations in department " | ||
| f"{self.department_name!r} sum to {total:.2f}%, " | ||
| f"exceeding {max_budget_percent:.0f}%" | ||
| ) | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
Enforce exact 100% totals to match allocation acceptance criteria.
Both validators only reject totals above 100%, so under-allocation still passes. The linked budget objective requires percentages to validate to 100%.
🔧 Proposed patch
@@
- def _validate_team_budget_sum(self) -> Self:
- """Ensure team budget percentages do not exceed 100%."""
+ def _validate_team_budget_sum(self) -> Self:
+ """Ensure team budget percentages sum to 100%."""
max_budget_percent = 100.0
total = sum(t.budget_percent for t in self.teams)
- if round(total, _BUDGET_ROUNDING_PRECISION) > max_budget_percent:
+ rounded_total = round(total, _BUDGET_ROUNDING_PRECISION)
+ if self.teams and rounded_total != max_budget_percent:
msg = (
f"Team budget allocations in department "
- f"{self.department_name!r} sum to {total:.2f}%, "
- f"exceeding {max_budget_percent:.0f}%"
+ f"{self.department_name!r} must sum to "
+ f"{max_budget_percent:.0f}%, got {total:.2f}%"
)
raise ValueError(msg)
return self
@@
- def _validate_department_budget_sum(self) -> Self:
- """Ensure department budget percentages do not exceed 100%."""
+ def _validate_department_budget_sum(self) -> Self:
+ """Ensure department budget percentages sum to 100%."""
max_budget_percent = 100.0
total = sum(d.budget_percent for d in self.departments)
- if round(total, _BUDGET_ROUNDING_PRECISION) > max_budget_percent:
+ rounded_total = round(total, _BUDGET_ROUNDING_PRECISION)
+ if self.departments and rounded_total != max_budget_percent:
msg = (
- f"Department budget allocations sum to {total:.2f}%, "
- f"exceeding {max_budget_percent:.0f}%"
+ f"Department budget allocations must sum to "
+ f"{max_budget_percent:.0f}%, got {total:.2f}%"
)
raise ValueError(msg)
return selfAlso applies to: 145-156
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/budget/hierarchy.py` around lines 97 - 108, The validator
_validate_team_budget_sum currently only rejects totals above 100% but should
enforce that team allocations sum to exactly 100% (after rounding). Change the
logic in _validate_team_budget_sum (and the analogous
_validate_department_budget_sum) to compute total = sum(t.budget_percent for t
in self.teams) (or departments), round it with _BUDGET_ROUNDING_PRECISION and
compare equality to max_budget_percent (100.0); if the rounded total !=
max_budget_percent raise a ValueError that includes self.department_name (or the
appropriate name) and the rounded total in the message so under- and
over-allocation both fail validation.
…, and Copilot - Add explicit key access in validate_transition (no more .get() fallback) - Add Task.with_transition() method to enforce state machine at model level - Add strict=True on BudgetAlertConfig/AutoDowngradeConfig int fields - Add self-downgrade validation to AutoDowngradeConfig - Use Counter for duplicate detection across 7 validators (consistency) - Extract shared BUDGET_ROUNDING_PRECISION to ai_company.constants - Document zero-budget bypass semantics in BudgetConfig - Fix docstrings: spec references, TaskStatus authoritative source note - Add pytestmark timeout markers to 5 core test files - Move factory import to module level in test_project.py
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive set of domain models for tasks, projects, budgets, and artifacts, complete with extensive unit tests. The code is well-structured, using Pydantic for data validation and creating immutable models, which is excellent practice. My feedback focuses on a few opportunities to improve maintainability and performance by consolidating validation logic within the Pydantic models and ensuring consistency in code style, such as sorting the __all__ export list.
| @model_validator(mode="after") | ||
| def _validate_no_empty_aliases(self) -> Self: | ||
| """Ensure no empty or whitespace-only alias strings.""" | ||
| for source, target in self.downgrade_map: | ||
| if not source.strip(): | ||
| msg = "Empty or whitespace-only source alias in downgrade_map" | ||
| raise ValueError(msg) | ||
| if not target.strip(): | ||
| msg = "Empty or whitespace-only target alias in downgrade_map" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_self_downgrade(self) -> Self: | ||
| """Ensure no alias downgrades to itself.""" | ||
| for source, target in self.downgrade_map: | ||
| if source == target: | ||
| msg = f"Self-downgrade in downgrade_map: {source!r} -> {target!r}" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_duplicate_source_aliases(self) -> Self: | ||
| """Ensure each source alias maps to exactly one target.""" | ||
| sources = [source for source, _ in self.downgrade_map] | ||
| if len(sources) != len(set(sources)): | ||
| dupes = sorted(s for s, c in Counter(sources).items() if c > 1) | ||
| msg = f"Duplicate source aliases in downgrade_map: {dupes}" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
The three separate model_validator decorators for downgrade_map iterate over the map three times. This can be inefficient for large maps. Combining them into a single validator that performs all checks in one loop would be more performant and groups all downgrade_map validation logic together, improving maintainability.
@model_validator(mode="after")
def _validate_downgrade_map(self) -> Self:
"""Validate the downgrade_map for correctness."""
sources: list[str] = []
for source, target in self.downgrade_map:
if not source.strip():
raise ValueError("Empty or whitespace-only source alias in downgrade_map")
if not target.strip():
raise ValueError("Empty or whitespace-only target alias in downgrade_map")
if source == target:
raise ValueError(f"Self-downgrade in downgrade_map: {source!r} -> {target!r}")
sources.append(source)
if len(sources) != len(set(sources)):
dupes = sorted(s for s, c in Counter(sources).items() if c > 1)
raise ValueError(f"Duplicate source aliases in downgrade_map: {dupes}")
return self| __all__ = [ | ||
| "BUILTIN_ROLES", | ||
| "SENIORITY_INFO", | ||
| "VALID_TRANSITIONS", | ||
| "AcceptanceCriterion", | ||
| "AgentIdentity", | ||
| "AgentStatus", | ||
| "Artifact", | ||
| "ArtifactType", | ||
| "Authority", | ||
| "Company", | ||
| "CompanyConfig", | ||
| "CompanyType", | ||
| "Complexity", | ||
| "CostTier", | ||
| "CreativityLevel", | ||
| "CustomRole", | ||
| "Department", | ||
| "DepartmentName", | ||
| "ExpectedArtifact", | ||
| "HRRegistry", | ||
| "MemoryConfig", | ||
| "MemoryType", | ||
| "ModelConfig", | ||
| "PersonalityConfig", | ||
| "Priority", | ||
| "ProficiencyLevel", | ||
| "Project", | ||
| "ProjectStatus", | ||
| "RiskTolerance", | ||
| "Role", | ||
| "SeniorityInfo", | ||
| "SeniorityLevel", | ||
| "Skill", | ||
| "SkillCategory", | ||
| "SkillSet", | ||
| "Task", | ||
| "TaskStatus", | ||
| "TaskType", | ||
| "Team", | ||
| "ToolPermissions", | ||
| "get_builtin_role", | ||
| "get_seniority_info", | ||
| "validate_transition", | ||
| ] |
There was a problem hiding this comment.
The __all__ list is not alphabetically sorted. Sorting it makes the code easier to read and maintain, especially as the number of exported symbols grows.
__all__ = [
"AcceptanceCriterion",
"AgentIdentity",
"AgentStatus",
"Artifact",
"ArtifactType",
"Authority",
"BUILTIN_ROLES",
"Company",
"CompanyConfig",
"CompanyType",
"Complexity",
"CostTier",
"CreativityLevel",
"CustomRole",
"Department",
"DepartmentName",
"ExpectedArtifact",
"HRRegistry",
"MemoryConfig",
"MemoryType",
"ModelConfig",
"PersonalityConfig",
"Priority",
"ProficiencyLevel",
"Project",
"ProjectStatus",
"RiskTolerance",
"Role",
"SENIORITY_INFO",
"SeniorityInfo",
"SeniorityLevel",
"Skill",
"SkillCategory",
"SkillSet",
"Task",
"TaskStatus",
"TaskType",
"Team",
"ToolPermissions",
"VALID_TRANSITIONS",
"get_builtin_role",
"get_seniority_info",
"validate_transition",
]| @model_validator(mode="after") | ||
| def _validate_non_blank_strings(self) -> Self: | ||
| """Ensure string identifier fields are not whitespace-only.""" | ||
| for field_name in ("id", "name"): | ||
| if not getattr(self, field_name).strip(): | ||
| msg = f"{field_name} must not be whitespace-only" | ||
| raise ValueError(msg) | ||
| if self.lead is not None and not self.lead.strip(): | ||
| msg = "lead must not be whitespace-only" | ||
| raise ValueError(msg) | ||
| if self.deadline is not None and not self.deadline.strip(): | ||
| msg = "deadline must not be whitespace-only" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_empty_collection_entries(self) -> Self: | ||
| """Ensure no empty or whitespace-only entries in string tuples.""" | ||
| for field_name in ("team", "task_ids"): | ||
| for value in getattr(self, field_name): | ||
| if not value.strip(): | ||
| msg = f"Empty or whitespace-only entry in {field_name}" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_duplicate_team_members(self) -> Self: | ||
| """Ensure no duplicate agent IDs in team.""" | ||
| if len(self.team) != len(set(self.team)): | ||
| dupes = sorted(m for m, c in Counter(self.team).items() if c > 1) | ||
| msg = f"Duplicate entries in team: {dupes}" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_duplicate_task_ids(self) -> Self: | ||
| """Ensure no duplicate task IDs.""" | ||
| if len(self.task_ids) != len(set(self.task_ids)): | ||
| dupes = sorted(t for t, c in Counter(self.task_ids).items() if c > 1) | ||
| msg = f"Duplicate entries in task_ids: {dupes}" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
The four separate model_validators can be combined into a single, more efficient validator. This avoids multiple function calls and groups all validation logic for the model in one place, improving performance and maintainability.
@model_validator(mode="after")
def _validate_fields(self) -> Self:
"""Validate string fields, collections, and uniqueness."""
# Non-blank strings
for field_name in ("id", "name"):
if not getattr(self, field_name).strip():
raise ValueError(f"{field_name} must not be whitespace-only")
if self.lead is not None and not self.lead.strip():
raise ValueError("lead must not be whitespace-only")
if self.deadline is not None and not self.deadline.strip():
raise ValueError("deadline must not be whitespace-only")
# No empty collection entries
for field_name in ("team", "task_ids"):
for value in getattr(self, field_name):
if not value.strip():
raise ValueError(f"Empty or whitespace-only entry in {field_name}")
# No duplicate team members
if len(self.team) != len(set(self.team)):
dupes = sorted(m for m, c in Counter(self.team).items() if c > 1)
raise ValueError(f"Duplicate entries in team: {dupes}")
# No duplicate task IDs
if len(self.task_ids) != len(set(self.task_ids)):
dupes = sorted(t for t, c in Counter(self.task_ids).items() if c > 1)
raise ValueError(f"Duplicate entries in task_ids: {dupes}")
return self| @model_validator(mode="after") | ||
| def _validate_non_blank_strings(self) -> Self: | ||
| """Ensure string identifier fields are not whitespace-only.""" | ||
| for field_name in ("id", "title", "description", "project", "created_by"): | ||
| if not getattr(self, field_name).strip(): | ||
| msg = f"{field_name} must not be whitespace-only" | ||
| raise ValueError(msg) | ||
| if self.assigned_to is not None and not self.assigned_to.strip(): | ||
| msg = "assigned_to must not be whitespace-only" | ||
| raise ValueError(msg) | ||
| if self.deadline is not None and not self.deadline.strip(): | ||
| msg = "deadline must not be whitespace-only" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_empty_collection_entries(self) -> Self: | ||
| """Ensure no empty or whitespace-only entries in string tuples.""" | ||
| for field_name in ("reviewers", "dependencies"): | ||
| for value in getattr(self, field_name): | ||
| if not value.strip(): | ||
| msg = f"Empty or whitespace-only entry in {field_name}" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_self_dependency(self) -> Self: | ||
| """Ensure a task does not depend on itself.""" | ||
| if self.id in self.dependencies: | ||
| msg = f"Task {self.id!r} cannot depend on itself" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_duplicate_dependencies(self) -> Self: | ||
| """Ensure no duplicate task IDs in dependencies.""" | ||
| if len(self.dependencies) != len(set(self.dependencies)): | ||
| dupes = sorted(d for d, c in Counter(self.dependencies).items() if c > 1) | ||
| msg = f"Duplicate entries in dependencies: {dupes}" | ||
| raise ValueError(msg) | ||
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def _validate_no_duplicate_reviewers(self) -> Self: | ||
| """Ensure no duplicate agent IDs in reviewers.""" | ||
| if len(self.reviewers) != len(set(self.reviewers)): | ||
| dupes = sorted(r for r, c in Counter(self.reviewers).items() if c > 1) | ||
| msg = f"Duplicate entries in reviewers: {dupes}" | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
The first five validators can be combined into a single, more efficient validator. This avoids multiple function calls and groups related validation logic, improving performance and maintainability.
@model_validator(mode="after")
def _validate_fields(self) -> Self:
"""Validate string fields, collections, and uniqueness of dependencies/reviewers."""
# Non-blank strings
for field_name in ("id", "title", "description", "project", "created_by"):
if not getattr(self, field_name).strip():
raise ValueError(f"{field_name} must not be whitespace-only")
if self.assigned_to is not None and not self.assigned_to.strip():
raise ValueError("assigned_to must not be whitespace-only")
if self.deadline is not None and not self.deadline.strip():
raise ValueError("deadline must not be whitespace-only")
# No empty collection entries
for field_name in ("reviewers", "dependencies"):
for value in getattr(self, field_name):
if not value.strip():
raise ValueError(f"Empty or whitespace-only entry in {field_name}")
# Dependency validation
if self.id in self.dependencies:
raise ValueError(f"Task {self.id!r} cannot depend on itself")
if len(self.dependencies) != len(set(self.dependencies)):
dupes = sorted(d for d, c in Counter(self.dependencies).items() if c > 1)
raise ValueError(f"Duplicate entries in dependencies: {dupes}")
# Reviewer validation
if len(self.reviewers) != len(set(self.reviewers)):
dupes = sorted(r for r, c in Counter(self.reviewers).items() if c > 1)
raise ValueError(f"Duplicate entries in reviewers: {dupes}")
return selfThere was a problem hiding this comment.
Pull request overview
Copilot reviewed 26 out of 27 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/ai_company/core/task.py
Outdated
| ValueError: If the transition is not valid. | ||
| """ | ||
| validate_transition(self.status, target) | ||
| return self.model_copy(update={"status": target, **overrides}) |
There was a problem hiding this comment.
Task.with_transition() uses model_copy(update=...), which does not re-run Pydantic validation on the updated fields. This can create Tasks that violate the model invariants (e.g., transitioning to ASSIGNED without providing assigned_to, or overriding fields with whitespace-only values) while still returning a Task instance. Consider reconstructing the model via Task.model_validate(...) / Task(**data) from a merged model_dump() + overrides so validators (including assignment/status consistency) are enforced on the returned object.
| return self.model_copy(update={"status": target, **overrides}) | |
| data = self.model_dump() | |
| data.update({"status": target, **overrides}) | |
| return Task.model_validate(data) |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/budget/config.py`:
- Around line 63-124: The validators (_validate_no_empty_aliases,
_validate_no_self_downgrade, _validate_no_duplicate_source_aliases) check for
whitespace-only aliases but do not normalize values, so aliases like " opus "
could bypass matching; add a normalization step in AutoDowngradeConfig that
strips whitespace from both source and target alias strings before other
validations run (e.g., implement a model_validator on AutoDowngradeConfig named
_normalize_downgrade_map with mode="before" or "plain" that transforms
downgrade_map into a tuple of stripped (source.strip(), target.strip()) pairs),
ensuring subsequent validators operate on normalized aliases.
In `@src/ai_company/core/task.py`:
- Around line 118-121: The deadline field currently allows any non-empty
string—add strict ISO 8601 validation on the Task model by replacing or
augmenting the Field approach for deadline with a Pydantic validator (e.g., a
`@validator`("deadline") or root_validator) that attempts to parse the value as an
ISO 8601 datetime (using datetime.fromisoformat or dateutil.parser.isoparse) and
raises a ValidationError for invalid strings; apply the same validator pattern
to the other datetime-like field(s) in the same model (the later time fields
referenced in the diff) so invalid dates are rejected at model validation time.
- Around line 201-219: The with_transition method currently uses
model_copy(update=...) which skips Pydantic v2 validators; change it to call
model_validate(update=...) so validators (including
_validate_assignment_consistency) run and prevent invalid states; ensure you set
the "status" key to the target before passing the merged overrides (i.e., merge
overrides but force "status": target so callers cannot override status via
**overrides) and call validate_transition(self.status, target) as before, then
return self.model_validate(update=merged_data).
In `@tests/unit/budget/test_config.py`:
- Around line 233-246: Add a symmetric unit test that verifies
per_agent_daily_limit equal to total_monthly is accepted: create a new test
function (e.g., test_per_agent_daily_equals_monthly_accepted) that constructs
BudgetConfig(total_monthly=10.0, per_agent_daily_limit=10.0) and asserts
cfg.per_agent_daily_limit == cfg.total_monthly; mirror the style and docstring
of test_per_task_equals_monthly_accepted to ensure consistent boundary coverage
for per_agent_daily_limit.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (15)
src/ai_company/budget/config.pysrc/ai_company/budget/hierarchy.pysrc/ai_company/budget/spending_summary.pysrc/ai_company/constants.pysrc/ai_company/core/company.pysrc/ai_company/core/enums.pysrc/ai_company/core/project.pysrc/ai_company/core/task.pysrc/ai_company/core/task_transitions.pytests/unit/budget/test_config.pytests/unit/core/test_artifact.pytests/unit/core/test_enums.pytests/unit/core/test_project.pytests/unit/core/test_task.pytests/unit/core/test_task_transitions.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Agent
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14+ has PEP 649 native lazy annotations
Include type hints on all public functions in Python
Use Google-style docstrings on all public classes and functions in Python
Create new objects instead of mutating existing ones — maintain immutability
Use Pydantic v2BaseModel,model_validator, andConfigDictfor data models
Enforce 88-character line length in Python code
Keep functions under 50 lines of code
Keep Python files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries — user input, external APIs, config files
Files:
src/ai_company/core/company.pysrc/ai_company/core/enums.pytests/unit/core/test_task.pysrc/ai_company/core/project.pysrc/ai_company/core/task_transitions.pytests/unit/core/test_project.pytests/unit/budget/test_config.pytests/unit/core/test_enums.pytests/unit/core/test_task_transitions.pysrc/ai_company/budget/spending_summary.pysrc/ai_company/budget/config.pytests/unit/core/test_artifact.pysrc/ai_company/core/task.pysrc/ai_company/budget/hierarchy.pysrc/ai_company/constants.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Maintain minimum 80% code coverage across test suite
Runmypyin strict mode on all source Python files for type checking
Files:
src/ai_company/core/company.pysrc/ai_company/core/enums.pysrc/ai_company/core/project.pysrc/ai_company/core/task_transitions.pysrc/ai_company/budget/spending_summary.pysrc/ai_company/budget/config.pysrc/ai_company/core/task.pysrc/ai_company/budget/hierarchy.pysrc/ai_company/constants.py
{src,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Run
ruff checkandruff formaton all source and test Python files
Files:
src/ai_company/core/company.pysrc/ai_company/core/enums.pytests/unit/core/test_task.pysrc/ai_company/core/project.pysrc/ai_company/core/task_transitions.pytests/unit/core/test_project.pytests/unit/budget/test_config.pytests/unit/core/test_enums.pytests/unit/core/test_task_transitions.pysrc/ai_company/budget/spending_summary.pysrc/ai_company/budget/config.pytests/unit/core/test_artifact.pysrc/ai_company/core/task.pysrc/ai_company/budget/hierarchy.pysrc/ai_company/constants.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, and@pytest.mark.slowmarkers on test functions
Set test timeout to 30 seconds per test
Files:
tests/unit/core/test_task.pytests/unit/core/test_project.pytests/unit/budget/test_config.pytests/unit/core/test_enums.pytests/unit/core/test_task_transitions.pytests/unit/core/test_artifact.py
🧠 Learnings (9)
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : All new code must have corresponding unit tests. When modifying existing code, update related tests. Tests should cover both happy paths and edge cases.
Applied to files:
tests/unit/core/test_task.pytests/unit/core/test_project.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: All new code must have corresponding unit tests. When modifying existing code, update related tests. Tests should cover both happy paths and edge cases.
Applied to files:
tests/unit/core/test_task.pytests/unit/core/test_project.py
📚 Learning: 2026-02-27T18:32:19.627Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-27T18:32:19.627Z
Learning: Applies to tests/**/*.py : Set test timeout to 30 seconds per test
Applied to files:
tests/unit/core/test_task.pytests/unit/core/test_project.pytests/unit/core/test_enums.pytests/unit/core/test_task_transitions.py
📚 Learning: 2026-02-27T18:32:19.627Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-27T18:32:19.627Z
Learning: Applies to tests/**/*.py : Use `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, and `pytest.mark.slow` markers on test functions
Applied to files:
tests/unit/core/test_task.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/*.py : Validate all LLM outputs by checking required fields, verifying data types, handling malformed responses, and using Pydantic models for structured data
Applied to files:
src/ai_company/core/project.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
tests/unit/core/test_project.py
📚 Learning: 2026-01-24T09:54:45.426Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/agents.instructions.md:0-0
Timestamp: 2026-01-24T09:54:45.426Z
Learning: Applies to agents/test*.py : Agent tests should cover: successful generation with valid output, handling malformed LLM responses, error conditions (network errors, timeouts), output format validation, and integration with story state
Applied to files:
tests/unit/core/test_task_transitions.pytests/unit/core/test_artifact.py
📚 Learning: 2026-02-27T18:32:19.627Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-27T18:32:19.627Z
Learning: Applies to src/ai_company/config/**/*.py : Use YAML for company configuration loading and validation in the `config/` module
Applied to files:
src/ai_company/budget/config.py
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to tests/**/*.py : Tests must use fake model names (e.g., `test-model:8b`, `fake-writer:latest`)—never use real model IDs from `RECOMMENDED_MODELS`.
Applied to files:
tests/unit/core/test_artifact.py
🧬 Code graph analysis (11)
tests/unit/core/test_task.py (2)
src/ai_company/core/artifact.py (1)
ExpectedArtifact(13-37)src/ai_company/core/task.py (3)
AcceptanceCriterion(13-38)Task(41-219)with_transition(201-219)
src/ai_company/core/project.py (3)
src/ai_company/core/enums.py (1)
ProjectStatus(183-190)src/ai_company/core/task.py (2)
_validate_non_blank_strings(128-140)_validate_no_empty_collection_entries(143-150)src/ai_company/core/artifact.py (1)
_validate_non_blank_strings(82-88)
src/ai_company/core/task_transitions.py (1)
src/ai_company/core/enums.py (1)
TaskStatus(122-143)
tests/unit/core/test_project.py (3)
src/ai_company/core/enums.py (1)
ProjectStatus(183-190)src/ai_company/core/project.py (1)
Project(11-106)tests/unit/core/conftest.py (2)
ProjectFactory(134-135)sample_project(262-270)
tests/unit/budget/test_config.py (2)
src/ai_company/budget/config.py (3)
AutoDowngradeConfig(63-124)BudgetAlertConfig(13-60)BudgetConfig(127-192)tests/unit/budget/conftest.py (4)
AutoDowngradeConfigFactory(37-40)BudgetAlertConfigFactory(30-34)BudgetConfigFactory(43-49)sample_budget_config(97-108)
tests/unit/core/test_enums.py (1)
src/ai_company/core/enums.py (7)
ArtifactType(175-180)CompanyType(72-82)Complexity(166-172)Priority(157-163)ProjectStatus(183-190)TaskStatus(122-143)TaskType(146-154)
tests/unit/core/test_task_transitions.py (2)
src/ai_company/core/enums.py (1)
TaskStatus(122-143)src/ai_company/core/task_transitions.py (1)
validate_transition(49-73)
src/ai_company/budget/spending_summary.py (2)
src/ai_company/budget/enums.py (1)
BudgetAlertLevel(6-16)src/ai_company/budget/hierarchy.py (2)
_validate_department_name_not_blank(76-81)_validate_unique_department_names(135-142)
tests/unit/core/test_artifact.py (3)
src/ai_company/core/artifact.py (2)
Artifact(40-88)ExpectedArtifact(13-37)src/ai_company/core/enums.py (1)
ArtifactType(175-180)tests/unit/core/conftest.py (2)
ExpectedArtifactFactory(116-117)ArtifactFactory(120-121)
src/ai_company/core/task.py (3)
src/ai_company/core/artifact.py (1)
ExpectedArtifact(13-37)src/ai_company/core/enums.py (4)
Complexity(166-172)Priority(157-163)TaskStatus(122-143)TaskType(146-154)src/ai_company/core/task_transitions.py (1)
validate_transition(49-73)
src/ai_company/budget/hierarchy.py (2)
src/ai_company/budget/spending_summary.py (2)
_validate_department_name_not_blank(150-155)_validate_unique_department_names(210-217)src/ai_company/core/company.py (1)
_validate_unique_team_names(91-98)
🔇 Additional comments (40)
src/ai_company/constants.py (1)
1-4: LGTM!Clean extraction of the shared constant with clear documentation explaining its purpose. Centralizing this avoids magic numbers scattered across the codebase.
src/ai_company/core/company.py (2)
9-9: LGTM!Good refactor to import the shared constant, eliminating duplication and ensuring consistent rounding precision across the codebase.
248-248: LGTM!Correctly uses the shared
BUDGET_ROUNDING_PRECISIONconstant for budget sum validation.src/ai_company/budget/spending_summary.py (5)
1-16: LGTM!Well-structured module with clear docstring explaining its purpose in relation to DESIGN_SPEC Section 10.3. Imports are clean and the
noqa: TC003comment appropriately documents whydatetimemust be imported at runtime.
18-65: LGTM!
PeriodSpendingcorrectly validates that the period start is strictly before the end, with a clear error message including ISO-formatted timestamps. The model is properly frozen and all numeric fields have appropriatege=0constraints.
67-109: LGTM!
AgentSpendingproperly validates non-blankagent_idwith bothmin_length=1constraint and whitespace-only check. Frozen model with appropriate field constraints.
111-156: LGTM!
DepartmentSpendingfollows the same validation pattern asAgentSpending, consistent with the similar validator inhierarchy.py(lines 75-80 in the relevant snippets).
158-217: LGTM!
SpendingSummaryproperly validates uniqueness of agent IDs and department names using theCounterpattern consistent with other modules. Thebudget_used_percentfield correctly has no upper bound, allowing reporting of overspending scenarios.src/ai_company/budget/hierarchy.py (4)
1-14: LGTM!Clean module setup implementing DESIGN_SPEC Section 10.1 with appropriate imports.
16-44: LGTM!
TeamBudgetcorrectly validates non-blankteam_namewith bothmin_length=1and whitespace check. Budget percent constrained to 0-100 range. Frozen for immutability.
110-155: LGTM!
BudgetHierarchycorrectly validates unique department names and enforces the budget sum ceiling at 100%. The docstring clearly documents that under-allocation is permitted for company-level reserves.
46-108: Design is correct—under-allocation intentionally allowed for reserves.Tests explicitly verify under-allocation is accepted:
test_team_budget_sum_under_100_accepted()confirms teams may sum to 50%, andtest_department_budget_sum_under_100_accepted()confirms departments may sum to 80%. The validator enforces only the maximum (≤ 100%), not a minimum, and docstrings document the intentional reserve behavior at both department and company levels.src/ai_company/budget/config.py (3)
1-11: LGTM!Clean module setup with appropriate imports. The docstring clearly references the design spec section being implemented.
13-60: Well-structured threshold configuration.Good use of
strict=Trueon the integer fields to reject float values, and themodel_validatorcorrectly enforces the ordering constraint. The defaults (75, 90, 100) are sensible and satisfy the validator.
127-192: LGTM!The top-level config correctly composes nested configs with
default_factoryand validates that per-task/per-agent limits don't exceed the monthly budget. The zero-budget bypass semantics are well-documented in the validator docstring (lines 171-173).tests/unit/budget/test_config.py (4)
1-18: LGTM!Proper module setup with
pytestmark = pytest.mark.timeout(30)applying the 30-second timeout to all tests as per coding guidelines.
23-91: Comprehensive coverage for BudgetAlertConfig.Good test coverage including boundary values (0, 100), ordering constraints, strict int validation, and immutability. The
matchparameter onpytest.raiseshelps ensure the correct validation is triggered.
96-187: Thorough validation tests for AutoDowngradeConfig.The tests cover all three validator paths: empty aliases, self-downgrade, and duplicate sources. Boundary tests for threshold (0, 100) are also included.
254-263: Good JSON roundtrip test.The test validates that nested configurations survive serialization/deserialization, which is important for config persistence scenarios.
tests/unit/core/test_artifact.py (2)
1-54: LGTM!Comprehensive test coverage for
ExpectedArtifactincluding construction, validation, immutability, and serialization. The timeout marker and unit markers are properly applied.
59-201: LGTM!Thorough test coverage for
Artifactmodel covering all validation scenarios (whitespace rejection for id, path, task_id, created_by), default values, datetime handling, immutability, factory usage, and JSON round-trip serialization.tests/unit/core/test_enums.py (2)
61-78: LGTM!Member count tests correctly verify the expected number of members for each new enum, providing a regression guard against accidental additions or removals.
105-145: LGTM!String value assertions comprehensively cover all members of the new enums, ensuring serialization consistency and matching the enum definitions.
tests/unit/core/test_task_transitions.py (3)
1-52: LGTM!Comprehensive coverage of all valid state transitions per DESIGN_SPEC 6.1, including the timeout marker that was previously flagged. Each transition is tested individually for clarity.
54-102: LGTM!Thorough invalid transition testing including terminal state enforcement and error message verification. The
test_error_message_includes_allowedtest ensures helpful error messages.
104-134: LGTM!Excellent completeness tests that guard against configuration drift—verifying all statuses have entries, terminal states have no outgoing transitions, all targets are valid TaskStatus values, and no self-transitions exist.
src/ai_company/core/enums.py (2)
121-143: LGTM!Well-documented
TaskStatusenum with comprehensive docstring referencing the authoritative transition map. The seven lifecycle states align with DESIGN_SPEC 6.1 requirements.
145-190: LGTM!The remaining enums (
TaskType,Priority,Complexity,ArtifactType,ProjectStatus) are cleanly defined with appropriate docstrings and lowercase string values consistent with the existing enum patterns.tests/unit/core/test_project.py (2)
1-24: LGTM!Good test setup with module-level factory import (addressing prior review feedback), timeout marker, and a clean helper function for creating test instances with sensible defaults.
29-213: LGTM!Comprehensive test coverage including construction, defaults, string validation, duplicate detection, budget constraints, immutability, factory usage, JSON serialization, and fixture verification. All validation scenarios from the
Projectmodel are well-tested.tests/unit/core/test_task.py (5)
1-34: LGTM!Good test setup with timeout marker (addressing prior review), comprehensive imports, and a clean helper function for creating test instances.
39-77: LGTM!Thorough
AcceptanceCriteriontesting including defaults, validation, immutability, factory usage, and JSON round-trip.
82-133: LGTM!Comprehensive
Taskconstruction tests covering minimal valid task, all fields set, and default values verification.
138-283: LGTM!Thorough validation testing including string field whitespace rejection, self-dependency prevention, duplicate detection for dependencies and reviewers, and assignment/status consistency rules. The tests correctly verify that BLOCKED and CANCELLED states allow flexible assignee handling.
288-433: LGTM!Excellent coverage of budget validation, immutability, factory usage, JSON serialization, fixture verification, and the
with_transitionmethod. Thewith_transitiontests properly verify state machine enforcement, immutability preservation, and error handling for invalid transitions.src/ai_company/core/task_transitions.py (2)
1-46: LGTM!Well-structured transition map using
frozensetfor immutability, with comprehensive coverage of allTaskStatusmembers. The module docstring clearly documents the state machine per DESIGN_SPEC 6.1.
49-73: LGTM!The
validate_transitionfunction is well-implemented with proper Google-style docstring, type hints, and descriptive error messages that include allowed transitions for debugging. The defensive check for missing entries (lines 60-65) guards against configuration drift.src/ai_company/core/project.py (2)
1-63: LGTM!Well-structured
Projectmodel with frozen configuration, appropriate field constraints (min_length,ge), and sensible defaults. The use oftuple[str, ...]for collections maintains immutability.
65-106: LGTM!Validators follow established patterns from
TaskandArtifactmodels. The use ofCounterfor duplicate detection is consistent across the codebase, and error messages are descriptive.src/ai_company/core/task.py (1)
152-176: Good integrity checks for dependency/reviewer collections.The self-dependency and duplicate detection validators are clear and enforce domain invariants well.
…d Copilot - Fix with_transition to use model_validate instead of model_copy, ensuring validators run on new instances and blocking status override via kwargs - Add ISO 8601 validation for deadline fields in Task and Project models - Consolidate multiple validators into fewer, more efficient validators - Normalize whitespace in AutoDowngradeConfig downgrade_map aliases - Add symmetric boundary test for per_agent_daily_limit == total_monthly - Add tests for deadline validation, status override rejection, and validator enforcement on transitions
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
core/): Task model with all 16 fields from DESIGN_SPEC 6.2, state machine validation (7 lifecycle states with enforced transitions), AcceptanceCriterion structured data, ExpectedArtifact/Artifact models, Project model with team/deadline/budget trackingbudget/): BudgetConfig with alert thresholds and auto-downgrade settings, BudgetHierarchy for Company→Department→Team allocation, immutable CostRecord for per-API-call tracking, SpendingSummary for aggregated reportingCloses #57, closes #61
Details
Task State Machine (DESIGN_SPEC 6.1)
Budget Hierarchy (DESIGN_SPEC 10.1)
Key Design Decisions
task_transitions.pyfor state machine reuse by engine layer (M3)tuple[tuple[str, str], ...]for downgrade_map (immutability over dict)datetimefor timestamps (type-safe, Pydantic handles ISO 8601)Test plan
uv run ruff check src/ tests/— all checks passeduv run mypy src/— no issues in 29 source filesuv run pytest tests/unit/ -m unit— 502 passed🤖 Generated with Claude Code