refactor: harden personality profiles, department validation, and template rendering#158
refactor: harden personality profiles, department validation, and template rendering#158
Conversation
, #79) Department structure (#78): - Add ReportingLine, ReviewRequirements, ApprovalChain, DepartmentPolicies, WorkflowHandoff, and EscalationPath frozen Pydantic models - Extend Department with reporting_lines and policies fields - Extend Company with workflow_handoffs and escalation_paths fields - Add ActionType enum for approval chain action types - Wire through config schema, defaults, template schema, and renderer Agent personality profiles (#79): - Extend PersonalityConfig with Big Five (OCEAN) float dimensions and behavioral enums (DecisionMakingStyle, CollaborationPreference, CommunicationVerbosity, ConflictApproach) - Add personality compatibility scoring module (compute_compatibility, compute_team_compatibility) with weighted composite algorithm - Expand presets from 4 to 15 with Big Five and behavioral dimensions - Enhance prompt template with new personality variables (v1.1.0) - Support inline personality dict override in template renderer Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…plate rendering Pre-reviewed by 10 agents, 32 findings addressed: - Add EscalationPath same-department validator with case-insensitive check - Add max_length constraints to PersonalityConfig description and communication_style - Add NaN rejection tests for Big Five dimensions - Wrap PERSONALITY_PRESETS and _AUTO_NAMES with MappingProxyType for immutability - Add logging before KeyError in get_personality_preset - Validate inline personality dicts eagerly via PersonalityConfig construction - Route workflow_handoffs/escalation_paths through _validate_list - Add type validation for reporting_lines (list) and policies (dict) in _build_departments - Extract _DEFAULT_PROVIDER and _DEFAULT_DEPARTMENT constants in renderer - Replace range(len()) nested loops with itertools.combinations in personality scoring - Fix stale docstrings in templates/schema.py and config/schema.py - Update DESIGN_SPEC.md §3.1 and §15.3 with personality dimensions - Consolidate duplicated metadata construction in build_error_prompt - Reorder test factories to fix DepartmentPoliciesFactory forward reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds personality modeling (Big Five floats + behavioral enums), compatibility scoring (pairwise & team), inline personality support in templates, cross-department governance models (reporting lines, approval chains, handoffs, escalations), new observability event constants, prompt/template integrations, and many tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Personality as core.personality
participant A as PersonalityConfig A
participant B as PersonalityConfig B
participant Observability
Client->>Personality: compute_compatibility(A, B)
Personality->>A: read Big Five & behavioral fields
Personality->>B: read Big Five & behavioral fields
Personality->>Personality: _big_five_score(...)
Personality->>Personality: _collaboration_score(...)
Personality->>Personality: _conflict_score(...)
Personality->>Personality: aggregate weights & clamp
Personality->>Observability: emit PERSONALITY_COMPATIBILITY_COMPUTED (components, score)
Personality-->>Client: return compatibility_score
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness and configurability of agent personality profiles, department structures, and template rendering within the system. It introduces detailed validation for inter-departmental workflows and escalations, strengthens personality configuration with new dimensions and constraints, and refines template processing for greater reliability and flexibility. The changes aim to improve the system's ability to define and manage complex organizational structures and agent behaviors. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive personality system for agents, including Big Five (OCEAN-variant) dimensions for internal compatibility scoring and behavioral enums (DecisionMakingStyle, CollaborationPreference, CommunicationVerbosity, ConflictApproach) for prompt injection. It adds a new PersonalityConfig model to ai_company.core.agent.py with validation for these new fields, and a personality.py module for calculating pairwise and team-level compatibility scores. The system prompt template (prompt_template.py) is updated to include these new behavioral enums, and the DESIGN_SPEC.md and CLAUDE.md documentation are updated to reflect these changes. Additionally, the PR extends company and department configuration with new models for ReportingLine, ReviewRequirements, ApprovalChain, DepartmentPolicies, WorkflowHandoff, and EscalationPath, allowing for more detailed organizational structure and inter-departmental workflows. Template rendering logic is updated to support inline personality configurations and pass through these new department and company-level settings. Review comments suggest refactoring a duplicate validation logic in ai_company.core.company.py into a shared utility and optimizing the compute_team_compatibility function in ai_company.core.personality.py for memory efficiency by using a generator instead of an intermediate list for pairs.
| @model_validator(mode="after") | ||
| def _validate_different_departments(self) -> Self: | ||
| """Reject escalations within the same department.""" | ||
| if ( | ||
| self.from_department.strip().casefold() | ||
| == self.to_department.strip().casefold() | ||
| ): | ||
| msg = ( | ||
| f"Escalation must be between different departments: " | ||
| f"{self.from_department!r} == {self.to_department!r}" | ||
| ) | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
| def compute_team_compatibility( | ||
| members: tuple[PersonalityConfig, ...], | ||
| ) -> float: | ||
| """Compute average pairwise compatibility for a team. | ||
|
|
||
| Args: | ||
| members: Tuple of personality profiles for team members. | ||
|
|
||
| Returns: | ||
| Average pairwise score (1.0 for teams with fewer than 2 members). | ||
| """ | ||
| if len(members) <= 1: | ||
| logger.debug( | ||
| PERSONALITY_TEAM_SCORE_COMPUTED, | ||
| team_size=len(members), | ||
| score=1.0, | ||
| ) | ||
| return 1.0 | ||
|
|
||
| pairs = list(itertools.combinations(members, 2)) | ||
| total = sum(compute_compatibility(a, b) for a, b in pairs) | ||
| result = total / len(pairs) | ||
|
|
||
| logger.debug( | ||
| PERSONALITY_TEAM_SCORE_COMPUTED, | ||
| team_size=len(members), | ||
| pair_count=len(pairs), | ||
| score=result, | ||
| ) | ||
| return result |
There was a problem hiding this comment.
This implementation can be made more memory-efficient by avoiding the creation of an intermediate list of all pairs, which can be large for teams with many members. The number of pairs can be calculated mathematically, and the sum can be performed over a generator.
def compute_team_compatibility(
members: tuple[PersonalityConfig, ...],
) -> float:
"""Compute average pairwise compatibility for a team.
Args:
members: Tuple of personality profiles for team members.
Returns:
Average pairwise score (1.0 for teams with fewer than 2 members).
"""
num_members = len(members)
if num_members <= 1:
logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=num_members,
score=1.0,
)
return 1.0
# Calculate pairs count mathematically and use a generator to avoid
# storing all pairs in memory, which is more efficient for large teams.
num_pairs = num_members * (num_members - 1) // 2
total_score = sum(
compute_compatibility(a, b) for a, b in itertools.combinations(members, 2)
)
result = total_score / num_pairs
logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=num_members,
pair_count=num_pairs,
score=result,
)
return resultThere was a problem hiding this comment.
Pull request overview
This PR expands the company template/config surface area to support richer department structures (reporting lines, policies, workflow handoffs, escalation paths) and significantly deepens agent personality modeling (Big Five dimensions, behavioral enums, presets, and compatibility scoring), while updating prompt rendering and documentation accordingly.
Changes:
- Add new department internal-structure models and cross-department workflow models, and wire them through config + template rendering.
- Extend
PersonalityConfigwith Big Five + behavioral enums, expand presets, and introduce compatibility scoring utilities + observability events. - Update prompt template/context to include new personality dimensions; update docs and add/extend unit tests.
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/templates/test_schema.py | Adds template schema tests for inline personality, reporting lines, and policies defaults/acceptance. |
| tests/unit/templates/test_renderer.py | Adds renderer-level tests for inline personality precedence, passthrough fields, and validation helpers. |
| tests/unit/templates/test_presets.py | Adds preset validation tests (count, Big Five keys, and PersonalityConfig constructability). |
| tests/unit/observability/test_events.py | Updates event module discovery list to include the new personality event domain. |
| tests/unit/engine/test_prompt.py | Verifies new personality dimensions appear in system prompts; asserts prompt template version bump. |
| tests/unit/core/test_personality.py | New tests for pairwise and team-level personality compatibility scoring behavior. |
| tests/unit/core/test_enums.py | Adds coverage for newly introduced enums (decision-making, collaboration, verbosity, conflict, action types). |
| tests/unit/core/test_company.py | Adds coverage for new company/department workflow and policy models and validators. |
| tests/unit/core/test_agent.py | Adds coverage for Big Five defaults/bounds/NaN rejection and new enum defaults. |
| tests/unit/core/conftest.py | Adds factories for new company policy/workflow models for test construction. |
| src/ai_company/templates/schema.py | Extends template schema to accept inline personality, reporting lines/policies, workflow handoffs, escalation paths. |
| src/ai_company/templates/renderer.py | Validates/threads new template fields into RootConfig-compatible dicts; adds inline personality validation. |
| src/ai_company/templates/presets.py | Expands preset library, makes preset/name registries immutable, and adds logging for unknown presets. |
| src/ai_company/observability/events/personality.py | Introduces personality-related event constants for scoring observability. |
| src/ai_company/engine/prompt_template.py | Bumps template version and renders new personality dimensions into the prompt. |
| src/ai_company/engine/prompt.py | Injects new personality dimensions into prompt context; refactors error prompt metadata construction. |
| src/ai_company/core/personality.py | Adds compatibility scoring implementation for personality profiles and teams. |
| src/ai_company/core/enums.py | Adds behavioral enums and ActionType enum used by policy/workflow features. |
| src/ai_company/core/company.py | Adds reporting lines, department policies, workflow handoffs, escalation paths to core company models. |
| src/ai_company/core/agent.py | Extends PersonalityConfig with Big Five floats, behavioral enums, max lengths, and NaN rejection. |
| src/ai_company/core/init.py | Re-exports newly added core enums/models for public API convenience. |
| src/ai_company/config/schema.py | Extends RootConfig to include workflow handoffs and escalation paths. |
| src/ai_company/config/defaults.py | Adds defaults for new RootConfig list fields (handoffs/escalations). |
| DESIGN_SPEC.md | Updates design spec with new personality dimensions, modules, and template examples. |
| CLAUDE.md | Updates repository structure description for templates/presets scope. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/ai_company/core/enums.py
Outdated
| class ActionType(StrEnum): | ||
| """Convenience constants for common approval action types. | ||
|
|
||
| Fields use ``NotBlankStr`` for extensibility — custom values are valid. |
There was a problem hiding this comment.
The ActionType docstring says “Fields use NotBlankStr for extensibility”, but this is an enum and has no fields using NotBlankStr. Consider rewording to clarify that models use NotBlankStr for action_type values (and ActionType is just a convenience set of common strings).
| Fields use ``NotBlankStr`` for extensibility — custom values are valid. | |
| Models typically use ``NotBlankStr`` for ``action_type`` fields, so these | |
| are optional helper constants and custom string values remain valid. |
src/ai_company/templates/renderer.py
Outdated
| } | ||
|
|
||
| for key in ("workflow_handoffs", "escalation_paths"): | ||
| if rendered_data.get(key): |
There was a problem hiding this comment.
rendered_data.get(key) is used as a truthiness guard. This will silently skip validation when the user explicitly provides a falsy-but-invalid value (e.g., ""), and it will also treat an explicit empty list as “not provided”. Prefer checking key presence / is not None and always running _validate_list when the field is present so invalid values can’t be masked by defaults.
| if rendered_data.get(key): | |
| if key in rendered_data and rendered_data[key] is not None: |
src/ai_company/templates/renderer.py
Outdated
| inline_personality = agent.get("personality") | ||
| preset_name = agent.get("personality_preset") | ||
| if preset_name: | ||
| if inline_personality and isinstance(inline_personality, dict): |
There was a problem hiding this comment.
Inline personality handling uses if inline_personality and isinstance(..., dict). This ignores explicit empty dicts (which should still validate) and also silently ignores invalid falsy values (e.g., empty string) instead of raising a template error. Consider branching on whether the key exists / value is not None, and raise TemplateRenderError when personality is provided but not a mapping.
| if inline_personality and isinstance(inline_personality, dict): | |
| has_inline_personality = "personality" in agent and agent["personality"] is not None | |
| if has_inline_personality: | |
| if not isinstance(inline_personality, dict): | |
| msg = ( | |
| f"Personality for agent {name!r} must be a mapping, " | |
| f"got {type(inline_personality).__name__}" | |
| ) | |
| raise TemplateRenderError(msg) |
src/ai_company/templates/renderer.py
Outdated
| """ | ||
| try: | ||
| PersonalityConfig(**personality) | ||
| except Exception as exc: |
There was a problem hiding this comment.
_validate_inline_personality catches Exception, which will also intercept KeyboardInterrupt/SystemExit and other unexpected exceptions. Narrow this to the expected validation failures (e.g., pydantic.ValidationError, TypeError, ValueError) and let genuinely unexpected exceptions propagate normally.
| except Exception as exc: | |
| except (ValidationError, TypeError, ValueError) as exc: |
| reporting_lines = dept.get("reporting_lines") | ||
| if reporting_lines: | ||
| if not isinstance(reporting_lines, list): | ||
| dept_name = dept.get("name", "") | ||
| msg = f"Department {dept_name!r} 'reporting_lines' must be a list" | ||
| raise TemplateRenderError(msg) | ||
| dept_dict["reporting_lines"] = reporting_lines | ||
| policies = dept.get("policies") | ||
| if policies: | ||
| if not isinstance(policies, dict): | ||
| dept_name = dept.get("name", "") | ||
| msg = f"Department {dept_name!r} 'policies' must be a mapping" | ||
| raise TemplateRenderError(msg) | ||
| dept_dict["policies"] = policies |
There was a problem hiding this comment.
Department extension validation is guarded by if reporting_lines: / if policies:. This means explicit falsy-but-invalid values (e.g., "", 0, [], {}) can bypass type validation and be silently dropped. Prefer validating whenever the key is present (or value is not None), and allow empty lists/dicts as valid values.
src/ai_company/core/personality.py
Outdated
| if len(members) <= 1: | ||
| logger.debug( | ||
| PERSONALITY_TEAM_SCORE_COMPUTED, | ||
| team_size=len(members), | ||
| score=1.0, | ||
| ) | ||
| return 1.0 | ||
|
|
||
| pairs = list(itertools.combinations(members, 2)) | ||
| total = sum(compute_compatibility(a, b) for a, b in pairs) | ||
| result = total / len(pairs) | ||
|
|
||
| logger.debug( | ||
| PERSONALITY_TEAM_SCORE_COMPUTED, | ||
| team_size=len(members), | ||
| pair_count=len(pairs), |
There was a problem hiding this comment.
compute_team_compatibility materializes all pair combinations into a list solely to compute len(pairs) for logging. This adds unnecessary memory overhead; you can iterate the combinations generator directly and compute pair_count as n*(n-1)//2 (or increment a counter while summing) to keep it O(1) memory.
| if len(members) <= 1: | |
| logger.debug( | |
| PERSONALITY_TEAM_SCORE_COMPUTED, | |
| team_size=len(members), | |
| score=1.0, | |
| ) | |
| return 1.0 | |
| pairs = list(itertools.combinations(members, 2)) | |
| total = sum(compute_compatibility(a, b) for a, b in pairs) | |
| result = total / len(pairs) | |
| logger.debug( | |
| PERSONALITY_TEAM_SCORE_COMPUTED, | |
| team_size=len(members), | |
| pair_count=len(pairs), | |
| team_size = len(members) | |
| if team_size <= 1: | |
| logger.debug( | |
| PERSONALITY_TEAM_SCORE_COMPUTED, | |
| team_size=team_size, | |
| score=1.0, | |
| ) | |
| return 1.0 | |
| pair_count = team_size * (team_size - 1) // 2 | |
| total = sum( | |
| compute_compatibility(a, b) | |
| for a, b in itertools.combinations(members, 2) | |
| ) | |
| result = total / pair_count | |
| logger.debug( | |
| PERSONALITY_TEAM_SCORE_COMPUTED, | |
| team_size=team_size, | |
| pair_count=pair_count, |
Greptile SummaryThis PR successfully hardens the Key changes:
All exception handling, immutability wrapping, and template routing have been properly implemented. The code passes 2590 tests at 96.36% coverage with strict mypy/ruff compliance. Confidence Score: 4/5
Class Diagram%%{init: {'theme': 'neutral'}}%%
classDiagram
class PersonalityConfig {
+float openness
+float conscientiousness
+float extraversion
+float agreeableness
+float stress_response
+DecisionMakingStyle decision_making
+CollaborationPreference collaboration
+CommunicationVerbosity verbosity
+ConflictApproach conflict_approach
+str description
+str communication_style
}
class Company {
+tuple~Department~ departments
+tuple~WorkflowHandoff~ workflow_handoffs
+tuple~EscalationPath~ escalation_paths
+_validate_departments()
}
class Department {
+NotBlankStr name
+NotBlankStr head
+float budget_percent
+tuple~Team~ teams
+tuple~ReportingLine~ reporting_lines
+DepartmentPolicies policies
+_validate_unique_team_names()
+_validate_unique_subordinates()
}
class DepartmentPolicies {
+ReviewRequirements review_requirements
+tuple~ApprovalChain~ approval_chains
+_validate_unique_action_types()
}
class ReportingLine {
+NotBlankStr subordinate
+NotBlankStr supervisor
+_validate_not_self_report()
}
class ApprovalChain {
+NotBlankStr action_type
+tuple~NotBlankStr~ approvers
+int min_approvals
+_validate_approvers()
}
class ReviewRequirements {
+int min_reviewers
+tuple~NotBlankStr~ required_reviewer_roles
+bool self_review_allowed
}
class WorkflowHandoff {
+NotBlankStr from_department
+NotBlankStr to_department
+NotBlankStr trigger
+tuple~NotBlankStr~ artifacts
+_validate_different_departments()
}
class EscalationPath {
+NotBlankStr from_department
+NotBlankStr to_department
+NotBlankStr condition
+int priority_boost
+_validate_different_departments()
}
Company "1" *-- "many" Department
Company "1" *-- "many" WorkflowHandoff
Company "1" *-- "many" EscalationPath
Department "1" *-- "many" ReportingLine
Department "1" *-- "1" DepartmentPolicies
DepartmentPolicies "1" *-- "1" ReviewRequirements
DepartmentPolicies "1" *-- "many" ApprovalChain
Last reviewed commit: 57f6d50 |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/observability/test_events.py (1)
93-112: 🧹 Nitpick | 🔵 TrivialAdd exact-value coverage for the new personality events.
This only proves the module is discoverable. A rename inside
ai_company.observability.events.personalitywould still pass, unlike the other public event domains covered in this file.💡 Proposed fix
from ai_company.observability.events.prompt import PROMPT_BUILD_START +from ai_company.observability.events.personality import ( + PERSONALITY_COMPATIBILITY_COMPUTED, + PERSONALITY_TEAM_SCORE_COMPUTED, +) from ai_company.observability.events.provider import ( PROVIDER_CALL_START, PROVIDER_REGISTRY_BUILT, ) @@ def test_all_domain_modules_discovered(self) -> None: """Every expected domain module is found by pkgutil discovery.""" expected = { "budget", "config", @@ discovered = {info.name for info in pkgutil.iter_modules(events.__path__)} assert discovered == expected + + def test_personality_events_exist(self) -> None: + assert PERSONALITY_COMPATIBILITY_COMPUTED == ( + "personality.compatibility.computed" + ) + assert PERSONALITY_TEAM_SCORE_COMPUTED == ( + "personality.team_score.computed" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/observability/test_events.py` around lines 93 - 112, The test test_all_domain_modules_discovered only checks the personality module is discoverable but not that its public event names match the expected API; import ai_company.observability.events.personality (events.personality) in this test and add an exact-value assertion that the module's exported event identifiers (use the module's __all__ if present, otherwise filter dir(events.personality) for public names) equal the expected set of personality event names so a rename would fail the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/core/company.py`:
- Around line 409-416: The workflow_handoffs and escalation_paths validators
currently only check from_department != to_department and can accept misspelled
or unknown department names; update the Company model validation to first build
a normalized set of declared department names (e.g., lowercased/trimmed) from
the Company.departments list, then validate each WorkflowHandoff and
EscalationPath entry (referencing workflow_handoffs and escalation_paths) by
normalizing their from_department and to_department and rejecting entries whose
endpoints are not present in the normalized department set; also apply the same
normalization when checking for same-department and dedupe logic so comparisons
are consistent and unknown/misspelled departments are rejected.
- Around line 286-297: The validator _validate_unique_subordinates currently
compares raw subordinate strings so names like "Alice" and " alice " slip
through; normalize subordinate names (e.g., s.strip().casefold()) before
building subs and before passing to len(set(...)) and Counter(...) to detect
duplicates. Update the subs list creation (reporting_lines -> subordinate) to
use normalized values for comparison and duplicate detection, while optionally
mapping back to the original names for the error message; keep the function name
_validate_unique_subordinates, use reporting_lines and subordinate to locate the
code, and reuse Counter(...) on the normalized list.
In `@src/ai_company/templates/renderer.py`:
- Around line 417-421: The try/except around PersonalityConfig(**personality) is
too broad; replace the bare Exception catch with the specific ValidationError
(the alias already imported) so only pydantic validation failures are handled.
Update the except clause to "except ValidationError as exc:" and keep raising
TemplateRenderError(msg) from exc, preserving the constructed msg that includes
agent_name and the original exception details.
---
Outside diff comments:
In `@tests/unit/observability/test_events.py`:
- Around line 93-112: The test test_all_domain_modules_discovered only checks
the personality module is discoverable but not that its public event names match
the expected API; import ai_company.observability.events.personality
(events.personality) in this test and add an exact-value assertion that the
module's exported event identifiers (use the module's __all__ if present,
otherwise filter dir(events.personality) for public names) equal the expected
set of personality event names so a rename would fail the test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: bbdcc797-2948-4a43-b1b4-880c6a0842f7
📒 Files selected for processing (25)
CLAUDE.mdDESIGN_SPEC.mdsrc/ai_company/config/defaults.pysrc/ai_company/config/schema.pysrc/ai_company/core/__init__.pysrc/ai_company/core/agent.pysrc/ai_company/core/company.pysrc/ai_company/core/enums.pysrc/ai_company/core/personality.pysrc/ai_company/engine/prompt.pysrc/ai_company/engine/prompt_template.pysrc/ai_company/observability/events/personality.pysrc/ai_company/templates/presets.pysrc/ai_company/templates/renderer.pysrc/ai_company/templates/schema.pytests/unit/core/conftest.pytests/unit/core/test_agent.pytests/unit/core/test_company.pytests/unit/core/test_enums.pytests/unit/core/test_personality.pytests/unit/engine/test_prompt.pytests/unit/observability/test_events.pytests/unit/templates/test_presets.pytests/unit/templates/test_renderer.pytests/unit/templates/test_schema.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649
Useexcept A, B:syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14
Include type hints on all public functions; enforce with mypy strict mode
Use Google-style docstrings on all public classes and functions — enforced by ruff D rules
Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction +MappingProxyTypewrapping. Fordict/listfields in frozen Pydantic models, rely onfrozen=Trueandcopy.deepcopy()at system boundaries
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values; useNotBlankStrfromcore.typesfor all identifier/name fields
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code; prefer structured concurrency over barecreate_task
Maintain line length of 88 characters — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate input at system boundaries: user input, external APIs, and config files
Files:
src/ai_company/core/enums.pytests/unit/templates/test_renderer.pytests/unit/core/test_personality.pysrc/ai_company/observability/events/personality.pytests/unit/engine/test_prompt.pysrc/ai_company/templates/renderer.pytests/unit/core/conftest.pytests/unit/core/test_enums.pysrc/ai_company/config/schema.pysrc/ai_company/core/agent.pysrc/ai_company/engine/prompt.pytests/unit/templates/test_presets.pysrc/ai_company/config/defaults.pysrc/ai_company/core/company.pytests/unit/core/test_company.pysrc/ai_company/templates/presets.pytests/unit/observability/test_events.pysrc/ai_company/engine/prompt_template.pysrc/ai_company/core/personality.pytests/unit/templates/test_schema.pysrc/ai_company/templates/schema.pytests/unit/core/test_agent.pysrc/ai_company/core/__init__.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic MUST have:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code — useget_logger()from observability module
Use variable namelogger(not_logger, notlog) for logging instances
Always use event name constants fromai_company.observability.eventsdomain-specific modules; import directly (e.g.,from ai_company.observability.events.<domain> import EVENT_CONSTANT)
Use structured logging format:logger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG level for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Files:
src/ai_company/core/enums.pysrc/ai_company/observability/events/personality.pysrc/ai_company/templates/renderer.pysrc/ai_company/config/schema.pysrc/ai_company/core/agent.pysrc/ai_company/engine/prompt.pysrc/ai_company/config/defaults.pysrc/ai_company/core/company.pysrc/ai_company/templates/presets.pysrc/ai_company/engine/prompt_template.pysrc/ai_company/core/personality.pysrc/ai_company/templates/schema.pysrc/ai_company/core/__init__.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like
example-provider,example-large-001,large/medium/smallas aliases, ortest-provider,test-small-001in tests
Files:
src/ai_company/core/enums.pysrc/ai_company/observability/events/personality.pysrc/ai_company/templates/renderer.pysrc/ai_company/config/schema.pysrc/ai_company/core/agent.pysrc/ai_company/engine/prompt.pysrc/ai_company/config/defaults.pysrc/ai_company/core/company.pysrc/ai_company/templates/presets.pysrc/ai_company/engine/prompt_template.pysrc/ai_company/core/personality.pysrc/ai_company/templates/schema.pysrc/ai_company/core/__init__.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Maintain 80% minimum code coverage — enforced in CI
Useasyncio_mode = "auto"in pytest configuration — no manual@pytest.mark.asyncioneeded
Set 30-second timeout per test
Usepytest-xdistvia-n autofor parallel test execution
Prefer@pytest.mark.parametrizefor testing similar cases
Files:
tests/unit/templates/test_renderer.pytests/unit/core/test_personality.pytests/unit/engine/test_prompt.pytests/unit/core/conftest.pytests/unit/core/test_enums.pytests/unit/templates/test_presets.pytests/unit/core/test_company.pytests/unit/observability/test_events.pytests/unit/templates/test_schema.pytests/unit/core/test_agent.py
src/ai_company/{providers,engine}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
RetryExhaustedErrorsignals that all retries failed — the engine layer catches this to trigger fallback chains
Files:
src/ai_company/engine/prompt.pysrc/ai_company/engine/prompt_template.py
🧠 Learnings (2)
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` domain-specific modules; import directly (e.g., `from ai_company.observability.events.<domain> import EVENT_CONSTANT`)
Applied to files:
src/ai_company/observability/events/personality.pytests/unit/observability/test_events.pysrc/ai_company/core/__init__.py
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves
Applied to files:
src/ai_company/core/agent.py
🧬 Code graph analysis (13)
tests/unit/templates/test_renderer.py (4)
src/ai_company/templates/renderer.py (4)
_expand_single_agent(361-401)_build_departments(424-465)_build_config_dict(245-296)_validate_list(299-317)src/ai_company/core/enums.py (1)
CompanyType(72-82)src/ai_company/templates/schema.py (3)
CompanyTemplate(190-293)TemplateAgentConfig(77-116)TemplateMetadata(156-187)src/ai_company/templates/errors.py (1)
TemplateRenderError(14-20)
tests/unit/core/test_personality.py (3)
src/ai_company/core/agent.py (1)
PersonalityConfig(26-139)src/ai_company/core/enums.py (2)
CollaborationPreference(244-249)ConflictApproach(260-267)src/ai_company/core/personality.py (2)
compute_compatibility(56-85)compute_team_compatibility(88-117)
tests/unit/engine/test_prompt.py (4)
src/ai_company/core/enums.py (5)
CollaborationPreference(244-249)CommunicationVerbosity(252-257)ConflictApproach(260-267)CreativityLevel(40-45)DecisionMakingStyle(235-241)tests/unit/engine/conftest.py (1)
sample_agent_with_personality(52-79)src/ai_company/core/agent.py (3)
AgentIdentity(263-321)ModelConfig(162-191)PersonalityConfig(26-139)src/ai_company/engine/prompt.py (1)
build_system_prompt(143-209)
src/ai_company/templates/renderer.py (3)
src/ai_company/core/agent.py (1)
PersonalityConfig(26-139)src/ai_company/templates/presets.py (1)
get_personality_preset(291-313)src/ai_company/templates/errors.py (1)
TemplateRenderError(14-20)
tests/unit/core/conftest.py (1)
src/ai_company/core/company.py (7)
ApprovalChain(67-98)Department(237-297)DepartmentPolicies(101-128)EscalationPath(169-203)HRRegistry(333-368)ReportingLine(16-38)ReviewRequirements(41-64)
tests/unit/core/test_enums.py (1)
src/ai_company/core/enums.py (5)
ActionType(270-281)CollaborationPreference(244-249)CommunicationVerbosity(252-257)ConflictApproach(260-267)DecisionMakingStyle(235-241)
src/ai_company/config/schema.py (1)
src/ai_company/core/company.py (2)
EscalationPath(169-203)WorkflowHandoff(134-166)
src/ai_company/core/agent.py (1)
src/ai_company/core/enums.py (4)
CollaborationPreference(244-249)CommunicationVerbosity(252-257)ConflictApproach(260-267)DecisionMakingStyle(235-241)
tests/unit/templates/test_presets.py (2)
src/ai_company/core/agent.py (1)
PersonalityConfig(26-139)src/ai_company/templates/presets.py (1)
get_personality_preset(291-313)
src/ai_company/templates/presets.py (1)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)
src/ai_company/core/personality.py (3)
src/ai_company/core/enums.py (2)
CollaborationPreference(244-249)ConflictApproach(260-267)src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/core/agent.py (1)
PersonalityConfig(26-139)
tests/unit/templates/test_schema.py (2)
src/ai_company/templates/schema.py (3)
TemplateAgentConfig(77-116)TemplateDepartmentConfig(119-153)CompanyTemplate(190-293)tests/unit/templates/conftest.py (1)
make_template_dict(123-125)
tests/unit/core/test_agent.py (2)
src/ai_company/core/enums.py (6)
CollaborationPreference(244-249)CommunicationVerbosity(252-257)ConflictApproach(260-267)CreativityLevel(40-45)DecisionMakingStyle(235-241)RiskTolerance(32-37)src/ai_company/core/agent.py (1)
PersonalityConfig(26-139)
🔇 Additional comments (70)
src/ai_company/engine/prompt_template.py (1)
13-13: Good template contract update.The version bump is paired with the new personality fields, so downstream code can detect the prompt-shape change cleanly.
Also applies to: 89-95
src/ai_company/engine/prompt.py (1)
304-307: Nice consolidation.Wiring the new personality enum values through
_build_core_context()and reusing_build_metadata()in the error path keeps the prompt contract centralized instead of duplicating field mappings.Also applies to: 652-659
src/ai_company/observability/events/personality.py (1)
1-6: LGTM.The new domain module keeps personality events discoverable, and the constant values match the existing dot-separated naming scheme.
tests/unit/templates/test_presets.py (1)
44-67: LGTM!The new tests comprehensively validate the expanded personality presets:
test_preset_count_at_least_15ensures the preset library meets minimum requirements.test_all_presets_produce_valid_personality_configvalidates schema compatibility.test_presets_include_big_fiveensures all presets include the required Big Five dimensions.Good use of the existing test class structure and consistent assertion messages.
tests/unit/templates/test_schema.py (3)
99-109: LGTM!Good coverage for the new
personalityfield inTemplateAgentConfig:
- Verifies inline personality dict acceptance.
- Confirms default
Nonevalue when not provided.
148-168: LGTM!Thorough test coverage for new
TemplateDepartmentConfigfields:
reporting_linesdefaults to empty tuple.policiesdefaults toNone.- Both fields accept valid input correctly.
353-382: LGTM!Good coverage for
CompanyTemplateworkflow fields:
workflow_handoffsandescalation_pathsdefault to empty tuples.workflow_handoffsaccepts valid handoff dicts.src/ai_company/core/enums.py (1)
233-281: LGTM!The new enums are well-designed:
- Consistent with existing
StrEnumpatterns in this file.- Docstrings follow Google-style conventions.
ActionTypecorrectly documents extensibility viaNotBlankStr.- All values use lowercase snake_case consistently.
src/ai_company/templates/schema.py (3)
90-112: LGTM!The
personalityfield addition is well-integrated:
- Provides an inline alternative to
personality_preset.- Uses
dict[str, Any] | Noneappropriately for raw config that will be validated during rendering.- Docstring updated to reflect the new field.
122-153: LGTM!Good additions to
TemplateDepartmentConfig:
reporting_linesandpoliciesfields enable richer department structure configuration.- Default values (empty tuple,
None) are sensible.- Docstring accurately reflects the new attributes.
211-255: LGTM!The
workflow_handoffsandescalation_pathsfields complete the cross-department workflow support at the template level:
- Consistent with the existing pattern of using
tuple[dict[str, Any], ...]for template-level raw config.- Defaults to empty tuples appropriately.
- Docstring updated to document new attributes.
tests/unit/core/test_enums.py (3)
6-16: LGTM!Imports for new enums are correctly added and alphabetically ordered.
84-97: LGTM!Member count tests correctly validate the expected number of members for each new enum. This helps catch accidental additions or removals.
168-235: LGTM!Excellent use of
@pytest.mark.parametrizefor testing enum string values, as recommended by coding guidelines. Tests cover all members of each new enum with explicit value assertions.tests/unit/core/test_personality.py (1)
1-145: LGTM!Comprehensive test suite for personality compatibility scoring:
Pairwise tests cover:
- Identical profiles scoring high
- Opposing profiles scoring low
- Symmetry/commutativity
- Range bounds validation
- Constructive vs. destructive conflict approaches
- Same vs. opposite collaboration preferences
- Default profile compatibility
Team tests cover:
- Single-member and empty team edge cases
- Identical member teams
- Score range validation
- Verification that team scoring averages all pairwise combinations
Good use of appropriate floating-point comparison tolerance at line 145.
src/ai_company/config/schema.py (2)
10-15: LGTM!Correct import of
EscalationPathandWorkflowHandofftyped models from the core domain, enabling proper type checking and validation for the config schema.
379-380: LGTM!Good addition of cross-department workflow fields to
RootConfig:
- Uses typed models (
WorkflowHandoff,EscalationPath) rather than raw dicts, ensuring proper validation including same-department rejection.- Default empty tuples are appropriate.
- Docstring updated to document the new fields.
Also applies to: 432-439
tests/unit/engine/test_prompt.py (3)
11-18: LGTM!Correct imports for the new personality dimension enums needed by the new tests.
230-265: LGTM!Comprehensive tests for new personality dimensions in prompts:
test_new_personality_dimensions_in_promptvalidates that default dimension values appear in rendered prompts.test_new_personality_dimensions_with_custom_valuesverifies that explicitly set dimension values render correctly.Both tests properly use the new enums (
CommunicationVerbosity,DecisionMakingStyle,CollaborationPreference,ConflictApproach).
465-468: LGTM!Explicit version assertion ensures the prompt template version is correctly bumped to
"1.1.0"as part of the personality dimensions feature.tests/unit/core/test_agent.py (7)
19-27: LGTM!New enum imports are correctly added to support the extended PersonalityConfig tests.
100-107: LGTM!Test correctly verifies that all Big Five dimensions default to 0.5, aligning with the model definition in
src/ai_company/core/agent.py.
109-124: LGTM!Good use of
@pytest.mark.parametrizeto test boundary conditions (0.0 and 1.0) for all Big Five dimensions.
126-154: LGTM!Proper validation tests for out-of-bounds values on Big Five dimensions.
156-175: LGTM!Tests for extended behavioral enums correctly verify both default values and custom value assignment.
177-200: LGTM!NaN rejection tests and max_length validation tests provide essential coverage for the new field constraints.
202-212: LGTM!Backward compatibility test ensures that existing code constructing PersonalityConfig without new fields continues to work with expected defaults.
src/ai_company/templates/renderer.py (4)
41-45: LGTM!Good use of internal constants for default values, improving maintainability and eliminating magic strings.
277-296: LGTM!Clean handling of optional
workflow_handoffsandescalation_pathsfields via the existing_validate_listhelper.
383-401: LGTM!Inline personality now correctly takes precedence over
personality_preset, with proper validation via_validate_inline_personality.
450-463: LGTM!Proper type validation for
reporting_lines(list) andpolicies(dict) with clear error messages including the department name.tests/unit/core/test_company.py (9)
7-17: LGTM!New domain model imports are correctly added to support the expanded test coverage.
437-456: LGTM!Tests for Company with
workflow_handoffsandescalation_pathsproperly verify the new cross-department coordination features.
462-486: LGTM!ReportingLine tests cover valid construction, self-report rejection (including case-insensitive check), and immutability.
492-511: LGTM!ReviewRequirements tests verify defaults and custom value handling.
517-558: LGTM!ApprovalChain tests comprehensively cover validation rules including empty approvers rejection, min_approvals bounds, and immutability.
564-582: LGTM!DepartmentPolicies tests verify defaults and unique action_type validation across approval chains.
588-619: LGTM!WorkflowHandoff tests cover valid construction, same-department rejection (including case-insensitive check), and artifacts handling.
625-692: LGTM!EscalationPath tests comprehensively cover valid construction, priority_boost boundaries, same-department rejection, and case-insensitive validation.
698-738: LGTM!TestDepartmentExtended properly tests the new
reporting_linesandpoliciesfields on Department, including backward compatibility and duplicate subordinate rejection.src/ai_company/templates/presets.py (5)
1-5: LGTM!Docstring correctly updated to reflect the expanded preset count and new dimensions.
8-14: LGTM!Proper imports for
MappingProxyType(immutability) and the observability event constant.
19-262: LGTM!Well-structured personality presets with comprehensive Big Five dimensions and behavioral enums, wrapped in
MappingProxyTypefor immutability as per coding guidelines.
265-288: LGTM!Auto-name pools wrapped in
MappingProxyTypefor immutability.
307-312: LGTM!Logging the warning with structured event and context before raising
KeyErrorimproves observability for debugging preset lookup failures.tests/unit/templates/test_renderer.py (4)
295-317: LGTM!Test correctly verifies that inline personality dict takes precedence over preset when both are provided.
320-384: LGTM!Comprehensive passthrough tests for
reporting_lines,policies, andworkflow_handoffsverify that these fields are correctly propagated through the rendering pipeline.
387-399: LGTM!Test verifies that unknown
personality_presetraisesTemplateRenderErrorwith appropriate message.
402-418: LGTM!Error handling tests for
_validate_listverify proper exceptions for non-list and non-dict-item inputs.src/ai_company/core/personality.py (6)
1-21: LGTM!Module structure follows coding guidelines with proper imports, logger setup, and TYPE_CHECKING pattern for PersonalityConfig.
23-53: LGTM!Weight constants are well-documented and sum correctly (Big Five weights: 1.0, component weights: 1.0). Immutable data structures (
MappingProxyType,frozenset) are properly used.
56-85: LGTM!
compute_compatibilitycorrectly combines weighted component scores with proper clamping and structured debug logging.
88-117: LGTM!
compute_team_compatibilitycorrectly handles edge cases (0 or 1 members return 1.0) and usesitertools.combinationsfor pairwise scoring as intended by the PR refactoring objectives.
123-150: LGTM!Big Five scoring logic is sound. The extraversion tent function correctly peaks at 0.3 difference, reflecting that moderate extraversion differences are ideal for balanced teams.
153-175: LGTM!Collaboration and conflict scoring helpers implement reasonable compatibility heuristics with clear adjacency and pair-based logic.
src/ai_company/core/agent.py (7)
3-8: LGTM!Proper imports for
math(NaN checking) andfield_validator(Big Five validation).
12-16: LGTM!New behavioral enum imports correctly added from
ai_company.core.enums.
26-48: LGTM!Docstring comprehensively updated to document all new Big Five dimensions and behavioral enum attributes.
56-73: LGTM!
max_lengthconstraints added forcommunication_style(100) anddescription(500) to prevent unbounded string fields.
75-105: LGTM!Big Five dimensions properly defined with
ge=0.0,le=1.0constraints and sensible defaults of 0.5. Based on learnings: frozen Pydantic models are used for config/identity.
107-123: LGTM!Behavioral enum fields added with appropriate defaults that represent neutral/collaborative team values.
125-139: LGTM!NaN rejection validator correctly uses
mode="after"to run after Pydantic's built-inge/levalidation, ensuring NaN values are explicitly rejected for all Big Five dimensions.tests/unit/core/conftest.py (4)
19-29: LGTM!New domain model imports correctly added to support the new factory definitions.
104-122: LGTM!Factories for
ReportingLine,ReviewRequirements,ApprovalChain, andDepartmentPoliciesare correctly defined with defaults that satisfy model validators (e.g., different subordinate/supervisor names, non-empty approvers).
125-128: LGTM!
DepartmentFactorycorrectly extended withpoliciesfield wired toDepartmentPoliciesFactory.
139-148: LGTM!
WorkflowHandoffFactoryandEscalationPathFactorycorrectly use different department names (engineering→qa,engineering→executive) to satisfy the same-department validation rules.DESIGN_SPEC.md (4)
156-164: Clear separation between internal scoring traits and prompt-facing behaviors.Good clarification. The two-tier model is explicit, and the YAML example stays consistent with it.
Also applies to: 185-195
2181-2181: Preset rename improves the template example.
strategic_plannerfits the richer personality model better and keeps the startup template aligned with the updated preset catalog.
2321-2322: The new personality surface is easier to discover now.Adding these modules to the project tree makes the public personality and observability entry points much easier to find from the spec.
Also applies to: 2364-2364, 2457-2457
2509-2509: The scoring convention is specific enough to implement and test.Capturing the weighting model and team-averaging rule here removes ambiguity for future implementations.
…iewers Review findings from 8 local agents (code-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, docs-consistency, issue-resolution-verifier) and 3 external reviewers (Gemini, Copilot, Greptile). Key changes: - Add allow_inf_nan=False to PersonalityConfig for numeric consistency - Extract shared _reject_same_department() to DRY duplicate validation - Replace list materialization with generator in compatibility scoring - Fix dict mutation in build_error_prompt metadata construction - Add startup validation loop for personality presets - Add role validation and explicit None checks in template renderer - Narrow exception catches from Exception to (ValidationError, TypeError) - Add logging before raises in structural validation paths - Add mutual exclusion validator for personality_preset vs inline personality - Fix ActionType docstring accuracy - Add comprehensive test coverage for new validation paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…date dept refs - Normalize subordinate names with strip().casefold() in duplicate detection - Validate workflow_handoff and escalation_path department references against declared departments, rejecting unknown department names - Normalize department name comparison for case-insensitive dedup - Update CompanyFactory to set empty handoffs/escalations (matches empty depts) - Add tests for case-insensitive subordinate dedup, unknown dept rejection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/ai_company/core/company.py (2)
288-299:⚠️ Potential issue | 🟠 MajorNormalize subordinate names before duplicate detection.
This validator compares raw strings while
ReportingLine._validate_not_self_reportusesstrip().casefold(). Names like"Alice"and" alice "bypass the duplicate guard, potentially allowing the same subordinate to have multiple supervisors.💡 Proposed fix
`@model_validator`(mode="after") def _validate_unique_subordinates(self) -> Self: """Ensure no duplicate subordinates in reporting lines.""" - subs = [r.subordinate for r in self.reporting_lines] + subs = [r.subordinate.strip().casefold() for r in self.reporting_lines] if len(subs) != len(set(subs)): dupes = sorted(s for s, c in Counter(subs).items() if c > 1) msg = ( f"Duplicate subordinates in reporting lines " f"for department {self.name!r}: {dupes}" ) raise ValueError(msg) return self🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/core/company.py` around lines 288 - 299, The duplicate detection in _validate_unique_subordinates is comparing raw subordinate strings but ReportingLine._validate_not_self_report normalizes names with strip().casefold(); update _validate_unique_subordinates to normalize each subordinate before checking duplicates (e.g., build subs = [r.subordinate.strip().casefold() for r in self.reporting_lines]) so "Alice" and " alice " are treated the same, then compute dupes from the normalized list and include the original department name (self.name) in the raised ValueError as before.
420-438:⚠️ Potential issue | 🟠 MajorValidate handoff/escalation endpoints against known departments.
The
_validate_departmentsvalidator ensures unique department names and budget limits, but does not verify thatworkflow_handoffsandescalation_pathsreference departments that actually exist. A typo like"enginering"would create dangling routes without raising an error.💡 Proposed fix
`@model_validator`(mode="after") def _validate_departments(self) -> Self: """Validate department names are unique and budgets do not exceed 100%.""" # Unique department names - names = [d.name for d in self.departments] + names = [d.name.strip().casefold() for d in self.departments] if len(names) != len(set(names)): dupes = sorted(n for n, c in Counter(names).items() if c > 1) msg = f"Duplicate department names: {dupes}" raise ValueError(msg) + known_departments = set(names) + for handoff in self.workflow_handoffs: + from_dept = handoff.from_department.strip().casefold() + to_dept = handoff.to_department.strip().casefold() + if from_dept not in known_departments or to_dept not in known_departments: + msg = ( + f"Workflow handoff references unknown department: " + f"{handoff.from_department!r} -> {handoff.to_department!r}" + ) + raise ValueError(msg) + for escalation in self.escalation_paths: + from_dept = escalation.from_department.strip().casefold() + to_dept = escalation.to_department.strip().casefold() + if from_dept not in known_departments or to_dept not in known_departments: + msg = ( + f"Escalation path references unknown department: " + f"{escalation.from_department!r} -> {escalation.to_department!r}" + ) + raise ValueError(msg) + # Budget sum max_budget_percent = 100.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/core/company.py` around lines 420 - 438, The _validate_departments model_validator currently checks unique names and budget totals but does not ensure that any department names referenced by workflow_handoffs or escalation_paths actually exist; to fix, after computing names = [d.name for d in self.departments] add validation that every department name used in self.workflow_handoffs (both source and target/step entries) and every name in self.escalation_paths is present in the names set, collecting any unknown names and raising ValueError listing them (include a clear message like "Unknown department references in workflow_handoffs/escalation_paths: [...]"); perform this check inside _validate_departments so typos like "enginering" cause validation failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/templates/renderer.py`:
- Around line 466-493: Add logging calls before raising TemplateRenderError for
the two type checks: when reporting_lines is not a list and when policies is not
a dict. Use the existing logger and TEMPLATE_RENDER_VARIABLE_ERROR (or
logger.error/warning as appropriate) and include context fields like
department=dept_name, field="reporting_lines" or field="policies", and a short
detail describing the expected type (e.g., "must be a list" / "must be a
mapping") so the log contains the same context as the raised
TemplateRenderError.
In `@tests/unit/templates/test_schema.py`:
- Around line 361-391: Add a missing unit test that mirrors
test_workflow_handoffs_accepted for escalation_paths: create a new test function
(e.g., test_escalation_paths_accepted) that uses make_template_dict to construct
a CompanyTemplate with a non-empty escalation_paths tuple and assert its length
is 1; reference the existing CompanyTemplate, make_template_dict, and
escalation_paths symbols so the test mirrors test_workflow_handoffs_accepted and
complements test_escalation_paths_default_empty.
---
Duplicate comments:
In `@src/ai_company/core/company.py`:
- Around line 288-299: The duplicate detection in _validate_unique_subordinates
is comparing raw subordinate strings but ReportingLine._validate_not_self_report
normalizes names with strip().casefold(); update _validate_unique_subordinates
to normalize each subordinate before checking duplicates (e.g., build subs =
[r.subordinate.strip().casefold() for r in self.reporting_lines]) so "Alice" and
" alice " are treated the same, then compute dupes from the normalized list and
include the original department name (self.name) in the raised ValueError as
before.
- Around line 420-438: The _validate_departments model_validator currently
checks unique names and budget totals but does not ensure that any department
names referenced by workflow_handoffs or escalation_paths actually exist; to
fix, after computing names = [d.name for d in self.departments] add validation
that every department name used in self.workflow_handoffs (both source and
target/step entries) and every name in self.escalation_paths is present in the
names set, collecting any unknown names and raising ValueError listing them
(include a clear message like "Unknown department references in
workflow_handoffs/escalation_paths: [...]"); perform this check inside
_validate_departments so typos like "enginering" cause validation failures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b4a39e1e-f914-4b0a-9ede-4f236e9a245a
📒 Files selected for processing (12)
src/ai_company/core/agent.pysrc/ai_company/core/company.pysrc/ai_company/core/enums.pysrc/ai_company/core/personality.pysrc/ai_company/engine/prompt.pysrc/ai_company/templates/presets.pysrc/ai_company/templates/renderer.pysrc/ai_company/templates/schema.pytests/unit/config/test_schema.pytests/unit/engine/test_prompt.pytests/unit/templates/test_renderer.pytests/unit/templates/test_schema.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649
Useexcept A, B:syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14
Include type hints on all public functions; enforce with mypy strict mode
Use Google-style docstrings on all public classes and functions — enforced by ruff D rules
Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction +MappingProxyTypewrapping. Fordict/listfields in frozen Pydantic models, rely onfrozen=Trueandcopy.deepcopy()at system boundaries
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values; useNotBlankStrfromcore.typesfor all identifier/name fields
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code; prefer structured concurrency over barecreate_task
Maintain line length of 88 characters — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate input at system boundaries: user input, external APIs, and config files
Files:
src/ai_company/engine/prompt.pysrc/ai_company/core/company.pytests/unit/engine/test_prompt.pysrc/ai_company/core/personality.pysrc/ai_company/templates/presets.pytests/unit/templates/test_renderer.pysrc/ai_company/templates/schema.pysrc/ai_company/core/agent.pytests/unit/templates/test_schema.pytests/unit/config/test_schema.pysrc/ai_company/templates/renderer.pysrc/ai_company/core/enums.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic MUST have:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code — useget_logger()from observability module
Use variable namelogger(not_logger, notlog) for logging instances
Always use event name constants fromai_company.observability.eventsdomain-specific modules; import directly (e.g.,from ai_company.observability.events.<domain> import EVENT_CONSTANT)
Use structured logging format:logger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG level for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Files:
src/ai_company/engine/prompt.pysrc/ai_company/core/company.pysrc/ai_company/core/personality.pysrc/ai_company/templates/presets.pysrc/ai_company/templates/schema.pysrc/ai_company/core/agent.pysrc/ai_company/templates/renderer.pysrc/ai_company/core/enums.py
src/ai_company/{providers,engine}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
RetryExhaustedErrorsignals that all retries failed — the engine layer catches this to trigger fallback chains
Files:
src/ai_company/engine/prompt.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like
example-provider,example-large-001,large/medium/smallas aliases, ortest-provider,test-small-001in tests
Files:
src/ai_company/engine/prompt.pysrc/ai_company/core/company.pysrc/ai_company/core/personality.pysrc/ai_company/templates/presets.pysrc/ai_company/templates/schema.pysrc/ai_company/core/agent.pysrc/ai_company/templates/renderer.pysrc/ai_company/core/enums.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Maintain 80% minimum code coverage — enforced in CI
Useasyncio_mode = "auto"in pytest configuration — no manual@pytest.mark.asyncioneeded
Set 30-second timeout per test
Usepytest-xdistvia-n autofor parallel test execution
Prefer@pytest.mark.parametrizefor testing similar cases
Files:
tests/unit/engine/test_prompt.pytests/unit/templates/test_renderer.pytests/unit/templates/test_schema.pytests/unit/config/test_schema.py
🧠 Learnings (4)
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves
Applied to files:
src/ai_company/core/agent.py
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values; use `NotBlankStr` from `core.types` for all identifier/name fields
Applied to files:
src/ai_company/core/agent.py
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions
Applied to files:
src/ai_company/templates/renderer.py
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Validate input at system boundaries: user input, external APIs, and config files
Applied to files:
src/ai_company/templates/renderer.py
🧬 Code graph analysis (8)
tests/unit/engine/test_prompt.py (5)
src/ai_company/core/enums.py (7)
CollaborationPreference(244-249)CommunicationVerbosity(252-257)ConflictApproach(260-267)CreativityLevel(40-45)DecisionMakingStyle(235-241)RiskTolerance(32-37)SeniorityLevel(6-21)src/ai_company/engine/errors.py (1)
PromptBuildError(8-9)src/ai_company/engine/prompt.py (3)
DefaultTokenEstimator(107-123)SystemPrompt(56-82)build_error_prompt(646-673)tests/unit/engine/conftest.py (1)
sample_agent_with_personality(52-79)src/ai_company/core/agent.py (3)
AgentIdentity(263-321)ModelConfig(162-191)PersonalityConfig(26-139)
src/ai_company/core/personality.py (3)
src/ai_company/core/enums.py (2)
CollaborationPreference(244-249)ConflictApproach(260-267)src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/core/agent.py (1)
PersonalityConfig(26-139)
src/ai_company/templates/presets.py (2)
src/ai_company/core/agent.py (1)
PersonalityConfig(26-139)src/ai_company/observability/_logger.py (1)
get_logger(8-28)
tests/unit/templates/test_renderer.py (4)
src/ai_company/templates/renderer.py (4)
_expand_single_agent(364-417)_build_departments(445-494)_build_config_dict(247-299)_validate_list(302-320)src/ai_company/core/enums.py (1)
CompanyType(72-82)src/ai_company/templates/schema.py (3)
CompanyTemplate(201-304)TemplateAgentConfig(77-127)TemplateMetadata(167-198)src/ai_company/templates/errors.py (1)
TemplateRenderError(14-20)
src/ai_company/core/agent.py (1)
src/ai_company/core/enums.py (5)
CollaborationPreference(244-249)CommunicationVerbosity(252-257)ConflictApproach(260-267)CreativityLevel(40-45)DecisionMakingStyle(235-241)
tests/unit/templates/test_schema.py (3)
src/ai_company/templates/schema.py (3)
TemplateAgentConfig(77-127)TemplateDepartmentConfig(130-164)CompanyTemplate(201-304)tests/unit/templates/conftest.py (1)
make_template_dict(123-125)tests/unit/config/test_schema.py (1)
test_workflow_handoffs_accepted(369-380)
tests/unit/config/test_schema.py (2)
tests/unit/templates/test_schema.py (1)
test_workflow_handoffs_accepted(375-390)src/ai_company/config/schema.py (1)
RootConfig(360-499)
src/ai_company/templates/renderer.py (3)
src/ai_company/core/agent.py (1)
PersonalityConfig(26-139)src/ai_company/templates/errors.py (1)
TemplateRenderError(14-20)src/ai_company/templates/presets.py (1)
get_personality_preset(292-314)
🔇 Additional comments (41)
src/ai_company/engine/prompt.py (5)
48-50: LGTM!Thread-safety documentation is clear and appropriately warns against post-initialization modifications.
160-175: LGTM!Well-structured Args/Returns documentation following Google-style docstrings.
318-321: LGTM!New personality dimensions (verbosity, decision_making, collaboration, conflict_approach) are correctly injected into the core template context, aligning with the PersonalityConfig model fields.
666-672: LGTM!The metadata merge correctly combines identity metadata with the provided
agent_id, ensuring the explicitagent_idparameter takes precedence over the identity's ID.
204-212: Exception syntax is correct for Python 3.14.The
except MemoryError, RecursionError:syntax without parentheses is valid per PEP 758, which allows optional parentheses for multiple exception types when noasclause is present. This code correctly catches both exception types.src/ai_company/templates/renderer.py (5)
41-46: LGTM!Extracting default constants improves maintainability and consistency across the codebase.
228-244: LGTM!Error logging added before raising exceptions for missing/invalid template keys, aligning with the coding guideline to log at WARNING or ERROR with context before raising.
280-299: LGTM!Clean extraction of workflow_handoffs and escalation_paths with proper validation via
_validate_list. The conditional inclusion pattern correctly handles None and missing keys.
369-417: LGTM!Solid implementation:
- Required role validation with clear error message
- Inline personality type validation before construction
- Proper error handling with
_validate_inline_personality- Uses
_DEFAULT_PROVIDERand_DEFAULT_DEPARTMENTconstants
420-443: LGTM!Exception handling is appropriately narrowed to
(ValidationError, TypeError), addressing the previous review comment. Logging at WARNING level with context before raising aligns with coding guidelines. Based on learnings: "Handle errors explicitly, never silently swallow exceptions."tests/unit/templates/test_schema.py (2)
99-126: LGTM!Comprehensive test coverage for inline personality support:
- Tests inline personality dict with valid fields
- Verifies None default when not provided
- Tests mutual exclusion of
personalityandpersonality_preset
156-177: LGTM!Good coverage of TemplateDepartmentConfig's new fields with tests for default values and explicit value acceptance.
src/ai_company/core/enums.py (1)
233-282: LGTM!Well-structured enum additions:
- Clear docstrings explaining purpose
- Consistent lowercase string values matching existing conventions
- ActionType helpfully notes values are optional constants, not required types
tests/unit/config/test_schema.py (1)
369-394: LGTM!Good test coverage for the new cross-department governance fields. The
# type: ignore[arg-type]comments are appropriate since the tests pass raw dicts that Pydantic coerces to the typed models.tests/unit/engine/test_prompt.py (4)
11-26: LGTM!Import additions align with the new personality enums and
build_error_promptfunction being tested.
231-267: LGTM!Comprehensive tests for new personality dimensions:
- Verifies default values appear in prompts
- Verifies custom enum values render correctly
466-470: LGTM!Version assertion updated to match the new template version.
815-850: LGTM!Well-structured tests for
build_error_prompt:
- Tests pass-through when existing prompt is provided
- Tests placeholder creation with correct metadata when no prompt exists
- Verifies
agent_idparameter is used in metadatasrc/ai_company/templates/presets.py (5)
1-16: LGTM!Clean module setup:
- Updated docstring reflecting Big Five additions
- MappingProxyType import for immutability
- PersonalityConfig import for startup validation
- Observability event constant import
20-263: LGTM!Comprehensive personality presets with:
- Full Big Five dimensions (openness, conscientiousness, extraversion, agreeableness, stress_response)
- Behavioral enums (decision_making, collaboration, verbosity, conflict_approach)
- MappingProxyType wrapping enforcing immutability
- Well-differentiated preset profiles
266-289: LGTM!Auto-name pools wrapped with MappingProxyType for immutability, with comprehensive role-to-name mappings using gender-neutral names.
304-314: LGTM!Error logging with structured event constant before raising KeyError aligns with coding guidelines for error paths.
317-321: LGTM!Startup validation ensures all presets are valid at import time, catching key typos or invalid values immediately. The cleanup via
delprevents polluting module namespace with loop variables.src/ai_company/core/personality.py (6)
1-21: LGTM!Well-structured module setup:
- Clear docstring explaining purpose
- TYPE_CHECKING import avoids runtime circular dependency
- Proper observability setup following coding guidelines
23-54: LGTM!Weight configuration is well-documented and mathematically sound:
- Component weights sum to 1.0 (0.6 + 0.2 + 0.2)
- Big Five dimension weights sum to 1.0
- Immutable collaboration order mapping
- Clear constructive/destructive conflict pairing definitions
56-86: LGTM!Clean implementation of pairwise compatibility:
- Combines three scoring components with appropriate weights
- Clamps result to valid range for safety
- Logs detailed breakdown at DEBUG level for diagnostics
88-121: LGTM!Team compatibility correctly uses
itertools.combinationsper PR objectives (replacing range(len()) nested loops). Edge case for teams with ≤1 members returns 1.0 with proper logging.
126-153: LGTM!Big Five scoring implements nuanced logic:
- Similarity scoring for openness, conscientiousness, agreeableness, stress_response
- Complementarity scoring for extraversion (optimal difference of 0.3)
- Tent function correctly peaks at optimal_diff
156-178: LGTM!Collaboration and conflict scoring logic is clear:
- Adjacency-based collaboration scoring (same=1.0, adjacent=0.5, distant=0.2)
- Conflict scoring prioritizes constructive approaches and penalizes destructive pairs
src/ai_company/templates/schema.py (3)
109-127: LGTM! Mutual exclusion validator correctly implemented.The new
personalityfield and the_validate_personality_mutual_exclusionvalidator properly enforce that only one ofpersonality_presetorpersonalitycan be specified. This aligns with the inline personality support added across the template system.
157-164: LGTM! Department fields properly extended.The new
reporting_linesandpoliciesfields appropriately use loose typing (dict[str, str]anddict[str, Any]) at the template schema level, with validation deferred to render time when these are converted to the stricter core domain models.
259-266: LGTM! Cross-department workflow fields properly added.The
workflow_handoffsandescalation_pathsfields use immutable tuples with flexible dict typing, which is appropriate for template definitions. Downstream validation occurs when the renderer constructs the stricterWorkflowHandoffandEscalationPathdomain models.src/ai_company/core/agent.py (3)
50-50: LGTM!allow_inf_nan=Falseproperly configured.Setting
allow_inf_nan=Falsein the model config ensures Pydantic rejects infinite values for all float fields, which is appropriate for personality scoring dimensions.
75-123: LGTM! Big Five dimensions and behavioral enums well-structured.The Big Five (OCEAN) dimensions are properly constrained to
[0.0, 1.0]with sensible defaults of0.5. The behavioral enums provide meaningful defaults that align with collaborative agent behavior. The docstring clearly distinguishes between internal scoring dimensions and prompt-injected labels.
125-139: LGTM! Explicit NaN rejection validator.While
allow_inf_nan=Falsealready rejects NaN at the Pydantic level, this explicit validator provides a clearer, domain-specific error message ("NaN is not allowed for Big Five dimensions") which improves debuggability. Good defense-in-depth approach.src/ai_company/core/company.py (3)
16-38: LGTM! ReportingLine model correctly validates self-reporting.The
_validate_not_self_reportvalidator properly uses case-insensitive comparison withstrip().casefold()to prevent agents from reporting to themselves, handling variations like"Alice"vs" alice ".
67-128: LGTM! Approval chain validation is robust.The
ApprovalChainmodel correctly validates that approvers is non-empty and thatmin_approvalsdoesn't exceed the number of approvers. TheDepartmentPoliciesmodel properly enforces uniqueaction_typevalues across approval chains using a Counter-based duplicate check.
134-205: LGTM! Cross-department models with shared validation helper.The extracted
_reject_same_departmenthelper provides consistent case-insensitive validation for bothWorkflowHandoffandEscalationPath, eliminating code duplication. Thepriority_boostconstraint (0-3) onEscalationPathis appropriately bounded.tests/unit/templates/test_renderer.py (3)
300-316: LGTM! Inline personality test properly validates passthrough.The test correctly verifies that inline personality dictionaries are passed through to the expanded agent config, checking both
communication_styleandtraitsfields.
318-382: LGTM! Department passthrough tests provide good coverage.The tests properly verify that
reporting_lines,policies, andworkflow_handoffsare passed through from rendered data to the output dicts. Using internal functions (_build_departments,_build_config_dict) for unit testing these specific paths is appropriate.
385-508: LGTM! Comprehensive error case coverage.The error tests properly verify that:
- Invalid inline personality raises
TemplateRenderError- Non-dict personality values are rejected
- Missing
rolefield raises an error- Type validation for
reporting_lines(list) andpolicies(dict)- Unknown personality presets are rejected
_validate_listrejects non-list values and non-dict itemsAll tests use appropriate
matchpatterns to verify specific error messages.
| dept_name = dept.get("name", "") | ||
| head_role = dept.get("head_role") | ||
| if not head_role: | ||
| logger.warning( | ||
| TEMPLATE_RENDER_VARIABLE_ERROR, | ||
| department=dept_name, | ||
| field="head_role", | ||
| detail="No head_role specified; using department name as placeholder", | ||
| ) | ||
| head_role = dept_name or "" | ||
| dept_dict: dict[str, Any] = { | ||
| "name": dept.get("name", ""), | ||
| "head": dept.get("head_role", dept.get("name", "")), | ||
| "name": dept_name, | ||
| "head": head_role, | ||
| "budget_percent": budget_pct, | ||
| } | ||
| reporting_lines = dept.get("reporting_lines") | ||
| if reporting_lines is not None: | ||
| if not isinstance(reporting_lines, list): | ||
| msg = f"Department {dept_name!r} 'reporting_lines' must be a list" | ||
| raise TemplateRenderError(msg) | ||
| dept_dict["reporting_lines"] = reporting_lines | ||
| policies = dept.get("policies") | ||
| if policies is not None: | ||
| if not isinstance(policies, dict): | ||
| msg = f"Department {dept_name!r} 'policies' must be a mapping" | ||
| raise TemplateRenderError(msg) | ||
| dept_dict["policies"] = policies | ||
| departments.append(dept_dict) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider logging before raising for type validation errors.
For consistency with coding guidelines requiring "all error paths must log at WARNING or ERROR with context before raising," consider adding logging before raising TemplateRenderError for reporting_lines and policies type validation failures.
♻️ Suggested improvement
reporting_lines = dept.get("reporting_lines")
if reporting_lines is not None:
if not isinstance(reporting_lines, list):
msg = f"Department {dept_name!r} 'reporting_lines' must be a list"
+ logger.warning(
+ TEMPLATE_RENDER_YAML_ERROR,
+ department=dept_name,
+ error=msg,
+ )
raise TemplateRenderError(msg)
dept_dict["reporting_lines"] = reporting_lines
policies = dept.get("policies")
if policies is not None:
if not isinstance(policies, dict):
msg = f"Department {dept_name!r} 'policies' must be a mapping"
+ logger.warning(
+ TEMPLATE_RENDER_YAML_ERROR,
+ department=dept_name,
+ error=msg,
+ )
raise TemplateRenderError(msg)
dept_dict["policies"] = policies🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/templates/renderer.py` around lines 466 - 493, Add logging
calls before raising TemplateRenderError for the two type checks: when
reporting_lines is not a list and when policies is not a dict. Use the existing
logger and TEMPLATE_RENDER_VARIABLE_ERROR (or logger.error/warning as
appropriate) and include context fields like department=dept_name,
field="reporting_lines" or field="policies", and a short detail describing the
expected type (e.g., "must be a list" / "must be a mapping") so the log contains
the same context as the raised TemplateRenderError.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/ai_company/core/agent.py
Outdated
| """ | ||
|
|
||
| model_config = ConfigDict(frozen=True) | ||
| model_config = ConfigDict(frozen=True, allow_inf_nan=False) |
There was a problem hiding this comment.
PersonalityConfig is used to validate inline personality dicts and presets, but its model_config doesn’t set extra='forbid'. With Pydantic’s default extra handling, unknown keys in personality dicts (e.g., typos like conciensciousness) may be silently ignored, meaning _validate_inline_personality() and the import-time preset validation won’t reliably catch schema mistakes. Consider forbidding extra fields on PersonalityConfig, or explicitly checking incoming dict keys against PersonalityConfig.model_fields in the renderer/preset validation paths to ensure typos are rejected.
| model_config = ConfigDict(frozen=True, allow_inf_nan=False) | |
| model_config = ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid") |
| # Unique department names (normalized for case-insensitive comparison) | ||
| names = [d.name.strip().casefold() for d in self.departments] | ||
| if len(names) != len(set(names)): | ||
| dupes = sorted(n for n, c in Counter(names).items() if c > 1) | ||
| msg = f"Duplicate department names: {dupes}" | ||
| raise ValueError(msg) |
There was a problem hiding this comment.
Company._validate_departments() now normalizes department names with strip().casefold() for uniqueness and for validating handoff/escalation references. This changes validation semantics vs other layers (e.g., RootConfig / CompanyTemplate validators still appear to treat department names case-sensitively), which can allow configs/templates to validate initially but fail later when constructing Company. Consider aligning department-name uniqueness/reference validation across config/template/company layers (all case-insensitive or all case-sensitive), and if keeping normalization here, consider reporting duplicates using the original department names for clearer error messages.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/core/company.py`:
- Around line 134-141: Add module-level observability and log validation
failures before raising: import get_logger from ai_company.observability and
create logger = get_logger(__name__), then update each validation function
(e.g., _reject_same_department and the other validator blocks around lines
164-205, 288-299, 420-451) to call logger.warning or logger.error with
contextual information (include from_dept, to_dept, label or relevant
parameters) immediately before raising ValueError so failures are recorded with
context.
- Around line 86-98: The _validate_approvers method currently allows duplicate
entries; update it to reject duplicate approvers by checking uniqueness (e.g.,
compare len(self.approvers) to len(set(self.approvers))) and raise a ValueError
listing or describing the duplicate(s) when found; also change the min_approvals
check in _validate_approvers to compare against the number of unique approvers
(use len(set(self.approvers))) so min_approvals cannot exceed distinct
approvers, and keep the existing error messages clear and contextual (refer to
_validate_approvers and the ApprovalChain approvers/min_approvals validation).
In `@tests/unit/core/test_company.py`:
- Around line 770-780: The test test_duplicate_subordinates_case_insensitive
only checks case differences and misses the whitespace-normalization regression;
update it to parametrize both normalization scenarios so it asserts duplicates
are rejected when subordinates differ by case OR by surrounding whitespace.
Specifically, convert test_duplicate_subordinates_case_insensitive into a
`@pytest.mark.parametrize` over pairs of subordinate strings (e.g.,
("Alice","alice") and ("Alice "," alice")) and still instantiate Department with
ReportingLine(subordinate=..., supervisor=...) entries and expect
ValidationError when creating Department; this ensures the
Department/ReportingLine normalization (strip() + casefold()) is actually
exercised.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 310521b7-50f9-46b0-8896-df5e1a38bfcb
📒 Files selected for processing (3)
src/ai_company/core/company.pytests/unit/core/conftest.pytests/unit/core/test_company.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649
Useexcept A, B:syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14
Include type hints on all public functions; enforce with mypy strict mode
Use Google-style docstrings on all public classes and functions — enforced by ruff D rules
Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction +MappingProxyTypewrapping. Fordict/listfields in frozen Pydantic models, rely onfrozen=Trueandcopy.deepcopy()at system boundaries
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values; useNotBlankStrfromcore.typesfor all identifier/name fields
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code; prefer structured concurrency over barecreate_task
Maintain line length of 88 characters — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate input at system boundaries: user input, external APIs, and config files
Files:
tests/unit/core/conftest.pysrc/ai_company/core/company.pytests/unit/core/test_company.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Maintain 80% minimum code coverage — enforced in CI
Useasyncio_mode = "auto"in pytest configuration — no manual@pytest.mark.asyncioneeded
Set 30-second timeout per test
Usepytest-xdistvia-n autofor parallel test execution
Prefer@pytest.mark.parametrizefor testing similar cases
Files:
tests/unit/core/conftest.pytests/unit/core/test_company.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic MUST have:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code — useget_logger()from observability module
Use variable namelogger(not_logger, notlog) for logging instances
Always use event name constants fromai_company.observability.eventsdomain-specific modules; import directly (e.g.,from ai_company.observability.events.<domain> import EVENT_CONSTANT)
Use structured logging format:logger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG level for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Files:
src/ai_company/core/company.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like
example-provider,example-large-001,large/medium/smallas aliases, ortest-provider,test-small-001in tests
Files:
src/ai_company/core/company.py
🧬 Code graph analysis (2)
tests/unit/core/conftest.py (1)
src/ai_company/core/company.py (11)
ApprovalChain(67-98)Company(373-452)CompanyConfig(302-332)Department(239-299)DepartmentPolicies(101-128)EscalationPath(175-205)HRRegistry(335-370)ReportingLine(16-38)ReviewRequirements(41-64)Team(208-236)WorkflowHandoff(144-172)
tests/unit/core/test_company.py (1)
src/ai_company/core/company.py (11)
ApprovalChain(67-98)Company(373-452)CompanyConfig(302-332)Department(239-299)DepartmentPolicies(101-128)EscalationPath(175-205)HRRegistry(335-370)ReportingLine(16-38)ReviewRequirements(41-64)Team(208-236)WorkflowHandoff(144-172)
| @model_validator(mode="after") | ||
| def _validate_approvers(self) -> Self: | ||
| """Ensure approvers is non-empty and min_approvals is within bounds.""" | ||
| if not self.approvers: | ||
| msg = "Approval chain must have at least one approver" | ||
| raise ValueError(msg) | ||
| if self.min_approvals > len(self.approvers): | ||
| msg = ( | ||
| f"min_approvals ({self.min_approvals}) exceeds " | ||
| f"number of approvers ({len(self.approvers)})" | ||
| ) | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
Reject duplicate approvers in ApprovalChain.
approvers=("lead", "lead") currently passes validation, so the same person can satisfy multiple slots and min_approvals stops representing distinct approvals.
Proposed fix
`@model_validator`(mode="after")
def _validate_approvers(self) -> Self:
"""Ensure approvers is non-empty and min_approvals is within bounds."""
if not self.approvers:
msg = "Approval chain must have at least one approver"
raise ValueError(msg)
+ normalized_approvers = [a.strip().casefold() for a in self.approvers]
+ if len(normalized_approvers) != len(set(normalized_approvers)):
+ dupes = sorted(
+ a for a, count in Counter(normalized_approvers).items() if count > 1
+ )
+ msg = (
+ f"Duplicate approvers in approval chain "
+ f"{self.action_type!r}: {dupes}"
+ )
+ raise ValueError(msg)
if self.min_approvals > len(self.approvers):
msg = (
f"min_approvals ({self.min_approvals}) exceeds "
f"number of approvers ({len(self.approvers)})"
)As per coding guidelines, "Validate input at system boundaries: user input, external APIs, and config files".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/core/company.py` around lines 86 - 98, The _validate_approvers
method currently allows duplicate entries; update it to reject duplicate
approvers by checking uniqueness (e.g., compare len(self.approvers) to
len(set(self.approvers))) and raise a ValueError listing or describing the
duplicate(s) when found; also change the min_approvals check in
_validate_approvers to compare against the number of unique approvers (use
len(set(self.approvers))) so min_approvals cannot exceed distinct approvers, and
keep the existing error messages clear and contextual (refer to
_validate_approvers and the ApprovalChain approvers/min_approvals validation).
| """ | ||
| try: | ||
| PersonalityConfig(**personality) | ||
| except (ValidationError, TypeError) as exc: |
There was a problem hiding this comment.
PEP 758 except syntax — missing from project convention
CLAUDE.md mandates PEP 758 bare-comma except syntax (no parentheses), enforced by ruff on Python 3.14:
| except (ValidationError, TypeError) as exc: | |
| except ValidationError, TypeError as exc: |
This is the only parenthesised multi-exception catch introduced in this PR.
Rule Used: CLAUDE.md (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/templates/renderer.py
Line: 435
Comment:
**PEP 758 except syntax — missing from project convention**
CLAUDE.md mandates PEP 758 bare-comma except syntax (no parentheses), enforced by ruff on Python 3.14:
```suggestion
except ValidationError, TypeError as exc:
```
This is the only parenthesised multi-exception catch introduced in this PR.
**Rule Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=6816cd03-d0e1-4fd0-9d04-2417487a584c))
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
…eptile - Remove dead _reject_nan validator (allow_inf_nan=False handles it) - Add extra='forbid' to PersonalityConfig to reject unknown keys - Add logger + COMPANY_VALIDATION_ERROR event to company.py validators - Reject duplicate approvers in ApprovalChain (case-insensitive) - Guard module-level del in presets.py against empty dict - Add whitespace-insensitive subordinate dedup test - Add test_escalation_paths_accepted to template schema tests - Add test_duplicate_approvers tests to ApprovalChain tests - Register company events module in discovery test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/observability/test_events.py (1)
93-113: 🧹 Nitpick | 🔵 TrivialDiscovery alone does not pin the new domain constants.
This update proves the
companyandpersonalitymodules are present, but a typo in one of their exported event strings would still pass_DOT_PATTERNand uniqueness checks. Please add domain-specific value assertions alongside the existingtest_*_events_existblocks.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/observability/test_events.py` around lines 93 - 113, The discovery test only verifies modules exist but not that their exported event string constants have the exact expected values; update the tests by adding domain-specific value assertions alongside the existing test_*_events_exist checks (e.g., assert the specific event string constants exported from the company and personality modules equal the canonical values used elsewhere) so that typos in exported strings fail; locate references in tests/unit/observability/test_events.py (functions test_all_domain_modules_discovered and the existing test_*_events_exist blocks) and add explicit equality assertions for the key constants from the company and personality modules to pin their exact values.
♻️ Duplicate comments (1)
src/ai_company/core/company.py (1)
33-41:⚠️ Potential issue | 🟠 MajorSome new validation paths still bypass structured error logging.
ReportingLine._validate_not_self_report,DepartmentPolicies._validate_unique_action_types, andDepartment._validate_unique_subordinatesstill raise directly, so those failures never emitCOMPANY_VALIDATION_ERRORthe way the other validators in this module now do.As per coding guidelines, "All error paths must log at WARNING or ERROR with context before raising" and "Every module with business logic MUST have:
from ai_company.observability import get_loggerthenlogger = get_logger(__name__)".Also applies to: 132-139, 301-311
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/core/company.py` around lines 33 - 41, The three validators (ReportingLine._validate_not_self_report, DepartmentPolicies._validate_unique_action_types, Department._validate_unique_subordinates) currently raise ValueError directly and must instead log the failure with context before raising; add the module-level imports "from ai_company.observability import get_logger" and "from ai_company.observability import COMPANY_VALIDATION_ERROR" and initialize "logger = get_logger(__name__)", then in each validator build the same error message you already construct, call logger.warning/ error with COMPANY_VALIDATION_ERROR and the message (include relevant fields like subordinate, supervisor, action type, or subordinate list) and then raise the ValueError as before so the error path is logged consistently prior to raising.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/templates/presets.py`:
- Around line 317-322: The startup validation loop over PERSONALITY_PRESETS
currently constructs PersonalityConfig(**_preset_dict) but does not include the
preset key in error reporting; update the loop to validate each preset inside a
try/except that catches pydantic.ValidationError (or Exception) and re-raises or
logs a new error that includes the preset name (_preset_name) and the original
exception details, then continue or fail as desired; reference the existing loop
over PERSONALITY_PRESETS and the PersonalityConfig constructor to locate where
to add the try/except and use _preset_name in the error message so failures
identify which preset failed.
- Around line 20-21: PERSONALITY_PRESETS currently only wraps the outer mapping,
leaving inner preset dicts mutable and allowing callers to mutate shared state
and bypass startup validation; fix this by making each preset payload immutable
or by storing validated PersonalityConfig instances: either wrap every inner
dict with MappingProxyType when building PERSONALITY_PRESETS (e.g., replace each
value dict with MappingProxyType(value_dict)) or construct and store validated
PersonalityConfig objects for each preset and ensure any public accessor returns
a copy (deepcopy or pydantic .copy()/dict()) rather than the original; update
the module initialization where PERSONALITY_PRESETS is created and any code that
reads presets to return copies so presets cannot be mutated at runtime.
---
Outside diff comments:
In `@tests/unit/observability/test_events.py`:
- Around line 93-113: The discovery test only verifies modules exist but not
that their exported event string constants have the exact expected values;
update the tests by adding domain-specific value assertions alongside the
existing test_*_events_exist checks (e.g., assert the specific event string
constants exported from the company and personality modules equal the canonical
values used elsewhere) so that typos in exported strings fail; locate references
in tests/unit/observability/test_events.py (functions
test_all_domain_modules_discovered and the existing test_*_events_exist blocks)
and add explicit equality assertions for the key constants from the company and
personality modules to pin their exact values.
---
Duplicate comments:
In `@src/ai_company/core/company.py`:
- Around line 33-41: The three validators
(ReportingLine._validate_not_self_report,
DepartmentPolicies._validate_unique_action_types,
Department._validate_unique_subordinates) currently raise ValueError directly
and must instead log the failure with context before raising; add the
module-level imports "from ai_company.observability import get_logger" and "from
ai_company.observability import COMPANY_VALIDATION_ERROR" and initialize "logger
= get_logger(__name__)", then in each validator build the same error message you
already construct, call logger.warning/ error with COMPANY_VALIDATION_ERROR and
the message (include relevant fields like subordinate, supervisor, action type,
or subordinate list) and then raise the ValueError as before so the error path
is logged consistently prior to raising.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 09147f9d-f9ec-47d9-a8a8-9d3c31fd9f1a
📒 Files selected for processing (7)
src/ai_company/core/agent.pysrc/ai_company/core/company.pysrc/ai_company/observability/events/company.pysrc/ai_company/templates/presets.pytests/unit/core/test_company.pytests/unit/observability/test_events.pytests/unit/templates/test_schema.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Python 3.14+ required with PEP 649 native lazy annotations
Do NOT usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax: useexcept A, B:(no parentheses) — ruff enforces this on Python 3.14
Enforce 88 character line length (ruff configuration)
Files:
src/ai_company/observability/events/company.pytests/unit/templates/test_schema.pysrc/ai_company/core/agent.pytests/unit/observability/test_events.pysrc/ai_company/templates/presets.pytests/unit/core/test_company.pysrc/ai_company/core/company.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: All public functions and classes require type hints; mypy strict mode enforced
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Create new objects instead of mutating existing ones — enforce immutability for non-Pydantic internal collections (registries, BaseTool) using copy.deepcopy() at construction and MappingProxyType wrapping
Use copy.deepcopy() at system boundaries for dict/list fields in frozen Pydantic models (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; separate mutable-via-copy models using model_copy(update=...) for runtime state that evolves
Never mix static config fields with mutable runtime fields in one Pydantic model
Use Pydantic v2 with BaseModel, model_validator, computed_field, and ConfigDict
Use@computed_fieldfor derived values instead of storing and validating redundant fields (e.g., TokenUsage.total_tokens)
Use NotBlankStr from core.types for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files) in Python code
Use variable namelogger(not_logger, notlog) for logger instances
Use structured kwargs in logging: alwayslogger.info(EVENT, key=value)— neverlogger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Log at DEBUG level for object creation, internal flow, entry/exit of key functions
Pure data models, enu...
Files:
src/ai_company/observability/events/company.pysrc/ai_company/core/agent.pysrc/ai_company/templates/presets.pysrc/ai_company/core/company.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Never useimport logging,logging.getLogger(), orprint()in application code
Always use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget)
Import event constants directly:from ai_company.observability.events.<domain> import EVENT_CONSTANT
Files:
src/ai_company/observability/events/company.pysrc/ai_company/core/agent.pysrc/ai_company/templates/presets.pysrc/ai_company/core/company.py
{src,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small aliases
Files:
src/ai_company/observability/events/company.pytests/unit/templates/test_schema.pysrc/ai_company/core/agent.pytests/unit/observability/test_events.pysrc/ai_company/templates/presets.pytests/unit/core/test_company.pysrc/ai_company/core/company.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Use asyncio_mode = 'auto' in pytest — no manual@pytest.mark.asyncioneeded on async tests
Enforce 30 second timeout per test
Prefer@pytest.mark.parametrizefor testing similar cases in Python tests
Use test-provider, test-small-001, etc. instead of vendor names in tests
Files:
tests/unit/templates/test_schema.pytests/unit/observability/test_events.pytests/unit/core/test_company.py
src/ai_company/!(observability)/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Every module with business logic MUST have:
from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Files:
src/ai_company/core/agent.pysrc/ai_company/templates/presets.pysrc/ai_company/core/company.py
🧠 Learnings (8)
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/ai_company/**/*.py : Import event constants directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/ai_company/observability/events/company.pytests/unit/observability/test_events.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget)
Applied to files:
src/ai_company/observability/events/company.pytests/unit/observability/test_events.pysrc/ai_company/core/company.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models using model_copy(update=...) for runtime state that evolves
Applied to files:
src/ai_company/core/agent.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/**/*.py : Never mix static config fields with mutable runtime fields in one Pydantic model
Applied to files:
src/ai_company/core/agent.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Every module with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
src/ai_company/core/company.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/ai_company/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code
Applied to files:
src/ai_company/core/company.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
src/ai_company/core/company.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/**/*.py : Validate at system boundaries (user input, external APIs, config files) in Python code
Applied to files:
src/ai_company/core/company.py
🧬 Code graph analysis (5)
tests/unit/templates/test_schema.py (2)
src/ai_company/templates/schema.py (3)
TemplateAgentConfig(77-127)TemplateDepartmentConfig(130-164)CompanyTemplate(201-304)tests/unit/templates/conftest.py (1)
make_template_dict(123-125)
src/ai_company/core/agent.py (1)
src/ai_company/core/enums.py (5)
CollaborationPreference(244-249)CommunicationVerbosity(252-257)ConflictApproach(260-267)CreativityLevel(40-45)DecisionMakingStyle(235-241)
src/ai_company/templates/presets.py (2)
src/ai_company/core/agent.py (1)
PersonalityConfig(25-122)src/ai_company/observability/_logger.py (1)
get_logger(8-28)
tests/unit/core/test_company.py (1)
src/ai_company/core/company.py (11)
ApprovalChain(71-110)Company(386-469)CompanyConfig(315-345)Department(252-312)DepartmentPolicies(113-140)EscalationPath(188-218)HRRegistry(348-383)ReportingLine(20-42)ReviewRequirements(45-68)Team(221-249)WorkflowHandoff(157-185)
src/ai_company/core/company.py (1)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)
🔇 Additional comments (1)
src/ai_company/core/agent.py (1)
49-122: Nice boundary hardening ofPersonalityConfig.
extra="forbid",allow_inf_nan=False, and the explicit bounds/max-length guards close off the obvious malformed preset and inline-config cases without needing extra custom validators.
…validation - Add logging before raise in _validate_unique_action_types (Greptile) - Freeze inner preset dicts with MappingProxyType (CodeRabbit) - Name the failing preset in startup validation error (CodeRabbit) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
# Conflicts: # src/ai_company/config/defaults.py # src/ai_company/config/schema.py # tests/unit/observability/test_events.py
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| agent_dict: dict[str, Any] = { | ||
| "name": name, | ||
| "role": role, | ||
| "department": agent.get("department", "engineering"), | ||
| "department": agent.get("department", _DEFAULT_DEPARTMENT), | ||
| "level": agent.get("level", "mid"), | ||
| } |
There was a problem hiding this comment.
department defaulting doesn’t handle an explicit null/None value. If the rendered agent dict contains department: null, agent.get('department', _DEFAULT_DEPARTMENT) will return None, and RootConfig/AgentConfig will later reject it (department is NotBlankStr). Consider normalizing None (and possibly whitespace-only strings) to _DEFAULT_DEPARTMENT before building agent_dict to match the template schema docstring (“None defaults to engineering during rendering”).
| inline_personality = agent.get("personality") | ||
| preset_name = agent.get("personality_preset") | ||
| if preset_name: | ||
| if inline_personality is not None: | ||
| if not isinstance(inline_personality, dict): | ||
| msg = ( | ||
| f"Personality for agent {name!r} must be a mapping, " | ||
| f"got {type(inline_personality).__name__}" | ||
| ) | ||
| raise TemplateRenderError(msg) | ||
| _validate_inline_personality(inline_personality, name) | ||
| agent_dict["personality"] = inline_personality | ||
| elif preset_name: |
There was a problem hiding this comment.
Mutual exclusivity between personality and personality_preset is enforced in TemplateAgentConfig, but _expand_single_agent() silently prefers personality when both keys are present in rendered YAML. For defense-in-depth and clearer errors (especially if rendered YAML is produced outside the schema path), consider explicitly rejecting the case where both are set and raising TemplateRenderError with a clear message.
| def test_preset_count_at_least_15(self) -> None: | ||
| assert len(PERSONALITY_PRESETS) >= 15 |
There was a problem hiding this comment.
test_preset_count_at_least_15 hard-codes a minimum preset count, which can make future refactors (e.g., consolidating/renaming presets) fail tests without any behavioral regression. Consider asserting on required preset names (or on qualitative invariants like “non-empty, unique keys, validates as PersonalityConfig”) rather than a specific quantity.
| def test_preset_count_at_least_15(self) -> None: | |
| assert len(PERSONALITY_PRESETS) >= 15 | |
| def test_presets_have_nonempty_unique_names(self) -> None: | |
| # Ensure we have at least one preset, and that names are well-formed and unique. | |
| assert PERSONALITY_PRESETS, "No personality presets defined" | |
| for name in PERSONALITY_PRESETS: | |
| assert isinstance(name, str), "Preset name must be a string" | |
| assert name.strip(), "Preset name must be non-empty" | |
| assert len(PERSONALITY_PRESETS) == len( | |
| set(PERSONALITY_PRESETS) | |
| ), "Duplicate preset names found" |
| def _validate_not_self_report(self) -> Self: | ||
| """Reject self-reporting relationships.""" | ||
| if self.subordinate.strip().casefold() == self.supervisor.strip().casefold(): | ||
| msg = ( | ||
| f"Agent cannot report to themselves: " | ||
| f"{self.subordinate!r} == {self.supervisor!r}" | ||
| ) | ||
| raise ValueError(msg) | ||
| return self |
There was a problem hiding this comment.
The _validate_not_self_report validator raises without logging, violating the CLAUDE.md convention that "all error paths must log at WARNING or ERROR with context before raising."
All other validators in this file (e.g., _validate_approvers, _validate_unique_action_types, _reject_same_department) correctly log via logger.warning(COMPANY_VALIDATION_ERROR, ...) before raising. Apply the same pattern here:
| def _validate_not_self_report(self) -> Self: | |
| """Reject self-reporting relationships.""" | |
| if self.subordinate.strip().casefold() == self.supervisor.strip().casefold(): | |
| msg = ( | |
| f"Agent cannot report to themselves: " | |
| f"{self.subordinate!r} == {self.supervisor!r}" | |
| ) | |
| raise ValueError(msg) | |
| return self | |
| @model_validator(mode="after") | |
| def _validate_not_self_report(self) -> Self: | |
| """Reject self-reporting relationships.""" | |
| if self.subordinate.strip().casefold() == self.supervisor.strip().casefold(): | |
| msg = ( | |
| f"Agent cannot report to themselves: " | |
| f"{self.subordinate!r} == {self.supervisor!r}" | |
| ) | |
| logger.warning(COMPANY_VALIDATION_ERROR, error=msg) | |
| raise ValueError(msg) | |
| return self |
The same pattern is needed for _validate_unique_subordinates (company.py:303–313) and _validate_personality_mutual_exclusion (templates/schema.py:119–127). Those two also raise without logging.
Rule Used: CLAUDE.md (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/core/company.py
Line: 34-42
Comment:
The `_validate_not_self_report` validator raises without logging, violating the CLAUDE.md convention that "all error paths must log at WARNING or ERROR with context before raising."
All other validators in this file (e.g., `_validate_approvers`, `_validate_unique_action_types`, `_reject_same_department`) correctly log via `logger.warning(COMPANY_VALIDATION_ERROR, ...)` before raising. Apply the same pattern here:
```suggestion
@model_validator(mode="after")
def _validate_not_self_report(self) -> Self:
"""Reject self-reporting relationships."""
if self.subordinate.strip().casefold() == self.supervisor.strip().casefold():
msg = (
f"Agent cannot report to themselves: "
f"{self.subordinate!r} == {self.supervisor!r}"
)
logger.warning(COMPANY_VALIDATION_ERROR, error=msg)
raise ValueError(msg)
return self
```
The same pattern is needed for `_validate_unique_subordinates` (company.py:303–313) and `_validate_personality_mutual_exclusion` (templates/schema.py:119–127). Those two also raise without logging.
**Rule Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=6816cd03-d0e1-4fd0-9d04-2417487a584c))
How can I resolve this? If you propose a fix, please make it concise.Address 13 findings from post-merge bot reviews on PRs #157-#162: Communication layer (PR #157): - Wake blocked receive() callers on unsubscribe via None sentinel - Distinguish shutdown vs timeout in receive() logging - Add AgentMessenger.receive() facade method - Validate MessageHandler.handle() is async at registration - Await cancelled tasks in _await_with_shutdown to prevent warnings Observability (PR #158): - Add log-before-raise to all validators missing it (company.py, schema.py) — 14 raise sites across 11 validators Parallel execution (PR #161): - Log suppressed ExceptionGroup instead of silent pass - Add PARALLEL_AGENT_CANCELLED structured event for cancellations - Fix progress.in_progress semantics (increment after semaphore) - Use PARALLEL_LOCK_RELEASE_ERROR for lock release failures - Remove duplicate plan_parsing.py from DESIGN_SPEC file tree Template inheritance (PR #162): - Update DESIGN_SPEC merge key docs to include merge_id - Preserve merge_id in _expand_single_agent (confirmed bug fix) - Defer deepcopy in _apply_child_agent past _remove early-return Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
EscalationPathsame-department validator (case-insensitive), type validation forreporting_lines(list) andpolicies(dict) in_build_departmentsmax_lengthconstraints todescription(500) andcommunication_style(100), NaN rejection tests for all Big Five dimensionsPersonalityConfigconstruction, routeworkflow_handoffs/escalation_pathsthrough_validate_list, extract_DEFAULT_PROVIDERand_DEFAULT_DEPARTMENTconstantsPERSONALITY_PRESETSand_AUTO_NAMESwithMappingProxyTyperange(len())nested loops withitertools.combinationsin personality scoring, consolidate duplicated metadata inbuild_error_prompt, loop-based handoff/escalation validationtemplates/schema.pyandconfig/schema.py, update DESIGN_SPEC.md §3.1 and §15.3 with personality dimensions and new filesDepartmentPoliciesFactoryforward reference, addfrom_department/to_departmenttoEscalationPathFactoryCloses #78, closes #79
Review coverage
Pre-reviewed by 10 agents, 32 findings addressed:
Test plan