Skip to content

refactor: harden personality profiles, department validation, and template rendering#158

Merged
Aureliolo merged 7 commits intomainfrom
feat/dept-personalities
Mar 7, 2026
Merged

refactor: harden personality profiles, department validation, and template rendering#158
Aureliolo merged 7 commits intomainfrom
feat/dept-personalities

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Harden department validation: Add EscalationPath same-department validator (case-insensitive), type validation for reporting_lines (list) and policies (dict) in _build_departments
  • Strengthen personality config: Add max_length constraints to description (500) and communication_style (100), NaN rejection tests for all Big Five dimensions
  • Improve template rendering: Validate inline personality dicts eagerly via PersonalityConfig construction, route workflow_handoffs/escalation_paths through _validate_list, extract _DEFAULT_PROVIDER and _DEFAULT_DEPARTMENT constants
  • Enforce immutability: Wrap PERSONALITY_PRESETS and _AUTO_NAMES with MappingProxyType
  • Simplify code: Replace range(len()) nested loops with itertools.combinations in personality scoring, consolidate duplicated metadata in build_error_prompt, loop-based handoff/escalation validation
  • Fix docs: Update stale docstrings in templates/schema.py and config/schema.py, update DESIGN_SPEC.md §3.1 and §15.3 with personality dimensions and new files
  • Fix test stability: Reorder test factories to resolve DepartmentPoliciesFactory forward reference, add from_department/to_department to EscalationPathFactory

Closes #78, closes #79

Review coverage

Pre-reviewed by 10 agents, 32 findings addressed:

  • code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter
  • comment-analyzer, type-design-analyzer, logging-audit, resilience-audit
  • security-reviewer, docs-consistency
  • Plus code-simplifier polish pass (2 simplifications applied)

Test plan

  • All 2590 tests pass
  • 96.36% coverage (threshold: 80%)
  • ruff lint + format clean
  • mypy strict passes
  • All pre-commit hooks pass

Aureliolo and others added 2 commits March 7, 2026 16:43
, #79)

Department structure (#78):
- Add ReportingLine, ReviewRequirements, ApprovalChain, DepartmentPolicies,
  WorkflowHandoff, and EscalationPath frozen Pydantic models
- Extend Department with reporting_lines and policies fields
- Extend Company with workflow_handoffs and escalation_paths fields
- Add ActionType enum for approval chain action types
- Wire through config schema, defaults, template schema, and renderer

Agent personality profiles (#79):
- Extend PersonalityConfig with Big Five (OCEAN) float dimensions and
  behavioral enums (DecisionMakingStyle, CollaborationPreference,
  CommunicationVerbosity, ConflictApproach)
- Add personality compatibility scoring module (compute_compatibility,
  compute_team_compatibility) with weighted composite algorithm
- Expand presets from 4 to 15 with Big Five and behavioral dimensions
- Enhance prompt template with new personality variables (v1.1.0)
- Support inline personality dict override in template renderer

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…plate rendering

Pre-reviewed by 10 agents, 32 findings addressed:
- Add EscalationPath same-department validator with case-insensitive check
- Add max_length constraints to PersonalityConfig description and communication_style
- Add NaN rejection tests for Big Five dimensions
- Wrap PERSONALITY_PRESETS and _AUTO_NAMES with MappingProxyType for immutability
- Add logging before KeyError in get_personality_preset
- Validate inline personality dicts eagerly via PersonalityConfig construction
- Route workflow_handoffs/escalation_paths through _validate_list
- Add type validation for reporting_lines (list) and policies (dict) in _build_departments
- Extract _DEFAULT_PROVIDER and _DEFAULT_DEPARTMENT constants in renderer
- Replace range(len()) nested loops with itertools.combinations in personality scoring
- Fix stale docstrings in templates/schema.py and config/schema.py
- Update DESIGN_SPEC.md §3.1 and §15.3 with personality dimensions
- Consolidate duplicated metadata construction in build_error_prompt
- Reorder test factories to fix DepartmentPoliciesFactory forward reference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 7, 2026 16:26
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 7, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds personality modeling (Big Five floats + behavioral enums), compatibility scoring (pairwise & team), inline personality support in templates, cross-department governance models (reporting lines, approval chains, handoffs, escalations), new observability event constants, prompt/template integrations, and many tests.

Changes

Cohort / File(s) Summary
Personality core & scoring
src/ai_company/core/personality.py, src/ai_company/core/agent.py, src/ai_company/core/enums.py
New personality module with compute_compatibility and compute_team_compatibility; PersonalityConfig extended with Big Five floats and behavioral enums; new enums (DecisionMakingStyle, CollaborationPreference, CommunicationVerbosity, ConflictApproach, ActionType).
Company models & governance
src/ai_company/core/company.py, src/ai_company/observability/events/company.py
Adds ReportingLine, ReviewRequirements, ApprovalChain, DepartmentPolicies, WorkflowHandoff, EscalationPath; extends Department and Company with reporting_lines, policies, workflow_handoffs, escalation_paths and validators; adds COMPANY_VALIDATION_ERROR constant.
Templates: presets, renderer, schema
src/ai_company/templates/presets.py, src/ai_company/templates/renderer.py, src/ai_company/templates/schema.py
Presets made immutable and expanded with personality data; import-time preset validation; renderer supports inline personality dicts (validated), stricter role/department defaults, and passes reporting_lines/policies/workflow_handoffs/escalation_paths; schema adds inline personality and mutual-exclusion validator.
Config, defaults & exports
src/ai_company/config/defaults.py, src/ai_company/config/schema.py, src/ai_company/core/__init__.py
Adds workflow_handoffs and escalation_paths to defaults and RootConfig; re-exports new models/enums from core package namespace.
Prompt & template versioning
src/ai_company/engine/prompt.py, src/ai_company/engine/prompt_template.py
Injects personality fields (verbosity, decision_making, collaboration, conflict_approach) into system prompt context; build_error_prompt merges metadata; bumps PROMPT_TEMPLATE_VERSION to "1.1.0".
Observability — personality events
src/ai_company/observability/events/personality.py
Adds PERSONALITY_COMPATIBILITY_COMPUTED and PERSONALITY_TEAM_SCORE_COMPUTED event constants.
Templates renderer robustness
src/ai_company/templates/renderer.py
New helpers and defaults (_validate_inline_personality, _DEFAULT_PROVIDER, _DEFAULT_DEPARTMENT), enhanced YAML parsing/type validation, inclusion and validation of workflow/escalation fields in built config.
Tests — wide coverage
tests/unit/...
Extensive new and updated tests for personality scoring, agent and company model validations, templates (presets/renderer/schema), prompt/version/error-prompt behavior, factories, and observability discovery.
Docs
CLAUDE.md, DESIGN_SPEC.md
Minor CLAUDE.md wording change; DESIGN_SPEC.md updated to document personality dimensions, behavioral enums, compatibility scoring, and department governance additions.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Personality as core.personality
    participant A as PersonalityConfig A
    participant B as PersonalityConfig B
    participant Observability

    Client->>Personality: compute_compatibility(A, B)
    Personality->>A: read Big Five & behavioral fields
    Personality->>B: read Big Five & behavioral fields
    Personality->>Personality: _big_five_score(...)
    Personality->>Personality: _collaboration_score(...)
    Personality->>Personality: _conflict_score(...)
    Personality->>Personality: aggregate weights & clamp
    Personality->>Observability: emit PERSONALITY_COMPATIBILITY_COMPUTED (components, score)
    Personality-->>Client: return compatibility_score
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 74.17% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: hardening personality profiles, department validation, and template rendering through refactoring.
Description check ✅ Passed The description is directly related to the changeset, detailing specific improvements and validations across personality, department, and template rendering subsystems.
Linked Issues check ✅ Passed The PR addresses both linked issues (#78, #79): implements department internal structures (reporting lines, policies, workflows, escalations) and extends personality profiles with Big Five dimensions, behavioral enums, and compatibility scoring.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the linked issues' requirements. No out-of-scope modifications were detected; changes focus on personality hardening, department validation, and template rendering as specified.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/dept-personalities
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/dept-personalities

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and configurability of agent personality profiles, department structures, and template rendering within the system. It introduces detailed validation for inter-departmental workflows and escalations, strengthens personality configuration with new dimensions and constraints, and refines template processing for greater reliability and flexibility. The changes aim to improve the system's ability to define and manage complex organizational structures and agent behaviors.

Highlights

  • Department Validation: Added EscalationPath same-department validator (case-insensitive), type validation for reporting_lines (list) and policies (dict) in _build_departments.
  • Personality Configuration: Strengthened personality config by adding max_length constraints to description (500) and communication_style (100), and NaN rejection tests for all Big Five dimensions.
  • Template Rendering: Improved template rendering by validating inline personality dicts eagerly via PersonalityConfig construction, routing workflow_handoffs/escalation_paths through _validate_list, and extracting _DEFAULT_PROVIDER and _DEFAULT_DEPARTMENT constants.
  • Immutability Enforcement: Enforced immutability by wrapping PERSONALITY_PRESETS and _AUTO_NAMES with MappingProxyType.
  • Code Simplification: Simplified code by replacing range(len()) nested loops with itertools.combinations in personality scoring, consolidating duplicated metadata in build_error_prompt, and improving loop-based handoff/escalation validation.
  • Documentation Updates: Fixed documentation by updating stale docstrings in templates/schema.py and config/schema.py, and updating DESIGN_SPEC.md �3.1 and �15.3 with personality dimensions and new files.
  • Test Stability: Fixed test stability by reordering test factories to resolve DepartmentPoliciesFactory forward reference and adding from_department/to_department to EscalationPathFactory.
Changelog
  • CLAUDE.md
    • Updated the description for the templates/ directory.
  • DESIGN_SPEC.md
    • Documented personality dimensions, updated an agent example, added new file paths, and detailed personality compatibility scoring.
  • src/ai_company/config/defaults.py
    • Added default empty lists for workflow handoffs and escalation paths.
  • src/ai_company/config/schema.py
    • Imported new company models and added workflow handoffs and escalation paths to RootConfig.
  • src/ai_company/core/init.py
    • Expanded imports for new company models and enums.
  • src/ai_company/core/agent.py
    • Introduced Big Five and behavioral enums to PersonalityConfig, added validation for description length, communication style length, and NaN values in Big Five dimensions.
  • src/ai_company/core/company.py
    • Defined new models for reporting lines, review requirements, approval chains, department policies, workflow handoffs, and escalation paths, integrating them into Department and Company models.
  • src/ai_company/core/enums.py
    • Added new StrEnum classes for decision-making style, collaboration preference, communication verbosity, conflict approach, and action types.
  • src/ai_company/core/personality.py
    • Added a new module to compute personality compatibility scores.
  • src/ai_company/engine/prompt.py
    • Integrated new personality dimensions into prompt generation and consolidated error prompt metadata.
  • src/ai_company/engine/prompt_template.py
    • Updated the prompt template version and included new personality dimensions.
  • src/ai_company/observability/events/personality.py
    • Added a new module for personality event constants.
  • src/ai_company/templates/presets.py
    • Expanded personality presets with Big Five and behavioral enum values, enforced immutability, and improved error logging.
  • src/ai_company/templates/renderer.py
    • Refactored template rendering to validate inline personality configs, use new default constants, and validate department reporting lines and policies.
  • src/ai_company/templates/schema.py
    • Updated template schema models to support inline personality, reporting lines, policies, workflow handoffs, and escalation paths.
  • tests/unit/core/conftest.py
    • Added new factories for company structure models and reordered DepartmentPoliciesFactory.
  • tests/unit/core/test_agent.py
    • Added comprehensive unit tests for PersonalityConfig's new features and validations.
  • tests/unit/core/test_company.py
    • Implemented extensive unit tests for new company structure models and their validation logic.
  • tests/unit/core/test_enums.py
    • Added unit tests for the new StrEnum classes.
  • tests/unit/core/test_personality.py
    • Added unit tests for personality compatibility scoring functions.
  • tests/unit/engine/test_prompt.py
    • Added tests for new personality dimensions in prompts and updated the template version check.
  • tests/unit/observability/test_events.py
    • Updated the test to include the new personality event module.
  • tests/unit/templates/test_presets.py
    • Added tests for personality presets' validity and content.
  • tests/unit/templates/test_renderer.py
    • Added tests for inline personality overrides, department configuration passthrough, and template rendering error handling.
  • tests/unit/templates/test_schema.py
    • Added tests for updated template schema models.
Activity
  • The pull request was pre-reviewed by 10 agents, and 32 findings were addressed.
  • All 2590 tests passed successfully.
  • Code coverage is at 96.36%, exceeding the 80% threshold.
  • Ruff lint and format checks passed cleanly.
  • Mypy strict type checks passed.
  • All pre-commit hooks passed.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive personality system for agents, including Big Five (OCEAN-variant) dimensions for internal compatibility scoring and behavioral enums (DecisionMakingStyle, CollaborationPreference, CommunicationVerbosity, ConflictApproach) for prompt injection. It adds a new PersonalityConfig model to ai_company.core.agent.py with validation for these new fields, and a personality.py module for calculating pairwise and team-level compatibility scores. The system prompt template (prompt_template.py) is updated to include these new behavioral enums, and the DESIGN_SPEC.md and CLAUDE.md documentation are updated to reflect these changes. Additionally, the PR extends company and department configuration with new models for ReportingLine, ReviewRequirements, ApprovalChain, DepartmentPolicies, WorkflowHandoff, and EscalationPath, allowing for more detailed organizational structure and inter-departmental workflows. Template rendering logic is updated to support inline personality configurations and pass through these new department and company-level settings. Review comments suggest refactoring a duplicate validation logic in ai_company.core.company.py into a shared utility and optimizing the compute_team_compatibility function in ai_company.core.personality.py for memory efficiency by using a generator instead of an intermediate list for pairs.

Comment on lines +191 to +203
@model_validator(mode="after")
def _validate_different_departments(self) -> Self:
"""Reject escalations within the same department."""
if (
self.from_department.strip().casefold()
== self.to_department.strip().casefold()
):
msg = (
f"Escalation must be between different departments: "
f"{self.from_department!r} == {self.to_department!r}"
)
raise ValueError(msg)
return self
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This validation logic is identical to the one in WorkflowHandoff (lines 154-166). To adhere to the DRY (Don't Repeat Yourself) principle, consider extracting this into a shared mixin class or a standalone validator function that both models can use. This will improve maintainability.

Comment on lines +88 to +117
def compute_team_compatibility(
members: tuple[PersonalityConfig, ...],
) -> float:
"""Compute average pairwise compatibility for a team.

Args:
members: Tuple of personality profiles for team members.

Returns:
Average pairwise score (1.0 for teams with fewer than 2 members).
"""
if len(members) <= 1:
logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=len(members),
score=1.0,
)
return 1.0

pairs = list(itertools.combinations(members, 2))
total = sum(compute_compatibility(a, b) for a, b in pairs)
result = total / len(pairs)

logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=len(members),
pair_count=len(pairs),
score=result,
)
return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This implementation can be made more memory-efficient by avoiding the creation of an intermediate list of all pairs, which can be large for teams with many members. The number of pairs can be calculated mathematically, and the sum can be performed over a generator.

def compute_team_compatibility(
    members: tuple[PersonalityConfig, ...],
) -> float:
    """Compute average pairwise compatibility for a team.

    Args:
        members: Tuple of personality profiles for team members.

    Returns:
        Average pairwise score (1.0 for teams with fewer than 2 members).
    """
    num_members = len(members)
    if num_members <= 1:
        logger.debug(
            PERSONALITY_TEAM_SCORE_COMPUTED,
            team_size=num_members,
            score=1.0,
        )
        return 1.0

    # Calculate pairs count mathematically and use a generator to avoid
    # storing all pairs in memory, which is more efficient for large teams.
    num_pairs = num_members * (num_members - 1) // 2
    total_score = sum(
        compute_compatibility(a, b) for a, b in itertools.combinations(members, 2)
    )
    result = total_score / num_pairs

    logger.debug(
        PERSONALITY_TEAM_SCORE_COMPUTED,
        team_size=num_members,
        pair_count=num_pairs,
        score=result,
    )
    return result

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the company template/config surface area to support richer department structures (reporting lines, policies, workflow handoffs, escalation paths) and significantly deepens agent personality modeling (Big Five dimensions, behavioral enums, presets, and compatibility scoring), while updating prompt rendering and documentation accordingly.

Changes:

  • Add new department internal-structure models and cross-department workflow models, and wire them through config + template rendering.
  • Extend PersonalityConfig with Big Five + behavioral enums, expand presets, and introduce compatibility scoring utilities + observability events.
  • Update prompt template/context to include new personality dimensions; update docs and add/extend unit tests.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/unit/templates/test_schema.py Adds template schema tests for inline personality, reporting lines, and policies defaults/acceptance.
tests/unit/templates/test_renderer.py Adds renderer-level tests for inline personality precedence, passthrough fields, and validation helpers.
tests/unit/templates/test_presets.py Adds preset validation tests (count, Big Five keys, and PersonalityConfig constructability).
tests/unit/observability/test_events.py Updates event module discovery list to include the new personality event domain.
tests/unit/engine/test_prompt.py Verifies new personality dimensions appear in system prompts; asserts prompt template version bump.
tests/unit/core/test_personality.py New tests for pairwise and team-level personality compatibility scoring behavior.
tests/unit/core/test_enums.py Adds coverage for newly introduced enums (decision-making, collaboration, verbosity, conflict, action types).
tests/unit/core/test_company.py Adds coverage for new company/department workflow and policy models and validators.
tests/unit/core/test_agent.py Adds coverage for Big Five defaults/bounds/NaN rejection and new enum defaults.
tests/unit/core/conftest.py Adds factories for new company policy/workflow models for test construction.
src/ai_company/templates/schema.py Extends template schema to accept inline personality, reporting lines/policies, workflow handoffs, escalation paths.
src/ai_company/templates/renderer.py Validates/threads new template fields into RootConfig-compatible dicts; adds inline personality validation.
src/ai_company/templates/presets.py Expands preset library, makes preset/name registries immutable, and adds logging for unknown presets.
src/ai_company/observability/events/personality.py Introduces personality-related event constants for scoring observability.
src/ai_company/engine/prompt_template.py Bumps template version and renders new personality dimensions into the prompt.
src/ai_company/engine/prompt.py Injects new personality dimensions into prompt context; refactors error prompt metadata construction.
src/ai_company/core/personality.py Adds compatibility scoring implementation for personality profiles and teams.
src/ai_company/core/enums.py Adds behavioral enums and ActionType enum used by policy/workflow features.
src/ai_company/core/company.py Adds reporting lines, department policies, workflow handoffs, escalation paths to core company models.
src/ai_company/core/agent.py Extends PersonalityConfig with Big Five floats, behavioral enums, max lengths, and NaN rejection.
src/ai_company/core/init.py Re-exports newly added core enums/models for public API convenience.
src/ai_company/config/schema.py Extends RootConfig to include workflow handoffs and escalation paths.
src/ai_company/config/defaults.py Adds defaults for new RootConfig list fields (handoffs/escalations).
DESIGN_SPEC.md Updates design spec with new personality dimensions, modules, and template examples.
CLAUDE.md Updates repository structure description for templates/presets scope.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

class ActionType(StrEnum):
"""Convenience constants for common approval action types.

Fields use ``NotBlankStr`` for extensibility — custom values are valid.
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ActionType docstring says “Fields use NotBlankStr for extensibility”, but this is an enum and has no fields using NotBlankStr. Consider rewording to clarify that models use NotBlankStr for action_type values (and ActionType is just a convenience set of common strings).

Suggested change
Fields use ``NotBlankStr`` for extensibilitycustom values are valid.
Models typically use ``NotBlankStr`` for ``action_type`` fields, so these
are optional helper constants and custom string values remain valid.

Copilot uses AI. Check for mistakes.
}

for key in ("workflow_handoffs", "escalation_paths"):
if rendered_data.get(key):
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rendered_data.get(key) is used as a truthiness guard. This will silently skip validation when the user explicitly provides a falsy-but-invalid value (e.g., ""), and it will also treat an explicit empty list as “not provided”. Prefer checking key presence / is not None and always running _validate_list when the field is present so invalid values can’t be masked by defaults.

Suggested change
if rendered_data.get(key):
if key in rendered_data and rendered_data[key] is not None:

Copilot uses AI. Check for mistakes.
inline_personality = agent.get("personality")
preset_name = agent.get("personality_preset")
if preset_name:
if inline_personality and isinstance(inline_personality, dict):
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline personality handling uses if inline_personality and isinstance(..., dict). This ignores explicit empty dicts (which should still validate) and also silently ignores invalid falsy values (e.g., empty string) instead of raising a template error. Consider branching on whether the key exists / value is not None, and raise TemplateRenderError when personality is provided but not a mapping.

Suggested change
if inline_personality and isinstance(inline_personality, dict):
has_inline_personality = "personality" in agent and agent["personality"] is not None
if has_inline_personality:
if not isinstance(inline_personality, dict):
msg = (
f"Personality for agent {name!r} must be a mapping, "
f"got {type(inline_personality).__name__}"
)
raise TemplateRenderError(msg)

Copilot uses AI. Check for mistakes.
"""
try:
PersonalityConfig(**personality)
except Exception as exc:
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_validate_inline_personality catches Exception, which will also intercept KeyboardInterrupt/SystemExit and other unexpected exceptions. Narrow this to the expected validation failures (e.g., pydantic.ValidationError, TypeError, ValueError) and let genuinely unexpected exceptions propagate normally.

Suggested change
except Exception as exc:
except (ValidationError, TypeError, ValueError) as exc:

Copilot uses AI. Check for mistakes.
Comment on lines +450 to +463
reporting_lines = dept.get("reporting_lines")
if reporting_lines:
if not isinstance(reporting_lines, list):
dept_name = dept.get("name", "")
msg = f"Department {dept_name!r} 'reporting_lines' must be a list"
raise TemplateRenderError(msg)
dept_dict["reporting_lines"] = reporting_lines
policies = dept.get("policies")
if policies:
if not isinstance(policies, dict):
dept_name = dept.get("name", "")
msg = f"Department {dept_name!r} 'policies' must be a mapping"
raise TemplateRenderError(msg)
dept_dict["policies"] = policies
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Department extension validation is guarded by if reporting_lines: / if policies:. This means explicit falsy-but-invalid values (e.g., "", 0, [], {}) can bypass type validation and be silently dropped. Prefer validating whenever the key is present (or value is not None), and allow empty lists/dicts as valid values.

Copilot uses AI. Check for mistakes.
Comment on lines +99 to +114
if len(members) <= 1:
logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=len(members),
score=1.0,
)
return 1.0

pairs = list(itertools.combinations(members, 2))
total = sum(compute_compatibility(a, b) for a, b in pairs)
result = total / len(pairs)

logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=len(members),
pair_count=len(pairs),
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute_team_compatibility materializes all pair combinations into a list solely to compute len(pairs) for logging. This adds unnecessary memory overhead; you can iterate the combinations generator directly and compute pair_count as n*(n-1)//2 (or increment a counter while summing) to keep it O(1) memory.

Suggested change
if len(members) <= 1:
logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=len(members),
score=1.0,
)
return 1.0
pairs = list(itertools.combinations(members, 2))
total = sum(compute_compatibility(a, b) for a, b in pairs)
result = total / len(pairs)
logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=len(members),
pair_count=len(pairs),
team_size = len(members)
if team_size <= 1:
logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=team_size,
score=1.0,
)
return 1.0
pair_count = team_size * (team_size - 1) // 2
total = sum(
compute_compatibility(a, b)
for a, b in itertools.combinations(members, 2)
)
result = total / pair_count
logger.debug(
PERSONALITY_TEAM_SCORE_COMPUTED,
team_size=team_size,
pair_count=pair_count,

Copilot uses AI. Check for mistakes.
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 7, 2026

Greptile Summary

This PR successfully hardens the ai_company core by adding Big Five personality dimensions and behavioral enums to PersonalityConfig, introducing a full suite of company structure models (ReportingLine, ApprovalChain, DepartmentPolicies, WorkflowHandoff, EscalationPath), and implementing personality compatibility scoring using itertools.combinations. It fits naturally into the existing Pydantic-frozen model pattern and expands the template rendering pipeline to validate inline personality dicts eagerly and route workflow/escalation data through RootConfig.

Key changes:

  • PersonalityConfig gains 5 Big Five float fields (ge=0, le=1, allow_inf_nan=False) and 4 behavioral enum fields, with max_length on description and communication_style, and extra="forbid"
  • core/personality.py implements weighted pairwise and team compatibility scoring with itertools.combinations; all sub-scores are bounded and the result is clamped to [0, 1]
  • core/company.py adds 6 new frozen models covering department governance, all validated with case-insensitive uniqueness checks. Most validators correctly log before raising via logger.warning(COMPANY_VALIDATION_ERROR, ...), but three validators (ReportingLine._validate_not_self_report, Department._validate_unique_subordinates, and TemplateAgentConfig._validate_personality_mutual_exclusion) raise without logging, violating the CLAUDE.md convention
  • templates/presets.py expands from 4 to 15 presets, wraps both mappings in MappingProxyType, and adds import-time PersonalityConfig validation with proper guard against empty-dict edge cases
  • PROMPT_TEMPLATE_VERSION bumped to 1.1.0; four new personality dimensions are injected into system prompts
  • Exception handling in renderer.py improved to catch only specific exceptions (ValidationError, TypeError) rather than broad Exception, preventing masking of programming errors

All exception handling, immutability wrapping, and template routing have been properly implemented. The code passes 2590 tests at 96.36% coverage with strict mypy/ruff compliance.

Confidence Score: 4/5

  • Safe to merge after adding logger.warning calls before raises in three new validators, consistent with the CLAUDE.md logging convention.
  • The PR is well-structured with 2590 passing tests, 96% coverage, and strict mypy/ruff compliance. All core logic (scoring math, validation rules, template rendering) is correct. The only recurring gap is that three new validators raise without logging (ReportingLine._validate_not_self_report, Department._validate_unique_subordinates, and TemplateAgentConfig._validate_personality_mutual_exclusion), violating the CLAUDE.md convention. Those three omissions are quick one-line fixes that do not affect correctness or security.
  • src/ai_company/core/company.py (lines 34–42 and 303–313) and src/ai_company/templates/schema.py (line 119–127) — add logger.warning before each raise statement.

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class PersonalityConfig {
        +float openness
        +float conscientiousness
        +float extraversion
        +float agreeableness
        +float stress_response
        +DecisionMakingStyle decision_making
        +CollaborationPreference collaboration
        +CommunicationVerbosity verbosity
        +ConflictApproach conflict_approach
        +str description
        +str communication_style
    }

    class Company {
        +tuple~Department~ departments
        +tuple~WorkflowHandoff~ workflow_handoffs
        +tuple~EscalationPath~ escalation_paths
        +_validate_departments()
    }

    class Department {
        +NotBlankStr name
        +NotBlankStr head
        +float budget_percent
        +tuple~Team~ teams
        +tuple~ReportingLine~ reporting_lines
        +DepartmentPolicies policies
        +_validate_unique_team_names()
        +_validate_unique_subordinates()
    }

    class DepartmentPolicies {
        +ReviewRequirements review_requirements
        +tuple~ApprovalChain~ approval_chains
        +_validate_unique_action_types()
    }

    class ReportingLine {
        +NotBlankStr subordinate
        +NotBlankStr supervisor
        +_validate_not_self_report()
    }

    class ApprovalChain {
        +NotBlankStr action_type
        +tuple~NotBlankStr~ approvers
        +int min_approvals
        +_validate_approvers()
    }

    class ReviewRequirements {
        +int min_reviewers
        +tuple~NotBlankStr~ required_reviewer_roles
        +bool self_review_allowed
    }

    class WorkflowHandoff {
        +NotBlankStr from_department
        +NotBlankStr to_department
        +NotBlankStr trigger
        +tuple~NotBlankStr~ artifacts
        +_validate_different_departments()
    }

    class EscalationPath {
        +NotBlankStr from_department
        +NotBlankStr to_department
        +NotBlankStr condition
        +int priority_boost
        +_validate_different_departments()
    }

    Company "1" *-- "many" Department
    Company "1" *-- "many" WorkflowHandoff
    Company "1" *-- "many" EscalationPath
    Department "1" *-- "many" ReportingLine
    Department "1" *-- "1" DepartmentPolicies
    DepartmentPolicies "1" *-- "1" ReviewRequirements
    DepartmentPolicies "1" *-- "many" ApprovalChain
Loading

Last reviewed commit: 57f6d50

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/unit/observability/test_events.py (1)

93-112: 🧹 Nitpick | 🔵 Trivial

Add exact-value coverage for the new personality events.

This only proves the module is discoverable. A rename inside ai_company.observability.events.personality would still pass, unlike the other public event domains covered in this file.

💡 Proposed fix
 from ai_company.observability.events.prompt import PROMPT_BUILD_START
+from ai_company.observability.events.personality import (
+    PERSONALITY_COMPATIBILITY_COMPUTED,
+    PERSONALITY_TEAM_SCORE_COMPUTED,
+)
 from ai_company.observability.events.provider import (
     PROVIDER_CALL_START,
     PROVIDER_REGISTRY_BUILT,
 )
@@
     def test_all_domain_modules_discovered(self) -> None:
         """Every expected domain module is found by pkgutil discovery."""
         expected = {
             "budget",
             "config",
@@
         discovered = {info.name for info in pkgutil.iter_modules(events.__path__)}
         assert discovered == expected
+
+    def test_personality_events_exist(self) -> None:
+        assert PERSONALITY_COMPATIBILITY_COMPUTED == (
+            "personality.compatibility.computed"
+        )
+        assert PERSONALITY_TEAM_SCORE_COMPUTED == (
+            "personality.team_score.computed"
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/observability/test_events.py` around lines 93 - 112, The test
test_all_domain_modules_discovered only checks the personality module is
discoverable but not that its public event names match the expected API; import
ai_company.observability.events.personality (events.personality) in this test
and add an exact-value assertion that the module's exported event identifiers
(use the module's __all__ if present, otherwise filter dir(events.personality)
for public names) equal the expected set of personality event names so a rename
would fail the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/core/company.py`:
- Around line 409-416: The workflow_handoffs and escalation_paths validators
currently only check from_department != to_department and can accept misspelled
or unknown department names; update the Company model validation to first build
a normalized set of declared department names (e.g., lowercased/trimmed) from
the Company.departments list, then validate each WorkflowHandoff and
EscalationPath entry (referencing workflow_handoffs and escalation_paths) by
normalizing their from_department and to_department and rejecting entries whose
endpoints are not present in the normalized department set; also apply the same
normalization when checking for same-department and dedupe logic so comparisons
are consistent and unknown/misspelled departments are rejected.
- Around line 286-297: The validator _validate_unique_subordinates currently
compares raw subordinate strings so names like "Alice" and " alice " slip
through; normalize subordinate names (e.g., s.strip().casefold()) before
building subs and before passing to len(set(...)) and Counter(...) to detect
duplicates. Update the subs list creation (reporting_lines -> subordinate) to
use normalized values for comparison and duplicate detection, while optionally
mapping back to the original names for the error message; keep the function name
_validate_unique_subordinates, use reporting_lines and subordinate to locate the
code, and reuse Counter(...) on the normalized list.

In `@src/ai_company/templates/renderer.py`:
- Around line 417-421: The try/except around PersonalityConfig(**personality) is
too broad; replace the bare Exception catch with the specific ValidationError
(the alias already imported) so only pydantic validation failures are handled.
Update the except clause to "except ValidationError as exc:" and keep raising
TemplateRenderError(msg) from exc, preserving the constructed msg that includes
agent_name and the original exception details.

---

Outside diff comments:
In `@tests/unit/observability/test_events.py`:
- Around line 93-112: The test test_all_domain_modules_discovered only checks
the personality module is discoverable but not that its public event names match
the expected API; import ai_company.observability.events.personality
(events.personality) in this test and add an exact-value assertion that the
module's exported event identifiers (use the module's __all__ if present,
otherwise filter dir(events.personality) for public names) equal the expected
set of personality event names so a rename would fail the test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: bbdcc797-2948-4a43-b1b4-880c6a0842f7

📥 Commits

Reviewing files that changed from the base of the PR and between f566fb4 and 62044ae.

📒 Files selected for processing (25)
  • CLAUDE.md
  • DESIGN_SPEC.md
  • src/ai_company/config/defaults.py
  • src/ai_company/config/schema.py
  • src/ai_company/core/__init__.py
  • src/ai_company/core/agent.py
  • src/ai_company/core/company.py
  • src/ai_company/core/enums.py
  • src/ai_company/core/personality.py
  • src/ai_company/engine/prompt.py
  • src/ai_company/engine/prompt_template.py
  • src/ai_company/observability/events/personality.py
  • src/ai_company/templates/presets.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/templates/schema.py
  • tests/unit/core/conftest.py
  • tests/unit/core/test_agent.py
  • tests/unit/core/test_company.py
  • tests/unit/core/test_enums.py
  • tests/unit/core/test_personality.py
  • tests/unit/engine/test_prompt.py
  • tests/unit/observability/test_events.py
  • tests/unit/templates/test_presets.py
  • tests/unit/templates/test_renderer.py
  • tests/unit/templates/test_schema.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
Use except A, B: syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14
Include type hints on all public functions; enforce with mypy strict mode
Use Google-style docstrings on all public classes and functions — enforced by ruff D rules
Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections, use copy.deepcopy() at construction + MappingProxyType wrapping. For dict/list fields in frozen Pydantic models, rely on frozen=True and copy.deepcopy() at system boundaries
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use @computed_field for derived values; use NotBlankStr from core.types for all identifier/name fields
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code; prefer structured concurrency over bare create_task
Maintain line length of 88 characters — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate input at system boundaries: user input, external APIs, and config files

Files:

  • src/ai_company/core/enums.py
  • tests/unit/templates/test_renderer.py
  • tests/unit/core/test_personality.py
  • src/ai_company/observability/events/personality.py
  • tests/unit/engine/test_prompt.py
  • src/ai_company/templates/renderer.py
  • tests/unit/core/conftest.py
  • tests/unit/core/test_enums.py
  • src/ai_company/config/schema.py
  • src/ai_company/core/agent.py
  • src/ai_company/engine/prompt.py
  • tests/unit/templates/test_presets.py
  • src/ai_company/config/defaults.py
  • src/ai_company/core/company.py
  • tests/unit/core/test_company.py
  • src/ai_company/templates/presets.py
  • tests/unit/observability/test_events.py
  • src/ai_company/engine/prompt_template.py
  • src/ai_company/core/personality.py
  • tests/unit/templates/test_schema.py
  • src/ai_company/templates/schema.py
  • tests/unit/core/test_agent.py
  • src/ai_company/core/__init__.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code — use get_logger() from observability module
Use variable name logger (not _logger, not log) for logging instances
Always use event name constants from ai_company.observability.events domain-specific modules; import directly (e.g., from ai_company.observability.events.<domain> import EVENT_CONSTANT)
Use structured logging format: logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG level for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging

Files:

  • src/ai_company/core/enums.py
  • src/ai_company/observability/events/personality.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/config/schema.py
  • src/ai_company/core/agent.py
  • src/ai_company/engine/prompt.py
  • src/ai_company/config/defaults.py
  • src/ai_company/core/company.py
  • src/ai_company/templates/presets.py
  • src/ai_company/engine/prompt_template.py
  • src/ai_company/core/personality.py
  • src/ai_company/templates/schema.py
  • src/ai_company/core/__init__.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like example-provider, example-large-001, large/medium/small as aliases, or test-provider, test-small-001 in tests

Files:

  • src/ai_company/core/enums.py
  • src/ai_company/observability/events/personality.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/config/schema.py
  • src/ai_company/core/agent.py
  • src/ai_company/engine/prompt.py
  • src/ai_company/config/defaults.py
  • src/ai_company/core/company.py
  • src/ai_company/templates/presets.py
  • src/ai_company/engine/prompt_template.py
  • src/ai_company/core/personality.py
  • src/ai_company/templates/schema.py
  • src/ai_company/core/__init__.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Maintain 80% minimum code coverage — enforced in CI
Use asyncio_mode = "auto" in pytest configuration — no manual @pytest.mark.asyncio needed
Set 30-second timeout per test
Use pytest-xdist via -n auto for parallel test execution
Prefer @pytest.mark.parametrize for testing similar cases

Files:

  • tests/unit/templates/test_renderer.py
  • tests/unit/core/test_personality.py
  • tests/unit/engine/test_prompt.py
  • tests/unit/core/conftest.py
  • tests/unit/core/test_enums.py
  • tests/unit/templates/test_presets.py
  • tests/unit/core/test_company.py
  • tests/unit/observability/test_events.py
  • tests/unit/templates/test_schema.py
  • tests/unit/core/test_agent.py
src/ai_company/{providers,engine}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains

Files:

  • src/ai_company/engine/prompt.py
  • src/ai_company/engine/prompt_template.py
🧠 Learnings (2)
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from `ai_company.observability.events` domain-specific modules; import directly (e.g., `from ai_company.observability.events.<domain> import EVENT_CONSTANT`)

Applied to files:

  • src/ai_company/observability/events/personality.py
  • tests/unit/observability/test_events.py
  • src/ai_company/core/__init__.py
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves

Applied to files:

  • src/ai_company/core/agent.py
🧬 Code graph analysis (13)
tests/unit/templates/test_renderer.py (4)
src/ai_company/templates/renderer.py (4)
  • _expand_single_agent (361-401)
  • _build_departments (424-465)
  • _build_config_dict (245-296)
  • _validate_list (299-317)
src/ai_company/core/enums.py (1)
  • CompanyType (72-82)
src/ai_company/templates/schema.py (3)
  • CompanyTemplate (190-293)
  • TemplateAgentConfig (77-116)
  • TemplateMetadata (156-187)
src/ai_company/templates/errors.py (1)
  • TemplateRenderError (14-20)
tests/unit/core/test_personality.py (3)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (26-139)
src/ai_company/core/enums.py (2)
  • CollaborationPreference (244-249)
  • ConflictApproach (260-267)
src/ai_company/core/personality.py (2)
  • compute_compatibility (56-85)
  • compute_team_compatibility (88-117)
tests/unit/engine/test_prompt.py (4)
src/ai_company/core/enums.py (5)
  • CollaborationPreference (244-249)
  • CommunicationVerbosity (252-257)
  • ConflictApproach (260-267)
  • CreativityLevel (40-45)
  • DecisionMakingStyle (235-241)
tests/unit/engine/conftest.py (1)
  • sample_agent_with_personality (52-79)
src/ai_company/core/agent.py (3)
  • AgentIdentity (263-321)
  • ModelConfig (162-191)
  • PersonalityConfig (26-139)
src/ai_company/engine/prompt.py (1)
  • build_system_prompt (143-209)
src/ai_company/templates/renderer.py (3)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (26-139)
src/ai_company/templates/presets.py (1)
  • get_personality_preset (291-313)
src/ai_company/templates/errors.py (1)
  • TemplateRenderError (14-20)
tests/unit/core/conftest.py (1)
src/ai_company/core/company.py (7)
  • ApprovalChain (67-98)
  • Department (237-297)
  • DepartmentPolicies (101-128)
  • EscalationPath (169-203)
  • HRRegistry (333-368)
  • ReportingLine (16-38)
  • ReviewRequirements (41-64)
tests/unit/core/test_enums.py (1)
src/ai_company/core/enums.py (5)
  • ActionType (270-281)
  • CollaborationPreference (244-249)
  • CommunicationVerbosity (252-257)
  • ConflictApproach (260-267)
  • DecisionMakingStyle (235-241)
src/ai_company/config/schema.py (1)
src/ai_company/core/company.py (2)
  • EscalationPath (169-203)
  • WorkflowHandoff (134-166)
src/ai_company/core/agent.py (1)
src/ai_company/core/enums.py (4)
  • CollaborationPreference (244-249)
  • CommunicationVerbosity (252-257)
  • ConflictApproach (260-267)
  • DecisionMakingStyle (235-241)
tests/unit/templates/test_presets.py (2)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (26-139)
src/ai_company/templates/presets.py (1)
  • get_personality_preset (291-313)
src/ai_company/templates/presets.py (1)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/core/personality.py (3)
src/ai_company/core/enums.py (2)
  • CollaborationPreference (244-249)
  • ConflictApproach (260-267)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (26-139)
tests/unit/templates/test_schema.py (2)
src/ai_company/templates/schema.py (3)
  • TemplateAgentConfig (77-116)
  • TemplateDepartmentConfig (119-153)
  • CompanyTemplate (190-293)
tests/unit/templates/conftest.py (1)
  • make_template_dict (123-125)
tests/unit/core/test_agent.py (2)
src/ai_company/core/enums.py (6)
  • CollaborationPreference (244-249)
  • CommunicationVerbosity (252-257)
  • ConflictApproach (260-267)
  • CreativityLevel (40-45)
  • DecisionMakingStyle (235-241)
  • RiskTolerance (32-37)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (26-139)
🔇 Additional comments (70)
src/ai_company/engine/prompt_template.py (1)

13-13: Good template contract update.

The version bump is paired with the new personality fields, so downstream code can detect the prompt-shape change cleanly.

Also applies to: 89-95

src/ai_company/engine/prompt.py (1)

304-307: Nice consolidation.

Wiring the new personality enum values through _build_core_context() and reusing _build_metadata() in the error path keeps the prompt contract centralized instead of duplicating field mappings.

Also applies to: 652-659

src/ai_company/observability/events/personality.py (1)

1-6: LGTM.

The new domain module keeps personality events discoverable, and the constant values match the existing dot-separated naming scheme.

tests/unit/templates/test_presets.py (1)

44-67: LGTM!

The new tests comprehensively validate the expanded personality presets:

  • test_preset_count_at_least_15 ensures the preset library meets minimum requirements.
  • test_all_presets_produce_valid_personality_config validates schema compatibility.
  • test_presets_include_big_five ensures all presets include the required Big Five dimensions.

Good use of the existing test class structure and consistent assertion messages.

tests/unit/templates/test_schema.py (3)

99-109: LGTM!

Good coverage for the new personality field in TemplateAgentConfig:

  • Verifies inline personality dict acceptance.
  • Confirms default None value when not provided.

148-168: LGTM!

Thorough test coverage for new TemplateDepartmentConfig fields:

  • reporting_lines defaults to empty tuple.
  • policies defaults to None.
  • Both fields accept valid input correctly.

353-382: LGTM!

Good coverage for CompanyTemplate workflow fields:

  • workflow_handoffs and escalation_paths default to empty tuples.
  • workflow_handoffs accepts valid handoff dicts.
src/ai_company/core/enums.py (1)

233-281: LGTM!

The new enums are well-designed:

  • Consistent with existing StrEnum patterns in this file.
  • Docstrings follow Google-style conventions.
  • ActionType correctly documents extensibility via NotBlankStr.
  • All values use lowercase snake_case consistently.
src/ai_company/templates/schema.py (3)

90-112: LGTM!

The personality field addition is well-integrated:

  • Provides an inline alternative to personality_preset.
  • Uses dict[str, Any] | None appropriately for raw config that will be validated during rendering.
  • Docstring updated to reflect the new field.

122-153: LGTM!

Good additions to TemplateDepartmentConfig:

  • reporting_lines and policies fields enable richer department structure configuration.
  • Default values (empty tuple, None) are sensible.
  • Docstring accurately reflects the new attributes.

211-255: LGTM!

The workflow_handoffs and escalation_paths fields complete the cross-department workflow support at the template level:

  • Consistent with the existing pattern of using tuple[dict[str, Any], ...] for template-level raw config.
  • Defaults to empty tuples appropriately.
  • Docstring updated to document new attributes.
tests/unit/core/test_enums.py (3)

6-16: LGTM!

Imports for new enums are correctly added and alphabetically ordered.


84-97: LGTM!

Member count tests correctly validate the expected number of members for each new enum. This helps catch accidental additions or removals.


168-235: LGTM!

Excellent use of @pytest.mark.parametrize for testing enum string values, as recommended by coding guidelines. Tests cover all members of each new enum with explicit value assertions.

tests/unit/core/test_personality.py (1)

1-145: LGTM!

Comprehensive test suite for personality compatibility scoring:

Pairwise tests cover:

  • Identical profiles scoring high
  • Opposing profiles scoring low
  • Symmetry/commutativity
  • Range bounds validation
  • Constructive vs. destructive conflict approaches
  • Same vs. opposite collaboration preferences
  • Default profile compatibility

Team tests cover:

  • Single-member and empty team edge cases
  • Identical member teams
  • Score range validation
  • Verification that team scoring averages all pairwise combinations

Good use of appropriate floating-point comparison tolerance at line 145.

src/ai_company/config/schema.py (2)

10-15: LGTM!

Correct import of EscalationPath and WorkflowHandoff typed models from the core domain, enabling proper type checking and validation for the config schema.


379-380: LGTM!

Good addition of cross-department workflow fields to RootConfig:

  • Uses typed models (WorkflowHandoff, EscalationPath) rather than raw dicts, ensuring proper validation including same-department rejection.
  • Default empty tuples are appropriate.
  • Docstring updated to document the new fields.

Also applies to: 432-439

tests/unit/engine/test_prompt.py (3)

11-18: LGTM!

Correct imports for the new personality dimension enums needed by the new tests.


230-265: LGTM!

Comprehensive tests for new personality dimensions in prompts:

  • test_new_personality_dimensions_in_prompt validates that default dimension values appear in rendered prompts.
  • test_new_personality_dimensions_with_custom_values verifies that explicitly set dimension values render correctly.

Both tests properly use the new enums (CommunicationVerbosity, DecisionMakingStyle, CollaborationPreference, ConflictApproach).


465-468: LGTM!

Explicit version assertion ensures the prompt template version is correctly bumped to "1.1.0" as part of the personality dimensions feature.

tests/unit/core/test_agent.py (7)

19-27: LGTM!

New enum imports are correctly added to support the extended PersonalityConfig tests.


100-107: LGTM!

Test correctly verifies that all Big Five dimensions default to 0.5, aligning with the model definition in src/ai_company/core/agent.py.


109-124: LGTM!

Good use of @pytest.mark.parametrize to test boundary conditions (0.0 and 1.0) for all Big Five dimensions.


126-154: LGTM!

Proper validation tests for out-of-bounds values on Big Five dimensions.


156-175: LGTM!

Tests for extended behavioral enums correctly verify both default values and custom value assignment.


177-200: LGTM!

NaN rejection tests and max_length validation tests provide essential coverage for the new field constraints.


202-212: LGTM!

Backward compatibility test ensures that existing code constructing PersonalityConfig without new fields continues to work with expected defaults.

src/ai_company/templates/renderer.py (4)

41-45: LGTM!

Good use of internal constants for default values, improving maintainability and eliminating magic strings.


277-296: LGTM!

Clean handling of optional workflow_handoffs and escalation_paths fields via the existing _validate_list helper.


383-401: LGTM!

Inline personality now correctly takes precedence over personality_preset, with proper validation via _validate_inline_personality.


450-463: LGTM!

Proper type validation for reporting_lines (list) and policies (dict) with clear error messages including the department name.

tests/unit/core/test_company.py (9)

7-17: LGTM!

New domain model imports are correctly added to support the expanded test coverage.


437-456: LGTM!

Tests for Company with workflow_handoffs and escalation_paths properly verify the new cross-department coordination features.


462-486: LGTM!

ReportingLine tests cover valid construction, self-report rejection (including case-insensitive check), and immutability.


492-511: LGTM!

ReviewRequirements tests verify defaults and custom value handling.


517-558: LGTM!

ApprovalChain tests comprehensively cover validation rules including empty approvers rejection, min_approvals bounds, and immutability.


564-582: LGTM!

DepartmentPolicies tests verify defaults and unique action_type validation across approval chains.


588-619: LGTM!

WorkflowHandoff tests cover valid construction, same-department rejection (including case-insensitive check), and artifacts handling.


625-692: LGTM!

EscalationPath tests comprehensively cover valid construction, priority_boost boundaries, same-department rejection, and case-insensitive validation.


698-738: LGTM!

TestDepartmentExtended properly tests the new reporting_lines and policies fields on Department, including backward compatibility and duplicate subordinate rejection.

src/ai_company/templates/presets.py (5)

1-5: LGTM!

Docstring correctly updated to reflect the expanded preset count and new dimensions.


8-14: LGTM!

Proper imports for MappingProxyType (immutability) and the observability event constant.


19-262: LGTM!

Well-structured personality presets with comprehensive Big Five dimensions and behavioral enums, wrapped in MappingProxyType for immutability as per coding guidelines.


265-288: LGTM!

Auto-name pools wrapped in MappingProxyType for immutability.


307-312: LGTM!

Logging the warning with structured event and context before raising KeyError improves observability for debugging preset lookup failures.

tests/unit/templates/test_renderer.py (4)

295-317: LGTM!

Test correctly verifies that inline personality dict takes precedence over preset when both are provided.


320-384: LGTM!

Comprehensive passthrough tests for reporting_lines, policies, and workflow_handoffs verify that these fields are correctly propagated through the rendering pipeline.


387-399: LGTM!

Test verifies that unknown personality_preset raises TemplateRenderError with appropriate message.


402-418: LGTM!

Error handling tests for _validate_list verify proper exceptions for non-list and non-dict-item inputs.

src/ai_company/core/personality.py (6)

1-21: LGTM!

Module structure follows coding guidelines with proper imports, logger setup, and TYPE_CHECKING pattern for PersonalityConfig.


23-53: LGTM!

Weight constants are well-documented and sum correctly (Big Five weights: 1.0, component weights: 1.0). Immutable data structures (MappingProxyType, frozenset) are properly used.


56-85: LGTM!

compute_compatibility correctly combines weighted component scores with proper clamping and structured debug logging.


88-117: LGTM!

compute_team_compatibility correctly handles edge cases (0 or 1 members return 1.0) and uses itertools.combinations for pairwise scoring as intended by the PR refactoring objectives.


123-150: LGTM!

Big Five scoring logic is sound. The extraversion tent function correctly peaks at 0.3 difference, reflecting that moderate extraversion differences are ideal for balanced teams.


153-175: LGTM!

Collaboration and conflict scoring helpers implement reasonable compatibility heuristics with clear adjacency and pair-based logic.

src/ai_company/core/agent.py (7)

3-8: LGTM!

Proper imports for math (NaN checking) and field_validator (Big Five validation).


12-16: LGTM!

New behavioral enum imports correctly added from ai_company.core.enums.


26-48: LGTM!

Docstring comprehensively updated to document all new Big Five dimensions and behavioral enum attributes.


56-73: LGTM!

max_length constraints added for communication_style (100) and description (500) to prevent unbounded string fields.


75-105: LGTM!

Big Five dimensions properly defined with ge=0.0, le=1.0 constraints and sensible defaults of 0.5. Based on learnings: frozen Pydantic models are used for config/identity.


107-123: LGTM!

Behavioral enum fields added with appropriate defaults that represent neutral/collaborative team values.


125-139: LGTM!

NaN rejection validator correctly uses mode="after" to run after Pydantic's built-in ge/le validation, ensuring NaN values are explicitly rejected for all Big Five dimensions.

tests/unit/core/conftest.py (4)

19-29: LGTM!

New domain model imports correctly added to support the new factory definitions.


104-122: LGTM!

Factories for ReportingLine, ReviewRequirements, ApprovalChain, and DepartmentPolicies are correctly defined with defaults that satisfy model validators (e.g., different subordinate/supervisor names, non-empty approvers).


125-128: LGTM!

DepartmentFactory correctly extended with policies field wired to DepartmentPoliciesFactory.


139-148: LGTM!

WorkflowHandoffFactory and EscalationPathFactory correctly use different department names (engineering→qa, engineering→executive) to satisfy the same-department validation rules.

DESIGN_SPEC.md (4)

156-164: Clear separation between internal scoring traits and prompt-facing behaviors.

Good clarification. The two-tier model is explicit, and the YAML example stays consistent with it.

Also applies to: 185-195


2181-2181: Preset rename improves the template example.

strategic_planner fits the richer personality model better and keeps the startup template aligned with the updated preset catalog.


2321-2322: The new personality surface is easier to discover now.

Adding these modules to the project tree makes the public personality and observability entry points much easier to find from the spec.

Also applies to: 2364-2364, 2457-2457


2509-2509: The scoring convention is specific enough to implement and test.

Capturing the weighting model and team-averaging rule here removes ambiguity for future implementations.

Aureliolo and others added 2 commits March 7, 2026 17:47
…iewers

Review findings from 8 local agents (code-reviewer, pr-test-analyzer,
silent-failure-hunter, comment-analyzer, type-design-analyzer,
logging-audit, docs-consistency, issue-resolution-verifier) and 3
external reviewers (Gemini, Copilot, Greptile).

Key changes:
- Add allow_inf_nan=False to PersonalityConfig for numeric consistency
- Extract shared _reject_same_department() to DRY duplicate validation
- Replace list materialization with generator in compatibility scoring
- Fix dict mutation in build_error_prompt metadata construction
- Add startup validation loop for personality presets
- Add role validation and explicit None checks in template renderer
- Narrow exception catches from Exception to (ValidationError, TypeError)
- Add logging before raises in structural validation paths
- Add mutual exclusion validator for personality_preset vs inline personality
- Fix ActionType docstring accuracy
- Add comprehensive test coverage for new validation paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…date dept refs

- Normalize subordinate names with strip().casefold() in duplicate detection
- Validate workflow_handoff and escalation_path department references against
  declared departments, rejecting unknown department names
- Normalize department name comparison for case-insensitive dedup
- Update CompanyFactory to set empty handoffs/escalations (matches empty depts)
- Add tests for case-insensitive subordinate dedup, unknown dept rejection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 7, 2026 16:51
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
src/ai_company/core/company.py (2)

288-299: ⚠️ Potential issue | 🟠 Major

Normalize subordinate names before duplicate detection.

This validator compares raw strings while ReportingLine._validate_not_self_report uses strip().casefold(). Names like "Alice" and " alice " bypass the duplicate guard, potentially allowing the same subordinate to have multiple supervisors.

💡 Proposed fix
     `@model_validator`(mode="after")
     def _validate_unique_subordinates(self) -> Self:
         """Ensure no duplicate subordinates in reporting lines."""
-        subs = [r.subordinate for r in self.reporting_lines]
+        subs = [r.subordinate.strip().casefold() for r in self.reporting_lines]
         if len(subs) != len(set(subs)):
             dupes = sorted(s for s, c in Counter(subs).items() if c > 1)
             msg = (
                 f"Duplicate subordinates in reporting lines "
                 f"for department {self.name!r}: {dupes}"
             )
             raise ValueError(msg)
         return self
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/core/company.py` around lines 288 - 299, The duplicate
detection in _validate_unique_subordinates is comparing raw subordinate strings
but ReportingLine._validate_not_self_report normalizes names with
strip().casefold(); update _validate_unique_subordinates to normalize each
subordinate before checking duplicates (e.g., build subs =
[r.subordinate.strip().casefold() for r in self.reporting_lines]) so "Alice" and
" alice " are treated the same, then compute dupes from the normalized list and
include the original department name (self.name) in the raised ValueError as
before.

420-438: ⚠️ Potential issue | 🟠 Major

Validate handoff/escalation endpoints against known departments.

The _validate_departments validator ensures unique department names and budget limits, but does not verify that workflow_handoffs and escalation_paths reference departments that actually exist. A typo like "enginering" would create dangling routes without raising an error.

💡 Proposed fix
     `@model_validator`(mode="after")
     def _validate_departments(self) -> Self:
         """Validate department names are unique and budgets do not exceed 100%."""
         # Unique department names
-        names = [d.name for d in self.departments]
+        names = [d.name.strip().casefold() for d in self.departments]
         if len(names) != len(set(names)):
             dupes = sorted(n for n, c in Counter(names).items() if c > 1)
             msg = f"Duplicate department names: {dupes}"
             raise ValueError(msg)
 
+        known_departments = set(names)
+        for handoff in self.workflow_handoffs:
+            from_dept = handoff.from_department.strip().casefold()
+            to_dept = handoff.to_department.strip().casefold()
+            if from_dept not in known_departments or to_dept not in known_departments:
+                msg = (
+                    f"Workflow handoff references unknown department: "
+                    f"{handoff.from_department!r} -> {handoff.to_department!r}"
+                )
+                raise ValueError(msg)
+        for escalation in self.escalation_paths:
+            from_dept = escalation.from_department.strip().casefold()
+            to_dept = escalation.to_department.strip().casefold()
+            if from_dept not in known_departments or to_dept not in known_departments:
+                msg = (
+                    f"Escalation path references unknown department: "
+                    f"{escalation.from_department!r} -> {escalation.to_department!r}"
+                )
+                raise ValueError(msg)
+
         # Budget sum
         max_budget_percent = 100.0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/core/company.py` around lines 420 - 438, The
_validate_departments model_validator currently checks unique names and budget
totals but does not ensure that any department names referenced by
workflow_handoffs or escalation_paths actually exist; to fix, after computing
names = [d.name for d in self.departments] add validation that every department
name used in self.workflow_handoffs (both source and target/step entries) and
every name in self.escalation_paths is present in the names set, collecting any
unknown names and raising ValueError listing them (include a clear message like
"Unknown department references in workflow_handoffs/escalation_paths: [...]");
perform this check inside _validate_departments so typos like "enginering" cause
validation failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/templates/renderer.py`:
- Around line 466-493: Add logging calls before raising TemplateRenderError for
the two type checks: when reporting_lines is not a list and when policies is not
a dict. Use the existing logger and TEMPLATE_RENDER_VARIABLE_ERROR (or
logger.error/warning as appropriate) and include context fields like
department=dept_name, field="reporting_lines" or field="policies", and a short
detail describing the expected type (e.g., "must be a list" / "must be a
mapping") so the log contains the same context as the raised
TemplateRenderError.

In `@tests/unit/templates/test_schema.py`:
- Around line 361-391: Add a missing unit test that mirrors
test_workflow_handoffs_accepted for escalation_paths: create a new test function
(e.g., test_escalation_paths_accepted) that uses make_template_dict to construct
a CompanyTemplate with a non-empty escalation_paths tuple and assert its length
is 1; reference the existing CompanyTemplate, make_template_dict, and
escalation_paths symbols so the test mirrors test_workflow_handoffs_accepted and
complements test_escalation_paths_default_empty.

---

Duplicate comments:
In `@src/ai_company/core/company.py`:
- Around line 288-299: The duplicate detection in _validate_unique_subordinates
is comparing raw subordinate strings but ReportingLine._validate_not_self_report
normalizes names with strip().casefold(); update _validate_unique_subordinates
to normalize each subordinate before checking duplicates (e.g., build subs =
[r.subordinate.strip().casefold() for r in self.reporting_lines]) so "Alice" and
" alice " are treated the same, then compute dupes from the normalized list and
include the original department name (self.name) in the raised ValueError as
before.
- Around line 420-438: The _validate_departments model_validator currently
checks unique names and budget totals but does not ensure that any department
names referenced by workflow_handoffs or escalation_paths actually exist; to
fix, after computing names = [d.name for d in self.departments] add validation
that every department name used in self.workflow_handoffs (both source and
target/step entries) and every name in self.escalation_paths is present in the
names set, collecting any unknown names and raising ValueError listing them
(include a clear message like "Unknown department references in
workflow_handoffs/escalation_paths: [...]"); perform this check inside
_validate_departments so typos like "enginering" cause validation failures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b4a39e1e-f914-4b0a-9ede-4f236e9a245a

📥 Commits

Reviewing files that changed from the base of the PR and between 62044ae and 058b885.

📒 Files selected for processing (12)
  • src/ai_company/core/agent.py
  • src/ai_company/core/company.py
  • src/ai_company/core/enums.py
  • src/ai_company/core/personality.py
  • src/ai_company/engine/prompt.py
  • src/ai_company/templates/presets.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/templates/schema.py
  • tests/unit/config/test_schema.py
  • tests/unit/engine/test_prompt.py
  • tests/unit/templates/test_renderer.py
  • tests/unit/templates/test_schema.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
Use except A, B: syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14
Include type hints on all public functions; enforce with mypy strict mode
Use Google-style docstrings on all public classes and functions — enforced by ruff D rules
Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections, use copy.deepcopy() at construction + MappingProxyType wrapping. For dict/list fields in frozen Pydantic models, rely on frozen=True and copy.deepcopy() at system boundaries
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use @computed_field for derived values; use NotBlankStr from core.types for all identifier/name fields
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code; prefer structured concurrency over bare create_task
Maintain line length of 88 characters — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate input at system boundaries: user input, external APIs, and config files

Files:

  • src/ai_company/engine/prompt.py
  • src/ai_company/core/company.py
  • tests/unit/engine/test_prompt.py
  • src/ai_company/core/personality.py
  • src/ai_company/templates/presets.py
  • tests/unit/templates/test_renderer.py
  • src/ai_company/templates/schema.py
  • src/ai_company/core/agent.py
  • tests/unit/templates/test_schema.py
  • tests/unit/config/test_schema.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/core/enums.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code — use get_logger() from observability module
Use variable name logger (not _logger, not log) for logging instances
Always use event name constants from ai_company.observability.events domain-specific modules; import directly (e.g., from ai_company.observability.events.<domain> import EVENT_CONSTANT)
Use structured logging format: logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG level for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging

Files:

  • src/ai_company/engine/prompt.py
  • src/ai_company/core/company.py
  • src/ai_company/core/personality.py
  • src/ai_company/templates/presets.py
  • src/ai_company/templates/schema.py
  • src/ai_company/core/agent.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/core/enums.py
src/ai_company/{providers,engine}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains

Files:

  • src/ai_company/engine/prompt.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like example-provider, example-large-001, large/medium/small as aliases, or test-provider, test-small-001 in tests

Files:

  • src/ai_company/engine/prompt.py
  • src/ai_company/core/company.py
  • src/ai_company/core/personality.py
  • src/ai_company/templates/presets.py
  • src/ai_company/templates/schema.py
  • src/ai_company/core/agent.py
  • src/ai_company/templates/renderer.py
  • src/ai_company/core/enums.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Maintain 80% minimum code coverage — enforced in CI
Use asyncio_mode = "auto" in pytest configuration — no manual @pytest.mark.asyncio needed
Set 30-second timeout per test
Use pytest-xdist via -n auto for parallel test execution
Prefer @pytest.mark.parametrize for testing similar cases

Files:

  • tests/unit/engine/test_prompt.py
  • tests/unit/templates/test_renderer.py
  • tests/unit/templates/test_schema.py
  • tests/unit/config/test_schema.py
🧠 Learnings (4)
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves

Applied to files:

  • src/ai_company/core/agent.py
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values; use `NotBlankStr` from `core.types` for all identifier/name fields

Applied to files:

  • src/ai_company/core/agent.py
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/ai_company/templates/renderer.py
📚 Learning: 2026-03-07T14:50:05.694Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T14:50:05.694Z
Learning: Applies to **/*.py : Validate input at system boundaries: user input, external APIs, and config files

Applied to files:

  • src/ai_company/templates/renderer.py
🧬 Code graph analysis (8)
tests/unit/engine/test_prompt.py (5)
src/ai_company/core/enums.py (7)
  • CollaborationPreference (244-249)
  • CommunicationVerbosity (252-257)
  • ConflictApproach (260-267)
  • CreativityLevel (40-45)
  • DecisionMakingStyle (235-241)
  • RiskTolerance (32-37)
  • SeniorityLevel (6-21)
src/ai_company/engine/errors.py (1)
  • PromptBuildError (8-9)
src/ai_company/engine/prompt.py (3)
  • DefaultTokenEstimator (107-123)
  • SystemPrompt (56-82)
  • build_error_prompt (646-673)
tests/unit/engine/conftest.py (1)
  • sample_agent_with_personality (52-79)
src/ai_company/core/agent.py (3)
  • AgentIdentity (263-321)
  • ModelConfig (162-191)
  • PersonalityConfig (26-139)
src/ai_company/core/personality.py (3)
src/ai_company/core/enums.py (2)
  • CollaborationPreference (244-249)
  • ConflictApproach (260-267)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (26-139)
src/ai_company/templates/presets.py (2)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (26-139)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
tests/unit/templates/test_renderer.py (4)
src/ai_company/templates/renderer.py (4)
  • _expand_single_agent (364-417)
  • _build_departments (445-494)
  • _build_config_dict (247-299)
  • _validate_list (302-320)
src/ai_company/core/enums.py (1)
  • CompanyType (72-82)
src/ai_company/templates/schema.py (3)
  • CompanyTemplate (201-304)
  • TemplateAgentConfig (77-127)
  • TemplateMetadata (167-198)
src/ai_company/templates/errors.py (1)
  • TemplateRenderError (14-20)
src/ai_company/core/agent.py (1)
src/ai_company/core/enums.py (5)
  • CollaborationPreference (244-249)
  • CommunicationVerbosity (252-257)
  • ConflictApproach (260-267)
  • CreativityLevel (40-45)
  • DecisionMakingStyle (235-241)
tests/unit/templates/test_schema.py (3)
src/ai_company/templates/schema.py (3)
  • TemplateAgentConfig (77-127)
  • TemplateDepartmentConfig (130-164)
  • CompanyTemplate (201-304)
tests/unit/templates/conftest.py (1)
  • make_template_dict (123-125)
tests/unit/config/test_schema.py (1)
  • test_workflow_handoffs_accepted (369-380)
tests/unit/config/test_schema.py (2)
tests/unit/templates/test_schema.py (1)
  • test_workflow_handoffs_accepted (375-390)
src/ai_company/config/schema.py (1)
  • RootConfig (360-499)
src/ai_company/templates/renderer.py (3)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (26-139)
src/ai_company/templates/errors.py (1)
  • TemplateRenderError (14-20)
src/ai_company/templates/presets.py (1)
  • get_personality_preset (292-314)
🔇 Additional comments (41)
src/ai_company/engine/prompt.py (5)

48-50: LGTM!

Thread-safety documentation is clear and appropriately warns against post-initialization modifications.


160-175: LGTM!

Well-structured Args/Returns documentation following Google-style docstrings.


318-321: LGTM!

New personality dimensions (verbosity, decision_making, collaboration, conflict_approach) are correctly injected into the core template context, aligning with the PersonalityConfig model fields.


666-672: LGTM!

The metadata merge correctly combines identity metadata with the provided agent_id, ensuring the explicit agent_id parameter takes precedence over the identity's ID.


204-212: Exception syntax is correct for Python 3.14.

The except MemoryError, RecursionError: syntax without parentheses is valid per PEP 758, which allows optional parentheses for multiple exception types when no as clause is present. This code correctly catches both exception types.

src/ai_company/templates/renderer.py (5)

41-46: LGTM!

Extracting default constants improves maintainability and consistency across the codebase.


228-244: LGTM!

Error logging added before raising exceptions for missing/invalid template keys, aligning with the coding guideline to log at WARNING or ERROR with context before raising.


280-299: LGTM!

Clean extraction of workflow_handoffs and escalation_paths with proper validation via _validate_list. The conditional inclusion pattern correctly handles None and missing keys.


369-417: LGTM!

Solid implementation:

  • Required role validation with clear error message
  • Inline personality type validation before construction
  • Proper error handling with _validate_inline_personality
  • Uses _DEFAULT_PROVIDER and _DEFAULT_DEPARTMENT constants

420-443: LGTM!

Exception handling is appropriately narrowed to (ValidationError, TypeError), addressing the previous review comment. Logging at WARNING level with context before raising aligns with coding guidelines. Based on learnings: "Handle errors explicitly, never silently swallow exceptions."

tests/unit/templates/test_schema.py (2)

99-126: LGTM!

Comprehensive test coverage for inline personality support:

  • Tests inline personality dict with valid fields
  • Verifies None default when not provided
  • Tests mutual exclusion of personality and personality_preset

156-177: LGTM!

Good coverage of TemplateDepartmentConfig's new fields with tests for default values and explicit value acceptance.

src/ai_company/core/enums.py (1)

233-282: LGTM!

Well-structured enum additions:

  • Clear docstrings explaining purpose
  • Consistent lowercase string values matching existing conventions
  • ActionType helpfully notes values are optional constants, not required types
tests/unit/config/test_schema.py (1)

369-394: LGTM!

Good test coverage for the new cross-department governance fields. The # type: ignore[arg-type] comments are appropriate since the tests pass raw dicts that Pydantic coerces to the typed models.

tests/unit/engine/test_prompt.py (4)

11-26: LGTM!

Import additions align with the new personality enums and build_error_prompt function being tested.


231-267: LGTM!

Comprehensive tests for new personality dimensions:

  • Verifies default values appear in prompts
  • Verifies custom enum values render correctly

466-470: LGTM!

Version assertion updated to match the new template version.


815-850: LGTM!

Well-structured tests for build_error_prompt:

  • Tests pass-through when existing prompt is provided
  • Tests placeholder creation with correct metadata when no prompt exists
  • Verifies agent_id parameter is used in metadata
src/ai_company/templates/presets.py (5)

1-16: LGTM!

Clean module setup:

  • Updated docstring reflecting Big Five additions
  • MappingProxyType import for immutability
  • PersonalityConfig import for startup validation
  • Observability event constant import

20-263: LGTM!

Comprehensive personality presets with:

  • Full Big Five dimensions (openness, conscientiousness, extraversion, agreeableness, stress_response)
  • Behavioral enums (decision_making, collaboration, verbosity, conflict_approach)
  • MappingProxyType wrapping enforcing immutability
  • Well-differentiated preset profiles

266-289: LGTM!

Auto-name pools wrapped with MappingProxyType for immutability, with comprehensive role-to-name mappings using gender-neutral names.


304-314: LGTM!

Error logging with structured event constant before raising KeyError aligns with coding guidelines for error paths.


317-321: LGTM!

Startup validation ensures all presets are valid at import time, catching key typos or invalid values immediately. The cleanup via del prevents polluting module namespace with loop variables.

src/ai_company/core/personality.py (6)

1-21: LGTM!

Well-structured module setup:

  • Clear docstring explaining purpose
  • TYPE_CHECKING import avoids runtime circular dependency
  • Proper observability setup following coding guidelines

23-54: LGTM!

Weight configuration is well-documented and mathematically sound:

  • Component weights sum to 1.0 (0.6 + 0.2 + 0.2)
  • Big Five dimension weights sum to 1.0
  • Immutable collaboration order mapping
  • Clear constructive/destructive conflict pairing definitions

56-86: LGTM!

Clean implementation of pairwise compatibility:

  • Combines three scoring components with appropriate weights
  • Clamps result to valid range for safety
  • Logs detailed breakdown at DEBUG level for diagnostics

88-121: LGTM!

Team compatibility correctly uses itertools.combinations per PR objectives (replacing range(len()) nested loops). Edge case for teams with ≤1 members returns 1.0 with proper logging.


126-153: LGTM!

Big Five scoring implements nuanced logic:

  • Similarity scoring for openness, conscientiousness, agreeableness, stress_response
  • Complementarity scoring for extraversion (optimal difference of 0.3)
  • Tent function correctly peaks at optimal_diff

156-178: LGTM!

Collaboration and conflict scoring logic is clear:

  • Adjacency-based collaboration scoring (same=1.0, adjacent=0.5, distant=0.2)
  • Conflict scoring prioritizes constructive approaches and penalizes destructive pairs
src/ai_company/templates/schema.py (3)

109-127: LGTM! Mutual exclusion validator correctly implemented.

The new personality field and the _validate_personality_mutual_exclusion validator properly enforce that only one of personality_preset or personality can be specified. This aligns with the inline personality support added across the template system.


157-164: LGTM! Department fields properly extended.

The new reporting_lines and policies fields appropriately use loose typing (dict[str, str] and dict[str, Any]) at the template schema level, with validation deferred to render time when these are converted to the stricter core domain models.


259-266: LGTM! Cross-department workflow fields properly added.

The workflow_handoffs and escalation_paths fields use immutable tuples with flexible dict typing, which is appropriate for template definitions. Downstream validation occurs when the renderer constructs the stricter WorkflowHandoff and EscalationPath domain models.

src/ai_company/core/agent.py (3)

50-50: LGTM! allow_inf_nan=False properly configured.

Setting allow_inf_nan=False in the model config ensures Pydantic rejects infinite values for all float fields, which is appropriate for personality scoring dimensions.


75-123: LGTM! Big Five dimensions and behavioral enums well-structured.

The Big Five (OCEAN) dimensions are properly constrained to [0.0, 1.0] with sensible defaults of 0.5. The behavioral enums provide meaningful defaults that align with collaborative agent behavior. The docstring clearly distinguishes between internal scoring dimensions and prompt-injected labels.


125-139: LGTM! Explicit NaN rejection validator.

While allow_inf_nan=False already rejects NaN at the Pydantic level, this explicit validator provides a clearer, domain-specific error message ("NaN is not allowed for Big Five dimensions") which improves debuggability. Good defense-in-depth approach.

src/ai_company/core/company.py (3)

16-38: LGTM! ReportingLine model correctly validates self-reporting.

The _validate_not_self_report validator properly uses case-insensitive comparison with strip().casefold() to prevent agents from reporting to themselves, handling variations like "Alice" vs " alice ".


67-128: LGTM! Approval chain validation is robust.

The ApprovalChain model correctly validates that approvers is non-empty and that min_approvals doesn't exceed the number of approvers. The DepartmentPolicies model properly enforces unique action_type values across approval chains using a Counter-based duplicate check.


134-205: LGTM! Cross-department models with shared validation helper.

The extracted _reject_same_department helper provides consistent case-insensitive validation for both WorkflowHandoff and EscalationPath, eliminating code duplication. The priority_boost constraint (0-3) on EscalationPath is appropriately bounded.

tests/unit/templates/test_renderer.py (3)

300-316: LGTM! Inline personality test properly validates passthrough.

The test correctly verifies that inline personality dictionaries are passed through to the expanded agent config, checking both communication_style and traits fields.


318-382: LGTM! Department passthrough tests provide good coverage.

The tests properly verify that reporting_lines, policies, and workflow_handoffs are passed through from rendered data to the output dicts. Using internal functions (_build_departments, _build_config_dict) for unit testing these specific paths is appropriate.


385-508: LGTM! Comprehensive error case coverage.

The error tests properly verify that:

  • Invalid inline personality raises TemplateRenderError
  • Non-dict personality values are rejected
  • Missing role field raises an error
  • Type validation for reporting_lines (list) and policies (dict)
  • Unknown personality presets are rejected
  • _validate_list rejects non-list values and non-dict items

All tests use appropriate match patterns to verify specific error messages.

Comment on lines +466 to 493
dept_name = dept.get("name", "")
head_role = dept.get("head_role")
if not head_role:
logger.warning(
TEMPLATE_RENDER_VARIABLE_ERROR,
department=dept_name,
field="head_role",
detail="No head_role specified; using department name as placeholder",
)
head_role = dept_name or ""
dept_dict: dict[str, Any] = {
"name": dept.get("name", ""),
"head": dept.get("head_role", dept.get("name", "")),
"name": dept_name,
"head": head_role,
"budget_percent": budget_pct,
}
reporting_lines = dept.get("reporting_lines")
if reporting_lines is not None:
if not isinstance(reporting_lines, list):
msg = f"Department {dept_name!r} 'reporting_lines' must be a list"
raise TemplateRenderError(msg)
dept_dict["reporting_lines"] = reporting_lines
policies = dept.get("policies")
if policies is not None:
if not isinstance(policies, dict):
msg = f"Department {dept_name!r} 'policies' must be a mapping"
raise TemplateRenderError(msg)
dept_dict["policies"] = policies
departments.append(dept_dict)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider logging before raising for type validation errors.

For consistency with coding guidelines requiring "all error paths must log at WARNING or ERROR with context before raising," consider adding logging before raising TemplateRenderError for reporting_lines and policies type validation failures.

♻️ Suggested improvement
         reporting_lines = dept.get("reporting_lines")
         if reporting_lines is not None:
             if not isinstance(reporting_lines, list):
                 msg = f"Department {dept_name!r} 'reporting_lines' must be a list"
+                logger.warning(
+                    TEMPLATE_RENDER_YAML_ERROR,
+                    department=dept_name,
+                    error=msg,
+                )
                 raise TemplateRenderError(msg)
             dept_dict["reporting_lines"] = reporting_lines
         policies = dept.get("policies")
         if policies is not None:
             if not isinstance(policies, dict):
                 msg = f"Department {dept_name!r} 'policies' must be a mapping"
+                logger.warning(
+                    TEMPLATE_RENDER_YAML_ERROR,
+                    department=dept_name,
+                    error=msg,
+                )
                 raise TemplateRenderError(msg)
             dept_dict["policies"] = policies
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/templates/renderer.py` around lines 466 - 493, Add logging
calls before raising TemplateRenderError for the two type checks: when
reporting_lines is not a list and when policies is not a dict. Use the existing
logger and TEMPLATE_RENDER_VARIABLE_ERROR (or logger.error/warning as
appropriate) and include context fields like department=dept_name,
field="reporting_lines" or field="policies", and a short detail describing the
expected type (e.g., "must be a list" / "must be a mapping") so the log contains
the same context as the raised TemplateRenderError.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"""

model_config = ConfigDict(frozen=True)
model_config = ConfigDict(frozen=True, allow_inf_nan=False)
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PersonalityConfig is used to validate inline personality dicts and presets, but its model_config doesn’t set extra='forbid'. With Pydantic’s default extra handling, unknown keys in personality dicts (e.g., typos like conciensciousness) may be silently ignored, meaning _validate_inline_personality() and the import-time preset validation won’t reliably catch schema mistakes. Consider forbidding extra fields on PersonalityConfig, or explicitly checking incoming dict keys against PersonalityConfig.model_fields in the renderer/preset validation paths to ensure typos are rejected.

Suggested change
model_config = ConfigDict(frozen=True, allow_inf_nan=False)
model_config = ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid")

Copilot uses AI. Check for mistakes.
Comment on lines +423 to 428
# Unique department names (normalized for case-insensitive comparison)
names = [d.name.strip().casefold() for d in self.departments]
if len(names) != len(set(names)):
dupes = sorted(n for n, c in Counter(names).items() if c > 1)
msg = f"Duplicate department names: {dupes}"
raise ValueError(msg)
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Company._validate_departments() now normalizes department names with strip().casefold() for uniqueness and for validating handoff/escalation references. This changes validation semantics vs other layers (e.g., RootConfig / CompanyTemplate validators still appear to treat department names case-sensitively), which can allow configs/templates to validate initially but fail later when constructing Company. Consider aligning department-name uniqueness/reference validation across config/template/company layers (all case-insensitive or all case-sensitive), and if keeping normalization here, consider reporting duplicates using the original department names for clearer error messages.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/core/company.py`:
- Around line 134-141: Add module-level observability and log validation
failures before raising: import get_logger from ai_company.observability and
create logger = get_logger(__name__), then update each validation function
(e.g., _reject_same_department and the other validator blocks around lines
164-205, 288-299, 420-451) to call logger.warning or logger.error with
contextual information (include from_dept, to_dept, label or relevant
parameters) immediately before raising ValueError so failures are recorded with
context.
- Around line 86-98: The _validate_approvers method currently allows duplicate
entries; update it to reject duplicate approvers by checking uniqueness (e.g.,
compare len(self.approvers) to len(set(self.approvers))) and raise a ValueError
listing or describing the duplicate(s) when found; also change the min_approvals
check in _validate_approvers to compare against the number of unique approvers
(use len(set(self.approvers))) so min_approvals cannot exceed distinct
approvers, and keep the existing error messages clear and contextual (refer to
_validate_approvers and the ApprovalChain approvers/min_approvals validation).

In `@tests/unit/core/test_company.py`:
- Around line 770-780: The test test_duplicate_subordinates_case_insensitive
only checks case differences and misses the whitespace-normalization regression;
update it to parametrize both normalization scenarios so it asserts duplicates
are rejected when subordinates differ by case OR by surrounding whitespace.
Specifically, convert test_duplicate_subordinates_case_insensitive into a
`@pytest.mark.parametrize` over pairs of subordinate strings (e.g.,
("Alice","alice") and ("Alice "," alice")) and still instantiate Department with
ReportingLine(subordinate=..., supervisor=...) entries and expect
ValidationError when creating Department; this ensures the
Department/ReportingLine normalization (strip() + casefold()) is actually
exercised.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 310521b7-50f9-46b0-8896-df5e1a38bfcb

📥 Commits

Reviewing files that changed from the base of the PR and between 058b885 and 5e1c69c.

📒 Files selected for processing (3)
  • src/ai_company/core/company.py
  • tests/unit/core/conftest.py
  • tests/unit/core/test_company.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
Use except A, B: syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14
Include type hints on all public functions; enforce with mypy strict mode
Use Google-style docstrings on all public classes and functions — enforced by ruff D rules
Use immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections, use copy.deepcopy() at construction + MappingProxyType wrapping. For dict/list fields in frozen Pydantic models, rely on frozen=True and copy.deepcopy() at system boundaries
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use @computed_field for derived values; use NotBlankStr from core.types for all identifier/name fields
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code; prefer structured concurrency over bare create_task
Maintain line length of 88 characters — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate input at system boundaries: user input, external APIs, and config files

Files:

  • tests/unit/core/conftest.py
  • src/ai_company/core/company.py
  • tests/unit/core/test_company.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Maintain 80% minimum code coverage — enforced in CI
Use asyncio_mode = "auto" in pytest configuration — no manual @pytest.mark.asyncio needed
Set 30-second timeout per test
Use pytest-xdist via -n auto for parallel test execution
Prefer @pytest.mark.parametrize for testing similar cases

Files:

  • tests/unit/core/conftest.py
  • tests/unit/core/test_company.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code — use get_logger() from observability module
Use variable name logger (not _logger, not log) for logging instances
Always use event name constants from ai_company.observability.events domain-specific modules; import directly (e.g., from ai_company.observability.events.<domain> import EVENT_CONSTANT)
Use structured logging format: logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG level for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging

Files:

  • src/ai_company/core/company.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like example-provider, example-large-001, large/medium/small as aliases, or test-provider, test-small-001 in tests

Files:

  • src/ai_company/core/company.py
🧬 Code graph analysis (2)
tests/unit/core/conftest.py (1)
src/ai_company/core/company.py (11)
  • ApprovalChain (67-98)
  • Company (373-452)
  • CompanyConfig (302-332)
  • Department (239-299)
  • DepartmentPolicies (101-128)
  • EscalationPath (175-205)
  • HRRegistry (335-370)
  • ReportingLine (16-38)
  • ReviewRequirements (41-64)
  • Team (208-236)
  • WorkflowHandoff (144-172)
tests/unit/core/test_company.py (1)
src/ai_company/core/company.py (11)
  • ApprovalChain (67-98)
  • Company (373-452)
  • CompanyConfig (302-332)
  • Department (239-299)
  • DepartmentPolicies (101-128)
  • EscalationPath (175-205)
  • HRRegistry (335-370)
  • ReportingLine (16-38)
  • ReviewRequirements (41-64)
  • Team (208-236)
  • WorkflowHandoff (144-172)

Comment on lines +86 to +98
@model_validator(mode="after")
def _validate_approvers(self) -> Self:
"""Ensure approvers is non-empty and min_approvals is within bounds."""
if not self.approvers:
msg = "Approval chain must have at least one approver"
raise ValueError(msg)
if self.min_approvals > len(self.approvers):
msg = (
f"min_approvals ({self.min_approvals}) exceeds "
f"number of approvers ({len(self.approvers)})"
)
raise ValueError(msg)
return self
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject duplicate approvers in ApprovalChain.

approvers=("lead", "lead") currently passes validation, so the same person can satisfy multiple slots and min_approvals stops representing distinct approvals.

Proposed fix
     `@model_validator`(mode="after")
     def _validate_approvers(self) -> Self:
         """Ensure approvers is non-empty and min_approvals is within bounds."""
         if not self.approvers:
             msg = "Approval chain must have at least one approver"
             raise ValueError(msg)
+        normalized_approvers = [a.strip().casefold() for a in self.approvers]
+        if len(normalized_approvers) != len(set(normalized_approvers)):
+            dupes = sorted(
+                a for a, count in Counter(normalized_approvers).items() if count > 1
+            )
+            msg = (
+                f"Duplicate approvers in approval chain "
+                f"{self.action_type!r}: {dupes}"
+            )
+            raise ValueError(msg)
         if self.min_approvals > len(self.approvers):
             msg = (
                 f"min_approvals ({self.min_approvals}) exceeds "
                 f"number of approvers ({len(self.approvers)})"
             )

As per coding guidelines, "Validate input at system boundaries: user input, external APIs, and config files".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/core/company.py` around lines 86 - 98, The _validate_approvers
method currently allows duplicate entries; update it to reject duplicate
approvers by checking uniqueness (e.g., compare len(self.approvers) to
len(set(self.approvers))) and raise a ValueError listing or describing the
duplicate(s) when found; also change the min_approvals check in
_validate_approvers to compare against the number of unique approvers (use
len(set(self.approvers))) so min_approvals cannot exceed distinct approvers, and
keep the existing error messages clear and contextual (refer to
_validate_approvers and the ApprovalChain approvers/min_approvals validation).

"""
try:
PersonalityConfig(**personality)
except (ValidationError, TypeError) as exc:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP 758 except syntax — missing from project convention

CLAUDE.md mandates PEP 758 bare-comma except syntax (no parentheses), enforced by ruff on Python 3.14:

Suggested change
except (ValidationError, TypeError) as exc:
except ValidationError, TypeError as exc:

This is the only parenthesised multi-exception catch introduced in this PR.

Rule Used: CLAUDE.md (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/templates/renderer.py
Line: 435

Comment:
**PEP 758 except syntax — missing from project convention**

CLAUDE.md mandates PEP 758 bare-comma except syntax (no parentheses), enforced by ruff on Python 3.14:

```suggestion
    except ValidationError, TypeError as exc:
```

This is the only parenthesised multi-exception catch introduced in this PR.

**Rule Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=6816cd03-d0e1-4fd0-9d04-2417487a584c))

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…eptile

- Remove dead _reject_nan validator (allow_inf_nan=False handles it)
- Add extra='forbid' to PersonalityConfig to reject unknown keys
- Add logger + COMPANY_VALIDATION_ERROR event to company.py validators
- Reject duplicate approvers in ApprovalChain (case-insensitive)
- Guard module-level del in presets.py against empty dict
- Add whitespace-insensitive subordinate dedup test
- Add test_escalation_paths_accepted to template schema tests
- Add test_duplicate_approvers tests to ApprovalChain tests
- Register company events module in discovery test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/unit/observability/test_events.py (1)

93-113: 🧹 Nitpick | 🔵 Trivial

Discovery alone does not pin the new domain constants.

This update proves the company and personality modules are present, but a typo in one of their exported event strings would still pass _DOT_PATTERN and uniqueness checks. Please add domain-specific value assertions alongside the existing test_*_events_exist blocks.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/observability/test_events.py` around lines 93 - 113, The discovery
test only verifies modules exist but not that their exported event string
constants have the exact expected values; update the tests by adding
domain-specific value assertions alongside the existing test_*_events_exist
checks (e.g., assert the specific event string constants exported from the
company and personality modules equal the canonical values used elsewhere) so
that typos in exported strings fail; locate references in
tests/unit/observability/test_events.py (functions
test_all_domain_modules_discovered and the existing test_*_events_exist blocks)
and add explicit equality assertions for the key constants from the company and
personality modules to pin their exact values.
♻️ Duplicate comments (1)
src/ai_company/core/company.py (1)

33-41: ⚠️ Potential issue | 🟠 Major

Some new validation paths still bypass structured error logging.

ReportingLine._validate_not_self_report, DepartmentPolicies._validate_unique_action_types, and Department._validate_unique_subordinates still raise directly, so those failures never emit COMPANY_VALIDATION_ERROR the way the other validators in this module now do.

As per coding guidelines, "All error paths must log at WARNING or ERROR with context before raising" and "Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)".

Also applies to: 132-139, 301-311

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/core/company.py` around lines 33 - 41, The three validators
(ReportingLine._validate_not_self_report,
DepartmentPolicies._validate_unique_action_types,
Department._validate_unique_subordinates) currently raise ValueError directly
and must instead log the failure with context before raising; add the
module-level imports "from ai_company.observability import get_logger" and "from
ai_company.observability import COMPANY_VALIDATION_ERROR" and initialize "logger
= get_logger(__name__)", then in each validator build the same error message you
already construct, call logger.warning/ error with COMPANY_VALIDATION_ERROR and
the message (include relevant fields like subordinate, supervisor, action type,
or subordinate list) and then raise the ValueError as before so the error path
is logged consistently prior to raising.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/templates/presets.py`:
- Around line 317-322: The startup validation loop over PERSONALITY_PRESETS
currently constructs PersonalityConfig(**_preset_dict) but does not include the
preset key in error reporting; update the loop to validate each preset inside a
try/except that catches pydantic.ValidationError (or Exception) and re-raises or
logs a new error that includes the preset name (_preset_name) and the original
exception details, then continue or fail as desired; reference the existing loop
over PERSONALITY_PRESETS and the PersonalityConfig constructor to locate where
to add the try/except and use _preset_name in the error message so failures
identify which preset failed.
- Around line 20-21: PERSONALITY_PRESETS currently only wraps the outer mapping,
leaving inner preset dicts mutable and allowing callers to mutate shared state
and bypass startup validation; fix this by making each preset payload immutable
or by storing validated PersonalityConfig instances: either wrap every inner
dict with MappingProxyType when building PERSONALITY_PRESETS (e.g., replace each
value dict with MappingProxyType(value_dict)) or construct and store validated
PersonalityConfig objects for each preset and ensure any public accessor returns
a copy (deepcopy or pydantic .copy()/dict()) rather than the original; update
the module initialization where PERSONALITY_PRESETS is created and any code that
reads presets to return copies so presets cannot be mutated at runtime.

---

Outside diff comments:
In `@tests/unit/observability/test_events.py`:
- Around line 93-113: The discovery test only verifies modules exist but not
that their exported event string constants have the exact expected values;
update the tests by adding domain-specific value assertions alongside the
existing test_*_events_exist checks (e.g., assert the specific event string
constants exported from the company and personality modules equal the canonical
values used elsewhere) so that typos in exported strings fail; locate references
in tests/unit/observability/test_events.py (functions
test_all_domain_modules_discovered and the existing test_*_events_exist blocks)
and add explicit equality assertions for the key constants from the company and
personality modules to pin their exact values.

---

Duplicate comments:
In `@src/ai_company/core/company.py`:
- Around line 33-41: The three validators
(ReportingLine._validate_not_self_report,
DepartmentPolicies._validate_unique_action_types,
Department._validate_unique_subordinates) currently raise ValueError directly
and must instead log the failure with context before raising; add the
module-level imports "from ai_company.observability import get_logger" and "from
ai_company.observability import COMPANY_VALIDATION_ERROR" and initialize "logger
= get_logger(__name__)", then in each validator build the same error message you
already construct, call logger.warning/ error with COMPANY_VALIDATION_ERROR and
the message (include relevant fields like subordinate, supervisor, action type,
or subordinate list) and then raise the ValueError as before so the error path
is logged consistently prior to raising.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 09147f9d-f9ec-47d9-a8a8-9d3c31fd9f1a

📥 Commits

Reviewing files that changed from the base of the PR and between 5e1c69c and e7e928e.

📒 Files selected for processing (7)
  • src/ai_company/core/agent.py
  • src/ai_company/core/company.py
  • src/ai_company/observability/events/company.py
  • src/ai_company/templates/presets.py
  • tests/unit/core/test_company.py
  • tests/unit/observability/test_events.py
  • tests/unit/templates/test_schema.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python 3.14+ required with PEP 649 native lazy annotations
Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Enforce 88 character line length (ruff configuration)

Files:

  • src/ai_company/observability/events/company.py
  • tests/unit/templates/test_schema.py
  • src/ai_company/core/agent.py
  • tests/unit/observability/test_events.py
  • src/ai_company/templates/presets.py
  • tests/unit/core/test_company.py
  • src/ai_company/core/company.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: All public functions and classes require type hints; mypy strict mode enforced
Use Google-style docstrings on all public classes and functions; enforced by ruff D rules
Create new objects instead of mutating existing ones — enforce immutability for non-Pydantic internal collections (registries, BaseTool) using copy.deepcopy() at construction and MappingProxyType wrapping
Use copy.deepcopy() at system boundaries for dict/list fields in frozen Pydantic models (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; separate mutable-via-copy models using model_copy(update=...) for runtime state that evolves
Never mix static config fields with mutable runtime fields in one Pydantic model
Use Pydantic v2 with BaseModel, model_validator, computed_field, and ConfigDict
Use @computed_field for derived values instead of storing and validating redundant fields (e.g., TokenUsage.total_tokens)
Use NotBlankStr from core.types for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files) in Python code
Use variable name logger (not _logger, not log) for logger instances
Use structured kwargs in logging: always logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Log at DEBUG level for object creation, internal flow, entry/exit of key functions
Pure data models, enu...

Files:

  • src/ai_company/observability/events/company.py
  • src/ai_company/core/agent.py
  • src/ai_company/templates/presets.py
  • src/ai_company/core/company.py
src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Never use import logging, logging.getLogger(), or print() in application code
Always use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget)
Import event constants directly: from ai_company.observability.events.<domain> import EVENT_CONSTANT

Files:

  • src/ai_company/observability/events/company.py
  • src/ai_company/core/agent.py
  • src/ai_company/templates/presets.py
  • src/ai_company/core/company.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small aliases

Files:

  • src/ai_company/observability/events/company.py
  • tests/unit/templates/test_schema.py
  • src/ai_company/core/agent.py
  • tests/unit/observability/test_events.py
  • src/ai_company/templates/presets.py
  • tests/unit/core/test_company.py
  • src/ai_company/core/company.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Use asyncio_mode = 'auto' in pytest — no manual @pytest.mark.asyncio needed on async tests
Enforce 30 second timeout per test
Prefer @pytest.mark.parametrize for testing similar cases in Python tests
Use test-provider, test-small-001, etc. instead of vendor names in tests

Files:

  • tests/unit/templates/test_schema.py
  • tests/unit/observability/test_events.py
  • tests/unit/core/test_company.py
src/ai_company/!(observability)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)

Files:

  • src/ai_company/core/agent.py
  • src/ai_company/templates/presets.py
  • src/ai_company/core/company.py
🧠 Learnings (8)
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/ai_company/**/*.py : Import event constants directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/ai_company/observability/events/company.py
  • tests/unit/observability/test_events.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/ai_company/**/*.py : Always use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget)

Applied to files:

  • src/ai_company/observability/events/company.py
  • tests/unit/observability/test_events.py
  • src/ai_company/core/company.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models using model_copy(update=...) for runtime state that evolves

Applied to files:

  • src/ai_company/core/agent.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/**/*.py : Never mix static config fields with mutable runtime fields in one Pydantic model

Applied to files:

  • src/ai_company/core/agent.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Every module with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • src/ai_company/core/company.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/ai_company/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code

Applied to files:

  • src/ai_company/core/company.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising

Applied to files:

  • src/ai_company/core/company.py
📚 Learning: 2026-03-07T17:01:58.720Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T17:01:58.720Z
Learning: Applies to src/**/*.py : Validate at system boundaries (user input, external APIs, config files) in Python code

Applied to files:

  • src/ai_company/core/company.py
🧬 Code graph analysis (5)
tests/unit/templates/test_schema.py (2)
src/ai_company/templates/schema.py (3)
  • TemplateAgentConfig (77-127)
  • TemplateDepartmentConfig (130-164)
  • CompanyTemplate (201-304)
tests/unit/templates/conftest.py (1)
  • make_template_dict (123-125)
src/ai_company/core/agent.py (1)
src/ai_company/core/enums.py (5)
  • CollaborationPreference (244-249)
  • CommunicationVerbosity (252-257)
  • ConflictApproach (260-267)
  • CreativityLevel (40-45)
  • DecisionMakingStyle (235-241)
src/ai_company/templates/presets.py (2)
src/ai_company/core/agent.py (1)
  • PersonalityConfig (25-122)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
tests/unit/core/test_company.py (1)
src/ai_company/core/company.py (11)
  • ApprovalChain (71-110)
  • Company (386-469)
  • CompanyConfig (315-345)
  • Department (252-312)
  • DepartmentPolicies (113-140)
  • EscalationPath (188-218)
  • HRRegistry (348-383)
  • ReportingLine (20-42)
  • ReviewRequirements (45-68)
  • Team (221-249)
  • WorkflowHandoff (157-185)
src/ai_company/core/company.py (1)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
🔇 Additional comments (1)
src/ai_company/core/agent.py (1)

49-122: Nice boundary hardening of PersonalityConfig.

extra="forbid", allow_inf_nan=False, and the explicit bounds/max-length guards close off the obvious malformed preset and inline-config cases without needing extra custom validators.

…validation

- Add logging before raise in _validate_unique_action_types (Greptile)
- Freeze inner preset dicts with MappingProxyType (CodeRabbit)
- Name the failing preset in startup validation error (CodeRabbit)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 7, 2026 17:22
# Conflicts:
#	src/ai_company/config/defaults.py
#	src/ai_company/config/schema.py
#	tests/unit/observability/test_events.py
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 390 to 395
agent_dict: dict[str, Any] = {
"name": name,
"role": role,
"department": agent.get("department", "engineering"),
"department": agent.get("department", _DEFAULT_DEPARTMENT),
"level": agent.get("level", "mid"),
}
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

department defaulting doesn’t handle an explicit null/None value. If the rendered agent dict contains department: null, agent.get('department', _DEFAULT_DEPARTMENT) will return None, and RootConfig/AgentConfig will later reject it (department is NotBlankStr). Consider normalizing None (and possibly whitespace-only strings) to _DEFAULT_DEPARTMENT before building agent_dict to match the template schema docstring (“None defaults to engineering during rendering”).

Copilot uses AI. Check for mistakes.
Comment on lines +397 to +408
inline_personality = agent.get("personality")
preset_name = agent.get("personality_preset")
if preset_name:
if inline_personality is not None:
if not isinstance(inline_personality, dict):
msg = (
f"Personality for agent {name!r} must be a mapping, "
f"got {type(inline_personality).__name__}"
)
raise TemplateRenderError(msg)
_validate_inline_personality(inline_personality, name)
agent_dict["personality"] = inline_personality
elif preset_name:
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutual exclusivity between personality and personality_preset is enforced in TemplateAgentConfig, but _expand_single_agent() silently prefers personality when both keys are present in rendered YAML. For defense-in-depth and clearer errors (especially if rendered YAML is produced outside the schema path), consider explicitly rejecting the case where both are set and raising TemplateRenderError with a clear message.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +45
def test_preset_count_at_least_15(self) -> None:
assert len(PERSONALITY_PRESETS) >= 15
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_preset_count_at_least_15 hard-codes a minimum preset count, which can make future refactors (e.g., consolidating/renaming presets) fail tests without any behavioral regression. Consider asserting on required preset names (or on qualitative invariants like “non-empty, unique keys, validates as PersonalityConfig”) rather than a specific quantity.

Suggested change
def test_preset_count_at_least_15(self) -> None:
assert len(PERSONALITY_PRESETS) >= 15
def test_presets_have_nonempty_unique_names(self) -> None:
# Ensure we have at least one preset, and that names are well-formed and unique.
assert PERSONALITY_PRESETS, "No personality presets defined"
for name in PERSONALITY_PRESETS:
assert isinstance(name, str), "Preset name must be a string"
assert name.strip(), "Preset name must be non-empty"
assert len(PERSONALITY_PRESETS) == len(
set(PERSONALITY_PRESETS)
), "Duplicate preset names found"

Copilot uses AI. Check for mistakes.
@Aureliolo Aureliolo merged commit 10b2299 into main Mar 7, 2026
10 checks passed
@Aureliolo Aureliolo deleted the feat/dept-personalities branch March 7, 2026 17:27
Comment on lines +34 to +42
def _validate_not_self_report(self) -> Self:
"""Reject self-reporting relationships."""
if self.subordinate.strip().casefold() == self.supervisor.strip().casefold():
msg = (
f"Agent cannot report to themselves: "
f"{self.subordinate!r} == {self.supervisor!r}"
)
raise ValueError(msg)
return self
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _validate_not_self_report validator raises without logging, violating the CLAUDE.md convention that "all error paths must log at WARNING or ERROR with context before raising."

All other validators in this file (e.g., _validate_approvers, _validate_unique_action_types, _reject_same_department) correctly log via logger.warning(COMPANY_VALIDATION_ERROR, ...) before raising. Apply the same pattern here:

Suggested change
def _validate_not_self_report(self) -> Self:
"""Reject self-reporting relationships."""
if self.subordinate.strip().casefold() == self.supervisor.strip().casefold():
msg = (
f"Agent cannot report to themselves: "
f"{self.subordinate!r} == {self.supervisor!r}"
)
raise ValueError(msg)
return self
@model_validator(mode="after")
def _validate_not_self_report(self) -> Self:
"""Reject self-reporting relationships."""
if self.subordinate.strip().casefold() == self.supervisor.strip().casefold():
msg = (
f"Agent cannot report to themselves: "
f"{self.subordinate!r} == {self.supervisor!r}"
)
logger.warning(COMPANY_VALIDATION_ERROR, error=msg)
raise ValueError(msg)
return self

The same pattern is needed for _validate_unique_subordinates (company.py:303–313) and _validate_personality_mutual_exclusion (templates/schema.py:119–127). Those two also raise without logging.

Rule Used: CLAUDE.md (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/core/company.py
Line: 34-42

Comment:
The `_validate_not_self_report` validator raises without logging, violating the CLAUDE.md convention that "all error paths must log at WARNING or ERROR with context before raising."

All other validators in this file (e.g., `_validate_approvers`, `_validate_unique_action_types`, `_reject_same_department`) correctly log via `logger.warning(COMPANY_VALIDATION_ERROR, ...)` before raising. Apply the same pattern here:

```suggestion
    @model_validator(mode="after")
    def _validate_not_self_report(self) -> Self:
        """Reject self-reporting relationships."""
        if self.subordinate.strip().casefold() == self.supervisor.strip().casefold():
            msg = (
                f"Agent cannot report to themselves: "
                f"{self.subordinate!r} == {self.supervisor!r}"
            )
            logger.warning(COMPANY_VALIDATION_ERROR, error=msg)
            raise ValueError(msg)
        return self
```

The same pattern is needed for `_validate_unique_subordinates` (company.py:303–313) and `_validate_personality_mutual_exclusion` (templates/schema.py:119–127). Those two also raise without logging.

**Rule Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=6816cd03-d0e1-4fd0-9d04-2417487a584c))

How can I resolve this? If you propose a fix, please make it concise.

Aureliolo added a commit that referenced this pull request Mar 7, 2026
Address 13 findings from post-merge bot reviews on PRs #157-#162:

Communication layer (PR #157):
- Wake blocked receive() callers on unsubscribe via None sentinel
- Distinguish shutdown vs timeout in receive() logging
- Add AgentMessenger.receive() facade method
- Validate MessageHandler.handle() is async at registration
- Await cancelled tasks in _await_with_shutdown to prevent warnings

Observability (PR #158):
- Add log-before-raise to all validators missing it (company.py,
  schema.py) — 14 raise sites across 11 validators

Parallel execution (PR #161):
- Log suppressed ExceptionGroup instead of silent pass
- Add PARALLEL_AGENT_CANCELLED structured event for cancellations
- Fix progress.in_progress semantics (increment after semaphore)
- Use PARALLEL_LOCK_RELEASE_ERROR for lock release failures
- Remove duplicate plan_parsing.py from DESIGN_SPEC file tree

Template inheritance (PR #162):
- Update DESIGN_SPEC merge key docs to include merge_id
- Preserve merge_id in _expand_single_agent (confirmed bug fix)
- Defer deepcopy in _apply_child_agent past _remove early-return

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aureliolo added a commit that referenced this pull request Mar 10, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.1.1](ai-company-v0.1.0...ai-company-v0.1.1)
(2026-03-10)


### Features

* add autonomy levels and approval timeout policies
([#42](#42),
[#126](#126))
([#197](#197))
([eecc25a](eecc25a))
* add CFO cost optimization service with anomaly detection, reports, and
approval decisions
([#186](#186))
([a7fa00b](a7fa00b))
* add code quality toolchain (ruff, mypy, pre-commit, dependabot)
([#63](#63))
([36681a8](36681a8))
* add configurable cost tiers and subscription/quota-aware tracking
([#67](#67))
([#185](#185))
([9baedfa](9baedfa))
* add container packaging, Docker Compose, and CI pipeline
([#269](#269))
([435bdfe](435bdfe)),
closes [#267](#267)
* add coordination error taxonomy classification pipeline
([#146](#146))
([#181](#181))
([70c7480](70c7480))
* add cost-optimized, hierarchical, and auction assignment strategies
([#175](#175))
([ce924fa](ce924fa)),
closes [#173](#173)
* add design specification, license, and project setup
([8669a09](8669a09))
* add env var substitution and config file auto-discovery
([#77](#77))
([7f53832](7f53832))
* add FastestStrategy routing + vendor-agnostic cleanup
([#140](#140))
([09619cb](09619cb)),
closes [#139](#139)
* add HR engine and performance tracking
([#45](#45),
[#47](#47))
([#193](#193))
([2d091ea](2d091ea))
* add issue auto-search and resolution verification to PR review skill
([#119](#119))
([deecc39](deecc39))
* add memory retrieval, ranking, and context injection pipeline
([#41](#41))
([873b0aa](873b0aa))
* add pluggable MemoryBackend protocol with models, config, and events
([#180](#180))
([46cfdd4](46cfdd4))
* add pluggable MemoryBackend protocol with models, config, and events
([#32](#32))
([46cfdd4](46cfdd4))
* add pluggable PersistenceBackend protocol with SQLite implementation
([#36](#36))
([f753779](f753779))
* add progressive trust and promotion/demotion subsystems
([#43](#43),
[#49](#49))
([3a87c08](3a87c08))
* add retry handler, rate limiter, and provider resilience
([#100](#100))
([b890545](b890545))
* add SecOps security agent with rule engine, audit log, and ToolInvoker
integration ([#40](#40))
([83b7b6c](83b7b6c))
* add shared org memory and memory consolidation/archival
([#125](#125),
[#48](#48))
([4a0832b](4a0832b))
* design unified provider interface
([#86](#86))
([3e23d64](3e23d64))
* expand template presets, rosters, and add inheritance
([#80](#80),
[#81](#81),
[#84](#84))
([15a9134](15a9134))
* implement agent runtime state vs immutable config split
([#115](#115))
([4cb1ca5](4cb1ca5))
* implement AgentEngine core orchestrator
([#11](#11))
([#143](#143))
([f2eb73a](f2eb73a))
* implement basic tool system (registry, invocation, results)
([#15](#15))
([c51068b](c51068b))
* implement built-in file system tools
([#18](#18))
([325ef98](325ef98))
* implement communication foundation — message bus, dispatcher, and
messenger ([#157](#157))
([8e71bfd](8e71bfd))
* implement company template system with 7 built-in presets
([#85](#85))
([cbf1496](cbf1496))
* implement conflict resolution protocol
([#122](#122))
([#166](#166))
([e03f9f2](e03f9f2))
* implement core entity and role system models
([#69](#69))
([acf9801](acf9801))
* implement crash recovery with fail-and-reassign strategy
([#149](#149))
([e6e91ed](e6e91ed))
* implement engine extensions — Plan-and-Execute loop and call
categorization
([#134](#134),
[#135](#135))
([#159](#159))
([9b2699f](9b2699f))
* implement enterprise logging system with structlog
([#73](#73))
([2f787e5](2f787e5))
* implement graceful shutdown with cooperative timeout strategy
([#130](#130))
([6592515](6592515))
* implement hierarchical delegation and loop prevention
([#12](#12),
[#17](#17))
([6be60b6](6be60b6))
* implement LiteLLM driver and provider registry
([#88](#88))
([ae3f18b](ae3f18b)),
closes [#4](#4)
* implement LLM decomposition strategy and workspace isolation
([#174](#174))
([aa0eefe](aa0eefe))
* implement meeting protocol system
([#123](#123))
([ee7caca](ee7caca))
* implement message and communication domain models
([#74](#74))
([560a5d2](560a5d2))
* implement model routing engine
([#99](#99))
([d3c250b](d3c250b))
* implement parallel agent execution
([#22](#22))
([#161](#161))
([65940b3](65940b3))
* implement per-call cost tracking service
([#7](#7))
([#102](#102))
([c4f1f1c](c4f1f1c))
* implement personality injection and system prompt construction
([#105](#105))
([934dd85](934dd85))
* implement single-task execution lifecycle
([#21](#21))
([#144](#144))
([c7e64e4](c7e64e4))
* implement subprocess sandbox for tool execution isolation
([#131](#131))
([#153](#153))
([3c8394e](3c8394e))
* implement task assignment subsystem with pluggable strategies
([#172](#172))
([c7f1b26](c7f1b26)),
closes [#26](#26)
[#30](#30)
* implement task decomposition and routing engine
([#14](#14))
([9c7fb52](9c7fb52))
* implement Task, Project, Artifact, Budget, and Cost domain models
([#71](#71))
([81eabf1](81eabf1))
* implement tool permission checking
([#16](#16))
([833c190](833c190))
* implement YAML config loader with Pydantic validation
([#59](#59))
([ff3a2ba](ff3a2ba))
* implement YAML config loader with Pydantic validation
([#75](#75))
([ff3a2ba](ff3a2ba))
* initialize project with uv, hatchling, and src layout
([39005f9](39005f9))
* initialize project with uv, hatchling, and src layout
([#62](#62))
([39005f9](39005f9))
* Litestar REST API, WebSocket feed, and approval queue (M6)
([#189](#189))
([29fcd08](29fcd08))
* make TokenUsage.total_tokens a computed field
([#118](#118))
([c0bab18](c0bab18)),
closes [#109](#109)
* parallel tool execution in ToolInvoker.invoke_all
([#137](#137))
([58517ee](58517ee))
* testing framework, CI pipeline, and M0 gap fixes
([#64](#64))
([f581749](f581749))
* wire all modules into observability system
([#97](#97))
([f7a0617](f7a0617))


### Bug Fixes

* address Greptile post-merge review findings from PRs
[#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175)
([#176](#176))
([c5ca929](c5ca929))
* address post-merge review feedback from PRs
[#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167)
([#170](#170))
([3bf897a](3bf897a)),
closes [#169](#169)
* enforce strict mypy on test files
([#89](#89))
([aeeff8c](aeeff8c))
* harden Docker sandbox, MCP bridge, and code runner
([#50](#50),
[#53](#53))
([d5e1b6e](d5e1b6e))
* harden git tools security + code quality improvements
([#150](#150))
([000a325](000a325))
* harden subprocess cleanup, env filtering, and shutdown resilience
([#155](#155))
([d1fe1fb](d1fe1fb))
* incorporate post-merge feedback + pre-PR review fixes
([#164](#164))
([c02832a](c02832a))
* pre-PR review fixes for post-merge findings
([#183](#183))
([26b3108](26b3108))
* strengthen immutability for BaseTool schema and ToolInvoker boundaries
([#117](#117))
([7e5e861](7e5e861))


### Performance

* harden non-inferable principle implementation
([#195](#195))
([02b5f4e](02b5f4e)),
closes [#188](#188)


### Refactoring

* adopt NotBlankStr across all models
([#108](#108))
([#120](#120))
([ef89b90](ef89b90))
* extract _SpendingTotals base class from spending summary models
([#111](#111))
([2f39c1b](2f39c1b))
* harden BudgetEnforcer with error handling, validation extraction, and
review fixes
([#182](#182))
([c107bf9](c107bf9))
* harden personality profiles, department validation, and template
rendering ([#158](#158))
([10b2299](10b2299))
* pre-PR review improvements for ExecutionLoop + ReAct loop
([#124](#124))
([8dfb3c0](8dfb3c0))
* split events.py into per-domain event modules
([#136](#136))
([e9cba89](e9cba89))


### Documentation

* add ADR-001 memory layer evaluation and selection
([#178](#178))
([db3026f](db3026f)),
closes [#39](#39)
* add agent scaling research findings to DESIGN_SPEC
([#145](#145))
([57e487b](57e487b))
* add CLAUDE.md, contributing guide, and dev documentation
([#65](#65))
([55c1025](55c1025)),
closes [#54](#54)
* add crash recovery, sandboxing, analytics, and testing decisions
([#127](#127))
([5c11595](5c11595))
* address external review feedback with MVP scope and new protocols
([#128](#128))
([3b30b9a](3b30b9a))
* expand design spec with pluggable strategy protocols
([#121](#121))
([6832db6](6832db6))
* finalize 23 design decisions (ADR-002)
([#190](#190))
([8c39742](8c39742))
* update project docs for M2.5 conventions and add docs-consistency
review agent
([#114](#114))
([99766ee](99766ee))


### Tests

* add e2e single agent integration tests
([#24](#24))
([#156](#156))
([f566fb4](f566fb4))
* add provider adapter integration tests
([#90](#90))
([40a61f4](40a61f4))


### CI/CD

* add Release Please for automated versioning and GitHub Releases
([#278](#278))
([a488758](a488758))
* bump actions/checkout from 4 to 6
([#95](#95))
([1897247](1897247))
* bump actions/upload-artifact from 4 to 7
([#94](#94))
([27b1517](27b1517))
* harden CI/CD pipeline
([#92](#92))
([ce4693c](ce4693c))
* split vulnerability scans into critical-fail and high-warn tiers
([#277](#277))
([aba48af](aba48af))


### Maintenance

* add /worktree skill for parallel worktree management
([#171](#171))
([951e337](951e337))
* add design spec context loading to research-link skill
([8ef9685](8ef9685))
* add post-merge-cleanup skill
([#70](#70))
([f913705](f913705))
* add pre-pr-review skill and update CLAUDE.md
([#103](#103))
([92e9023](92e9023))
* add research-link skill and rename skill files to SKILL.md
([#101](#101))
([651c577](651c577))
* bump aiosqlite from 0.21.0 to 0.22.1
([#191](#191))
([3274a86](3274a86))
* bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group
([#96](#96))
([0338d0c](0338d0c))
* bump ruff from 0.15.4 to 0.15.5
([a49ee46](a49ee46))
* fix M0 audit items
([#66](#66))
([c7724b5](c7724b5))
* pin setup-uv action to full SHA
([#281](#281))
([4448002](4448002))
* post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests,
hookify rules
([#148](#148))
([c57a6a9](c57a6a9))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Aureliolo added a commit that referenced this pull request Mar 11, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.1.0](v0.0.0...v0.1.0)
(2026-03-11)


### Features

* add autonomy levels and approval timeout policies
([#42](#42),
[#126](#126))
([#197](#197))
([eecc25a](eecc25a))
* add CFO cost optimization service with anomaly detection, reports, and
approval decisions
([#186](#186))
([a7fa00b](a7fa00b))
* add code quality toolchain (ruff, mypy, pre-commit, dependabot)
([#63](#63))
([36681a8](36681a8))
* add configurable cost tiers and subscription/quota-aware tracking
([#67](#67))
([#185](#185))
([9baedfa](9baedfa))
* add container packaging, Docker Compose, and CI pipeline
([#269](#269))
([435bdfe](435bdfe)),
closes [#267](#267)
* add coordination error taxonomy classification pipeline
([#146](#146))
([#181](#181))
([70c7480](70c7480))
* add cost-optimized, hierarchical, and auction assignment strategies
([#175](#175))
([ce924fa](ce924fa)),
closes [#173](#173)
* add design specification, license, and project setup
([8669a09](8669a09))
* add env var substitution and config file auto-discovery
([#77](#77))
([7f53832](7f53832))
* add FastestStrategy routing + vendor-agnostic cleanup
([#140](#140))
([09619cb](09619cb)),
closes [#139](#139)
* add HR engine and performance tracking
([#45](#45),
[#47](#47))
([#193](#193))
([2d091ea](2d091ea))
* add issue auto-search and resolution verification to PR review skill
([#119](#119))
([deecc39](deecc39))
* add mandatory JWT + API key authentication
([#256](#256))
([c279cfe](c279cfe))
* add memory retrieval, ranking, and context injection pipeline
([#41](#41))
([873b0aa](873b0aa))
* add pluggable MemoryBackend protocol with models, config, and events
([#180](#180))
([46cfdd4](46cfdd4))
* add pluggable MemoryBackend protocol with models, config, and events
([#32](#32))
([46cfdd4](46cfdd4))
* add pluggable output scan response policies
([#263](#263))
([b9907e8](b9907e8))
* add pluggable PersistenceBackend protocol with SQLite implementation
([#36](#36))
([f753779](f753779))
* add progressive trust and promotion/demotion subsystems
([#43](#43),
[#49](#49))
([3a87c08](3a87c08))
* add retry handler, rate limiter, and provider resilience
([#100](#100))
([b890545](b890545))
* add SecOps security agent with rule engine, audit log, and ToolInvoker
integration ([#40](#40))
([83b7b6c](83b7b6c))
* add shared org memory and memory consolidation/archival
([#125](#125),
[#48](#48))
([4a0832b](4a0832b))
* design unified provider interface
([#86](#86))
([3e23d64](3e23d64))
* expand template presets, rosters, and add inheritance
([#80](#80),
[#81](#81),
[#84](#84))
([15a9134](15a9134))
* implement agent runtime state vs immutable config split
([#115](#115))
([4cb1ca5](4cb1ca5))
* implement AgentEngine core orchestrator
([#11](#11))
([#143](#143))
([f2eb73a](f2eb73a))
* implement AuditRepository for security audit log persistence
([#279](#279))
([94bc29f](94bc29f))
* implement basic tool system (registry, invocation, results)
([#15](#15))
([c51068b](c51068b))
* implement built-in file system tools
([#18](#18))
([325ef98](325ef98))
* implement communication foundation — message bus, dispatcher, and
messenger ([#157](#157))
([8e71bfd](8e71bfd))
* implement company template system with 7 built-in presets
([#85](#85))
([cbf1496](cbf1496))
* implement conflict resolution protocol
([#122](#122))
([#166](#166))
([e03f9f2](e03f9f2))
* implement core entity and role system models
([#69](#69))
([acf9801](acf9801))
* implement crash recovery with fail-and-reassign strategy
([#149](#149))
([e6e91ed](e6e91ed))
* implement engine extensions — Plan-and-Execute loop and call
categorization
([#134](#134),
[#135](#135))
([#159](#159))
([9b2699f](9b2699f))
* implement enterprise logging system with structlog
([#73](#73))
([2f787e5](2f787e5))
* implement graceful shutdown with cooperative timeout strategy
([#130](#130))
([6592515](6592515))
* implement hierarchical delegation and loop prevention
([#12](#12),
[#17](#17))
([6be60b6](6be60b6))
* implement LiteLLM driver and provider registry
([#88](#88))
([ae3f18b](ae3f18b)),
closes [#4](#4)
* implement LLM decomposition strategy and workspace isolation
([#174](#174))
([aa0eefe](aa0eefe))
* implement meeting protocol system
([#123](#123))
([ee7caca](ee7caca))
* implement message and communication domain models
([#74](#74))
([560a5d2](560a5d2))
* implement model routing engine
([#99](#99))
([d3c250b](d3c250b))
* implement parallel agent execution
([#22](#22))
([#161](#161))
([65940b3](65940b3))
* implement per-call cost tracking service
([#7](#7))
([#102](#102))
([c4f1f1c](c4f1f1c))
* implement personality injection and system prompt construction
([#105](#105))
([934dd85](934dd85))
* implement single-task execution lifecycle
([#21](#21))
([#144](#144))
([c7e64e4](c7e64e4))
* implement subprocess sandbox for tool execution isolation
([#131](#131))
([#153](#153))
([3c8394e](3c8394e))
* implement task assignment subsystem with pluggable strategies
([#172](#172))
([c7f1b26](c7f1b26)),
closes [#26](#26)
[#30](#30)
* implement task decomposition and routing engine
([#14](#14))
([9c7fb52](9c7fb52))
* implement Task, Project, Artifact, Budget, and Cost domain models
([#71](#71))
([81eabf1](81eabf1))
* implement tool permission checking
([#16](#16))
([833c190](833c190))
* implement YAML config loader with Pydantic validation
([#59](#59))
([ff3a2ba](ff3a2ba))
* implement YAML config loader with Pydantic validation
([#75](#75))
([ff3a2ba](ff3a2ba))
* initialize project with uv, hatchling, and src layout
([39005f9](39005f9))
* initialize project with uv, hatchling, and src layout
([#62](#62))
([39005f9](39005f9))
* Litestar REST API, WebSocket feed, and approval queue (M6)
([#189](#189))
([29fcd08](29fcd08))
* make TokenUsage.total_tokens a computed field
([#118](#118))
([c0bab18](c0bab18)),
closes [#109](#109)
* parallel tool execution in ToolInvoker.invoke_all
([#137](#137))
([58517ee](58517ee))
* testing framework, CI pipeline, and M0 gap fixes
([#64](#64))
([f581749](f581749))
* wire all modules into observability system
([#97](#97))
([f7a0617](f7a0617))


### Bug Fixes

* address Greptile post-merge review findings from PRs
[#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175)
([#176](#176))
([c5ca929](c5ca929))
* address post-merge review feedback from PRs
[#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167)
([#170](#170))
([3bf897a](3bf897a)),
closes [#169](#169)
* enforce strict mypy on test files
([#89](#89))
([aeeff8c](aeeff8c))
* harden Docker sandbox, MCP bridge, and code runner
([#50](#50),
[#53](#53))
([d5e1b6e](d5e1b6e))
* harden git tools security + code quality improvements
([#150](#150))
([000a325](000a325))
* harden subprocess cleanup, env filtering, and shutdown resilience
([#155](#155))
([d1fe1fb](d1fe1fb))
* incorporate post-merge feedback + pre-PR review fixes
([#164](#164))
([c02832a](c02832a))
* pre-PR review fixes for post-merge findings
([#183](#183))
([26b3108](26b3108))
* resolve circular imports, bump litellm, fix release tag format
([#286](#286))
([a6659b5](a6659b5))
* strengthen immutability for BaseTool schema and ToolInvoker boundaries
([#117](#117))
([7e5e861](7e5e861))


### Performance

* harden non-inferable principle implementation
([#195](#195))
([02b5f4e](02b5f4e)),
closes [#188](#188)


### Refactoring

* adopt NotBlankStr across all models
([#108](#108))
([#120](#120))
([ef89b90](ef89b90))
* extract _SpendingTotals base class from spending summary models
([#111](#111))
([2f39c1b](2f39c1b))
* harden BudgetEnforcer with error handling, validation extraction, and
review fixes
([#182](#182))
([c107bf9](c107bf9))
* harden personality profiles, department validation, and template
rendering ([#158](#158))
([10b2299](10b2299))
* pre-PR review improvements for ExecutionLoop + ReAct loop
([#124](#124))
([8dfb3c0](8dfb3c0))
* split events.py into per-domain event modules
([#136](#136))
([e9cba89](e9cba89))


### Documentation

* add ADR-001 memory layer evaluation and selection
([#178](#178))
([db3026f](db3026f)),
closes [#39](#39)
* add agent scaling research findings to DESIGN_SPEC
([#145](#145))
([57e487b](57e487b))
* add CLAUDE.md, contributing guide, and dev documentation
([#65](#65))
([55c1025](55c1025)),
closes [#54](#54)
* add crash recovery, sandboxing, analytics, and testing decisions
([#127](#127))
([5c11595](5c11595))
* address external review feedback with MVP scope and new protocols
([#128](#128))
([3b30b9a](3b30b9a))
* expand design spec with pluggable strategy protocols
([#121](#121))
([6832db6](6832db6))
* finalize 23 design decisions (ADR-002)
([#190](#190))
([8c39742](8c39742))
* update project docs for M2.5 conventions and add docs-consistency
review agent
([#114](#114))
([99766ee](99766ee))


### Tests

* add e2e single agent integration tests
([#24](#24))
([#156](#156))
([f566fb4](f566fb4))
* add provider adapter integration tests
([#90](#90))
([40a61f4](40a61f4))


### CI/CD

* add Release Please for automated versioning and GitHub Releases
([#278](#278))
([a488758](a488758))
* bump actions/checkout from 4 to 6
([#95](#95))
([1897247](1897247))
* bump actions/upload-artifact from 4 to 7
([#94](#94))
([27b1517](27b1517))
* bump anchore/scan-action from 6.5.1 to 7.3.2
([#271](#271))
([80a1c15](80a1c15))
* bump docker/build-push-action from 6.19.2 to 7.0.0
([#273](#273))
([dd0219e](dd0219e))
* bump docker/login-action from 3.7.0 to 4.0.0
([#272](#272))
([33d6238](33d6238))
* bump docker/metadata-action from 5.10.0 to 6.0.0
([#270](#270))
([baee04e](baee04e))
* bump docker/setup-buildx-action from 3.12.0 to 4.0.0
([#274](#274))
([5fc06f7](5fc06f7))
* bump sigstore/cosign-installer from 3.9.1 to 4.1.0
([#275](#275))
([29dd16c](29dd16c))
* harden CI/CD pipeline
([#92](#92))
([ce4693c](ce4693c))
* split vulnerability scans into critical-fail and high-warn tiers
([#277](#277))
([aba48af](aba48af))


### Maintenance

* add /worktree skill for parallel worktree management
([#171](#171))
([951e337](951e337))
* add design spec context loading to research-link skill
([8ef9685](8ef9685))
* add post-merge-cleanup skill
([#70](#70))
([f913705](f913705))
* add pre-pr-review skill and update CLAUDE.md
([#103](#103))
([92e9023](92e9023))
* add research-link skill and rename skill files to SKILL.md
([#101](#101))
([651c577](651c577))
* bump aiosqlite from 0.21.0 to 0.22.1
([#191](#191))
([3274a86](3274a86))
* bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group
([#96](#96))
([0338d0c](0338d0c))
* bump ruff from 0.15.4 to 0.15.5
([a49ee46](a49ee46))
* fix M0 audit items
([#66](#66))
([c7724b5](c7724b5))
* **main:** release ai-company 0.1.1
([#282](#282))
([2f4703d](2f4703d))
* pin setup-uv action to full SHA
([#281](#281))
([4448002](4448002))
* post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests,
hookify rules
([#148](#148))
([c57a6a9](c57a6a9))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Research and implement comprehensive agent personality profiles Design department internal structure (teams, reporting lines, workflows)

2 participants