feat: implement meeting protocol system (#123) by Aureliolo · Pull Request #167 · Aureliolo/synthorg

Aureliolo · 2026-03-08T09:03:02Z

Summary

Implement 3 meeting protocols (DESIGN_SPEC §5.7): round-robin, position papers, structured phases — all behind the MeetingProtocol protocol interface
MeetingOrchestrator lifecycle manager: validates inputs, selects protocol, executes meeting, creates tasks from action items, records audit trail
Domain models: MeetingAgenda, MeetingMinutes, MeetingContribution, ActionItem, MeetingRecord (all frozen Pydantic v2)
Shared utilities: TokenTracker (mutable budget tracking), build_agenda_prompt (DRY prompt builder)
20 structured event constants in observability/events/meeting.py
Error hierarchy: MeetingError → MeetingBudgetExhaustedError, MeetingProtocolNotFoundError, MeetingParticipantError, MeetingAgentError
Config: MeetingProtocolConfig with per-protocol settings, auto_create_tasks at top level
Parallel execution via asyncio.TaskGroup in position papers and structured phases protocols
Budget enforcement: pre-divided budgets for parallel agents, MeetingBudgetExhaustedError on exhaustion
Documentation: updated DESIGN_SPEC.md §5.4, §5.7, §15.3 and CLAUDE.md package structure

Closes #123

Pre-PR Review

Pre-reviewed by 9 agents, 38 findings addressed:

code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency

Key fixes from review:

Extracted shared _TokenTracker and _prompts modules (eliminated duplication across 3 protocols)
Fixed non-deterministic parallel ordering in position_papers/structured_phases
Added budget pre-division for parallel agents (race condition prevention)
Split large run() methods into focused helpers (<50 lines each)
Added validation logging, ExceptionGroup handling, task creator resilience
Fixed TaskCreator type alias (str → Priority)
Added participant uniqueness + leader exclusion validators on MeetingMinutes
Strengthened test coverage: budget exhaustion, task creation, protocol interface

Test plan

🤖 Generated with Claude Code

Add pluggable meeting protocol system with three strategies: - Round-Robin: sequential turns with full transcript context - Position Papers: parallel position papers with synthesis - Structured Phases: input gathering, conflict detection, discussion, synthesis Includes MeetingOrchestrator lifecycle manager, frozen Pydantic models, token budget enforcement, event constants, and 138 meeting-specific tests.

Pre-reviewed by 9 agents, 38 findings addressed: - Extract shared _TokenTracker and _prompts modules (DRY) - Move auto_create_tasks to top-level MeetingProtocolConfig - Fix non-deterministic parallel ordering in position_papers/structured_phases - Add budget pre-division for parallel agents - Split large run() methods into focused helpers - Add validation logging, ExceptionGroup handling, task creator resilience - Fix TaskCreator type alias (str → Priority) - Add participant uniqueness + leader exclusion validators on MeetingMinutes - Use NotBlankStr for decisions tuple - Strengthen test coverage: budget exhaustion, task creation, protocol interface - Add meeting event constant tests - Update DESIGN_SPEC.md §5.4, §5.7, §15.3 and CLAUDE.md package structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-08T09:03:10Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-03-08T09:03:17Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1115c337-1bdd-45a3-a94c-2fd09e1d0823

📥 Commits

Reviewing files that changed from the base of the PR and between 3101925 and 2aa2a1f.

📒 Files selected for processing (9)

CLAUDE.md
DESIGN_SPEC.md
src/ai_company/communication/__init__.py
src/ai_company/communication/config.py
src/ai_company/communication/enums.py
src/ai_company/communication/meeting/orchestrator.py
src/ai_company/communication/meeting/position_papers.py
src/ai_company/communication/meeting/structured_phases.py
tests/unit/observability/test_events.py

📝 Walkthrough

Summary by CodeRabbit

New Features
- Full meeting system with three configurable protocols, end-to-end orchestration, minutes, and audit records
- Per-meeting token budgeting and enforcement with visible token accounting
- Optional automatic task creation from meeting action items (top-level config)
- New message type for meeting contributions
Observability
- New set of meeting event names for telemetry (lifecycle, phases, agent calls, tokens, tasks, conflicts)
Documentation
- Project structure and logging guidance updated to include meeting protocols
Tests
- Comprehensive unit and integration tests covering all meeting flows, budgets, and error paths

Walkthrough

Adds a new meeting subsystem: three pluggable meeting protocols (RoundRobin, PositionPapers, StructuredPhases), a MeetingOrchestrator with token-budget enforcement and task-creation hooks, domain models/errors/enums, token/prompt utilities, observability event constants, public re-exports, and extensive unit/integration tests. No existing public signatures were removed.

Changes

Cohort / File(s)	Summary
Design & docs `CLAUDE.md`, `DESIGN_SPEC.md`	Documented meeting protocol subsystem, public API surface, telemetry events, and global `auto_create_tasks` config.
Public exports `src/ai_company/communication/__init__.py`, `src/ai_company/communication/meeting/__init__.py`	Re-exports meeting subsystem types (configs, enums, models, protocols, orchestrator, task creator, detectors); expands package public surface.
Configuration `src/ai_company/communication/config.py`, `src/ai_company/communication/meeting/config.py`	Adds `protocol_config` to MeetingTypeConfig and new immutable Pydantic configs: `RoundRobinConfig`, `PositionPapersConfig`, `StructuredPhasesConfig`, `MeetingProtocolConfig` (includes `auto_create_tasks`).
Core enums & message type `src/ai_company/communication/enums.py`, `src/ai_company/communication/meeting/enums.py`	Adds `MEETING_CONTRIBUTION` to `MessageType`; introduces `MeetingProtocolType`, `MeetingPhase`, `MeetingStatus` enums.
Protocol interfaces & types `src/ai_company/communication/meeting/protocol.py`	Defines `AgentCaller`/`TaskCreator` type aliases and runtime-checkable `MeetingProtocol` and `ConflictDetector` Protocols.
Models & errors `src/ai_company/communication/meeting/models.py`, `src/ai_company/communication/meeting/errors.py`	Adds domain models (`AgentResponse`, `MeetingAgenda*`, `MeetingContribution`, `ActionItem`, `MeetingMinutes`, `MeetingRecord`) with validators and meeting-specific error hierarchy.
Orchestrator `src/ai_company/communication/meeting/orchestrator.py`	Implements `MeetingOrchestrator` to validate input, resolve protocol, run meetings, persist in-memory `MeetingRecord`s, handle budget/failures, emit events, and optionally create tasks.
Protocol implementations `src/ai_company/communication/meeting/round_robin.py`, `.../position_papers.py`, `.../structured_phases.py`	Implements three meeting protocols with token budgeting, deterministic ordering, conflict-detection hook, phase orchestration, observability events, and MeetingMinutes generation.
Utilities `src/ai_company/communication/meeting/_prompts.py`, `src/ai_company/communication/meeting/_token_tracker.py`	Adds `build_agenda_prompt` and `TokenTracker` (budget, record, used/remaining/is_exhausted) with validation and concurrency notes.
Observability `src/ai_company/observability/events/meeting.py`	Adds `MEETING_*` event constants for lifecycle, phases, agent interactions, contributions, conflicts, synthesis, tasks, validation, and token recording.
Tests — unit & integration `tests/unit/communication/meeting/...`, `tests/integration/communication/test_meeting_integration.py`	Adds comprehensive unit and integration tests covering configs, enums, models, errors, protocols, orchestrator, token tracker, prompts, conflict behavior, budget exhaustion, and audit records.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Orchestrator as MeetingOrchestrator
    participant Protocol as MeetingProtocol
    participant Tracker as TokenTracker
    participant Caller as AgentCaller
    participant Agent as Agent

    Client->>Orchestrator: run_meeting(config, agenda, leader, participants, budget)
    Orchestrator->>Orchestrator: _validate_inputs()
    Orchestrator->>Orchestrator: _resolve_protocol()
    Orchestrator->>Protocol: run(meeting_id, agenda, leader, participants, agent_caller, budget)
    Protocol->>Tracker: TokenTracker(budget)
    Protocol->>Protocol: build_agenda_prompt()

    loop contributions (parallel or sequential)
        Protocol->>Caller: call(agent_id, prompt, max_tokens)
        Caller->>Agent: invoke
        Agent-->>Caller: AgentResponse (input/output tokens)
        Caller-->>Protocol: AgentResponse
        Protocol->>Tracker: record(input_tokens, output_tokens)
        Protocol->>Protocol: append MeetingContribution
    end

    alt budget available for synthesis/summary
        Protocol->>Caller: call(leader/synthesizer, synthesis_prompt, max_tokens)
        Caller-->>Protocol: synthesis AgentResponse
        Protocol->>Tracker: record(...)
    else budget exhausted
        Protocol-->>Orchestrator: raise MeetingBudgetExhaustedError
    end

    Protocol-->>Orchestrator: MeetingMinutes
    Orchestrator->>Orchestrator: _record_success() / _make_failure_record()
    alt auto_create_tasks enabled
        Orchestrator->>Orchestrator: _create_tasks(action_items)
    end
    Orchestrator-->>Client: MeetingRecord

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

feat: implement meeting protocol system (#123) #167: Implements the same meeting-protocol subsystem (same modules/classes and meeting event constants); direct code-level overlap.
feat: implement communication foundation — message bus, dispatcher, and messenger #157: Related changes to communication package exports and observability event modules; overlaps public re-export surface and event definitions.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.83% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'feat: implement meeting protocol system (`#123`)' directly summarizes the main objective: implementing a meeting protocol system. It is concise, clear, and specific to the primary change.
Description check	✅ Passed	The PR description thoroughly explains the implementation, covering all three protocols, the MeetingOrchestrator, domain models, utilities, error hierarchy, config changes, and documentation updates. It clearly relates to the changeset.
Linked Issues check	✅ Passed	The PR fully implements the meeting protocol system per issue `#123`: three concrete protocols (round-robin, position papers, structured phases) with MeetingProtocol interface, MeetingOrchestrator for lifecycle management, token budget enforcement, action items → tasks conversion, domain models, error hierarchy, event constants, and comprehensive test coverage (>80%).
Out of Scope Changes check	✅ Passed	All changes align with the meeting protocol system implementation. Documentation updates to CLAUDE.md and DESIGN_SPEC.md are directly related. The addition of MEETING_CONTRIBUTION to MessageType enum is a necessary extension to support the new meeting system. No unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/meeting-protocol

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/meeting-protocol

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-08T09:03:34Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive meeting protocol system, enabling structured multi-agent conversations within the AI company framework. It provides three distinct protocols—round-robin, position papers, and structured phases—managed by a central orchestrator. The system ensures robust lifecycle management, budget enforcement, and detailed audit trails, significantly enhancing the platform's ability to facilitate complex agent interactions and decision-making.

Highlights

Meeting Protocol Implementation: Implemented three distinct meeting protocols: round-robin, position papers, and structured phases, all adhering to a unified MeetingProtocol interface.
Meeting Orchestrator: Introduced a MeetingOrchestrator to manage the full meeting lifecycle, including input validation, protocol selection, execution, task creation from action items, and audit trail recording.
New Domain Models: Defined several new frozen Pydantic v2 domain models, such as MeetingAgenda, MeetingMinutes, MeetingContribution, ActionItem, and MeetingRecord, to represent meeting-related data.
Shared Utilities and Error Handling: Added shared utilities like TokenTracker for mutable budget tracking and build_agenda_prompt for DRY prompt construction. A new MeetingError hierarchy was also established for robust error management.
Parallel Execution and Budget Enforcement: Enabled parallel execution for position papers and structured phases protocols using asyncio.TaskGroup, coupled with pre-divided budgets for agents and MeetingBudgetExhaustedError for strict token budget enforcement.
Observability and Configuration: Integrated 20 new structured event constants for meeting activities into observability/events/meeting.py and introduced MeetingProtocolConfig for flexible, per-protocol settings, including a top-level auto_create_tasks option.
Pre-PR Review Enhancements: Incorporated 38 findings from a pre-PR review by 9 agents, leading to key fixes such as extracting shared modules, resolving non-deterministic parallel ordering, implementing budget pre-division, refactoring large methods, enhancing validation logging, and strengthening test coverage.

Changelog

CLAUDE.md
- Updated the description for the communication/ directory to include 'meeting protocol'.
- Revised logging guidelines to reference new MEETING_STARTED event.
- Expanded the package structure documentation to include the new communication/meeting/ and observability/events/meeting.py modules.
DESIGN_SPEC.md
- Updated the 'Current state (M4 in-progress)' section to reflect the implementation of meeting protocols.
- Marked the 'Meetings' section as 'M4 complete' for all three protocols.
- Adjusted the structured_phases configuration example to move auto_create_tasks to a top-level setting.
- Added detailed package structure for the new communication/meeting/ and observability/events/meeting.py modules.
src/ai_company/communication/init.py
- Imported numerous new classes and enums from the ai_company.communication.meeting package.
- Added these new imports to the __all__ export list for public access.
src/ai_company/communication/config.py
- Imported MeetingProtocolConfig.
- Added a protocol_config field to MeetingTypeConfig.
src/ai_company/communication/enums.py
- Added MEETING_CONTRIBUTION to the MessageType enum.
src/ai_company/communication/meeting/init.py
- Created the meeting package, defining its public interfaces and documentation.
src/ai_company/communication/meeting/_prompts.py
- Added shared prompt builders for meeting protocols, including build_agenda_prompt.
src/ai_company/communication/meeting/_token_tracker.py
- Implemented a TokenTracker class for mutable token budget tracking within meeting executions.
src/ai_company/communication/meeting/config.py
- Defined configuration models for RoundRobinConfig, PositionPapersConfig, StructuredPhasesConfig, and the top-level MeetingProtocolConfig.
src/ai_company/communication/meeting/enums.py
- Defined new enumerations for MeetingProtocolType, MeetingPhase, and MeetingStatus.
src/ai_company/communication/meeting/errors.py
- Established a hierarchy of meeting-specific exceptions, all inheriting from CommunicationError.
src/ai_company/communication/meeting/models.py
- Defined Pydantic models for AgentResponse, MeetingAgendaItem, MeetingAgenda, MeetingContribution, ActionItem, MeetingMinutes, and MeetingRecord.
src/ai_company/communication/meeting/orchestrator.py
- Implemented the MeetingOrchestrator class to manage the full meeting lifecycle, including protocol selection, execution, and audit trail recording.
src/ai_company/communication/meeting/position_papers.py
- Implemented the PositionPapersProtocol for parallel input gathering and synthesis in meetings.
src/ai_company/communication/meeting/protocol.py
- Defined the MeetingProtocol interface and AgentCaller, TaskCreator type aliases for meeting protocol implementations.
src/ai_company/communication/meeting/round_robin.py
- Implemented the RoundRobinProtocol for sequential turns with full transcript context in meetings.
src/ai_company/communication/meeting/structured_phases.py
- Implemented the StructuredPhasesProtocol for phased input gathering, conditional discussion, and synthesis in meetings.
src/ai_company/observability/events/meeting.py
- Defined 20 structured event constants for meeting lifecycle and phase tracking.
tests/integration/communication/test_meeting_integration.py
- Added integration tests for the meeting protocol system, covering full lifecycle, protocol switching, budget enforcement, and error recovery.
tests/unit/communication/meeting/conftest.py
- Added Pytest fixtures and factories to support meeting protocol unit tests.
tests/unit/communication/meeting/test_config.py
- Added unit tests for meeting protocol configuration models.
tests/unit/communication/meeting/test_enums.py
- Added unit tests for meeting protocol enumerations.
tests/unit/communication/meeting/test_errors.py
- Added unit tests for meeting protocol error hierarchy.
tests/unit/communication/meeting/test_models.py
- Added unit tests for meeting protocol domain models.
tests/unit/communication/meeting/test_orchestrator.py
- Added unit tests for the meeting orchestrator.
tests/unit/communication/meeting/test_position_papers.py
- Added unit tests for the position papers protocol.
tests/unit/communication/meeting/test_protocol.py
- Added unit tests for the meeting protocol interface.
tests/unit/communication/meeting/test_round_robin.py
- Added unit tests for the round-robin protocol.
tests/unit/communication/meeting/test_structured_phases.py
- Added unit tests for the structured phases protocol.
tests/unit/communication/test_enums.py
- Updated MessageType enum test to reflect the addition of MEETING_CONTRIBUTION.
tests/unit/observability/test_events.py
- Updated event discovery test to include the new meeting module.
- Added a new test to verify the meeting event constants.

Activity

The pull request author, Aureliolo, implemented the meeting protocol system.
A comprehensive pre-PR review was conducted by 9 agents, resulting in 38 findings being addressed.
Key fixes from the pre-PR review included extracting shared _TokenTracker and _prompts modules to eliminate duplication, fixing non-deterministic parallel ordering in position papers and structured phases, adding budget pre-division for parallel agents to prevent race conditions, splitting large run() methods into focused helpers, adding validation logging and ExceptionGroup handling, improving task creator resilience, fixing TaskCreator type alias, adding participant uniqueness and leader exclusion validators on MeetingMinutes, and strengthening test coverage for budget exhaustion, task creation, and protocol interfaces.
All 3394 tests passed successfully, maintaining a 96.22% coverage (above the 80% minimum).
Ruff lint and mypy strict checks passed cleanly across all 379 files.
All pre-commit hooks passed, ensuring code quality and consistency.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a comprehensive and well-designed meeting protocol system. It includes three distinct protocols (round-robin, position papers, and structured phases), a robust orchestrator for managing the meeting lifecycle, and a full suite of supporting models, configurations, error types, and tests. The implementation demonstrates strong adherence to best practices, especially in its use of dependency injection, asynchronous patterns with asyncio.TaskGroup, and resilient error handling. The code is well-structured and thoroughly tested. I have one suggestion to improve the robustness of the conflict detection mechanism in the structured phases protocol by using a more deterministic method than string parsing.

_{Note: Security Review did not run due to the size of the PR.}

gemini-code-assist · 2026-03-08T09:05:34Z

src/ai_company/communication/meeting/structured_phases.py

+        contributions.append(conflict_contribution)
+        turn_number += 1
+
+        conflicts_detected = "CONFLICTS: YES" in conflict_response.content.upper()


Parsing the LLM's free-text response to detect conflicts is a bit brittle. A minor change in the model's output format (e.g., extra whitespace, different phrasing) could break this logic. For more robust conflict detection, consider one of these approaches:

Structured JSON Output: Instruct the LLM to respond with a JSON object, which can be parsed reliably.
{ "conflicts_detected": true, "analysis": "Disagreement on API style." }

Tool Calling: Define a record_conflict_decision tool that the LLM must call. The tool would take a boolean conflicts_detected argument, ensuring a deterministic signal.

This change would improve the robustness of the conflict detection mechanism.

Copilot

Pull request overview

Implements the meeting protocol subsystem described in DESIGN_SPEC §5.7, adding pluggable meeting execution strategies plus orchestration, configuration, observability events, and comprehensive unit/integration tests.

Changes:

Added ai_company.communication.meeting package with 3 protocol implementations, shared prompt/token utilities, and a MeetingOrchestrator.
Introduced meeting domain models/config/enums/errors and wired meeting settings into MeetingTypeConfig plus communication package re-exports.
Added meeting observability event constants and expanded unit/integration test coverage for protocols/orchestrator.

Reviewed changes

Copilot reviewed 31 out of 32 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
tests/unit/observability/test_events.py	Registers and validates meeting event constants in the observability event suite.
tests/unit/communication/test_enums.py	Updates message-type enum member count for new meeting message type.
tests/unit/communication/meeting/conftest.py	Adds meeting test fixtures and a mock `AgentCaller`.
tests/unit/communication/meeting/test_config.py	Unit tests for meeting protocol config models.
tests/unit/communication/meeting/test_enums.py	Unit tests for meeting protocol enums (type/phase/status).
tests/unit/communication/meeting/test_errors.py	Unit tests for meeting error hierarchy + context behavior.
tests/unit/communication/meeting/test_models.py	Unit tests for meeting domain Pydantic models/validators.
tests/unit/communication/meeting/test_orchestrator.py	Unit tests for orchestrator validation, execution, audit trail, and error handling.
tests/unit/communication/meeting/test_position_papers.py	Unit tests for parallel position-papers protocol behavior.
tests/unit/communication/meeting/test_protocol.py	Unit tests verifying the runtime-checkable `MeetingProtocol` interface and type aliases.
tests/unit/communication/meeting/test_round_robin.py	Unit tests for round-robin protocol behavior and budget handling.
tests/unit/communication/meeting/test_structured_phases.py	Unit tests for structured-phases protocol behavior and conflict gating.
tests/unit/communication/meeting/init.py	Marks meeting unit test package.
tests/integration/communication/test_meeting_integration.py	End-to-end integration tests for protocol switching, budgets, and communication re-exports.
src/ai_company/observability/events/meeting.py	Adds meeting lifecycle/phase/task/token event constants.
src/ai_company/communication/meeting/_prompts.py	Shared agenda prompt builder for meeting protocols.
src/ai_company/communication/meeting/_token_tracker.py	Shared mutable token budget tracker used by protocol implementations.
src/ai_company/communication/meeting/config.py	Adds per-protocol config models + top-level `MeetingProtocolConfig`.
src/ai_company/communication/meeting/enums.py	Adds meeting protocol/phase/status enums.
src/ai_company/communication/meeting/errors.py	Adds meeting-specific error hierarchy rooted at `CommunicationError`.
src/ai_company/communication/meeting/models.py	Adds frozen Pydantic domain models for agendas, minutes, records, action items, etc.
src/ai_company/communication/meeting/orchestrator.py	Adds orchestrator to validate inputs, pick protocol, execute meeting, create tasks, and record audit trail.
src/ai_company/communication/meeting/position_papers.py	Implements async position papers protocol using `asyncio.TaskGroup`.
src/ai_company/communication/meeting/protocol.py	Defines `MeetingProtocol` runtime-checkable protocol and `AgentCaller`/`TaskCreator` type aliases.
src/ai_company/communication/meeting/round_robin.py	Implements round-robin protocol with transcript context, summary reserve, and budget handling.
src/ai_company/communication/meeting/structured_phases.py	Implements structured-phases protocol with parallel input gathering and conditional discussion.
src/ai_company/communication/meeting/init.py	Re-exports meeting subsystem public API.
src/ai_company/communication/enums.py	Adds `MessageType.MEETING_CONTRIBUTION`.
src/ai_company/communication/config.py	Adds `protocol_config: MeetingProtocolConfig` to `MeetingTypeConfig`.
src/ai_company/communication/init.py	Re-exports meeting types from top-level communication package.
DESIGN_SPEC.md	Updates spec status and package tree to include meeting subsystem and config shape.
CLAUDE.md	Updates package structure and logging event-constant guidance to include meetings.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-08T09:09:41Z

tests/integration/communication/test_meeting_integration.py

+"""Integration tests for the meeting protocol system.
+
+Tests full meeting lifecycle with protocol switching, budget
+enforcement, and orchestrator coordination.
+"""
+
+import pytest
+
+from ai_company.communication.meeting.config import (
+    MeetingProtocolConfig,


Existing integration tests typically include pytestmark = pytest.mark.timeout(30) (e.g. tests/integration/communication/test_delegation_integration.py:39) to avoid hangs. This new integration test module doesn’t set a timeout marker, so consider adding one for consistency and to keep CI behavior uniform.

Copilot · 2026-03-08T09:09:41Z

tests/unit/communication/test_enums.py

    def test_member_count(self) -> None:
-        assert len(MessageType) == 8
+        assert len(MessageType) == 9

    def test_values(self) -> None:
        assert MessageType.TASK_UPDATE.value == "task_update"


len(MessageType) was updated to 9, but test_values() still only asserts the original 8 enum values. Add an assertion for MessageType.MEETING_CONTRIBUTION.value == "meeting_contribution" so this test actually validates the newly added member.

Copilot · 2026-03-08T09:09:41Z

src/ai_company/communication/meeting/orchestrator.py

+        if status == MeetingStatus.BUDGET_EXHAUSTED:
+            logger.warning(
+                MEETING_FAILED,
+                meeting_id=meeting_id,
+                status=status,
+                error=error_msg,
+            )


When the meeting ends due to budget exhaustion, _make_failure_record() logs the lifecycle event as MEETING_FAILED. Since MEETING_BUDGET_EXHAUSTED exists as a distinct lifecycle event, log that event for MeetingStatus.BUDGET_EXHAUSTED (and reserve MEETING_FAILED for true failures) so observability can distinguish the two outcomes reliably.

Copilot · 2026-03-08T09:09:42Z

src/ai_company/communication/meeting/orchestrator.py

+        if not participant_ids:
+            logger.warning(
+                MEETING_VALIDATION_FAILED,
+                meeting_id=meeting_id,
+                error="at least one participant is required",
+            )
+            msg = "At least one participant is required"


_validate_inputs() doesn’t reject duplicate entries in participant_ids. If duplicates are passed, protocols will perform duplicate agent calls and MeetingMinutes validation will later fail (unique participant_ids), turning an input problem into a runtime FAILED record. Validate uniqueness here (and raise MeetingParticipantError) to keep failures deterministic and consistent with the method’s “validation errors are raised directly” contract.

Copilot · 2026-03-08T09:09:42Z

src/ai_company/communication/meeting/config.py

+        skip_discussion_if_no_conflicts: Skip discussion when no
+            conflicts are detected.
+        max_discussion_tokens: Token budget for the discussion round.
+        auto_create_tasks: Whether to auto-create tasks from action items.


StructuredPhasesConfig’s docstring lists auto_create_tasks, but the model doesn’t define that field (it lives on MeetingProtocolConfig). Update the docstring to avoid implying a per-protocol flag that doesn’t exist.

Suggested change

auto_create_tasks: Whether to auto-create tasks from action items.

Copilot · 2026-03-08T09:09:42Z

DESIGN_SPEC.md

 Meetings (§5.1 Pattern 3) follow configurable protocols that determine how agents interact during structured multi-agent conversations. Different meeting types naturally suit different protocols. All protocols implement a `MeetingProtocol` protocol, making the system extensible — new protocols can be registered and selected per meeting type. Cost bounds are enforced by `duration_tokens` in meeting config (§5.4).

-> **MVP: Not in M3.** Meetings are an M4 feature. Round-Robin (Protocol 1) is the initial default.
+> **Current state (M4 complete):** All 3 meeting protocols are implemented in `communication/meeting/`: `RoundRobinProtocol`, `PositionPapersProtocol`, and `StructuredPhasesProtocol`. The `MeetingOrchestrator` runs meetings end-to-end with token budget enforcement via `TokenBudgetTracker`. All protocols implement the `MeetingProtocol` protocol interface.


This section says budget enforcement is done via TokenBudgetTracker, but the implementation added in communication/meeting/_token_tracker.py defines TokenTracker. Update the name here to match the actual type to keep the spec accurate/searchable.

Suggested change

> **Current state (M4 complete):** All 3 meeting protocols are implemented in `communication/meeting/`: `RoundRobinProtocol`, `PositionPapersProtocol`, and `StructuredPhasesProtocol`. The `MeetingOrchestrator` runs meetings end-to-end with token budget enforcement via `TokenBudgetTracker`. All protocols implement the `MeetingProtocol` protocol interface.

> **Current state (M4 complete):** All 3 meeting protocols are implemented in `communication/meeting/`: `RoundRobinProtocol`, `PositionPapersProtocol`, and `StructuredPhasesProtocol`. The `MeetingOrchestrator` runs meetings end-to-end with token budget enforcement via `TokenTracker`. All protocols implement the `MeetingProtocol` protocol interface.

Copilot · 2026-03-08T09:09:42Z

DESIGN_SPEC.md

+│       │   ├── meeting/             # Meeting protocol subsystem
+│       │   │   ├── __init__.py    # Package exports
+│       │   │   ├── _prompts.py    # LLM prompt templates for meeting phases
+│       │   │   ├── _token_tracker.py # TokenBudgetTracker for duration_tokens enforcement


The package tree comment for _token_tracker.py calls it TokenBudgetTracker, but the module defines TokenTracker. Please align the spec’s naming with the code so readers can find the right symbol quickly.

Suggested change

│ │ │ ├── _token_tracker.py # TokenBudgetTracker for duration_tokens enforcement

│ │ │ ├── _token_tracker.py # TokenTracker for duration_tokens enforcement

Copilot · 2026-03-08T09:09:43Z

tests/unit/communication/meeting/test_round_robin.py

+"""Tests for round-robin meeting protocol."""
+
+import pytest
+
+from ai_company.communication.meeting.config import RoundRobinConfig


Most existing tests/unit/communication/** modules set pytestmark = pytest.mark.timeout(30) (e.g. tests/unit/communication/test_enums.py:14). This new meeting test module doesn’t, so it won’t be covered by the same hang-protection. Consider adding the same module-level timeout marker here (and across the new meeting test files) for consistency.

greptile-apps · 2026-03-08T09:13:31Z

Greptile Summary

This PR implements the full §5.7 meeting protocol system: three strategies (RoundRobinProtocol, PositionPapersProtocol, StructuredPhasesProtocol) behind a MeetingProtocol interface, a MeetingOrchestrator lifecycle manager, frozen Pydantic v2 domain models, a shared TokenTracker, 20 observability event constants, and a structured error hierarchy. The pre-review addressed the most critical findings (synthesis budget reserves, ExceptionGroup handling, event-constant corrections, validation logging). Three new issues remain:

decisions and action_items are never populated by any protocol — all three protocols place the synthesizer's free-text output in summary only. MeetingMinutes.decisions and MeetingMinutes.action_items always default to empty tuples, making the auto_create_tasks feature silently non-functional regardless of configuration. A parsing step is needed (or the feature should be clearly documented as caller-responsibility).
Discussion phase can still exhaust the synthesis reserve in structured_phases.py — the 20% _SYNTHESIS_RESERVE_FRACTION applied before input gathering is shared across the conflict-check agent, the discussion round, and synthesis. When discussion consumes the remaining reserve, _run_synthesis receives an exhausted tracker and raises MeetingBudgetExhaustedError. A nested sub-reserve for synthesis inside _run_discussion_round (mirroring the input-gathering fix) is needed.
MEETING_CONFLICT_DETECTED event emitted twice per conflict check — once at INFO (without raw_response) and once at DEBUG (with raw_response) using the same event constant string. This produces duplicate entries in log aggregators and inflates event-count dashboards.

Confidence Score: 3/5

Safe to merge for infrastructure, but the auto_create_tasks feature is silently broken and the synthesis budget exhaustion in structured phases remains a real runtime failure path.
The core protocol machinery, token tracking, error hierarchy, and observability plumbing are well-implemented and thoroughly tested. However, two logic issues affect runtime correctness: MeetingMinutes.action_items is never populated by any protocol (making auto_create_tasks inoperative and leaving decisions as a dead field), and the discussion phase in StructuredPhasesProtocol can still consume the synthesis reserve and trigger MeetingBudgetExhaustedError under realistic token budgets — the pre-review fix only protects synthesis from input-gathering exhaustion, not from discussion exhaustion. The duplicate event logging is a minor observability concern.
src/ai_company/communication/meeting/structured_phases.py (synthesis budget exhaustion + duplicate event logging) and src/ai_company/communication/meeting/models.py / all three protocol files (unpopulated action_items / decisions).

Important Files Changed

Filename	Overview
src/ai_company/communication/meeting/orchestrator.py	Lifecycle manager for meetings; correctly differentiates BUDGET_EXHAUSTED vs FAILED records, handles ExceptionGroup from parallel TaskGroup, validates inputs with proper logging. The `_create_tasks` method is silently inoperative because no protocol ever populates `MeetingMinutes.action_items`.
src/ai_company/communication/meeting/structured_phases.py	Three-phase protocol with parallel input gathering and conditional discussion. Two issues: (1) duplicate `MEETING_CONFLICT_DETECTED` event at both INFO and DEBUG per conflict check, and (2) discussion phase can fully consume the 20% synthesis reserve, leaving `_run_synthesis` with zero remaining tokens and raising `MeetingBudgetExhaustedError` despite the reserve fix from a prior review.
src/ai_company/communication/meeting/models.py	Well-structured frozen Pydantic v2 models with solid validators (timing, participant uniqueness, leader exclusion, status consistency). `MeetingMinutes.decisions` and `MeetingMinutes.action_items` are declared but never populated by any protocol implementation, making them dead fields.
src/ai_company/communication/meeting/position_papers.py	Parallel position-paper collection with deterministic slot-indexed ordering and 20% synthesis reserve. Clean separation of concerns with `_collect_position_papers` and `_run_synthesis`. No new issues; synthesis reserve is correctly applied before dividing the parallel budget.
src/ai_company/communication/meeting/round_robin.py	Sequential turn-taking with 20% summary reserve, per-turn token-available calculation, max_turns_per_agent and max_total_turns caps, and explicit budget-exhaustion warning. Logic is sound and no new issues found.
src/ai_company/communication/meeting/_token_tracker.py	Simple, correct token budget tracker with clear concurrency note in the module docstring. Validates non-negative inputs and positive budget. No issues.
src/ai_company/communication/meeting/protocol.py	Clean protocol interface definitions with `runtime_checkable` protocols for `MeetingProtocol` and `ConflictDetector`. `TaskCreator` type alias correctly uses `Priority` enum. No issues.
src/ai_company/observability/events/meeting.py	Twenty well-named event constants organized into logical groups. All constants are typed as `Final[str]`. No issues.
tests/integration/communication/test_meeting_integration.py	Good coverage of lifecycle, protocol switching, budget exhaustion, and task creation paths. No test verifies that `action_items` are ever non-empty or that `auto_create_tasks` actually creates tasks, which is consistent with the implementation gap noted in `models.py`.
src/ai_company/communication/meeting/errors.py	Clean four-level error hierarchy under `CommunicationError`. All exceptions are distinct and well-named. No issues.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant Orchestrator as MeetingOrchestrator
    participant Registry as ProtocolRegistry
    participant Protocol as MeetingProtocol
    participant Tracker as TokenTracker
    participant Agent as AgentCaller

    Caller->>Orchestrator: run_meeting(meeting_type_name, protocol_config, agenda, leader_id, participant_ids, token_budget)
    Orchestrator->>Orchestrator: _validate_inputs()
    Orchestrator->>Registry: _resolve_protocol(protocol_type)
    Registry-->>Orchestrator: protocol instance
    Orchestrator->>Orchestrator: log MEETING_STARTED
    Orchestrator->>Protocol: _execute_protocol() → protocol.run()

    alt RoundRobinProtocol
        Protocol->>Tracker: TokenTracker(budget=token_budget)
        Note over Protocol: Reserve 20% for summary
        loop max_turns_per_agent × participants
            Protocol->>Agent: agent_caller(participant_id, turn_prompt, tokens_available)
            Agent-->>Protocol: AgentResponse
            Protocol->>Tracker: record(input_tokens, output_tokens)
        end
        Protocol->>Agent: agent_caller(leader_id, summary_prompt, remaining)
        Agent-->>Protocol: summary AgentResponse
    else PositionPapersProtocol
        Protocol->>Tracker: TokenTracker(budget=token_budget)
        Note over Protocol: Reserve 20% for synthesis
        par parallel position papers
            Protocol->>Agent: agent_caller(participant_1, position_prompt, slice)
            Protocol->>Agent: agent_caller(participant_2, position_prompt, slice)
            Protocol->>Agent: agent_caller(participant_N, position_prompt, slice)
        end
        Protocol->>Agent: agent_caller(synthesizer_id, synthesis_prompt, remaining)
        Agent-->>Protocol: synthesis AgentResponse
    else StructuredPhasesProtocol
        Protocol->>Tracker: TokenTracker(budget=token_budget)
        Note over Protocol: Phase 1: Agenda broadcast (no LLM)
        Note over Protocol: Reserve 20% for later phases
        par Phase 2: parallel input gathering
            Protocol->>Agent: agent_caller(participant_1, input_prompt, slice)
            Protocol->>Agent: agent_caller(participant_N, input_prompt, slice)
        end
        Protocol->>Agent: agent_caller(leader_id, conflict_check_prompt, remaining)
        Agent-->>Protocol: conflict AgentResponse
        alt conflicts detected
            loop participants
                Protocol->>Agent: agent_caller(pid, discussion_prompt, slice)
            end
        end
        Protocol->>Agent: agent_caller(leader_id, synthesis_prompt, remaining)
        Agent-->>Protocol: synthesis AgentResponse
    end

    Protocol-->>Orchestrator: MeetingMinutes
    Orchestrator->>Orchestrator: _create_tasks(action_items)
    Orchestrator->>Orchestrator: _record_success() → log MEETING_COMPLETED
    Orchestrator-->>Caller: MeetingRecord(status=COMPLETED)

Comments Outside Diff (3)

src/ai_company/communication/meeting/models.py, line 839-867 (link)

decisions and action_items fields are never populated — auto_create_tasks is silently inoperative

All three protocol implementations (RoundRobinProtocol, PositionPapersProtocol, StructuredPhasesProtocol) construct MeetingMinutes without ever passing decisions or action_items, so both fields always default to empty tuples ().

This means MeetingOrchestrator._create_tasks() always short-circuits on the not minutes.action_items guard:
```
# orchestrator.py _create_tasks — this early-return always fires
if (
    self._task_creator is None
    or not protocol_config.auto_create_tasks
    or not minutes.action_items          # ← always True; protocols never populate this
):
    return
```
The synthesis prompts in all three protocols do ask agents to produce "action items with assignees", but the free-text summary field captures the raw LLM output — no parsing step extracts structured ActionItem objects from it.

Concrete evidence:
- round_robin.py MeetingMinutes(... summary=summary ...) — no action_items kwarg
- position_papers.py MeetingMinutes(... summary=synthesis_contribution.content ...) — no action_items kwarg
- structured_phases.py MeetingMinutes(... summary=summary ...) — no action_items kwarg
Setting auto_create_tasks=True in MeetingProtocolConfig (the default) will never actually create any tasks. A parsing layer that extracts ActionItem objects from the synthesis response is needed, or the docstring for auto_create_tasks should be updated to reflect that the feature requires caller-side population of action_items.
src/ai_company/communication/meeting/structured_phases.py, line 2785-2795 (link)

Same MEETING_CONFLICT_DETECTED event constant emitted twice per conflict check

The same event constant is logged once at INFO and once at DEBUG for every conflict-check response:
```
logger.info(
    MEETING_CONFLICT_DETECTED,
    meeting_id=meeting_id,
    conflicts_found=conflicts_detected,   # subset of fields
)
logger.debug(
    MEETING_CONFLICT_DETECTED,           # same constant
    meeting_id=meeting_id,
    conflicts_found=conflicts_detected,
    raw_response=conflict_response.content,
)
```
In any log aggregator that groups or counts by event name, this produces two meeting.conflict.detected entries for every meeting — one without raw_response and one with. Dashboards counting conflict-detection events will double-count, and filtering for meeting.conflict.detected at DEBUG level will return both entries.

The INFO entry is already sufficient for operational visibility. The raw_response context is useful for debugging but the note-worthy bit is at DEBUG, so the DEBUG line is redundant for every case where DEBUG is enabled (it duplicates the same conflicts_found field). Consider dropping the DEBUG call, or — if the raw response is desired at DEBUG — use a different, more specific event string (e.g., MEETING_AGENT_RESPONDED already exists and is fired in _collect_input):
src/ai_company/communication/meeting/structured_phases.py, line 2854-2861 (link)

Discussion phase can exhaust the synthesis reserve, still causing MeetingBudgetExhaustedError

The _SYNTHESIS_RESERVE_FRACTION is subtracted from tracker.remaining before input gathering, protecting synthesis from input-gathering exhaustion. However, once _run_discussion is entered, the reserved portion of the budget is shared between the conflict-check agent call and the discussion round — with no nested sub-reserve carved out for synthesis.

In practice: after input gathering consumes ~80% of the total budget, the conflict-check and discussion phases together consume the remaining ~20%. Since discussion_budget = min(self._config.max_discussion_tokens, tracker.remaining) returns tracker.remaining whenever the remaining tokens are less than max_discussion_tokens, all participants in _run_discussion_round can collectively consume the entire remaining balance, leaving nothing for _run_synthesis, which then raises MeetingBudgetExhaustedError.

The same reserve pattern from _run_input_gathering (line 2639) should be applied inside _run_discussion_round before dividing the discussion budget across participants, so synthesis retains a guaranteed token allowance regardless of how much discussion consumes.

_{Last reviewed commit: 2aa2a1f}

src/ai_company/communication/meeting/config.py

src/ai_company/communication/meeting/orchestrator.py

DESIGN_SPEC.md

coderabbitai

Actionable comments posted: 27

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@DESIGN_SPEC.md`:
- Line 653: Update the spec text to use the current symbol names: replace
references to TokenBudgetTracker with TokenTracker and AgendaItem with
MeetingAgendaItem wherever they appear (including the section currently
mentioning all 3 meeting protocols and the noted repeat at lines ~2421-2425);
ensure any inline code formatting or API references match the implemented
symbols RoundRobinProtocol, PositionPapersProtocol, StructuredPhasesProtocol,
MeetingOrchestrator, and TokenTracker so readers are directed to existing APIs.

In `@src/ai_company/communication/config.py`:
- Around line 113-116: Update the Google-style docstring for the public class
MeetingTypeConfig to include the new public attribute `protocol_config` (type
MeetingProtocolConfig) and a short description such as "Meeting protocol
configuration" so generated docs and IDE hints reflect the new field; ensure the
docstring follows the existing Google-style sections (Args/Attributes) and
mentions `duration_tokens` and the new `protocol_config` entry consistently with
other attributes.

In `@src/ai_company/communication/meeting/_prompts.py`:
- Around line 23-27: The agenda item formatter in _prompts.py currently drops
MeetingAgendaItem.presenter_id when building entries from agenda.items; update
the loop that builds entry (for i, item in enumerate(agenda.items, 1)) to
include the optional presenter assignment when present (e.g., append a presenter
indicator using item.presenter_id or resolved presenter name) so presenter
ownership is preserved in the shared agenda prompt and any downstream protocol
that consumes MeetingAgendaItem can route presenter-specific flows correctly.
- Around line 3-9: The public annotation for MeetingAgenda in
build_agenda_prompt is only available under TYPE_CHECKING which causes runtime
resolution errors; replace the TYPE_CHECKING-only import with a real import of
MeetingAgenda from ai_company.communication.meeting.models at module level
(remove the conditional block or add a normal import alongside it) so that
build_agenda_prompt can be resolved at runtime by typing.get_type_hints() and
other consumers.

In `@src/ai_company/communication/meeting/_token_tracker.py`:
- Around line 6-38: TokenTracker currently mutates shared state via record(),
violating the repo convention of copy-on-write runtime state; replace in-place
mutation by making TokenTracker immutable and having record return a new
instance with updated input_tokens/output_tokens (or convert to a frozen
Pydantic model and use model_copy(update=...)). Update the TokenTracker
dataclass to be frozen (or use Pydantic BaseModel frozen) and change the record
method signature to return TokenTracker with the new totals; keep the used,
remaining, and is_exhausted properties as-is but computed from the new instance
to preserve behavior. Ensure all call sites that previously relied on in-place
updates now reassign the returned TokenTracker from record(...) to propagate
state.
- Around line 25-38: The record method currently accepts any integers and can
silently under/over-count tokens; update record(input_tokens, output_tokens) to
validate both deltas are non-negative ints, compute projected_used = self.used +
input_tokens + output_tokens and raise a ValueError (or custom exception) before
mutating if projected_used > self.budget or if either delta < 0; keep remaining
and is_exhausted as-is but rely on this check to prevent negative/overspend
states and perform the validation atomically so
self.input_tokens/self.output_tokens/self.used are only updated after the checks
pass.

In `@src/ai_company/communication/meeting/config.py`:
- Around line 61-66: Remove the stale attribute mention from
StructuredPhasesConfig's docstring: delete the "auto_create_tasks" entry under
Attributes so the docstring only documents skip_discussion_if_no_conflicts and
max_discussion_tokens; reference that the auto_create_tasks setting now lives on
MeetingProtocolConfig.auto_create_tasks so do not reintroduce it into
StructuredPhasesConfig's public docs.

In `@src/ai_company/communication/meeting/models.py`:
- Around line 265-289: The validator _validate_status_consistency should reject
blank/whitespace error messages for failed states: when self.status is
MeetingStatus.FAILED or MeetingStatus.BUDGET_EXHAUSTED, check that
self.error_message is not None and that self.error_message.strip() is non-empty
(or enforce min_length=1 on the error_message Field) and raise the existing
ValueError if the check fails so empty strings/whitespace are not accepted.
- Around line 184-218: The MeetingMinutes model currently allows independently
set total_input_tokens and total_output_tokens that can diverge from
per-contribution counts in contributions; update MeetingMinutes to enforce
consistency by deriving or validating these totals from MeetingContribution
entries: either convert total_input_tokens and total_output_tokens into computed
fields (like total_tokens is computed) or add a model validator that sums
MeetingContribution.total_input_tokens and .total_output_tokens and
raises/normalizes when they differ from the stored
total_input_tokens/total_output_tokens; refer to MeetingContribution,
MeetingMinutes, contributions, total_input_tokens, total_output_tokens, and
total_tokens when making the change.

In `@src/ai_company/communication/meeting/orchestrator.py`:
- Around line 332-378: Extend _validate_inputs to also reject blank/whitespace
IDs and duplicate participant IDs before execution: after the existing
token_budget check, validate leader_id.strip() is non-empty and that every
participant_id.strip() is non-empty (treat trimmed values as the canonical IDs),
logging MEETING_VALIDATION_FAILED and raising MeetingParticipantError with
context={"meeting_id": meeting_id, "leader_id": leader_id} (or include offending
participant id) when any are blank; then check for duplicates among the trimmed
participant_ids (e.g., by comparing length of set(trimmed_participant_ids) to
list length), log MEETING_VALIDATION_FAILED with error="duplicate
participant_ids" and raise MeetingParticipantError with context={"meeting_id":
meeting_id, "duplicates": list_of_duplicates}; keep the existing
leader-in-participants check using the trimmed canonical IDs so leader_id
present in participant_ids still fails as before.
- Around line 85-88: The constructor currently assigns the caller-owned
protocol_registry directly to self._protocol_registry which allows external
mutation; in __init__ make a defensive deep copy of the passed protocol_registry
(use copy.deepcopy) and wrap that copy in types.MappingProxyType before
assigning to self._protocol_registry so the orchestrator holds an immutable
snapshot for its lifetime; update any imports (copy and MappingProxyType) if
needed and keep the attribute name self._protocol_registry and its type
annotation list[MeetingRecord] unchanged.
- Around line 346-348: Before raising the ValueError in the token_budget <= 0
branch, emit a warning using the existing logger with context and the
MEETING_VALIDATION_FAILED event: log a message that includes the token_budget
value and MEETING_VALIDATION_FAILED (use logger.warning or self.logger.warning
consistent with surrounding code) and any other relevant identifiers, then raise
ValueError(msg); ensure MEETING_VALIDATION_FAILED is referenced/imported the
same way other validation errors are logged in this module so observability
captures this public error path.

In `@src/ai_company/communication/meeting/position_papers.py`:
- Around line 156-167: The MeetingMinutes returned for POSITION_PAPERS only sets
summary, leaving decisions and action_items empty so
MeetingOrchestrator._create_tasks() cannot generate tasks; update the return in
position_papers flow to populate MeetingMinutes.decisions and .action_items from
the synthesis contribution (e.g., use synthesis_contribution.decisions and
synthesis_contribution.action_items or parse structured fields from
synthesis_contribution.content if needed), converting them to the expected tuple
types, and keep summary as synthesis_contribution.content so tasks are created
correctly.
- Around line 203-215: The current pre-division uses tokens_per_agent = max(1,
tracker.remaining // max(1, n)) which can over-allocate when tracker.remaining <
number of parallel workers; change the logic before spawning _collect_paper
tasks to compute an effective fan-out: set n_effective = min(n,
tracker.remaining) (or if tracker.remaining == 0 raise/return early), then
compute tokens_per_agent = tracker.remaining // max(1, n_effective); if
tokens_per_agent == 0 either reduce n_effective to tracker.remaining or fail
fast; update all places that use n (e.g., the TaskGroup spawn and participant
loop that starts _collect_paper) to use n_effective so no worker receives a zero
or implicit extra token.

In `@src/ai_company/communication/meeting/round_robin.py`:
- Around line 125-126: The code always subtracts a summary reserve
(summary_reserve = int(token_budget * _SUMMARY_RESERVE_FRACTION)) from
discussion_budget even when summaries are disabled; change the logic in the
block that computes summary_reserve/discussion_budget so that summary_reserve is
set to 0 when leader_summarizes is False (and only compute int(token_budget *
_SUMMARY_RESERVE_FRACTION) when leader_summarizes is True), then set
discussion_budget = token_budget - summary_reserve; update any code paths that
use summary_reserve to assume zero when summaries are disabled (references:
summary_reserve, _SUMMARY_RESERVE_FRACTION, discussion_budget,
leader_summarizes).

In `@src/ai_company/communication/meeting/structured_phases.py`:
- Around line 388-392: The conflict detection call currently passes
tracker.remaining directly to agent_caller (conflict_response = await
agent_caller(leader_id, conflict_prompt, tracker.remaining)), which can exhaust
the whole meeting budget; change it to cap the tokens by
StructuredPhasesConfig.max_discussion_tokens (e.g., compute allowed =
min(tracker.remaining, StructuredPhasesConfig.max_discussion_tokens) and pass
allowed) and ensure the value is non-negative so phase 3 discussion/synthesis
retain their token budget; update any related variable names in the same scope
and keep the agent_caller invocation using leader_id and conflict_prompt.
- Around line 278-279: The current calculation forces tokens_per_agent = 1 even
when tracker.remaining < num_participants, causing total allocation to
oversubscribe the remaining budget; change the logic around
participant_ids/tokens_per_agent so you either (A) cap fan-out by computing
max_agents = min(len(participant_ids), tracker.remaining) and use
tokens_per_agent = tracker.remaining // max(1, max_agents) (and reduce/choose a
subset of participant_ids to max_agents), or (B) fail fast when
tracker.remaining < len(participant_ids) by raising or returning an error;
update the code that sets tokens_per_agent and any subsequent use of
participant_ids accordingly (refer to participant_ids, tokens_per_agent,
tracker.remaining).
- Around line 238-250: The MeetingMinutes being returned from the
structured-phases flow omits decisions and action_items even though the
synthesis collects them; update the MeetingMinutes constructor call in
structured_phases.py to include decisions and action_items (e.g.,
decisions=tuple(decisions) and action_items=tuple(action_items)), ensuring the
local variables (decisions, action_items) are defined or defaulted to empty
lists/tuples before the return so downstream task creation sees the populated
fields; keep protocol_type as MeetingProtocolType.STRUCTURED_PHASES and preserve
existing fields like summary.

In `@tests/integration/communication/test_meeting_integration.py`:
- Around line 121-145: The test test_round_robin_full_lifecycle currently only
asserts len(record.minutes.contributions) > 0 which doesn't verify the
RoundRobinConfig was applied; update the assertion to check that the
contributions match the expected deterministic number given
RoundRobinConfig(max_turns_per_agent=1) (i.e., two participant turns plus one
meeting summary => assert len(record.minutes.contributions) == 3) and optionally
assert the participant contributions contain the expected participant_ids
("agent-a", "agent-b") in the expected order to ensure max_turns_per_agent was
honored; locate these checks around the existing MeetingProtocolConfig /
RoundRobinConfig usage and the record.minutes.contributions assertions.

In `@tests/unit/communication/meeting/conftest.py`:
- Around line 58-79: The fake caller _caller currently ignores the max_tokens
argument so budget tests can't observe or enforce token limits; modify _caller
to record the requested max_tokens (e.g., push it into a shared list or dict
keyed by agent_id/call index) and/or clamp the returned
input_tokens/output_tokens to not exceed max_tokens before constructing the
AgentResponse; update references inside _caller (max_tokens, input_tokens,
output_tokens, call_counts, _responses, default_content, AgentResponse) so tests
can inspect the captured requested budgets or rely on the clamped token counts.

In `@tests/unit/communication/meeting/test_orchestrator.py`:
- Around line 374-402: The tests test_task_creator_not_called_when_disabled and
its paired test never exercise task creation because the mocked protocol/minutes
return no ActionItem; update the tests to use a stub protocol or mocked minutes
that include at least one ActionItem so the orchestrator path through
_create_tasks() is executed (for example, provide a custom protocol/minutes
object returned by the orchestrator run that contains an ActionItem), then
verify behavior when MeetingProtocolConfig(auto_create_tasks=False) and when
task_creator raises—this will exercise the gate and exception-handling paths;
reference the test functions test_task_creator_not_called_when_disabled, the
MeetingProtocolConfig(auto_create_tasks=False) usage, the task_creator stub
passed into _make_orchestrator, and the orchestrator.run_meeting call to locate
where to inject the stubbed minutes or protocol.
- Around line 292-310: The test test_budget_exhaustion_produces_record currently
accepts MeetingStatus.FAILED which permits regressions; change the assertion to
require the dedicated budget-exhaustion status only by asserting record.status
== MeetingStatus.BUDGET_EXHAUSTED (keep the existing check that
record.error_message is not None) so the test verifies
orchestrator.run_meeting(..., token_budget=1) produces the explicit
BUDGET_EXHAUSTED outcome.

In `@tests/unit/communication/meeting/test_position_papers.py`:
- Around line 60-118: The tests test_contributions_structure and
test_parallel_execution_all_participants currently only assert counts and set
membership which won't catch nondeterministic ordering regressions; update them
to assert a deterministic order by inspecting minutes.contributions (filtering
by MeetingPhase.POSITION_PAPER) and then check the sequence of each
contribution's turn_number or agent_id matches the expected participant_ids
order (or a sorted/explicit expected sequence), e.g., build the ordered list
from [c for c in minutes.contributions if c.phase ==
MeetingPhase.POSITION_PAPER] and assert their turn_number or agent_id sequence
equals the expected ordering to lock in the deterministic contract.

In `@tests/unit/communication/meeting/test_round_robin.py`:
- Around line 220-247: The test currently treats a call that would push total
tokens past token_budget as valid; update the test
(test_budget_exhaustion_stops_turns and the similar block at 249-280) so the
RoundRobinProtocol enforces the budget pre-turn: invoke protocol.run with
make_mock_agent_caller and token_budget=50 but assert that the protocol either
raises MeetingBudgetExhaustedError before executing the turn that would cross
the budget or verify the mock caller was not invoked for that crossing turn;
specifically reference RoundRobinProtocol.run, make_mock_agent_caller,
MeetingBudgetExhaustedError and the token_budget argument when changing the
expectation so the test fails if an over-budget turn is allowed to complete.

In `@tests/unit/communication/meeting/test_structured_phases.py`:
- Around line 105-112: The test mocks use a hardcoded key "leader-agent" in the
responses mapping which breaks if the fixture leader_id changes; update each
responses dict (e.g., the one passed to make_mock_agent_caller in
test_structured_phases.py) to use the injected leader_id variable as the key
instead of the literal string, and apply the same change to the other response
dictionaries in this file (the blocks at the other ranges noted) so mocks match
the fixture-provided leader_id.

In `@tests/unit/communication/test_enums.py`:
- Around line 19-20: Update the test to assert the new enum member directly
instead of relying only on the enum length: in
tests/unit/communication/test_enums.py modify the test_values() (or add to
test_member_count) to include an explicit assertion that
MessageType.MEETING_CONTRIBUTION exists (e.g., assert
MessageType.MEETING_CONTRIBUTION in MessageType or assert hasattr(MessageType,
"MEETING_CONTRIBUTION")) so the test fails if that public enum value is removed
or renamed.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 38d2d420-bcd8-46b8-a5e3-718f69959a38

📥 Commits

Reviewing files that changed from the base of the PR and between c02832a and af5d72a.

📒 Files selected for processing (32)

CLAUDE.md
DESIGN_SPEC.md
src/ai_company/communication/__init__.py
src/ai_company/communication/config.py
src/ai_company/communication/enums.py
src/ai_company/communication/meeting/__init__.py
src/ai_company/communication/meeting/_prompts.py
src/ai_company/communication/meeting/_token_tracker.py
src/ai_company/communication/meeting/config.py
src/ai_company/communication/meeting/enums.py
src/ai_company/communication/meeting/errors.py
src/ai_company/communication/meeting/models.py
src/ai_company/communication/meeting/orchestrator.py
src/ai_company/communication/meeting/position_papers.py
src/ai_company/communication/meeting/protocol.py
src/ai_company/communication/meeting/round_robin.py
src/ai_company/communication/meeting/structured_phases.py
src/ai_company/observability/events/meeting.py
tests/integration/communication/test_meeting_integration.py
tests/unit/communication/meeting/__init__.py
tests/unit/communication/meeting/conftest.py
tests/unit/communication/meeting/test_config.py
tests/unit/communication/meeting/test_enums.py
tests/unit/communication/meeting/test_errors.py
tests/unit/communication/meeting/test_models.py
tests/unit/communication/meeting/test_orchestrator.py
tests/unit/communication/meeting/test_position_papers.py
tests/unit/communication/meeting/test_protocol.py
tests/unit/communication/meeting/test_round_robin.py
tests/unit/communication/meeting/test_structured_phases.py
tests/unit/communication/test_enums.py
tests/unit/observability/test_events.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Agent
GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (4)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations in Python 3.14+ code — Python 3.14 has native PEP 649 lazy annotations
Use except A, B: (no parentheses) for exception syntax in Python 3.14 — PEP 758 except syntax enforced by ruff
All public functions and classes must have type hints; mypy strict mode is enforced
All public classes and functions must have Google-style docstrings; ruff D rules enforce this
Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement of non-Pydantic internal collections (registries, BaseTool)
For non-Pydantic internal collections and dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use @computed_field for derived values instead of storing redundant fields
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow errors
Validate at system boundaries (user input, external APIs, config files)
Line length must be 88 characters (enforced by ruff)

Files:

tests/unit/communication/test_enums.py
src/ai_company/communication/meeting/_prompts.py
src/ai_company/communication/meeting/enums.py
tests/unit/communication/meeting/test_position_papers.py
tests/unit/communication/meeting/test_round_robin.py
src/ai_company/communication/enums.py
tests/unit/communication/meeting/test_protocol.py
src/ai_company/communication/config.py
src/ai_company/communication/meeting/round_robin.py
tests/unit/communication/meeting/conftest.py
tests/unit/communication/meeting/test_errors.py
tests/integration/communication/test_meeting_integration.py
tests/unit/communication/meeting/test_structured_phases.py
src/ai_company/communication/meeting/_token_tracker.py
tests/unit/communication/meeting/test_config.py
src/ai_company/communication/meeting/position_papers.py
src/ai_company/communication/meeting/protocol.py
src/ai_company/communication/meeting/__init__.py
src/ai_company/communication/meeting/errors.py
src/ai_company/communication/meeting/config.py
src/ai_company/observability/events/meeting.py
tests/unit/communication/meeting/test_enums.py
src/ai_company/communication/meeting/orchestrator.py
tests/unit/communication/meeting/test_models.py
src/ai_company/communication/meeting/models.py
src/ai_company/communication/__init__.py
tests/unit/communication/meeting/test_orchestrator.py
src/ai_company/communication/meeting/structured_phases.py
tests/unit/observability/test_events.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow as appropriate
Prefer @pytest.mark.parametrize for testing similar cases
Use vendor-agnostic test names: test-provider, test-small-001, etc. instead of real vendor names

Files:

tests/unit/communication/test_enums.py
tests/unit/communication/meeting/test_position_papers.py
tests/unit/communication/meeting/test_round_robin.py
tests/unit/communication/meeting/test_protocol.py
tests/unit/communication/meeting/conftest.py
tests/unit/communication/meeting/test_errors.py
tests/integration/communication/test_meeting_integration.py
tests/unit/communication/meeting/test_structured_phases.py
tests/unit/communication/meeting/test_config.py
tests/unit/communication/meeting/test_enums.py
tests/unit/communication/meeting/test_models.py
tests/unit/communication/meeting/test_orchestrator.py
tests/unit/observability/test_events.py

src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic must import from ai_company.observability import get_logger and create logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code
Always use variable name logger (not _logger, not log) for the logger instance
Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Always use structured logging with kwargs: logger.info(EVENT, key=value) — never use formatted strings like logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG logging is for object creation, internal flow, and entry/exit of key functions

Files:

src/ai_company/communication/meeting/_prompts.py
src/ai_company/communication/meeting/enums.py
src/ai_company/communication/enums.py
src/ai_company/communication/config.py
src/ai_company/communication/meeting/round_robin.py
src/ai_company/communication/meeting/_token_tracker.py
src/ai_company/communication/meeting/position_papers.py
src/ai_company/communication/meeting/protocol.py
src/ai_company/communication/meeting/__init__.py
src/ai_company/communication/meeting/errors.py
src/ai_company/communication/meeting/config.py
src/ai_company/observability/events/meeting.py
src/ai_company/communication/meeting/orchestrator.py
src/ai_company/communication/meeting/models.py
src/ai_company/communication/__init__.py
src/ai_company/communication/meeting/structured_phases.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases

Files:

src/ai_company/communication/meeting/_prompts.py
src/ai_company/communication/meeting/enums.py
src/ai_company/communication/enums.py
src/ai_company/communication/config.py
src/ai_company/communication/meeting/round_robin.py
src/ai_company/communication/meeting/_token_tracker.py
src/ai_company/communication/meeting/position_papers.py
src/ai_company/communication/meeting/protocol.py
src/ai_company/communication/meeting/__init__.py
src/ai_company/communication/meeting/errors.py
src/ai_company/communication/meeting/config.py
src/ai_company/observability/events/meeting.py
src/ai_company/communication/meeting/orchestrator.py
src/ai_company/communication/meeting/models.py
src/ai_company/communication/__init__.py
src/ai_company/communication/meeting/structured_phases.py

🧠 Learnings (9)