feat: implement meeting protocol system (#123)#167
Conversation
Add pluggable meeting protocol system with three strategies: - Round-Robin: sequential turns with full transcript context - Position Papers: parallel position papers with synthesis - Structured Phases: input gathering, conflict detection, discussion, synthesis Includes MeetingOrchestrator lifecycle manager, frozen Pydantic models, token budget enforcement, event constants, and 138 meeting-specific tests.
Pre-reviewed by 9 agents, 38 findings addressed: - Extract shared _TokenTracker and _prompts modules (DRY) - Move auto_create_tasks to top-level MeetingProtocolConfig - Fix non-deterministic parallel ordering in position_papers/structured_phases - Add budget pre-division for parallel agents - Split large run() methods into focused helpers - Add validation logging, ExceptionGroup handling, task creator resilience - Fix TaskCreator type alias (str → Priority) - Add participant uniqueness + leader exclusion validators on MeetingMinutes - Use NotBlankStr for decisions tuple - Strengthen test coverage: budget exhaustion, task creation, protocol interface - Add meeting event constant tests - Update DESIGN_SPEC.md §5.4, §5.7, §15.3 and CLAUDE.md package structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a new meeting subsystem: three pluggable meeting protocols (RoundRobin, PositionPapers, StructuredPhases), a MeetingOrchestrator with token-budget enforcement and task-creation hooks, domain models/errors/enums, token/prompt utilities, observability event constants, public re-exports, and extensive unit/integration tests. No existing public signatures were removed. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Orchestrator as MeetingOrchestrator
participant Protocol as MeetingProtocol
participant Tracker as TokenTracker
participant Caller as AgentCaller
participant Agent as Agent
Client->>Orchestrator: run_meeting(config, agenda, leader, participants, budget)
Orchestrator->>Orchestrator: _validate_inputs()
Orchestrator->>Orchestrator: _resolve_protocol()
Orchestrator->>Protocol: run(meeting_id, agenda, leader, participants, agent_caller, budget)
Protocol->>Tracker: TokenTracker(budget)
Protocol->>Protocol: build_agenda_prompt()
loop contributions (parallel or sequential)
Protocol->>Caller: call(agent_id, prompt, max_tokens)
Caller->>Agent: invoke
Agent-->>Caller: AgentResponse (input/output tokens)
Caller-->>Protocol: AgentResponse
Protocol->>Tracker: record(input_tokens, output_tokens)
Protocol->>Protocol: append MeetingContribution
end
alt budget available for synthesis/summary
Protocol->>Caller: call(leader/synthesizer, synthesis_prompt, max_tokens)
Caller-->>Protocol: synthesis AgentResponse
Protocol->>Tracker: record(...)
else budget exhausted
Protocol-->>Orchestrator: raise MeetingBudgetExhaustedError
end
Protocol-->>Orchestrator: MeetingMinutes
Orchestrator->>Orchestrator: _record_success() / _make_failure_record()
alt auto_create_tasks enabled
Orchestrator->>Orchestrator: _create_tasks(action_items)
end
Orchestrator-->>Client: MeetingRecord
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a comprehensive meeting protocol system, enabling structured multi-agent conversations within the AI company framework. It provides three distinct protocols—round-robin, position papers, and structured phases—managed by a central orchestrator. The system ensures robust lifecycle management, budget enforcement, and detailed audit trails, significantly enhancing the platform's ability to facilitate complex agent interactions and decision-making. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive and well-designed meeting protocol system. It includes three distinct protocols (round-robin, position papers, and structured phases), a robust orchestrator for managing the meeting lifecycle, and a full suite of supporting models, configurations, error types, and tests. The implementation demonstrates strong adherence to best practices, especially in its use of dependency injection, asynchronous patterns with asyncio.TaskGroup, and resilient error handling. The code is well-structured and thoroughly tested. I have one suggestion to improve the robustness of the conflict detection mechanism in the structured phases protocol by using a more deterministic method than string parsing.
Note: Security Review did not run due to the size of the PR.
| contributions.append(conflict_contribution) | ||
| turn_number += 1 | ||
|
|
||
| conflicts_detected = "CONFLICTS: YES" in conflict_response.content.upper() |
There was a problem hiding this comment.
Parsing the LLM's free-text response to detect conflicts is a bit brittle. A minor change in the model's output format (e.g., extra whitespace, different phrasing) could break this logic. For more robust conflict detection, consider one of these approaches:
- Structured JSON Output: Instruct the LLM to respond with a JSON object, which can be parsed reliably.
{ "conflicts_detected": true, "analysis": "Disagreement on API style." } - Tool Calling: Define a
record_conflict_decisiontool that the LLM must call. The tool would take a booleanconflicts_detectedargument, ensuring a deterministic signal.
This change would improve the robustness of the conflict detection mechanism.
There was a problem hiding this comment.
Pull request overview
Implements the meeting protocol subsystem described in DESIGN_SPEC §5.7, adding pluggable meeting execution strategies plus orchestration, configuration, observability events, and comprehensive unit/integration tests.
Changes:
- Added
ai_company.communication.meetingpackage with 3 protocol implementations, shared prompt/token utilities, and aMeetingOrchestrator. - Introduced meeting domain models/config/enums/errors and wired meeting settings into
MeetingTypeConfigplus communication package re-exports. - Added meeting observability event constants and expanded unit/integration test coverage for protocols/orchestrator.
Reviewed changes
Copilot reviewed 31 out of 32 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/observability/test_events.py | Registers and validates meeting event constants in the observability event suite. |
| tests/unit/communication/test_enums.py | Updates message-type enum member count for new meeting message type. |
| tests/unit/communication/meeting/conftest.py | Adds meeting test fixtures and a mock AgentCaller. |
| tests/unit/communication/meeting/test_config.py | Unit tests for meeting protocol config models. |
| tests/unit/communication/meeting/test_enums.py | Unit tests for meeting protocol enums (type/phase/status). |
| tests/unit/communication/meeting/test_errors.py | Unit tests for meeting error hierarchy + context behavior. |
| tests/unit/communication/meeting/test_models.py | Unit tests for meeting domain Pydantic models/validators. |
| tests/unit/communication/meeting/test_orchestrator.py | Unit tests for orchestrator validation, execution, audit trail, and error handling. |
| tests/unit/communication/meeting/test_position_papers.py | Unit tests for parallel position-papers protocol behavior. |
| tests/unit/communication/meeting/test_protocol.py | Unit tests verifying the runtime-checkable MeetingProtocol interface and type aliases. |
| tests/unit/communication/meeting/test_round_robin.py | Unit tests for round-robin protocol behavior and budget handling. |
| tests/unit/communication/meeting/test_structured_phases.py | Unit tests for structured-phases protocol behavior and conflict gating. |
| tests/unit/communication/meeting/init.py | Marks meeting unit test package. |
| tests/integration/communication/test_meeting_integration.py | End-to-end integration tests for protocol switching, budgets, and communication re-exports. |
| src/ai_company/observability/events/meeting.py | Adds meeting lifecycle/phase/task/token event constants. |
| src/ai_company/communication/meeting/_prompts.py | Shared agenda prompt builder for meeting protocols. |
| src/ai_company/communication/meeting/_token_tracker.py | Shared mutable token budget tracker used by protocol implementations. |
| src/ai_company/communication/meeting/config.py | Adds per-protocol config models + top-level MeetingProtocolConfig. |
| src/ai_company/communication/meeting/enums.py | Adds meeting protocol/phase/status enums. |
| src/ai_company/communication/meeting/errors.py | Adds meeting-specific error hierarchy rooted at CommunicationError. |
| src/ai_company/communication/meeting/models.py | Adds frozen Pydantic domain models for agendas, minutes, records, action items, etc. |
| src/ai_company/communication/meeting/orchestrator.py | Adds orchestrator to validate inputs, pick protocol, execute meeting, create tasks, and record audit trail. |
| src/ai_company/communication/meeting/position_papers.py | Implements async position papers protocol using asyncio.TaskGroup. |
| src/ai_company/communication/meeting/protocol.py | Defines MeetingProtocol runtime-checkable protocol and AgentCaller/TaskCreator type aliases. |
| src/ai_company/communication/meeting/round_robin.py | Implements round-robin protocol with transcript context, summary reserve, and budget handling. |
| src/ai_company/communication/meeting/structured_phases.py | Implements structured-phases protocol with parallel input gathering and conditional discussion. |
| src/ai_company/communication/meeting/init.py | Re-exports meeting subsystem public API. |
| src/ai_company/communication/enums.py | Adds MessageType.MEETING_CONTRIBUTION. |
| src/ai_company/communication/config.py | Adds protocol_config: MeetingProtocolConfig to MeetingTypeConfig. |
| src/ai_company/communication/init.py | Re-exports meeting types from top-level communication package. |
| DESIGN_SPEC.md | Updates spec status and package tree to include meeting subsystem and config shape. |
| CLAUDE.md | Updates package structure and logging event-constant guidance to include meetings. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """Integration tests for the meeting protocol system. | ||
|
|
||
| Tests full meeting lifecycle with protocol switching, budget | ||
| enforcement, and orchestrator coordination. | ||
| """ | ||
|
|
||
| import pytest | ||
|
|
||
| from ai_company.communication.meeting.config import ( | ||
| MeetingProtocolConfig, |
There was a problem hiding this comment.
Existing integration tests typically include pytestmark = pytest.mark.timeout(30) (e.g. tests/integration/communication/test_delegation_integration.py:39) to avoid hangs. This new integration test module doesn’t set a timeout marker, so consider adding one for consistency and to keep CI behavior uniform.
| def test_member_count(self) -> None: | ||
| assert len(MessageType) == 8 | ||
| assert len(MessageType) == 9 | ||
|
|
||
| def test_values(self) -> None: | ||
| assert MessageType.TASK_UPDATE.value == "task_update" |
There was a problem hiding this comment.
len(MessageType) was updated to 9, but test_values() still only asserts the original 8 enum values. Add an assertion for MessageType.MEETING_CONTRIBUTION.value == "meeting_contribution" so this test actually validates the newly added member.
| if status == MeetingStatus.BUDGET_EXHAUSTED: | ||
| logger.warning( | ||
| MEETING_FAILED, | ||
| meeting_id=meeting_id, | ||
| status=status, | ||
| error=error_msg, | ||
| ) |
There was a problem hiding this comment.
When the meeting ends due to budget exhaustion, _make_failure_record() logs the lifecycle event as MEETING_FAILED. Since MEETING_BUDGET_EXHAUSTED exists as a distinct lifecycle event, log that event for MeetingStatus.BUDGET_EXHAUSTED (and reserve MEETING_FAILED for true failures) so observability can distinguish the two outcomes reliably.
| if not participant_ids: | ||
| logger.warning( | ||
| MEETING_VALIDATION_FAILED, | ||
| meeting_id=meeting_id, | ||
| error="at least one participant is required", | ||
| ) | ||
| msg = "At least one participant is required" |
There was a problem hiding this comment.
_validate_inputs() doesn’t reject duplicate entries in participant_ids. If duplicates are passed, protocols will perform duplicate agent calls and MeetingMinutes validation will later fail (unique participant_ids), turning an input problem into a runtime FAILED record. Validate uniqueness here (and raise MeetingParticipantError) to keep failures deterministic and consistent with the method’s “validation errors are raised directly” contract.
| skip_discussion_if_no_conflicts: Skip discussion when no | ||
| conflicts are detected. | ||
| max_discussion_tokens: Token budget for the discussion round. | ||
| auto_create_tasks: Whether to auto-create tasks from action items. |
There was a problem hiding this comment.
StructuredPhasesConfig’s docstring lists auto_create_tasks, but the model doesn’t define that field (it lives on MeetingProtocolConfig). Update the docstring to avoid implying a per-protocol flag that doesn’t exist.
| auto_create_tasks: Whether to auto-create tasks from action items. |
DESIGN_SPEC.md
Outdated
| Meetings (§5.1 Pattern 3) follow configurable protocols that determine how agents interact during structured multi-agent conversations. Different meeting types naturally suit different protocols. All protocols implement a `MeetingProtocol` protocol, making the system extensible — new protocols can be registered and selected per meeting type. Cost bounds are enforced by `duration_tokens` in meeting config (§5.4). | ||
|
|
||
| > **MVP: Not in M3.** Meetings are an M4 feature. Round-Robin (Protocol 1) is the initial default. | ||
| > **Current state (M4 complete):** All 3 meeting protocols are implemented in `communication/meeting/`: `RoundRobinProtocol`, `PositionPapersProtocol`, and `StructuredPhasesProtocol`. The `MeetingOrchestrator` runs meetings end-to-end with token budget enforcement via `TokenBudgetTracker`. All protocols implement the `MeetingProtocol` protocol interface. |
There was a problem hiding this comment.
This section says budget enforcement is done via TokenBudgetTracker, but the implementation added in communication/meeting/_token_tracker.py defines TokenTracker. Update the name here to match the actual type to keep the spec accurate/searchable.
| > **Current state (M4 complete):** All 3 meeting protocols are implemented in `communication/meeting/`: `RoundRobinProtocol`, `PositionPapersProtocol`, and `StructuredPhasesProtocol`. The `MeetingOrchestrator` runs meetings end-to-end with token budget enforcement via `TokenBudgetTracker`. All protocols implement the `MeetingProtocol` protocol interface. | |
| > **Current state (M4 complete):** All 3 meeting protocols are implemented in `communication/meeting/`: `RoundRobinProtocol`, `PositionPapersProtocol`, and `StructuredPhasesProtocol`. The `MeetingOrchestrator` runs meetings end-to-end with token budget enforcement via `TokenTracker`. All protocols implement the `MeetingProtocol` protocol interface. |
DESIGN_SPEC.md
Outdated
| │ │ ├── meeting/ # Meeting protocol subsystem | ||
| │ │ │ ├── __init__.py # Package exports | ||
| │ │ │ ├── _prompts.py # LLM prompt templates for meeting phases | ||
| │ │ │ ├── _token_tracker.py # TokenBudgetTracker for duration_tokens enforcement |
There was a problem hiding this comment.
The package tree comment for _token_tracker.py calls it TokenBudgetTracker, but the module defines TokenTracker. Please align the spec’s naming with the code so readers can find the right symbol quickly.
| │ │ │ ├── _token_tracker.py # TokenBudgetTracker for duration_tokens enforcement | |
| │ │ │ ├── _token_tracker.py # TokenTracker for duration_tokens enforcement |
| """Tests for round-robin meeting protocol.""" | ||
|
|
||
| import pytest | ||
|
|
||
| from ai_company.communication.meeting.config import RoundRobinConfig |
There was a problem hiding this comment.
Most existing tests/unit/communication/** modules set pytestmark = pytest.mark.timeout(30) (e.g. tests/unit/communication/test_enums.py:14). This new meeting test module doesn’t, so it won’t be covered by the same hang-protection. Consider adding the same module-level timeout marker here (and across the new meeting test files) for consistency.
Greptile SummaryThis PR implements the full §5.7 meeting protocol system: three strategies (
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant Orchestrator as MeetingOrchestrator
participant Registry as ProtocolRegistry
participant Protocol as MeetingProtocol
participant Tracker as TokenTracker
participant Agent as AgentCaller
Caller->>Orchestrator: run_meeting(meeting_type_name, protocol_config, agenda, leader_id, participant_ids, token_budget)
Orchestrator->>Orchestrator: _validate_inputs()
Orchestrator->>Registry: _resolve_protocol(protocol_type)
Registry-->>Orchestrator: protocol instance
Orchestrator->>Orchestrator: log MEETING_STARTED
Orchestrator->>Protocol: _execute_protocol() → protocol.run()
alt RoundRobinProtocol
Protocol->>Tracker: TokenTracker(budget=token_budget)
Note over Protocol: Reserve 20% for summary
loop max_turns_per_agent × participants
Protocol->>Agent: agent_caller(participant_id, turn_prompt, tokens_available)
Agent-->>Protocol: AgentResponse
Protocol->>Tracker: record(input_tokens, output_tokens)
end
Protocol->>Agent: agent_caller(leader_id, summary_prompt, remaining)
Agent-->>Protocol: summary AgentResponse
else PositionPapersProtocol
Protocol->>Tracker: TokenTracker(budget=token_budget)
Note over Protocol: Reserve 20% for synthesis
par parallel position papers
Protocol->>Agent: agent_caller(participant_1, position_prompt, slice)
Protocol->>Agent: agent_caller(participant_2, position_prompt, slice)
Protocol->>Agent: agent_caller(participant_N, position_prompt, slice)
end
Protocol->>Agent: agent_caller(synthesizer_id, synthesis_prompt, remaining)
Agent-->>Protocol: synthesis AgentResponse
else StructuredPhasesProtocol
Protocol->>Tracker: TokenTracker(budget=token_budget)
Note over Protocol: Phase 1: Agenda broadcast (no LLM)
Note over Protocol: Reserve 20% for later phases
par Phase 2: parallel input gathering
Protocol->>Agent: agent_caller(participant_1, input_prompt, slice)
Protocol->>Agent: agent_caller(participant_N, input_prompt, slice)
end
Protocol->>Agent: agent_caller(leader_id, conflict_check_prompt, remaining)
Agent-->>Protocol: conflict AgentResponse
alt conflicts detected
loop participants
Protocol->>Agent: agent_caller(pid, discussion_prompt, slice)
end
end
Protocol->>Agent: agent_caller(leader_id, synthesis_prompt, remaining)
Agent-->>Protocol: synthesis AgentResponse
end
Protocol-->>Orchestrator: MeetingMinutes
Orchestrator->>Orchestrator: _create_tasks(action_items)
Orchestrator->>Orchestrator: _record_success() → log MEETING_COMPLETED
Orchestrator-->>Caller: MeetingRecord(status=COMPLETED)
|
There was a problem hiding this comment.
Actionable comments posted: 27
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@DESIGN_SPEC.md`:
- Line 653: Update the spec text to use the current symbol names: replace
references to TokenBudgetTracker with TokenTracker and AgendaItem with
MeetingAgendaItem wherever they appear (including the section currently
mentioning all 3 meeting protocols and the noted repeat at lines ~2421-2425);
ensure any inline code formatting or API references match the implemented
symbols RoundRobinProtocol, PositionPapersProtocol, StructuredPhasesProtocol,
MeetingOrchestrator, and TokenTracker so readers are directed to existing APIs.
In `@src/ai_company/communication/config.py`:
- Around line 113-116: Update the Google-style docstring for the public class
MeetingTypeConfig to include the new public attribute `protocol_config` (type
MeetingProtocolConfig) and a short description such as "Meeting protocol
configuration" so generated docs and IDE hints reflect the new field; ensure the
docstring follows the existing Google-style sections (Args/Attributes) and
mentions `duration_tokens` and the new `protocol_config` entry consistently with
other attributes.
In `@src/ai_company/communication/meeting/_prompts.py`:
- Around line 23-27: The agenda item formatter in _prompts.py currently drops
MeetingAgendaItem.presenter_id when building entries from agenda.items; update
the loop that builds entry (for i, item in enumerate(agenda.items, 1)) to
include the optional presenter assignment when present (e.g., append a presenter
indicator using item.presenter_id or resolved presenter name) so presenter
ownership is preserved in the shared agenda prompt and any downstream protocol
that consumes MeetingAgendaItem can route presenter-specific flows correctly.
- Around line 3-9: The public annotation for MeetingAgenda in
build_agenda_prompt is only available under TYPE_CHECKING which causes runtime
resolution errors; replace the TYPE_CHECKING-only import with a real import of
MeetingAgenda from ai_company.communication.meeting.models at module level
(remove the conditional block or add a normal import alongside it) so that
build_agenda_prompt can be resolved at runtime by typing.get_type_hints() and
other consumers.
In `@src/ai_company/communication/meeting/_token_tracker.py`:
- Around line 6-38: TokenTracker currently mutates shared state via record(),
violating the repo convention of copy-on-write runtime state; replace in-place
mutation by making TokenTracker immutable and having record return a new
instance with updated input_tokens/output_tokens (or convert to a frozen
Pydantic model and use model_copy(update=...)). Update the TokenTracker
dataclass to be frozen (or use Pydantic BaseModel frozen) and change the record
method signature to return TokenTracker with the new totals; keep the used,
remaining, and is_exhausted properties as-is but computed from the new instance
to preserve behavior. Ensure all call sites that previously relied on in-place
updates now reassign the returned TokenTracker from record(...) to propagate
state.
- Around line 25-38: The record method currently accepts any integers and can
silently under/over-count tokens; update record(input_tokens, output_tokens) to
validate both deltas are non-negative ints, compute projected_used = self.used +
input_tokens + output_tokens and raise a ValueError (or custom exception) before
mutating if projected_used > self.budget or if either delta < 0; keep remaining
and is_exhausted as-is but rely on this check to prevent negative/overspend
states and perform the validation atomically so
self.input_tokens/self.output_tokens/self.used are only updated after the checks
pass.
In `@src/ai_company/communication/meeting/config.py`:
- Around line 61-66: Remove the stale attribute mention from
StructuredPhasesConfig's docstring: delete the "auto_create_tasks" entry under
Attributes so the docstring only documents skip_discussion_if_no_conflicts and
max_discussion_tokens; reference that the auto_create_tasks setting now lives on
MeetingProtocolConfig.auto_create_tasks so do not reintroduce it into
StructuredPhasesConfig's public docs.
In `@src/ai_company/communication/meeting/models.py`:
- Around line 265-289: The validator _validate_status_consistency should reject
blank/whitespace error messages for failed states: when self.status is
MeetingStatus.FAILED or MeetingStatus.BUDGET_EXHAUSTED, check that
self.error_message is not None and that self.error_message.strip() is non-empty
(or enforce min_length=1 on the error_message Field) and raise the existing
ValueError if the check fails so empty strings/whitespace are not accepted.
- Around line 184-218: The MeetingMinutes model currently allows independently
set total_input_tokens and total_output_tokens that can diverge from
per-contribution counts in contributions; update MeetingMinutes to enforce
consistency by deriving or validating these totals from MeetingContribution
entries: either convert total_input_tokens and total_output_tokens into computed
fields (like total_tokens is computed) or add a model validator that sums
MeetingContribution.total_input_tokens and .total_output_tokens and
raises/normalizes when they differ from the stored
total_input_tokens/total_output_tokens; refer to MeetingContribution,
MeetingMinutes, contributions, total_input_tokens, total_output_tokens, and
total_tokens when making the change.
In `@src/ai_company/communication/meeting/orchestrator.py`:
- Around line 332-378: Extend _validate_inputs to also reject blank/whitespace
IDs and duplicate participant IDs before execution: after the existing
token_budget check, validate leader_id.strip() is non-empty and that every
participant_id.strip() is non-empty (treat trimmed values as the canonical IDs),
logging MEETING_VALIDATION_FAILED and raising MeetingParticipantError with
context={"meeting_id": meeting_id, "leader_id": leader_id} (or include offending
participant id) when any are blank; then check for duplicates among the trimmed
participant_ids (e.g., by comparing length of set(trimmed_participant_ids) to
list length), log MEETING_VALIDATION_FAILED with error="duplicate
participant_ids" and raise MeetingParticipantError with context={"meeting_id":
meeting_id, "duplicates": list_of_duplicates}; keep the existing
leader-in-participants check using the trimmed canonical IDs so leader_id
present in participant_ids still fails as before.
- Around line 85-88: The constructor currently assigns the caller-owned
protocol_registry directly to self._protocol_registry which allows external
mutation; in __init__ make a defensive deep copy of the passed protocol_registry
(use copy.deepcopy) and wrap that copy in types.MappingProxyType before
assigning to self._protocol_registry so the orchestrator holds an immutable
snapshot for its lifetime; update any imports (copy and MappingProxyType) if
needed and keep the attribute name self._protocol_registry and its type
annotation list[MeetingRecord] unchanged.
- Around line 346-348: Before raising the ValueError in the token_budget <= 0
branch, emit a warning using the existing logger with context and the
MEETING_VALIDATION_FAILED event: log a message that includes the token_budget
value and MEETING_VALIDATION_FAILED (use logger.warning or self.logger.warning
consistent with surrounding code) and any other relevant identifiers, then raise
ValueError(msg); ensure MEETING_VALIDATION_FAILED is referenced/imported the
same way other validation errors are logged in this module so observability
captures this public error path.
In `@src/ai_company/communication/meeting/position_papers.py`:
- Around line 156-167: The MeetingMinutes returned for POSITION_PAPERS only sets
summary, leaving decisions and action_items empty so
MeetingOrchestrator._create_tasks() cannot generate tasks; update the return in
position_papers flow to populate MeetingMinutes.decisions and .action_items from
the synthesis contribution (e.g., use synthesis_contribution.decisions and
synthesis_contribution.action_items or parse structured fields from
synthesis_contribution.content if needed), converting them to the expected tuple
types, and keep summary as synthesis_contribution.content so tasks are created
correctly.
- Around line 203-215: The current pre-division uses tokens_per_agent = max(1,
tracker.remaining // max(1, n)) which can over-allocate when tracker.remaining <
number of parallel workers; change the logic before spawning _collect_paper
tasks to compute an effective fan-out: set n_effective = min(n,
tracker.remaining) (or if tracker.remaining == 0 raise/return early), then
compute tokens_per_agent = tracker.remaining // max(1, n_effective); if
tokens_per_agent == 0 either reduce n_effective to tracker.remaining or fail
fast; update all places that use n (e.g., the TaskGroup spawn and participant
loop that starts _collect_paper) to use n_effective so no worker receives a zero
or implicit extra token.
In `@src/ai_company/communication/meeting/round_robin.py`:
- Around line 125-126: The code always subtracts a summary reserve
(summary_reserve = int(token_budget * _SUMMARY_RESERVE_FRACTION)) from
discussion_budget even when summaries are disabled; change the logic in the
block that computes summary_reserve/discussion_budget so that summary_reserve is
set to 0 when leader_summarizes is False (and only compute int(token_budget *
_SUMMARY_RESERVE_FRACTION) when leader_summarizes is True), then set
discussion_budget = token_budget - summary_reserve; update any code paths that
use summary_reserve to assume zero when summaries are disabled (references:
summary_reserve, _SUMMARY_RESERVE_FRACTION, discussion_budget,
leader_summarizes).
In `@src/ai_company/communication/meeting/structured_phases.py`:
- Around line 388-392: The conflict detection call currently passes
tracker.remaining directly to agent_caller (conflict_response = await
agent_caller(leader_id, conflict_prompt, tracker.remaining)), which can exhaust
the whole meeting budget; change it to cap the tokens by
StructuredPhasesConfig.max_discussion_tokens (e.g., compute allowed =
min(tracker.remaining, StructuredPhasesConfig.max_discussion_tokens) and pass
allowed) and ensure the value is non-negative so phase 3 discussion/synthesis
retain their token budget; update any related variable names in the same scope
and keep the agent_caller invocation using leader_id and conflict_prompt.
- Around line 278-279: The current calculation forces tokens_per_agent = 1 even
when tracker.remaining < num_participants, causing total allocation to
oversubscribe the remaining budget; change the logic around
participant_ids/tokens_per_agent so you either (A) cap fan-out by computing
max_agents = min(len(participant_ids), tracker.remaining) and use
tokens_per_agent = tracker.remaining // max(1, max_agents) (and reduce/choose a
subset of participant_ids to max_agents), or (B) fail fast when
tracker.remaining < len(participant_ids) by raising or returning an error;
update the code that sets tokens_per_agent and any subsequent use of
participant_ids accordingly (refer to participant_ids, tokens_per_agent,
tracker.remaining).
- Around line 238-250: The MeetingMinutes being returned from the
structured-phases flow omits decisions and action_items even though the
synthesis collects them; update the MeetingMinutes constructor call in
structured_phases.py to include decisions and action_items (e.g.,
decisions=tuple(decisions) and action_items=tuple(action_items)), ensuring the
local variables (decisions, action_items) are defined or defaulted to empty
lists/tuples before the return so downstream task creation sees the populated
fields; keep protocol_type as MeetingProtocolType.STRUCTURED_PHASES and preserve
existing fields like summary.
In `@tests/integration/communication/test_meeting_integration.py`:
- Around line 121-145: The test test_round_robin_full_lifecycle currently only
asserts len(record.minutes.contributions) > 0 which doesn't verify the
RoundRobinConfig was applied; update the assertion to check that the
contributions match the expected deterministic number given
RoundRobinConfig(max_turns_per_agent=1) (i.e., two participant turns plus one
meeting summary => assert len(record.minutes.contributions) == 3) and optionally
assert the participant contributions contain the expected participant_ids
("agent-a", "agent-b") in the expected order to ensure max_turns_per_agent was
honored; locate these checks around the existing MeetingProtocolConfig /
RoundRobinConfig usage and the record.minutes.contributions assertions.
In `@tests/unit/communication/meeting/conftest.py`:
- Around line 58-79: The fake caller _caller currently ignores the max_tokens
argument so budget tests can't observe or enforce token limits; modify _caller
to record the requested max_tokens (e.g., push it into a shared list or dict
keyed by agent_id/call index) and/or clamp the returned
input_tokens/output_tokens to not exceed max_tokens before constructing the
AgentResponse; update references inside _caller (max_tokens, input_tokens,
output_tokens, call_counts, _responses, default_content, AgentResponse) so tests
can inspect the captured requested budgets or rely on the clamped token counts.
In `@tests/unit/communication/meeting/test_orchestrator.py`:
- Around line 374-402: The tests test_task_creator_not_called_when_disabled and
its paired test never exercise task creation because the mocked protocol/minutes
return no ActionItem; update the tests to use a stub protocol or mocked minutes
that include at least one ActionItem so the orchestrator path through
_create_tasks() is executed (for example, provide a custom protocol/minutes
object returned by the orchestrator run that contains an ActionItem), then
verify behavior when MeetingProtocolConfig(auto_create_tasks=False) and when
task_creator raises—this will exercise the gate and exception-handling paths;
reference the test functions test_task_creator_not_called_when_disabled, the
MeetingProtocolConfig(auto_create_tasks=False) usage, the task_creator stub
passed into _make_orchestrator, and the orchestrator.run_meeting call to locate
where to inject the stubbed minutes or protocol.
- Around line 292-310: The test test_budget_exhaustion_produces_record currently
accepts MeetingStatus.FAILED which permits regressions; change the assertion to
require the dedicated budget-exhaustion status only by asserting record.status
== MeetingStatus.BUDGET_EXHAUSTED (keep the existing check that
record.error_message is not None) so the test verifies
orchestrator.run_meeting(..., token_budget=1) produces the explicit
BUDGET_EXHAUSTED outcome.
In `@tests/unit/communication/meeting/test_position_papers.py`:
- Around line 60-118: The tests test_contributions_structure and
test_parallel_execution_all_participants currently only assert counts and set
membership which won't catch nondeterministic ordering regressions; update them
to assert a deterministic order by inspecting minutes.contributions (filtering
by MeetingPhase.POSITION_PAPER) and then check the sequence of each
contribution's turn_number or agent_id matches the expected participant_ids
order (or a sorted/explicit expected sequence), e.g., build the ordered list
from [c for c in minutes.contributions if c.phase ==
MeetingPhase.POSITION_PAPER] and assert their turn_number or agent_id sequence
equals the expected ordering to lock in the deterministic contract.
In `@tests/unit/communication/meeting/test_round_robin.py`:
- Around line 220-247: The test currently treats a call that would push total
tokens past token_budget as valid; update the test
(test_budget_exhaustion_stops_turns and the similar block at 249-280) so the
RoundRobinProtocol enforces the budget pre-turn: invoke protocol.run with
make_mock_agent_caller and token_budget=50 but assert that the protocol either
raises MeetingBudgetExhaustedError before executing the turn that would cross
the budget or verify the mock caller was not invoked for that crossing turn;
specifically reference RoundRobinProtocol.run, make_mock_agent_caller,
MeetingBudgetExhaustedError and the token_budget argument when changing the
expectation so the test fails if an over-budget turn is allowed to complete.
In `@tests/unit/communication/meeting/test_structured_phases.py`:
- Around line 105-112: The test mocks use a hardcoded key "leader-agent" in the
responses mapping which breaks if the fixture leader_id changes; update each
responses dict (e.g., the one passed to make_mock_agent_caller in
test_structured_phases.py) to use the injected leader_id variable as the key
instead of the literal string, and apply the same change to the other response
dictionaries in this file (the blocks at the other ranges noted) so mocks match
the fixture-provided leader_id.
In `@tests/unit/communication/test_enums.py`:
- Around line 19-20: Update the test to assert the new enum member directly
instead of relying only on the enum length: in
tests/unit/communication/test_enums.py modify the test_values() (or add to
test_member_count) to include an explicit assertion that
MessageType.MEETING_CONTRIBUTION exists (e.g., assert
MessageType.MEETING_CONTRIBUTION in MessageType or assert hasattr(MessageType,
"MEETING_CONTRIBUTION")) so the test fails if that public enum value is removed
or renamed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 38d2d420-bcd8-46b8-a5e3-718f69959a38
📒 Files selected for processing (32)
CLAUDE.mdDESIGN_SPEC.mdsrc/ai_company/communication/__init__.pysrc/ai_company/communication/config.pysrc/ai_company/communication/enums.pysrc/ai_company/communication/meeting/__init__.pysrc/ai_company/communication/meeting/_prompts.pysrc/ai_company/communication/meeting/_token_tracker.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/communication/meeting/enums.pysrc/ai_company/communication/meeting/errors.pysrc/ai_company/communication/meeting/models.pysrc/ai_company/communication/meeting/orchestrator.pysrc/ai_company/communication/meeting/position_papers.pysrc/ai_company/communication/meeting/protocol.pysrc/ai_company/communication/meeting/round_robin.pysrc/ai_company/communication/meeting/structured_phases.pysrc/ai_company/observability/events/meeting.pytests/integration/communication/test_meeting_integration.pytests/unit/communication/meeting/__init__.pytests/unit/communication/meeting/conftest.pytests/unit/communication/meeting/test_config.pytests/unit/communication/meeting/test_enums.pytests/unit/communication/meeting/test_errors.pytests/unit/communication/meeting/test_models.pytests/unit/communication/meeting/test_orchestrator.pytests/unit/communication/meeting/test_position_papers.pytests/unit/communication/meeting/test_protocol.pytests/unit/communication/meeting/test_round_robin.pytests/unit/communication/meeting/test_structured_phases.pytests/unit/communication/test_enums.pytests/unit/observability/test_events.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotationsin Python 3.14+ code — Python 3.14 has native PEP 649 lazy annotations
Useexcept A, B:(no parentheses) for exception syntax in Python 3.14 — PEP 758 except syntax enforced by ruff
All public functions and classes must have type hints; mypy strict mode is enforced
All public classes and functions must have Google-style docstrings; ruff D rules enforce this
Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement of non-Pydantic internal collections (registries, BaseTool)
For non-Pydantic internal collections and dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow errors
Validate at system boundaries (user input, external APIs, config files)
Line length must be 88 characters (enforced by ruff)
Files:
tests/unit/communication/test_enums.pysrc/ai_company/communication/meeting/_prompts.pysrc/ai_company/communication/meeting/enums.pytests/unit/communication/meeting/test_position_papers.pytests/unit/communication/meeting/test_round_robin.pysrc/ai_company/communication/enums.pytests/unit/communication/meeting/test_protocol.pysrc/ai_company/communication/config.pysrc/ai_company/communication/meeting/round_robin.pytests/unit/communication/meeting/conftest.pytests/unit/communication/meeting/test_errors.pytests/integration/communication/test_meeting_integration.pytests/unit/communication/meeting/test_structured_phases.pysrc/ai_company/communication/meeting/_token_tracker.pytests/unit/communication/meeting/test_config.pysrc/ai_company/communication/meeting/position_papers.pysrc/ai_company/communication/meeting/protocol.pysrc/ai_company/communication/meeting/__init__.pysrc/ai_company/communication/meeting/errors.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/observability/events/meeting.pytests/unit/communication/meeting/test_enums.pysrc/ai_company/communication/meeting/orchestrator.pytests/unit/communication/meeting/test_models.pysrc/ai_company/communication/meeting/models.pysrc/ai_company/communication/__init__.pytests/unit/communication/meeting/test_orchestrator.pysrc/ai_company/communication/meeting/structured_phases.pytests/unit/observability/test_events.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slowas appropriate
Prefer@pytest.mark.parametrizefor testing similar cases
Use vendor-agnostic test names: test-provider, test-small-001, etc. instead of real vendor names
Files:
tests/unit/communication/test_enums.pytests/unit/communication/meeting/test_position_papers.pytests/unit/communication/meeting/test_round_robin.pytests/unit/communication/meeting/test_protocol.pytests/unit/communication/meeting/conftest.pytests/unit/communication/meeting/test_errors.pytests/integration/communication/test_meeting_integration.pytests/unit/communication/meeting/test_structured_phases.pytests/unit/communication/meeting/test_config.pytests/unit/communication/meeting/test_enums.pytests/unit/communication/meeting/test_models.pytests/unit/communication/meeting/test_orchestrator.pytests/unit/observability/test_events.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic must importfrom ai_company.observability import get_loggerand createlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Always use variable namelogger(not_logger, notlog) for the logger instance
Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Always use structured logging with kwargs:logger.info(EVENT, key=value)— never use formatted strings likelogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG logging is for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/communication/meeting/_prompts.pysrc/ai_company/communication/meeting/enums.pysrc/ai_company/communication/enums.pysrc/ai_company/communication/config.pysrc/ai_company/communication/meeting/round_robin.pysrc/ai_company/communication/meeting/_token_tracker.pysrc/ai_company/communication/meeting/position_papers.pysrc/ai_company/communication/meeting/protocol.pysrc/ai_company/communication/meeting/__init__.pysrc/ai_company/communication/meeting/errors.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/observability/events/meeting.pysrc/ai_company/communication/meeting/orchestrator.pysrc/ai_company/communication/meeting/models.pysrc/ai_company/communication/__init__.pysrc/ai_company/communication/meeting/structured_phases.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases
Files:
src/ai_company/communication/meeting/_prompts.pysrc/ai_company/communication/meeting/enums.pysrc/ai_company/communication/enums.pysrc/ai_company/communication/config.pysrc/ai_company/communication/meeting/round_robin.pysrc/ai_company/communication/meeting/_token_tracker.pysrc/ai_company/communication/meeting/position_papers.pysrc/ai_company/communication/meeting/protocol.pysrc/ai_company/communication/meeting/__init__.pysrc/ai_company/communication/meeting/errors.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/observability/events/meeting.pysrc/ai_company/communication/meeting/orchestrator.pysrc/ai_company/communication/meeting/models.pysrc/ai_company/communication/__init__.pysrc/ai_company/communication/meeting/structured_phases.py
🧠 Learnings (9)
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic must import `from ai_company.observability import get_logger` and create `logger = get_logger(__name__)`
Applied to files:
CLAUDE.mdsrc/ai_company/observability/events/meeting.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Applied to files:
CLAUDE.mdsrc/ai_company/observability/events/meeting.pytests/unit/observability/test_events.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use structured logging with kwargs: `logger.info(EVENT, key=value)` — never use formatted strings like `logger.info("msg %s", val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use variable name `logger` (not `_logger`, not `log`) for the logger instance
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : All state transitions must log at INFO level
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : DEBUG logging is for object creation, internal flow, and entry/exit of key functions
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Applied to files:
src/ai_company/communication/meeting/config.py
🧬 Code graph analysis (18)
tests/unit/communication/test_enums.py (1)
src/ai_company/communication/enums.py (1)
MessageType(6-20)
src/ai_company/communication/meeting/_prompts.py (3)
src/ai_company/communication/meeting/models.py (1)
MeetingAgenda(77-96)tests/integration/communication/test_meeting_integration.py (1)
agenda(98-114)src/ai_company/tools/base.py (1)
description(115-117)
tests/unit/communication/meeting/test_round_robin.py (2)
src/ai_company/communication/meeting/errors.py (1)
MeetingBudgetExhaustedError(14-15)src/ai_company/communication/meeting/round_robin.py (2)
get_protocol_type(94-96)run(98-196)
tests/unit/communication/meeting/test_protocol.py (3)
src/ai_company/communication/meeting/enums.py (1)
MeetingProtocolType(6-17)src/ai_company/communication/meeting/models.py (2)
MeetingAgenda(77-96)MeetingMinutes(153-235)tests/unit/communication/meeting/conftest.py (3)
meeting_id(114-116)leader_id(102-104)participant_ids(108-110)
src/ai_company/communication/config.py (1)
src/ai_company/communication/meeting/config.py (1)
MeetingProtocolConfig(81-117)
src/ai_company/communication/meeting/round_robin.py (6)
src/ai_company/communication/meeting/_prompts.py (1)
build_agenda_prompt(9-28)src/ai_company/communication/meeting/_token_tracker.py (5)
TokenTracker(7-38)is_exhausted(31-33)used(21-23)record(35-38)remaining(26-28)src/ai_company/communication/meeting/config.py (1)
RoundRobinConfig(9-33)src/ai_company/communication/meeting/enums.py (2)
MeetingPhase(20-46)MeetingProtocolType(6-17)src/ai_company/communication/meeting/errors.py (1)
MeetingBudgetExhaustedError(14-15)src/ai_company/communication/meeting/models.py (4)
MeetingAgenda(77-96)MeetingContribution(99-128)MeetingMinutes(153-235)total_tokens(216-218)
tests/unit/communication/meeting/conftest.py (2)
src/ai_company/communication/meeting/config.py (4)
MeetingProtocolConfig(81-117)PositionPapersConfig(36-55)RoundRobinConfig(9-33)StructuredPhasesConfig(58-78)src/ai_company/communication/meeting/models.py (3)
AgentResponse(23-52)MeetingAgenda(77-96)MeetingAgendaItem(55-74)
tests/unit/communication/meeting/test_errors.py (2)
src/ai_company/communication/errors.py (1)
CommunicationError(12-44)src/ai_company/communication/meeting/errors.py (5)
MeetingAgentError(26-27)MeetingBudgetExhaustedError(14-15)MeetingError(10-11)MeetingParticipantError(22-23)MeetingProtocolNotFoundError(18-19)
tests/integration/communication/test_meeting_integration.py (12)
src/ai_company/communication/meeting/config.py (4)
MeetingProtocolConfig(81-117)PositionPapersConfig(36-55)RoundRobinConfig(9-33)StructuredPhasesConfig(58-78)src/ai_company/communication/meeting/enums.py (2)
MeetingProtocolType(6-17)MeetingStatus(49-72)src/ai_company/communication/meeting/models.py (3)
AgentResponse(23-52)MeetingAgenda(77-96)MeetingAgendaItem(55-74)src/ai_company/communication/meeting/orchestrator.py (3)
MeetingOrchestrator(58-405)run_meeting(90-168)get_records(170-176)src/ai_company/communication/meeting/position_papers.py (1)
PositionPapersProtocol(71-363)src/ai_company/communication/meeting/protocol.py (1)
MeetingProtocol(39-78)src/ai_company/communication/meeting/round_robin.py (1)
RoundRobinProtocol(79-415)src/ai_company/communication/meeting/structured_phases.py (1)
StructuredPhasesProtocol(120-620)src/ai_company/tools/base.py (1)
description(115-117)src/ai_company/communication/config.py (1)
MeetingTypeConfig(79-133)src/ai_company/communication/meeting/errors.py (1)
MeetingError(10-11)src/ai_company/communication/enums.py (1)
MessageType(6-20)
tests/unit/communication/meeting/test_config.py (2)
src/ai_company/communication/meeting/config.py (4)
MeetingProtocolConfig(81-117)PositionPapersConfig(36-55)RoundRobinConfig(9-33)StructuredPhasesConfig(58-78)src/ai_company/communication/meeting/enums.py (1)
MeetingProtocolType(6-17)
src/ai_company/communication/meeting/position_papers.py (7)
src/ai_company/communication/meeting/_prompts.py (1)
build_agenda_prompt(9-28)src/ai_company/communication/meeting/_token_tracker.py (5)
TokenTracker(7-38)used(21-23)remaining(26-28)record(35-38)is_exhausted(31-33)src/ai_company/communication/meeting/config.py (1)
PositionPapersConfig(36-55)src/ai_company/communication/meeting/enums.py (2)
MeetingPhase(20-46)MeetingProtocolType(6-17)src/ai_company/communication/meeting/errors.py (1)
MeetingBudgetExhaustedError(14-15)src/ai_company/communication/meeting/models.py (4)
MeetingAgenda(77-96)MeetingContribution(99-128)MeetingMinutes(153-235)total_tokens(216-218)src/ai_company/communication/meeting/protocol.py (1)
run(47-70)
src/ai_company/communication/meeting/protocol.py (3)
src/ai_company/communication/meeting/enums.py (1)
MeetingProtocolType(6-17)src/ai_company/communication/meeting/models.py (3)
AgentResponse(23-52)MeetingAgenda(77-96)MeetingMinutes(153-235)src/ai_company/core/enums.py (1)
Priority(162-168)
src/ai_company/communication/meeting/errors.py (1)
src/ai_company/communication/errors.py (1)
CommunicationError(12-44)
src/ai_company/communication/meeting/config.py (1)
src/ai_company/communication/meeting/enums.py (1)
MeetingProtocolType(6-17)
tests/unit/communication/meeting/test_enums.py (2)
src/ai_company/communication/meeting/enums.py (3)
MeetingPhase(20-46)MeetingProtocolType(6-17)MeetingStatus(49-72)tests/unit/communication/test_enums.py (6)
test_member_count(19-20)test_member_count(38-39)test_member_count(56-57)test_member_count(67-68)test_member_count(78-79)test_member_count(99-100)
tests/unit/communication/meeting/test_models.py (1)
src/ai_company/communication/meeting/models.py (8)
ActionItem(131-150)AgentResponse(23-52)MeetingAgenda(77-96)MeetingAgendaItem(55-74)MeetingContribution(99-128)MeetingMinutes(153-235)MeetingRecord(238-290)total_tokens(216-218)
src/ai_company/communication/meeting/models.py (3)
src/ai_company/communication/meeting/enums.py (3)
MeetingPhase(20-46)MeetingProtocolType(6-17)MeetingStatus(49-72)src/ai_company/core/enums.py (1)
Priority(162-168)src/ai_company/core/types.py (1)
validate_unique_strings(25-37)
src/ai_company/communication/meeting/structured_phases.py (7)
src/ai_company/communication/meeting/_prompts.py (1)
build_agenda_prompt(9-28)src/ai_company/communication/meeting/_token_tracker.py (5)
TokenTracker(7-38)is_exhausted(31-33)used(21-23)remaining(26-28)record(35-38)src/ai_company/communication/meeting/config.py (1)
StructuredPhasesConfig(58-78)src/ai_company/communication/meeting/enums.py (2)
MeetingPhase(20-46)MeetingProtocolType(6-17)src/ai_company/communication/meeting/errors.py (1)
MeetingBudgetExhaustedError(14-15)src/ai_company/communication/meeting/models.py (4)
MeetingAgenda(77-96)MeetingContribution(99-128)MeetingMinutes(153-235)total_tokens(216-218)src/ai_company/observability/_logger.py (1)
get_logger(8-28)
🔇 Additional comments (1)
src/ai_company/observability/events/meeting.py (1)
5-40: Good centralization of meeting event names.Keeping the meeting telemetry surface in its own events module will make the new protocols and orchestrator logs consistent and easy to query.
Based on learnings, "Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget)".
DESIGN_SPEC.md
Outdated
| Meetings (§5.1 Pattern 3) follow configurable protocols that determine how agents interact during structured multi-agent conversations. Different meeting types naturally suit different protocols. All protocols implement a `MeetingProtocol` protocol, making the system extensible — new protocols can be registered and selected per meeting type. Cost bounds are enforced by `duration_tokens` in meeting config (§5.4). | ||
|
|
||
| > **MVP: Not in M3.** Meetings are an M4 feature. Round-Robin (Protocol 1) is the initial default. | ||
| > **Current state (M4 complete):** All 3 meeting protocols are implemented in `communication/meeting/`: `RoundRobinProtocol`, `PositionPapersProtocol`, and `StructuredPhasesProtocol`. The `MeetingOrchestrator` runs meetings end-to-end with token budget enforcement via `TokenBudgetTracker`. All protocols implement the `MeetingProtocol` protocol interface. |
There was a problem hiding this comment.
Update the spec to the current meeting symbol names.
These sections still refer to TokenBudgetTracker and AgendaItem, but the implemented symbols are TokenTracker and MeetingAgendaItem. Leaving the old names here sends readers to non-existent APIs.
Also applies to: 2421-2425
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@DESIGN_SPEC.md` at line 653, Update the spec text to use the current symbol
names: replace references to TokenBudgetTracker with TokenTracker and AgendaItem
with MeetingAgendaItem wherever they appear (including the section currently
mentioning all 3 meeting protocols and the noted repeat at lines ~2421-2425);
ensure any inline code formatting or API references match the implemented
symbols RoundRobinProtocol, PositionPapersProtocol, StructuredPhasesProtocol,
MeetingOrchestrator, and TokenTracker so readers are directed to existing APIs.
| protocol_config: MeetingProtocolConfig = Field( | ||
| default_factory=MeetingProtocolConfig, | ||
| description="Meeting protocol configuration", | ||
| ) |
There was a problem hiding this comment.
Update MeetingTypeConfig docs for the new public field.
protocol_config is now part of the public model, but the MeetingTypeConfig docstring still stops at duration_tokens. Please document the new attribute there so generated docs and IDE help stay accurate. As per coding guidelines, "All public classes and functions must have Google-style docstrings; ruff D rules enforce this".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/config.py` around lines 113 - 116, Update the
Google-style docstring for the public class MeetingTypeConfig to include the new
public attribute `protocol_config` (type MeetingProtocolConfig) and a short
description such as "Meeting protocol configuration" so generated docs and IDE
hints reflect the new field; ensure the docstring follows the existing
Google-style sections (Args/Attributes) and mentions `duration_tokens` and the
new `protocol_config` entry consistently with other attributes.
| from typing import TYPE_CHECKING | ||
|
|
||
| if TYPE_CHECKING: | ||
| from ai_company.communication.meeting.models import MeetingAgenda | ||
|
|
||
|
|
||
| def build_agenda_prompt(agenda: MeetingAgenda) -> str: |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Python 3.14 with PEP 649 lazy annotations, what happens when a function annotation references a type imported only inside if TYPE_CHECKING:andtyping.get_type_hints() evaluates that annotation?
💡 Result:
With PEP 649 lazy annotations (Python 3.14+), the annotation expression isn’t evaluated at function definition time, so referencing a name that’s only imported under if TYPE_CHECKING: is “harmless” until you actually introspect/evaluate the annotations. However, typing.get_type_hints() does try to resolve/evaluate annotations, and if a referenced forward name can’t be resolved (because the TYPE_CHECKING import never ran at runtime), it raises an exception such as NameError at the moment get_type_hints() evaluates that annotation. [1]
This matches the general PEP 649/749 model: evaluation is deferred, but when the deferred evaluation runs, missing names can still produce runtime errors like NameError. [2]
Implication: if you want get_type_hints() to succeed, you must ensure the referenced type is available in the evaluation namespace (e.g., do a real runtime import, or pass an appropriate globalns/localns), or use annotationlib.get_annotations(..., format=Format.FORWARDREF or Format.STRING) when you need to inspect annotations that may contain undefined symbols without failing. [1]
Sources:
[1] Python 3.14 typing docs for get_type_hints() / TYPE_CHECKING behavior and the NameError warning for unresolvable forward references (https://docs.python.org/bn-in/3.14/library/typing.html)
[2] PEP 749 (implementation notes/examples showing deferred evaluation can still raise NameError when evaluated) (https://peps.python.org/pep-0749/)
🏁 Script executed:
# Check if there's a circular import between models.py and _prompts.py
cd src/ai_company/communication/meeting
echo "=== Checking models.py imports ==="
head -20 models.py | grep -E "^(import|from)" || echo "No imports in first 20 lines"
echo -e "\n=== Checking _prompts.py imports ==="
head -20 _prompts.py | grep -E "^(import|from)" || echo "No imports in first 20 lines"
echo -e "\n=== Checking if models.py imports from _prompts.py ==="
grep -n "_prompts" models.py || echo "No imports from _prompts in models.py"Repository: Aureliolo/ai-company
Length of output: 477
🏁 Script executed:
# Read full _prompts.py to see the complete function implementation
cat -n src/ai_company/communication/meeting/_prompts.pyRepository: Aureliolo/ai-company
Length of output: 1097
🏁 Script executed:
# Check MeetingAgenda and MeetingAgendaItem definitions in models.py
cat -n src/ai_company/communication/meeting/models.py | head -100Repository: Aureliolo/ai-company
Length of output: 3382
🏁 Script executed:
# Find usages of build_agenda_prompt
rg "build_agenda_prompt" --type py -B 2 -A 2Repository: Aureliolo/ai-company
Length of output: 3462
🏁 Script executed:
# Check for test files
find src/ai_company/communication/meeting -name "*test*" -o -name "test_*"Repository: Aureliolo/ai-company
Length of output: 46
🏁 Script executed:
# Check if there's documentation about what presenter_id should do
grep -rn "presenter_id" src/ai_company/communication/meeting/ --type pyRepository: Aureliolo/ai-company
Length of output: 181
🏁 Script executed:
# Search for presenter_id usage in meeting module
rg "presenter_id" src/ai_company/communication/meeting/Repository: Aureliolo/ai-company
Length of output: 266
🏁 Script executed:
# Check if there are any test files for the prompts
find . -path "*/test*" -name "*prompt*" -o -path "*/test*" -name "*meeting*" | grep -E "(test|spec)" | head -20Repository: Aureliolo/ai-company
Length of output: 239
🏁 Script executed:
# Check the docstring and usage context more broadly
rg "presenter" src/ai_company/communication/meeting/ -B 2 -A 2Repository: Aureliolo/ai-company
Length of output: 888
🏁 Script executed:
# Check unit tests for meeting module to see what's expected
ls -la tests/unit/communication/meeting/Repository: Aureliolo/ai-company
Length of output: 918
🏁 Script executed:
# Look for tests specifically for _prompts.py
cat tests/unit/communication/meeting/*prompt* 2>/dev/null || echo "No prompt tests found, checking all tests"
ls tests/unit/communication/meeting/Repository: Aureliolo/ai-company
Length of output: 296
🏁 Script executed:
# Check integration test for meeting to see how presenter_id is used
cat tests/integration/communication/test_meeting_integration.py | head -100Repository: Aureliolo/ai-company
Length of output: 2998
Use a real import for MeetingAgenda here.
build_agenda_prompt() exposes MeetingAgenda in a public annotation, but the name only exists under TYPE_CHECKING. With Python 3.14 lazy annotation evaluation, runtime consumers such as typing.get_type_hints() will raise NameError when trying to resolve the annotation at runtime, since the TYPE_CHECKING block never executes. No circular dependency exists between these modules, so a real import is safe.
Safer import pattern
-from typing import TYPE_CHECKING
-
-if TYPE_CHECKING:
- from ai_company.communication.meeting.models import MeetingAgenda
+from ai_company.communication.meeting.models import MeetingAgenda📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from typing import TYPE_CHECKING | |
| if TYPE_CHECKING: | |
| from ai_company.communication.meeting.models import MeetingAgenda | |
| def build_agenda_prompt(agenda: MeetingAgenda) -> str: | |
| from ai_company.communication.meeting.models import MeetingAgenda | |
| def build_agenda_prompt(agenda: MeetingAgenda) -> str: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/_prompts.py` around lines 3 - 9, The
public annotation for MeetingAgenda in build_agenda_prompt is only available
under TYPE_CHECKING which causes runtime resolution errors; replace the
TYPE_CHECKING-only import with a real import of MeetingAgenda from
ai_company.communication.meeting.models at module level (remove the conditional
block or add a normal import alongside it) so that build_agenda_prompt can be
resolved at runtime by typing.get_type_hints() and other consumers.
| for i, item in enumerate(agenda.items, 1): | ||
| entry = f" {i}. {item.title}" | ||
| if item.description: | ||
| entry += f" — {item.description}" | ||
| parts.append(entry) |
There was a problem hiding this comment.
Don't drop presenter_id from the shared agenda prompt.
MeetingAgendaItem includes an optional presenter assignment in src/ai_company/communication/meeting/models.py, but this formatter strips it. Any protocol that uses this helper loses agenda ownership and can prompt the wrong meeting flow for presenter-specific items.
📝 Suggested formatter fix
for i, item in enumerate(agenda.items, 1):
entry = f" {i}. {item.title}"
+ if item.presenter_id:
+ entry += f" (presenter: {item.presenter_id})"
if item.description:
entry += f" — {item.description}"
parts.append(entry)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/_prompts.py` around lines 23 - 27, The
agenda item formatter in _prompts.py currently drops
MeetingAgendaItem.presenter_id when building entries from agenda.items; update
the loop that builds entry (for i, item in enumerate(agenda.items, 1)) to
include the optional presenter assignment when present (e.g., append a presenter
indicator using item.presenter_id or resolved presenter name) so presenter
ownership is preserved in the shared agenda prompt and any downstream protocol
that consumes MeetingAgendaItem can route presenter-specific flows correctly.
| @dataclasses.dataclass | ||
| class TokenTracker: | ||
| """Mutable token budget tracker scoped to a single meeting execution. | ||
|
|
||
| Attributes: | ||
| budget: Total token budget for the meeting. | ||
| input_tokens: Total prompt tokens consumed so far. | ||
| output_tokens: Total response tokens generated so far. | ||
| """ | ||
|
|
||
| budget: int | ||
| input_tokens: int = 0 | ||
| output_tokens: int = 0 | ||
|
|
||
| @property | ||
| def used(self) -> int: | ||
| """Total tokens consumed so far.""" | ||
| return self.input_tokens + self.output_tokens | ||
|
|
||
| @property | ||
| def remaining(self) -> int: | ||
| """Tokens remaining in the budget.""" | ||
| return max(0, self.budget - self.used) | ||
|
|
||
| @property | ||
| def is_exhausted(self) -> bool: | ||
| """Whether the budget is fully consumed.""" | ||
| return self.remaining == 0 | ||
|
|
||
| def record(self, input_tokens: int, output_tokens: int) -> None: | ||
| """Record token usage from an agent call.""" | ||
| self.input_tokens += input_tokens | ||
| self.output_tokens += output_tokens |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Prefer copy-on-write state over mutating TokenTracker in place.
This helper introduces shared mutable state via a dataclass and record(), which goes against the repo’s runtime-state convention and makes retries or parallel branches harder to reason about. Returning an updated tracker, or switching to a frozen Pydantic model plus model_copy(update=...), would keep budget accounting consistent with the rest of the codebase.
As per coding guidelines, "Create new objects instead of mutating existing ones" and "Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/_token_tracker.py` around lines 6 - 38,
TokenTracker currently mutates shared state via record(), violating the repo
convention of copy-on-write runtime state; replace in-place mutation by making
TokenTracker immutable and having record return a new instance with updated
input_tokens/output_tokens (or convert to a frozen Pydantic model and use
model_copy(update=...)). Update the TokenTracker dataclass to be frozen (or use
Pydantic BaseModel frozen) and change the record method signature to return
TokenTracker with the new totals; keep the used, remaining, and is_exhausted
properties as-is but computed from the new instance to preserve behavior. Ensure
all call sites that previously relied on in-place updates now reassign the
returned TokenTracker from record(...) to propagate state.
| async def test_contributions_structure( | ||
| self, | ||
| simple_agenda: MeetingAgenda, | ||
| leader_id: str, | ||
| participant_ids: tuple[str, ...], | ||
| meeting_id: str, | ||
| ) -> None: | ||
| caller = make_mock_agent_caller() | ||
| protocol = PositionPapersProtocol(config=PositionPapersConfig()) | ||
|
|
||
| minutes = await protocol.run( | ||
| meeting_id=meeting_id, | ||
| agenda=simple_agenda, | ||
| leader_id=leader_id, | ||
| participant_ids=participant_ids, | ||
| agent_caller=caller, | ||
| token_budget=10000, | ||
| ) | ||
|
|
||
| # 3 position papers + 1 synthesis = 4 contributions | ||
| assert len(minutes.contributions) == 4 | ||
|
|
||
| # Position papers | ||
| papers = [ | ||
| c for c in minutes.contributions if c.phase == MeetingPhase.POSITION_PAPER | ||
| ] | ||
| assert len(papers) == 3 | ||
|
|
||
| # Synthesis | ||
| synthesis = [ | ||
| c for c in minutes.contributions if c.phase == MeetingPhase.SYNTHESIS | ||
| ] | ||
| assert len(synthesis) == 1 | ||
|
|
||
| async def test_parallel_execution_all_participants( | ||
| self, | ||
| simple_agenda: MeetingAgenda, | ||
| leader_id: str, | ||
| participant_ids: tuple[str, ...], | ||
| meeting_id: str, | ||
| ) -> None: | ||
| caller = make_mock_agent_caller() | ||
| protocol = PositionPapersProtocol(config=PositionPapersConfig()) | ||
|
|
||
| minutes = await protocol.run( | ||
| meeting_id=meeting_id, | ||
| agenda=simple_agenda, | ||
| leader_id=leader_id, | ||
| participant_ids=participant_ids, | ||
| agent_caller=caller, | ||
| token_budget=10000, | ||
| ) | ||
|
|
||
| paper_agents = { | ||
| c.agent_id | ||
| for c in minutes.contributions | ||
| if c.phase == MeetingPhase.POSITION_PAPER | ||
| } | ||
| assert paper_agents == set(participant_ids) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Pin the deterministic ordering contract in the parallel-path tests.
These assertions only check counts and set membership. If minutes.contributions regresses back to nondeterministic ordering, this file will still stay green. Please add an ordered assertion on turn_number or the expected participant sequence to lock in the regression fix.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/meeting/test_position_papers.py` around lines 60 -
118, The tests test_contributions_structure and
test_parallel_execution_all_participants currently only assert counts and set
membership which won't catch nondeterministic ordering regressions; update them
to assert a deterministic order by inspecting minutes.contributions (filtering
by MeetingPhase.POSITION_PAPER) and then check the sequence of each
contribution's turn_number or agent_id matches the expected participant_ids
order (or a sorted/explicit expected sequence), e.g., build the ordered list
from [c for c in minutes.contributions if c.phase ==
MeetingPhase.POSITION_PAPER] and assert their turn_number or agent_id sequence
equals the expected ordering to lock in the deterministic contract.
| async def test_budget_exhaustion_stops_turns( | ||
| self, | ||
| simple_agenda: MeetingAgenda, | ||
| leader_id: str, | ||
| meeting_id: str, | ||
| ) -> None: | ||
| # Each call uses 30 tokens, budget is 50 (20% reserve = 40 discussion). | ||
| # agent-a: 30 used (< 40), agent-b: 60 used (>= 40, stops after). | ||
| # Budget check is pre-turn, so the call that crosses is completed. | ||
| # With leader_summarizes=True (default), budget exhaustion raises | ||
| # MeetingBudgetExhaustedError when summary cannot be generated. | ||
| caller = make_mock_agent_caller(input_tokens=10, output_tokens=20) | ||
| config = RoundRobinConfig( | ||
| max_turns_per_agent=5, | ||
| max_total_turns=100, | ||
| ) | ||
| protocol = RoundRobinProtocol(config=config) | ||
| participants = ("agent-a", "agent-b", "agent-c") | ||
|
|
||
| with pytest.raises(MeetingBudgetExhaustedError, match="budget exhausted"): | ||
| await protocol.run( | ||
| meeting_id=meeting_id, | ||
| agenda=simple_agenda, | ||
| leader_id=leader_id, | ||
| participant_ids=participants, | ||
| agent_caller=caller, | ||
| token_budget=50, | ||
| ) |
There was a problem hiding this comment.
These tests lock in over-budget execution as valid behavior.
With token_budget=50 and a 30-token mock response, both cases expect the second turn to complete even though the meeting has already exceeded its declared budget. That contradicts the duration-token bound in the PR objective and will keep budget-overrun bugs green.
Also applies to: 249-280
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/meeting/test_round_robin.py` around lines 220 - 247,
The test currently treats a call that would push total tokens past token_budget
as valid; update the test (test_budget_exhaustion_stops_turns and the similar
block at 249-280) so the RoundRobinProtocol enforces the budget pre-turn: invoke
protocol.run with make_mock_agent_caller and token_budget=50 but assert that the
protocol either raises MeetingBudgetExhaustedError before executing the turn
that would cross the budget or verify the mock caller was not invoked for that
crossing turn; specifically reference RoundRobinProtocol.run,
make_mock_agent_caller, MeetingBudgetExhaustedError and the token_budget
argument when changing the expectation so the test fails if an over-budget turn
is allowed to complete.
| # Leader response says no conflicts | ||
| responses = { | ||
| "leader-agent": [ | ||
| "CONFLICTS: NO\nAll participants agree.", | ||
| "Final synthesis with decisions.", | ||
| ], | ||
| } | ||
| caller = make_mock_agent_caller(responses=responses) |
There was a problem hiding this comment.
Use the injected leader_id instead of hardcoding "leader-agent".
These mocked response maps are keyed to a literal leader ID even though the test already receives leader_id. If that fixture changes or becomes parameterized, the custom responses stop matching and these tests fail for the wrong reason.
🧪 Proposed fix
- responses = {
- "leader-agent": [
+ responses = {
+ leader_id: [
"CONFLICTS: NO\nAll participants agree.",
"Final synthesis with decisions.",
],
}Apply the same replacement to the other response dictionaries in this file.
Also applies to: 143-149, 178-184, 215-221, 248-255, 330-333
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/meeting/test_structured_phases.py` around lines 105
- 112, The test mocks use a hardcoded key "leader-agent" in the responses
mapping which breaks if the fixture leader_id changes; update each responses
dict (e.g., the one passed to make_mock_agent_caller in
test_structured_phases.py) to use the injected leader_id variable as the key
instead of the literal string, and apply the same change to the other response
dictionaries in this file (the blocks at the other ranges noted) so mocks match
the fixture-provided leader_id.
| def test_member_count(self) -> None: | ||
| assert len(MessageType) == 8 | ||
| assert len(MessageType) == 9 |
There was a problem hiding this comment.
Assert the new enum member directly.
Line 20 updates the count, but test_values() still never checks MessageType.MEETING_CONTRIBUTION. This can pass even if the new public value regresses.
🔎 Suggested test update
def test_values(self) -> None:
assert MessageType.TASK_UPDATE.value == "task_update"
assert MessageType.QUESTION.value == "question"
assert MessageType.ANNOUNCEMENT.value == "announcement"
assert MessageType.REVIEW_REQUEST.value == "review_request"
assert MessageType.APPROVAL.value == "approval"
assert MessageType.DELEGATION.value == "delegation"
assert MessageType.STATUS_REPORT.value == "status_report"
assert MessageType.ESCALATION.value == "escalation"
+ assert MessageType.MEETING_CONTRIBUTION.value == "meeting_contribution"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/test_enums.py` around lines 19 - 20, Update the test
to assert the new enum member directly instead of relying only on the enum
length: in tests/unit/communication/test_enums.py modify the test_values() (or
add to test_member_count) to include an explicit assertion that
MessageType.MEETING_CONTRIBUTION exists (e.g., assert
MessageType.MEETING_CONTRIBUTION in MessageType or assert hasattr(MessageType,
"MEETING_CONTRIBUTION")) so the test fails if that public enum value is removed
or renamed.
- Add ConflictDetector protocol + KeywordConflictDetector (pluggable strategy) - Refactor structured_phases.py for immutability (sub-methods return contributions) - Add inverse status constraints on MeetingRecord - Add TokenTracker validation (positive budget, non-negative tokens) - Fix MeetingMinutes participant validators (duplicate IDs, leader-in-participants) - Add _format_exception for nested ExceptionGroup handling - Fix orchestrator budget exhaustion logging and task creation error handling - Update DESIGN_SPEC.md class names (TokenTracker, MeetingProtocolConfig) - Add test_token_tracker.py, test_prompts.py, and expand all test files
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (17)
tests/unit/communication/meeting/test_position_papers.py (1)
63-121:⚠️ Potential issue | 🟡 MinorAssert the deterministic
POSITION_PAPERorder.These tests still pass if
minutes.contributionsregresses back to nondeterministic ordering, because they only check counts and set membership. Add an ordered assertion over the paper contributions (agent_idand/orturn_number) to pin the regression fix.🧪 Suggested assertions
papers = [ c for c in minutes.contributions if c.phase == MeetingPhase.POSITION_PAPER ] assert len(papers) == 3 + assert [c.agent_id for c in papers] == list(participant_ids) + assert [c.turn_number for c in papers] == list(range(len(participant_ids))) ... assert paper_agents == set(participant_ids) + assert [ + c.agent_id + for c in minutes.contributions + if c.phase == MeetingPhase.POSITION_PAPER + ] == list(participant_ids)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/communication/meeting/test_position_papers.py` around lines 63 - 121, The tests currently only assert counts and set membership, which allows nondeterministic ordering to pass; update the two tests test_contributions_structure and test_parallel_execution_all_participants to assert a deterministic ordering for POSITION_PAPER contributions by extracting the list of contributions where c.phase == MeetingPhase.POSITION_PAPER from minutes.contributions and then assert their order (e.g., by comparing the sequence of c.agent_id or c.turn_number to the expected order derived from participant_ids or expected turn numbers). Ensure you reference the minutes.contributions list, MeetingPhase.POSITION_PAPER, and the contribution fields agent_id and/or turn_number when implementing the ordered assertion so the regression to nondeterministic ordering is pinned.src/ai_company/communication/meeting/position_papers.py (2)
156-168:⚠️ Potential issue | 🟠 MajorReturn structured decisions/action items, not only
summary.
MeetingMinutesexposesdecisionsandaction_items, andauto_create_tasksdepends on them. This return path only copiessynthesis_contribution.content, so successful position-papers meetings still emit empty structured outputs and won't create tasks. GivenMeetingContributionis free-text only, this likely needs a parser/helper here or a richer synthesis result type.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/position_papers.py` around lines 156 - 168, The return currently only sets MeetingMinutes.summary from synthesis_contribution.content, leaving MeetingMinutes.decisions and .action_items empty; update the position-papers finalization in the function that builds MeetingMinutes (the block returning MeetingMinutes with protocol_type=MeetingProtocolType.POSITION_PAPERS) to parse/derive structured decisions and action items from the synthesis_contribution (or from combined contributions) and populate MeetingMinutes.decisions and MeetingMinutes.action_items accordingly so auto_create_tasks can consume them; you can add a small helper (e.g., parse_decisions_and_actions_from_text) to extract items from synthesis_contribution.content or change the synthesis step to return a richer object and map its fields into MeetingMinutes.decisions and .action_items.
203-205:⚠️ Potential issue | 🟠 MajorDon't overcommit the paper-phase budget.
max(1, tracker.remaining // n)guarantees every worker at least one token. Whentracker.remaining < n, the slices already sum to more than the meeting budget before any agent call starts.💸 Suggested guard
- tokens_per_agent = max(1, tracker.remaining // max(1, n)) + if n > 0 and tracker.remaining < n: + msg = "Token budget exhausted before collecting position papers" + raise MeetingBudgetExhaustedError( + msg, + context={ + "meeting_id": meeting_id, + "budget": tracker.budget, + "used": tracker.used, + "participant_count": n, + }, + ) + tokens_per_agent = tracker.remaining // max(1, n)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/position_papers.py` around lines 203 - 205, The current pre-division line tokens_per_agent = max(1, tracker.remaining // max(1, n)) can overcommit when tracker.remaining < n; instead compute a base and remainder and distribute without exceeding tracker.remaining: set base = tracker.remaining // n and rem = tracker.remaining % n, then assign each agent tokens = base + (1 if agent_index < rem else 0) (this yields some agents 0 when tracker.remaining < n and guarantees the total sum <= tracker.remaining). Update the logic around tokens_per_agent and any loop that consumes it (referencing tokens_per_agent, tracker.remaining and n) to use this distribution scheme so no more tokens are allocated than available.DESIGN_SPEC.md (1)
2418-2425:⚠️ Potential issue | 🟡 MinorUse the current model name in the project tree.
AgendaItemis stale here; the implemented symbol isMeetingAgendaItem. Leaving the old name in §15.3 sends readers to a non-existent API.📝 Suggested fix
-│ │ │ ├── models.py # MeetingRecord, AgendaItem, ActionItem, etc. +│ │ │ ├── models.py # MeetingRecord, MeetingAgendaItem, ActionItem, etc.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@DESIGN_SPEC.md` around lines 2418 - 2425, Update the project tree and §15.3 references to use the actual implemented symbol name: replace the stale "AgendaItem" entry with "MeetingAgendaItem" (and update any other mentions of AgendaItem to MeetingAgendaItem) so readers are directed to the existing API; ensure the package list line under the meeting subsystem and the §15.3 text use MeetingAgendaItem consistently.src/ai_company/communication/meeting/_token_tracker.py (2)
16-18: 🛠️ Refactor suggestion | 🟠 Major
TokenTrackerstill breaks the repo's copy-on-write state model.A shared mutable dataclass here reintroduces in-place runtime state at the center of budget accounting. Prefer an immutable tracker that returns a new instance from
record()(or a frozen Pydantic model withmodel_copy(update=...)). As per coding guidelines, "Create new objects instead of mutating existing ones" and "Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields".Also applies to: 67-68
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/_token_tracker.py` around lines 16 - 18, TokenTracker is a mutable dataclass that violates the repo's copy-on-write model; change it to an immutable/frozen model and make its state-update methods return a new instance rather than mutating self. Specifically, convert TokenTracker to a frozen dataclass or a frozen Pydantic model and update the record() (and any other state-changing helpers referenced around TokenTracker) to perform a copy/update and return the new TokenTracker instance (e.g., via dataclasses.replace or model_copy(update=...)) so callers always receive a new object instead of mutating shared state.
41-68:⚠️ Potential issue | 🟠 MajorReject budget overruns before updating the totals.
record()validates negative deltas, but it still accepts a call that pushesusedpastbudget. If an agent returns more tokens than its assigned slice ormax_tokens, the protocols silently overspend andremaininghides it by clamping to zero. Compute the projected total first and raiseMeetingBudgetExhaustedErrorbefore mutating state.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/_token_tracker.py` around lines 41 - 68, The record() method currently updates input_tokens/output_tokens before checking budget, allowing silent overruns; modify MeetingTokenTracker.record to compute projected_used = (self.used or self.input_tokens + self.output_tokens) + input_tokens + output_tokens (or simply self.input_tokens + self.output_tokens + input_tokens + output_tokens) and if projected_used > self.budget raise MeetingBudgetExhaustedError before mutating state, otherwise proceed to increment self.input_tokens and self.output_tokens; keep the existing non-negative validation and use the existing property names remaining, is_exhausted, budget, used, input_tokens, output_tokens and the MeetingBudgetExhaustedError exception.src/ai_company/communication/meeting/structured_phases.py (3)
441-445:⚠️ Potential issue | 🟠 MajorCap the conflict-check call with
max_discussion_tokens.This leader call currently spends from
tracker.remaining, so the pre-discussion conflict check can consume the entire remaining meeting budget before the actual discussion round or synthesis runs. It needs to draw from the sameStructuredPhasesConfig.max_discussion_tokensallowance as the discussion phase.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/structured_phases.py` around lines 441 - 445, The pre-discussion conflict check currently calls agent_caller(leader_id, conflict_prompt, tracker.remaining) and can consume the whole meeting budget; change it to cap the token allowance with StructuredPhasesConfig.max_discussion_tokens (e.g., compute allowed_tokens = min(tracker.remaining, StructuredPhasesConfig.max_discussion_tokens)) and pass allowed_tokens to agent_caller so conflict_response uses the same max_discussion_tokens limit as the discussion phase.
282-295:⚠️ Potential issue | 🟠 MajorPersist structured decisions and action items from synthesis.
The synthesis step asks the leader for decisions and action items, but
MeetingMinutesis returned with onlysummary.action_itemstherefore stays empty for structured-phases meetings, so downstream task creation never runs on this protocol. Parse the synthesis output (or switch to a structured response) and populatedecisions/action_itemsbefore returning.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/structured_phases.py` around lines 282 - 295, The MeetingMinutes returned by finalize in structured_phases omits decisions and action_items, so downstream task creation never runs; update the finalize flow (inside structured_phases where MeetingMinutes(...) is constructed) to parse the synthesis output produced by the leader synthesis step (or switch that step to request a structured JSON response) and populate the MeetingMinutes.decisions and MeetingMinutes.action_items fields before returning; locate the synthesis result variable used earlier in the phase (the leader synthesis response) and extract/validate decisions and action items, converting them into the same types used by MeetingMinutes and include them in the MeetingMinutes(...) constructor along with summary.
322-323:⚠️ Potential issue | 🟠 MajorDon't oversubscribe the parallel input budget.
max(1, tracker.remaining // num_participants)gives every participant a token even when the remaining budget is smaller than the participant count, so theTaskGroupcan launch more aggregate budget than the meeting has left. Fail fast or cap fan-out instead of forcing a minimum slice of1.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/structured_phases.py` around lines 322 - 323, The current calculation in structured_phases.py that sets tokens_per_agent = max(1, tracker.remaining // max(1, num_participants)) can oversubscribe the TaskGroup when tracker.remaining < num_participants; change the logic to either fail fast or cap fan-out—e.g., compute allowed_agents = min(num_participants, tracker.remaining) and then set tokens_per_agent = max(1, tracker.remaining // allowed_agents) (or alternatively raise an error when tracker.remaining < num_participants); update the code paths that spawn the TaskGroup (referencing num_participants, tokens_per_agent, tracker.remaining and the TaskGroup spawn logic) to use allowed_agents so aggregate token usage never exceeds tracker.remaining.src/ai_company/communication/meeting/round_robin.py (2)
125-126:⚠️ Potential issue | 🟠 MajorDon't reserve summary budget when summaries are disabled.
With
leader_summarizes=False, this still withholds 20% of the caller's budget from discussion. The protocol can stop early even though summary generation is disabled.Suggested change
- summary_reserve = int(token_budget * _SUMMARY_RESERVE_FRACTION) + summary_reserve = ( + int(token_budget * _SUMMARY_RESERVE_FRACTION) + if self._config.leader_summarizes + else 0 + ) discussion_budget = token_budget - summary_reserve🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/round_robin.py` around lines 125 - 126, The code always reserves summary_reserve = int(token_budget * _SUMMARY_RESERVE_FRACTION) and subtracts it from discussion_budget even when leader_summarizes is False; change the logic in src/ai_company/communication/meeting/round_robin.py so that summary_reserve is only computed and withheld when leader_summarizes is True (otherwise set summary_reserve = 0 and discussion_budget = token_budget), updating the variables summary_reserve and discussion_budget accordingly where they are used so the protocol doesn't prematurely stop discussion when summaries are disabled.
185-197:⚠️ Potential issue | 🟠 MajorReturn structured decisions and action items from the summary phase.
The leader is prompted to list decisions and action items, but the returned
MeetingMinutesonly keeps the free-formsummary. That leavesaction_itemsempty on the successful round-robin path, soauto_create_taskscannot work for round-robin meetings.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/round_robin.py` around lines 185 - 197, The MeetingMinutes returned at the end of the round-robin flow only sets summary and omits the structured decisions/action_items captured during the summary phase; update the return in round_robin.py (the MeetingMinutes construction) to include the parsed decisions and action_items variables produced during the summary step (e.g., decisions and action_items or parsed_decisions/parsed_action_items) so MeetingMinutes.decisions and MeetingMinutes.action_items are populated for the successful round-robin path and allow auto_create_tasks to work.tests/unit/communication/meeting/test_structured_phases.py (1)
107-112:⚠️ Potential issue | 🟡 MinorUse the injected
leader_idin these mock response maps.These tests already receive
leader_id, but the response dictionaries are still keyed by the literal"leader-agent". If that fixture changes or becomes parameterized, the mocks stop matching and the failures no longer reflect protocol behavior.Also applies to: 145-149, 180-184, 217-221, 249-251, 331-333, 385-390, 431-435
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/communication/meeting/test_structured_phases.py` around lines 107 - 112, Mocks in these tests hard-code the key "leader-agent" in the responses dict instead of using the injected leader_id fixture, which will break if leader_id changes; update each responses map (the dict named responses in the test) to use the leader_id variable as the key (e.g., replace "leader-agent": [...] with leader_id: [...]) for all occurrences so the mock matches the injected fixture value and any parameterizations of leader_id.tests/unit/communication/meeting/test_orchestrator.py (1)
455-483:⚠️ Potential issue | 🟠 MajorThese two tests never enter the task-creation path.
Both tests use the default protocol/mocks, which currently produce no
ActionItems. That means theauto_create_tasks=Falsegate and the failingtask_creatorbranch never execute, so these tests can pass even if_create_tasks()is broken. Reuse the stub-protocol pattern fromtest_task_creator_called_with_correct_args()and return minutes with at least one action item.Also applies to: 485-511
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/communication/meeting/test_orchestrator.py` around lines 455 - 483, The tests test_task_creator_not_called_when_disabled (and the duplicate at 485-511) never exercise the task-creation path because the default stub protocol returns no ActionItem minutes; update these tests to reuse the stub-protocol used in test_task_creator_called_with_correct_args() so the protocol returns Minutes containing at least one ActionItem, then assert auto_create_tasks=False prevents _create_tasks() from calling the provided _creator. Specifically, adjust the protocol stub used when constructing orchestrator via _make_orchestrator to produce Minutes with an ActionItem (matching the stub pattern in test_task_creator_called_with_correct_args()), keep MeetingProtocolConfig(auto_create_tasks=False), pass the same _creator, and call orchestrator.run_meeting so the task-creation branch is actually reached and can be validated.src/ai_company/communication/meeting/orchestrator.py (2)
86-96:⚠️ Potential issue | 🟠 MajorFreeze the protocol registry to prevent external mutation.
The constructor stores the caller-owned
protocol_registrydirectly. External mutations after construction can change protocol resolution behavior unexpectedly.♻️ Proposed fix with defensive copy
+from copy import deepcopy +from types import MappingProxyType ... def __init__( self, *, protocol_registry: Mapping[MeetingProtocolType, MeetingProtocol], agent_caller: AgentCaller, task_creator: TaskCreator | None = None, ) -> None: - self._protocol_registry = protocol_registry + self._protocol_registry: Mapping[MeetingProtocolType, MeetingProtocol] = ( + MappingProxyType(deepcopy(dict(protocol_registry))) + ) self._agent_caller = agent_caller self._task_creator = task_creator self._records: list[MeetingRecord] = []As per coding guidelines, "Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement of non-Pydantic internal collections (registries, BaseTool)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/orchestrator.py` around lines 86 - 96, The constructor currently stores the caller-owned protocol_registry directly (parameter protocol_registry -> attribute self._protocol_registry), allowing external mutation; fix by creating a defensive copy and exposing it as read-only: deepcopy the incoming protocol_registry (use copy.deepcopy) and then wrap the result with types.MappingProxyType before assigning to self._protocol_registry so the internal registry cannot be mutated after construction (update imports as needed to include copy and MappingProxyType).
358-409:⚠️ Potential issue | 🟠 MajorValidate blank and duplicate IDs at the system boundary.
This validation only checks for empty
participant_idsand leader inclusion. Blank/whitespace IDs and duplicate participants will pass through and fail later inMeetingMinutesvalidation after tokens have been spent. Validate these constraints here to fail fast.🛡️ Proposed fix to add boundary validation
def _validate_inputs( self, meeting_id: str, leader_id: str, participant_ids: tuple[str, ...], token_budget: int, ) -> None: """Validate meeting inputs. Raises: MeetingParticipantError: If participants are empty or leader is in participants. ValueError: If token_budget is not positive. """ if token_budget <= 0: logger.warning( MEETING_VALIDATION_FAILED, meeting_id=meeting_id, error=f"token_budget must be positive, got {token_budget}", ) msg = f"token_budget must be positive, got {token_budget}" raise ValueError(msg) + # Validate leader_id is not blank + if not leader_id or not leader_id.strip(): + logger.warning( + MEETING_VALIDATION_FAILED, + meeting_id=meeting_id, + error="leader_id must not be blank", + ) + msg = "leader_id must not be blank" + raise MeetingParticipantError( + msg, + context={"meeting_id": meeting_id}, + ) + if not participant_ids: logger.warning( MEETING_VALIDATION_FAILED, meeting_id=meeting_id, error="at least one participant is required", ) msg = "At least one participant is required" raise MeetingParticipantError( msg, context={"meeting_id": meeting_id}, ) + + # Validate no blank participant IDs + blank_ids = [p for p in participant_ids if not p or not p.strip()] + if blank_ids: + logger.warning( + MEETING_VALIDATION_FAILED, + meeting_id=meeting_id, + error="participant_ids contain blank values", + ) + msg = "participant_ids must not contain blank values" + raise MeetingParticipantError( + msg, + context={"meeting_id": meeting_id}, + ) + + # Validate no duplicate participant IDs + if len(participant_ids) != len(set(participant_ids)): + from collections import Counter + dupes = [v for v, c in Counter(participant_ids).items() if c > 1] + logger.warning( + MEETING_VALIDATION_FAILED, + meeting_id=meeting_id, + error="duplicate participant_ids", + duplicates=dupes, + ) + msg = f"Duplicate participant_ids: {dupes}" + raise MeetingParticipantError( + msg, + context={"meeting_id": meeting_id, "duplicates": dupes}, + ) + if leader_id in participant_ids: ...As per coding guidelines, "Validate at system boundaries (user input, external APIs, config files)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/orchestrator.py` around lines 358 - 409, The _validate_inputs method currently only checks for empty participant_ids, leader inclusion, and token_budget; extend it to also reject blank/whitespace participant IDs and duplicate participant IDs at the system boundary: iterate over participant_ids in _validate_inputs, trim each ID and if any are empty/only whitespace log MEETING_VALIDATION_FAILED and raise MeetingParticipantError with context {"meeting_id": meeting_id} and a clear message about blank IDs, and if len(set(participant_ids)) != len(participant_ids) log and raise MeetingParticipantError indicating duplicate participant IDs (include meeting_id and optionally leader_id in context), keeping existing token_budget and leader checks intact and using the same logger and error patterns used elsewhere in _validate_inputs.src/ai_company/communication/meeting/models.py (2)
201-218: 🛠️ Refactor suggestion | 🟠 MajorDerive token totals from contributions to prevent inconsistency.
total_input_tokensandtotal_output_tokensare independent stored fields that can diverge from the sum of per-contribution token counts. Sincetotal_tokensis already a@computed_field, consider making these derived as well to enforce consistency and avoid redundant state.♻️ Proposed refactor to use computed fields
- total_input_tokens: int = Field( - default=0, - ge=0, - description="Total prompt tokens", - ) - total_output_tokens: int = Field( - default=0, - ge=0, - description="Total response tokens", - ) + `@computed_field` # type: ignore[prop-decorator] + `@property` + def total_input_tokens(self) -> int: + """Total prompt tokens consumed.""" + return sum(c.input_tokens for c in self.contributions) + + `@computed_field` # type: ignore[prop-decorator] + `@property` + def total_output_tokens(self) -> int: + """Total response tokens generated.""" + return sum(c.output_tokens for c in self.contributions)As per coding guidelines, "Use
@computed_fieldfor derived values instead of storing redundant fields".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/models.py` around lines 201 - 218, The stored fields total_input_tokens and total_output_tokens can diverge from per-contribution sums; change them to computed fields that sum over the meeting's contributions (e.g., sum(c.input_tokens) for total_input_tokens and sum(c.output_tokens) for total_output_tokens) instead of keeping default stored ints, and ensure total_tokens remains (or is updated to be) the computed sum of those computed fields to remove redundant state; update the Field definitions/annotations for total_input_tokens and total_output_tokens to use `@computed_field` and reference the contributions relationship when calculating the totals (keeping the existing total_tokens property as-is or adjust to return the sum of the newly computed fields).
265-268:⚠️ Potential issue | 🟡 MinorEnforce non-blank
error_messagefor failed statuses.The validator at lines 284-296 requires
error_messageforFAILEDandBUDGET_EXHAUSTEDstatuses, but usingstr | Nonestill allows empty or whitespace-only strings to pass. UseNotBlankStr | Noneto ensure meaningful error descriptions.🛠️ Proposed fix
- error_message: str | None = Field( + error_message: NotBlankStr | None = Field( default=None, description="Error description on failure", )Based on learnings, "Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/models.py` around lines 265 - 268, The error_message field uses type str | None which allows empty/whitespace strings; change its annotation to NotBlankStr | None (import NotBlankStr from core.types) so validators for statuses FAILED and BUDGET_EXHAUSTED enforce non-blank messages, i.e., replace "error_message: str | None = Field(...)" with "error_message: NotBlankStr | None = Field(...)" and add the necessary import.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/communication/meeting/position_papers.py`:
- Around line 212-229: The current code sets max_tokens using budget_slice but
ignores prompt token cost so agent_caller(...) can generate within max_tokens
while the prompt pushes tracker.used past token_budget; fix by estimating the
prompt token count before calling agent_caller (e.g., via the project's
tokenizer or a conservative heuristic) and compute output_allowance = max(0,
min(self._config.max_tokens_per_position, budget_slice -
estimated_prompt_tokens)); pass output_allowance as max_tokens to agent_caller,
record both estimated_prompt_tokens + response.output_tokens to tracker (instead
of waiting for response.input_tokens), and apply the same change to the other
occurrence around the block handling lines ~338-347 to ensure the
duration_tokens/token_budget bound is respected.
In `@src/ai_company/communication/meeting/protocol.py`:
- Around line 61-100: The MeetingProtocol interface currently only exposes run()
and get_protocol_type(), which prevents callers from composing standard
lifecycle phases; update the MeetingProtocol class to declare the standard
lifecycle methods start_meeting(...), run_round(...), summarize(...), and
extract_action_items(...) (with appropriate async signatures and the same domain
types used by run: meeting_id: str, agenda: MeetingAgenda, leader_id: str,
participant_ids: tuple[str, ...], agent_caller: AgentCaller, token_budget: int,
returning MeetingMinutes or specific partial results as appropriate) so
implementations must implement these phases, and keep run() as a convenience
orchestrator that can call those methods; reference MeetingProtocol, run,
get_protocol_type, start_meeting, run_round, summarize, and extract_action_items
when making the change.
In `@src/ai_company/communication/meeting/structured_phases.py`:
- Around line 467-472: The logger call using MEETING_CONFLICT_DETECTED currently
writes the full meeting analysis via raw_response=conflict_response.content;
replace this with non-sensitive metadata only (e.g., a boolean or redacted
summary) so sensitive participant text is not stored. Update the logger.info
call that includes MEETING_CONFLICT_DETECTED, meeting_id, conflicts_found
(conflicts_detected) to remove or redact raw_response and instead pass a
redact-safe field like conflict_summary (e.g., "redacted" or a short
summary/flag) or a hashed/length-only marker derived from
conflict_response.content; ensure references are to logger.info,
MEETING_CONFLICT_DETECTED, meeting_id, conflicts_found, and conflict_response so
reviewers can find and verify the change.
In `@tests/unit/communication/meeting/test_token_tracker.py`:
- Around line 36-47: Update the two tests to expect over-budget recording to
fail instead of being allowed: wrap the calls to TokenTracker.record in
pytest.raises(ValueError) (or the specific exception your TokenTracker raises)
so test_record_exceeds_budget and test_remaining_never_negative assert that
calling TokenTracker.record would raise when the requested tokens exceed the
budget; after the exception assert that tracker.used remains at its previous
value (e.g., 30 for test_record_exceeds_budget or 0 for the other),
tracker.remaining is 0, and tracker.is_exhausted is True; this changes
expectations for the TokenTracker.record method and uses the TokenTracker,
record, used, remaining, and is_exhausted symbols to locate the code to update.
---
Duplicate comments:
In `@DESIGN_SPEC.md`:
- Around line 2418-2425: Update the project tree and §15.3 references to use the
actual implemented symbol name: replace the stale "AgendaItem" entry with
"MeetingAgendaItem" (and update any other mentions of AgendaItem to
MeetingAgendaItem) so readers are directed to the existing API; ensure the
package list line under the meeting subsystem and the §15.3 text use
MeetingAgendaItem consistently.
In `@src/ai_company/communication/meeting/_token_tracker.py`:
- Around line 16-18: TokenTracker is a mutable dataclass that violates the
repo's copy-on-write model; change it to an immutable/frozen model and make its
state-update methods return a new instance rather than mutating self.
Specifically, convert TokenTracker to a frozen dataclass or a frozen Pydantic
model and update the record() (and any other state-changing helpers referenced
around TokenTracker) to perform a copy/update and return the new TokenTracker
instance (e.g., via dataclasses.replace or model_copy(update=...)) so callers
always receive a new object instead of mutating shared state.
- Around line 41-68: The record() method currently updates
input_tokens/output_tokens before checking budget, allowing silent overruns;
modify MeetingTokenTracker.record to compute projected_used = (self.used or
self.input_tokens + self.output_tokens) + input_tokens + output_tokens (or
simply self.input_tokens + self.output_tokens + input_tokens + output_tokens)
and if projected_used > self.budget raise MeetingBudgetExhaustedError before
mutating state, otherwise proceed to increment self.input_tokens and
self.output_tokens; keep the existing non-negative validation and use the
existing property names remaining, is_exhausted, budget, used, input_tokens,
output_tokens and the MeetingBudgetExhaustedError exception.
In `@src/ai_company/communication/meeting/models.py`:
- Around line 201-218: The stored fields total_input_tokens and
total_output_tokens can diverge from per-contribution sums; change them to
computed fields that sum over the meeting's contributions (e.g.,
sum(c.input_tokens) for total_input_tokens and sum(c.output_tokens) for
total_output_tokens) instead of keeping default stored ints, and ensure
total_tokens remains (or is updated to be) the computed sum of those computed
fields to remove redundant state; update the Field definitions/annotations for
total_input_tokens and total_output_tokens to use `@computed_field` and reference
the contributions relationship when calculating the totals (keeping the existing
total_tokens property as-is or adjust to return the sum of the newly computed
fields).
- Around line 265-268: The error_message field uses type str | None which allows
empty/whitespace strings; change its annotation to NotBlankStr | None (import
NotBlankStr from core.types) so validators for statuses FAILED and
BUDGET_EXHAUSTED enforce non-blank messages, i.e., replace "error_message: str |
None = Field(...)" with "error_message: NotBlankStr | None = Field(...)" and add
the necessary import.
In `@src/ai_company/communication/meeting/orchestrator.py`:
- Around line 86-96: The constructor currently stores the caller-owned
protocol_registry directly (parameter protocol_registry -> attribute
self._protocol_registry), allowing external mutation; fix by creating a
defensive copy and exposing it as read-only: deepcopy the incoming
protocol_registry (use copy.deepcopy) and then wrap the result with
types.MappingProxyType before assigning to self._protocol_registry so the
internal registry cannot be mutated after construction (update imports as needed
to include copy and MappingProxyType).
- Around line 358-409: The _validate_inputs method currently only checks for
empty participant_ids, leader inclusion, and token_budget; extend it to also
reject blank/whitespace participant IDs and duplicate participant IDs at the
system boundary: iterate over participant_ids in _validate_inputs, trim each ID
and if any are empty/only whitespace log MEETING_VALIDATION_FAILED and raise
MeetingParticipantError with context {"meeting_id": meeting_id} and a clear
message about blank IDs, and if len(set(participant_ids)) !=
len(participant_ids) log and raise MeetingParticipantError indicating duplicate
participant IDs (include meeting_id and optionally leader_id in context),
keeping existing token_budget and leader checks intact and using the same logger
and error patterns used elsewhere in _validate_inputs.
In `@src/ai_company/communication/meeting/position_papers.py`:
- Around line 156-168: The return currently only sets MeetingMinutes.summary
from synthesis_contribution.content, leaving MeetingMinutes.decisions and
.action_items empty; update the position-papers finalization in the function
that builds MeetingMinutes (the block returning MeetingMinutes with
protocol_type=MeetingProtocolType.POSITION_PAPERS) to parse/derive structured
decisions and action items from the synthesis_contribution (or from combined
contributions) and populate MeetingMinutes.decisions and
MeetingMinutes.action_items accordingly so auto_create_tasks can consume them;
you can add a small helper (e.g., parse_decisions_and_actions_from_text) to
extract items from synthesis_contribution.content or change the synthesis step
to return a richer object and map its fields into MeetingMinutes.decisions and
.action_items.
- Around line 203-205: The current pre-division line tokens_per_agent = max(1,
tracker.remaining // max(1, n)) can overcommit when tracker.remaining < n;
instead compute a base and remainder and distribute without exceeding
tracker.remaining: set base = tracker.remaining // n and rem = tracker.remaining
% n, then assign each agent tokens = base + (1 if agent_index < rem else 0)
(this yields some agents 0 when tracker.remaining < n and guarantees the total
sum <= tracker.remaining). Update the logic around tokens_per_agent and any loop
that consumes it (referencing tokens_per_agent, tracker.remaining and n) to use
this distribution scheme so no more tokens are allocated than available.
In `@src/ai_company/communication/meeting/round_robin.py`:
- Around line 125-126: The code always reserves summary_reserve =
int(token_budget * _SUMMARY_RESERVE_FRACTION) and subtracts it from
discussion_budget even when leader_summarizes is False; change the logic in
src/ai_company/communication/meeting/round_robin.py so that summary_reserve is
only computed and withheld when leader_summarizes is True (otherwise set
summary_reserve = 0 and discussion_budget = token_budget), updating the
variables summary_reserve and discussion_budget accordingly where they are used
so the protocol doesn't prematurely stop discussion when summaries are disabled.
- Around line 185-197: The MeetingMinutes returned at the end of the round-robin
flow only sets summary and omits the structured decisions/action_items captured
during the summary phase; update the return in round_robin.py (the
MeetingMinutes construction) to include the parsed decisions and action_items
variables produced during the summary step (e.g., decisions and action_items or
parsed_decisions/parsed_action_items) so MeetingMinutes.decisions and
MeetingMinutes.action_items are populated for the successful round-robin path
and allow auto_create_tasks to work.
In `@src/ai_company/communication/meeting/structured_phases.py`:
- Around line 441-445: The pre-discussion conflict check currently calls
agent_caller(leader_id, conflict_prompt, tracker.remaining) and can consume the
whole meeting budget; change it to cap the token allowance with
StructuredPhasesConfig.max_discussion_tokens (e.g., compute allowed_tokens =
min(tracker.remaining, StructuredPhasesConfig.max_discussion_tokens)) and pass
allowed_tokens to agent_caller so conflict_response uses the same
max_discussion_tokens limit as the discussion phase.
- Around line 282-295: The MeetingMinutes returned by finalize in
structured_phases omits decisions and action_items, so downstream task creation
never runs; update the finalize flow (inside structured_phases where
MeetingMinutes(...) is constructed) to parse the synthesis output produced by
the leader synthesis step (or switch that step to request a structured JSON
response) and populate the MeetingMinutes.decisions and
MeetingMinutes.action_items fields before returning; locate the synthesis result
variable used earlier in the phase (the leader synthesis response) and
extract/validate decisions and action items, converting them into the same types
used by MeetingMinutes and include them in the MeetingMinutes(...) constructor
along with summary.
- Around line 322-323: The current calculation in structured_phases.py that sets
tokens_per_agent = max(1, tracker.remaining // max(1, num_participants)) can
oversubscribe the TaskGroup when tracker.remaining < num_participants; change
the logic to either fail fast or cap fan-out—e.g., compute allowed_agents =
min(num_participants, tracker.remaining) and then set tokens_per_agent = max(1,
tracker.remaining // allowed_agents) (or alternatively raise an error when
tracker.remaining < num_participants); update the code paths that spawn the
TaskGroup (referencing num_participants, tokens_per_agent, tracker.remaining and
the TaskGroup spawn logic) to use allowed_agents so aggregate token usage never
exceeds tracker.remaining.
In `@tests/unit/communication/meeting/test_orchestrator.py`:
- Around line 455-483: The tests test_task_creator_not_called_when_disabled (and
the duplicate at 485-511) never exercise the task-creation path because the
default stub protocol returns no ActionItem minutes; update these tests to reuse
the stub-protocol used in test_task_creator_called_with_correct_args() so the
protocol returns Minutes containing at least one ActionItem, then assert
auto_create_tasks=False prevents _create_tasks() from calling the provided
_creator. Specifically, adjust the protocol stub used when constructing
orchestrator via _make_orchestrator to produce Minutes with an ActionItem
(matching the stub pattern in test_task_creator_called_with_correct_args()),
keep MeetingProtocolConfig(auto_create_tasks=False), pass the same _creator, and
call orchestrator.run_meeting so the task-creation branch is actually reached
and can be validated.
In `@tests/unit/communication/meeting/test_position_papers.py`:
- Around line 63-121: The tests currently only assert counts and set membership,
which allows nondeterministic ordering to pass; update the two tests
test_contributions_structure and test_parallel_execution_all_participants to
assert a deterministic ordering for POSITION_PAPER contributions by extracting
the list of contributions where c.phase == MeetingPhase.POSITION_PAPER from
minutes.contributions and then assert their order (e.g., by comparing the
sequence of c.agent_id or c.turn_number to the expected order derived from
participant_ids or expected turn numbers). Ensure you reference the
minutes.contributions list, MeetingPhase.POSITION_PAPER, and the contribution
fields agent_id and/or turn_number when implementing the ordered assertion so
the regression to nondeterministic ordering is pinned.
In `@tests/unit/communication/meeting/test_structured_phases.py`:
- Around line 107-112: Mocks in these tests hard-code the key "leader-agent" in
the responses dict instead of using the injected leader_id fixture, which will
break if leader_id changes; update each responses map (the dict named responses
in the test) to use the leader_id variable as the key (e.g., replace
"leader-agent": [...] with leader_id: [...]) for all occurrences so the mock
matches the injected fixture value and any parameterizations of leader_id.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7f1dc386-0c0c-49ef-a80e-d278593c8af8
📒 Files selected for processing (19)
DESIGN_SPEC.mdsrc/ai_company/communication/__init__.pysrc/ai_company/communication/meeting/__init__.pysrc/ai_company/communication/meeting/_token_tracker.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/communication/meeting/enums.pysrc/ai_company/communication/meeting/models.pysrc/ai_company/communication/meeting/orchestrator.pysrc/ai_company/communication/meeting/position_papers.pysrc/ai_company/communication/meeting/protocol.pysrc/ai_company/communication/meeting/round_robin.pysrc/ai_company/communication/meeting/structured_phases.pytests/unit/communication/meeting/test_models.pytests/unit/communication/meeting/test_orchestrator.pytests/unit/communication/meeting/test_position_papers.pytests/unit/communication/meeting/test_prompts.pytests/unit/communication/meeting/test_protocol.pytests/unit/communication/meeting/test_structured_phases.pytests/unit/communication/meeting/test_token_tracker.py
| max_tokens = min( | ||
| self._config.max_tokens_per_position, | ||
| budget_slice, | ||
| ) | ||
|
|
||
| logger.debug( | ||
| MEETING_AGENT_CALLED, | ||
| meeting_id=meeting_id, | ||
| agent_id=participant_id, | ||
| phase=MeetingPhase.POSITION_PAPER, | ||
| ) | ||
|
|
||
| response = await agent_caller( | ||
| participant_id, | ||
| prompt, | ||
| max_tokens, | ||
| ) | ||
| tracker.record(response.input_tokens, response.output_tokens) |
There was a problem hiding this comment.
max_tokens here does not cap total token spend.
agent_caller(..., max_tokens) only limits generated tokens. Prompt tokens are discovered afterward via response.input_tokens and recorded unconditionally, so a large prompt can push tracker.used past token_budget even when every slice looked safe. That breaks the hard duration_tokens bound from §5.4; reserve prompt cost up front or derive the output allowance from budget_slice - estimated_prompt_tokens before dispatch.
Also applies to: 338-347
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/position_papers.py` around lines 212 -
229, The current code sets max_tokens using budget_slice but ignores prompt
token cost so agent_caller(...) can generate within max_tokens while the prompt
pushes tracker.used past token_budget; fix by estimating the prompt token count
before calling agent_caller (e.g., via the project's tokenizer or a conservative
heuristic) and compute output_allowance = max(0,
min(self._config.max_tokens_per_position, budget_slice -
estimated_prompt_tokens)); pass output_allowance as max_tokens to agent_caller,
record both estimated_prompt_tokens + response.output_tokens to tracker (instead
of waiting for response.input_tokens), and apply the same change to the other
occurrence around the block handling lines ~338-347 to ensure the
duration_tokens/token_budget bound is respected.
| class MeetingProtocol(Protocol): | ||
| """Strategy interface for meeting protocol implementations. | ||
|
|
||
| Each implementation defines a different structure for how agents | ||
| interact during a meeting (round-robin turns, parallel position | ||
| papers, structured phases with discussion). | ||
| """ | ||
|
|
||
| async def run( # noqa: PLR0913 | ||
| self, | ||
| *, | ||
| meeting_id: str, | ||
| agenda: MeetingAgenda, | ||
| leader_id: str, | ||
| participant_ids: tuple[str, ...], | ||
| agent_caller: AgentCaller, | ||
| token_budget: int, | ||
| ) -> MeetingMinutes: | ||
| """Execute the meeting protocol and produce minutes. | ||
|
|
||
| Args: | ||
| meeting_id: Unique identifier for this meeting. | ||
| agenda: The meeting agenda. | ||
| leader_id: ID of the agent leading the meeting. | ||
| participant_ids: IDs of participating agents. | ||
| agent_caller: Callback to invoke agents. | ||
| token_budget: Maximum tokens for the entire meeting. | ||
|
|
||
| Returns: | ||
| Complete meeting minutes. | ||
| """ | ||
| ... | ||
|
|
||
| def get_protocol_type(self) -> MeetingProtocolType: | ||
| """Return the protocol type this implementation handles. | ||
|
|
||
| Returns: | ||
| The meeting protocol type enum value. | ||
| """ | ||
| ... |
There was a problem hiding this comment.
The MeetingProtocol surface no longer matches the linked objective.
#123 called for standard lifecycle operations (start_meeting, run_round, summarize, extract_action_items). With only run()/get_protocol_type(), each protocol becomes a black box and callers cannot compose or enforce those common phases consistently. Either expose the standard operations here or update the requirement/spec to match the abstraction you're actually shipping.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/protocol.py` around lines 61 - 100, The
MeetingProtocol interface currently only exposes run() and get_protocol_type(),
which prevents callers from composing standard lifecycle phases; update the
MeetingProtocol class to declare the standard lifecycle methods
start_meeting(...), run_round(...), summarize(...), and
extract_action_items(...) (with appropriate async signatures and the same domain
types used by run: meeting_id: str, agenda: MeetingAgenda, leader_id: str,
participant_ids: tuple[str, ...], agent_caller: AgentCaller, token_budget: int,
returning MeetingMinutes or specific partial results as appropriate) so
implementations must implement these phases, and keep run() as a convenience
orchestrator that can call those methods; reference MeetingProtocol, run,
get_protocol_type, start_meeting, run_round, summarize, and extract_action_items
when making the change.
| logger.info( | ||
| MEETING_CONFLICT_DETECTED, | ||
| meeting_id=meeting_id, | ||
| conflicts_found=conflicts_detected, | ||
| raw_response=conflict_response.content, | ||
| ) |
There was a problem hiding this comment.
Don't log raw meeting analysis at INFO level.
raw_response=conflict_response.content writes the leader's full conflict analysis into application logs. That can include participant arguments or other sensitive meeting content, while observability only needs a boolean or a redacted summary. Log metadata instead of the full text.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/structured_phases.py` around lines 467 -
472, The logger call using MEETING_CONFLICT_DETECTED currently writes the full
meeting analysis via raw_response=conflict_response.content; replace this with
non-sensitive metadata only (e.g., a boolean or redacted summary) so sensitive
participant text is not stored. Update the logger.info call that includes
MEETING_CONFLICT_DETECTED, meeting_id, conflicts_found (conflicts_detected) to
remove or redact raw_response and instead pass a redact-safe field like
conflict_summary (e.g., "redacted" or a short summary/flag) or a
hashed/length-only marker derived from conflict_response.content; ensure
references are to logger.info, MEETING_CONFLICT_DETECTED, meeting_id,
conflicts_found, and conflict_response so reviewers can find and verify the
change.
| def test_record_exceeds_budget(self) -> None: | ||
| tracker = TokenTracker(budget=50) | ||
| tracker.record(30, 30) | ||
| assert tracker.used == 60 | ||
| assert tracker.remaining == 0 | ||
| assert tracker.is_exhausted is True | ||
|
|
||
| def test_remaining_never_negative(self) -> None: | ||
| tracker = TokenTracker(budget=10) | ||
| tracker.record(100, 100) | ||
| assert tracker.remaining == 0 | ||
| assert tracker.is_exhausted is True |
There was a problem hiding this comment.
These tests lock in over-budget recording as valid behavior.
TokenTracker is supposed to underpin duration_tokens enforcement, but these expectations bless used > budget and only clamp remaining to zero after overspending. That makes exhaustion detectable only after the cap is already violated; once the tracker/enforcers are tightened, these cases should flip to expecting a failure instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/meeting/test_token_tracker.py` around lines 36 - 47,
Update the two tests to expect over-budget recording to fail instead of being
allowed: wrap the calls to TokenTracker.record in pytest.raises(ValueError) (or
the specific exception your TokenTracker raises) so test_record_exceeds_budget
and test_remaining_never_negative assert that calling TokenTracker.record would
raise when the requested tokens exceed the budget; after the exception assert
that tracker.used remains at its previous value (e.g., 30 for
test_record_exceeds_budget or 0 for the other), tracker.remaining is 0, and
tracker.is_exhausted is True; this changes expectations for the
TokenTracker.record method and uses the TokenTracker, record, used, remaining,
and is_exhausted symbols to locate the code to update.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 34 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| budget_group = exc.subgroup(MeetingBudgetExhaustedError) | ||
| if budget_group is not None and len(budget_group.exceptions) == len( | ||
| exc.exceptions | ||
| ): |
There was a problem hiding this comment.
ExceptionGroup handling can misclassify mixed nested ExceptionGroups as fully budget-exhausted. subgroup(MeetingBudgetExhaustedError) preserves group structure, so the top-level len(...exceptions) comparison may still match even when some nested exceptions are non-budget errors, incorrectly returning BUDGET_EXHAUSTED. Use exc.split(MeetingBudgetExhaustedError) (or equivalent) and only mark BUDGET_EXHAUSTED when the remainder group is None.
| budget_group = exc.subgroup(MeetingBudgetExhaustedError) | |
| if budget_group is not None and len(budget_group.exceptions) == len( | |
| exc.exceptions | |
| ): | |
| matched, remainder = exc.split(MeetingBudgetExhaustedError) | |
| if matched is not None and remainder is None: |
| if not participant_ids: | ||
| logger.warning( | ||
| MEETING_VALIDATION_FAILED, | ||
| meeting_id=meeting_id, | ||
| error="at least one participant is required", | ||
| ) | ||
| msg = "At least one participant is required" | ||
| raise MeetingParticipantError( | ||
| msg, | ||
| context={"meeting_id": meeting_id}, | ||
| ) | ||
| if leader_id in participant_ids: | ||
| logger.warning( |
There was a problem hiding this comment.
run_meeting() documents that input validation errors are raised directly, but _validate_inputs() does not reject duplicate participant_ids. Duplicates currently get detected later by MeetingMinutes validation, which then gets caught as a generic failure and turned into a FAILED record instead of a MeetingParticipantError. Add a uniqueness check here (e.g., via validate_unique_strings(...)) so invalid participant lists are handled consistently as validation errors.
| assert all(r is not None for r in results), ( # noqa: S101 | ||
| f"Expected {n} position papers but some slots are None" | ||
| ) | ||
| assert all(c is not None for c in contrib_results), ( # noqa: S101 | ||
| f"Expected {n} contributions but some slots are None" | ||
| ) |
There was a problem hiding this comment.
These invariants are enforced with assert. Because Python can run with assertions disabled (-O), this could allow None entries to slip through and cause confusing downstream failures. Prefer an explicit runtime check and raise a normal exception (with context) if any slot is unexpectedly unfilled.
| assert all(r is not None for r in results), ( # noqa: S101 | |
| f"Expected {n} position papers but some slots are None" | |
| ) | |
| assert all(c is not None for c in contrib_results), ( # noqa: S101 | |
| f"Expected {n} contributions but some slots are None" | |
| ) | |
| missing_paper_indices = [i for i, r in enumerate(results) if r is None] | |
| if missing_paper_indices: | |
| raise RuntimeError( | |
| f"Expected {n} position papers but some slots are None " | |
| f"at indices {missing_paper_indices}" | |
| ) | |
| missing_contrib_indices = [i for i, c in enumerate(contrib_results) if c is None] | |
| if missing_contrib_indices: | |
| raise RuntimeError( | |
| f"Expected {n} contributions but some slots are None " | |
| f"at indices {missing_contrib_indices}" | |
| ) |
| assert all(r is not None for r in result_inputs), ( # noqa: S101 | ||
| f"Expected {num_participants} inputs but some slots are None" | ||
| ) | ||
| assert all(c is not None for c in result_contributions), ( # noqa: S101 | ||
| f"Expected {num_participants} contributions but some slots are None" | ||
| ) |
There was a problem hiding this comment.
These invariants are enforced with assert. If assertions are disabled (python -O), result_inputs / result_contributions could still contain None and the function would proceed, leading to harder-to-debug type errors later. Prefer explicit runtime checks and raise a normal exception if any slot is unexpectedly unfilled.
| assert all(r is not None for r in result_inputs), ( # noqa: S101 | |
| f"Expected {num_participants} inputs but some slots are None" | |
| ) | |
| assert all(c is not None for c in result_contributions), ( # noqa: S101 | |
| f"Expected {num_participants} contributions but some slots are None" | |
| ) | |
| missing_input_indices = [i for i, r in enumerate(result_inputs) if r is None] | |
| if missing_input_indices: | |
| raise RuntimeError( | |
| f"Expected {num_participants} inputs but found None for indices " | |
| f"{missing_input_indices}" | |
| ) | |
| missing_contribution_indices = [ | |
| i for i, c in enumerate(result_contributions) if c is None | |
| ] | |
| if missing_contribution_indices: | |
| raise RuntimeError( | |
| "Expected " | |
| f"{num_participants} contributions but found None for indices " | |
| f"{missing_contribution_indices}" | |
| ) |
Critical (C1-C4): - Parse decisions/action_items from LLM synthesis in all 3 meeting protocols - Validate winning_agent_id exists in find_losers() before computing losers Major (M1-M17): - Guard summary budget reserve when leader_summarizes=False - Add synthesis sub-reserve in structured phases discussion - Reject duplicate participant_ids in meeting orchestrator - Freeze protocol registry with MappingProxyType - Warn when token tracker exceeds budget - Add hierarchy tiebreaker to pick_highest_seniority() - Wire hierarchy into debate/hybrid authority fallbacks - Fast-path get_lowest_common_manager(a, a) → a - Validate _SENIORITY_ORDER matches enum members at import - Remove dead max_tokens_per_argument config field - Verify task IDs match plan subtask IDs in DecompositionResult - Return CANCELLED for mixed completed+cancelled terminal states - Fix double-logging in rollup compute() for empty case - Copy subtask dependencies from plan to created Tasks - Reject duplicate subtask IDs in RoutingResult - Wake all pending waiters on unsubscribe (not just one) Minor (m1-m15): - Remove duplicate MEETING_CONFLICT_DETECTED log events - Replace assert with explicit raises in meeting protocols - Include presenter_id in formatted agenda prompt - Validate token aggregates in MeetingMinutes - Require non-empty error_message for FAILED/BUDGET_EXHAUSTED - Move _MIN_POSITIONS to local constant in service.py - Precompute seniority rank dict for O(1) lookups - Remove dead asyncio.QueueFull catch on unbounded queue - Fix racy state check in _log_receive_null (acquire lock) - Type channel_name as NotBlankStr in messenger - Document unsubscribe as None return path in receive() - Preserve traceback context in parallel.py re-raise - Validate parent_task.id matches plan.parent_task_id - Add logging before raises in routing model validators Trivial (t1-t4): - Use centralized event constant in routing scorer - Add task_structure/coordination_topology to Task docstring - Fix DESIGN_SPEC.md model/function names to match code - Fix StructuredPhasesConfig docstring Tests (T1-T5): - Assert MEETING_CONTRIBUTION enum value - Add timeout markers to all meeting test modules - Add 3+ participant test for authority/debate strategies - Remove dead max_tokens_per_argument test references - Update HybridResolver tests for new hierarchy parameter Closes #169
## Summary - **46 findings** from external reviewers (Copilot, Greptile, CodeRabbit, Gemini) on PRs #164-#167, all addressed - **4 critical**: Parse decisions/action_items from LLM synthesis in all 3 meeting protocols; validate `winning_agent_id` in `find_losers()` - **17 major**: Token budget guards, duplicate participant rejection, frozen registry, hierarchy tiebreakers, dependency copying, wake-all-waiters on unsubscribe, and more - **15 minor**: Duplicate log removal, assert→raise, traceback preservation, O(1) seniority lookups, NotBlankStr typing, routing validation ordering - **4 trivial**: Centralized event constants, docstring/spec fixes - **6 test/doc gaps**: New `test_parsing.py` (18 tests), expanded tests across 7 modules, timeout markers, spec name corrections - **Pre-PR review fixes**: List-item regex crossing line boundaries, parent_task_id validation ordering, circular exception cause, dead code removal ## Test plan - [x] All 3634 unit tests pass - [x] Ruff lint clean - [x] Mypy strict clean - [x] Pre-commit hooks pass - [ ] CI pipeline (lint + type-check + test + coverage) Closes #169
Summary
MeetingProtocolprotocol interfaceMeetingAgenda,MeetingMinutes,MeetingContribution,ActionItem,MeetingRecord(all frozen Pydantic v2)TokenTracker(mutable budget tracking),build_agenda_prompt(DRY prompt builder)observability/events/meeting.pyMeetingError→MeetingBudgetExhaustedError,MeetingProtocolNotFoundError,MeetingParticipantError,MeetingAgentErrorMeetingProtocolConfigwith per-protocol settings,auto_create_tasksat top levelasyncio.TaskGroupin position papers and structured phases protocolsMeetingBudgetExhaustedErroron exhaustionCloses #123
Pre-PR Review
Pre-reviewed by 9 agents, 38 findings addressed:
Key fixes from review:
_TokenTrackerand_promptsmodules (eliminated duplication across 3 protocols)run()methods into focused helpers (<50 lines each)TaskCreatortype alias (str→Priority)MeetingMinutesTest plan
pytest tests/ -n auto)🤖 Generated with Claude Code