fix: address post-merge review feedback from PRs #164-#167#170
fix: address post-merge review feedback from PRs #164-#167#170
Conversation
Critical (C1-C4): - Parse decisions/action_items from LLM synthesis in all 3 meeting protocols - Validate winning_agent_id exists in find_losers() before computing losers Major (M1-M17): - Guard summary budget reserve when leader_summarizes=False - Add synthesis sub-reserve in structured phases discussion - Reject duplicate participant_ids in meeting orchestrator - Freeze protocol registry with MappingProxyType - Warn when token tracker exceeds budget - Add hierarchy tiebreaker to pick_highest_seniority() - Wire hierarchy into debate/hybrid authority fallbacks - Fast-path get_lowest_common_manager(a, a) → a - Validate _SENIORITY_ORDER matches enum members at import - Remove dead max_tokens_per_argument config field - Verify task IDs match plan subtask IDs in DecompositionResult - Return CANCELLED for mixed completed+cancelled terminal states - Fix double-logging in rollup compute() for empty case - Copy subtask dependencies from plan to created Tasks - Reject duplicate subtask IDs in RoutingResult - Wake all pending waiters on unsubscribe (not just one) Minor (m1-m15): - Remove duplicate MEETING_CONFLICT_DETECTED log events - Replace assert with explicit raises in meeting protocols - Include presenter_id in formatted agenda prompt - Validate token aggregates in MeetingMinutes - Require non-empty error_message for FAILED/BUDGET_EXHAUSTED - Move _MIN_POSITIONS to local constant in service.py - Precompute seniority rank dict for O(1) lookups - Remove dead asyncio.QueueFull catch on unbounded queue - Fix racy state check in _log_receive_null (acquire lock) - Type channel_name as NotBlankStr in messenger - Document unsubscribe as None return path in receive() - Preserve traceback context in parallel.py re-raise - Validate parent_task.id matches plan.parent_task_id - Add logging before raises in routing model validators Trivial (t1-t4): - Use centralized event constant in routing scorer - Add task_structure/coordination_topology to Task docstring - Fix DESIGN_SPEC.md model/function names to match code - Fix StructuredPhasesConfig docstring Tests (T1-T5): - Assert MEETING_CONTRIBUTION enum value - Add timeout markers to all meeting test modules - Add 3+ participant test for authority/debate strategies - Remove dead max_tokens_per_argument test references - Update HybridResolver tests for new hierarchy parameter Closes #169
- Fix list-item regex crossing line boundaries (\s* → [^\S\n]*) - Move parent_task_id validation before empty-agents early return - Fix circular exception cause in parallel.py re-raise - Remove unused COMM_UNSUBSCRIBE_SENTINEL_FAILED constant - Use NotBlankStr for error_message field (replaces manual check) - Add logger + logging before raises in parsing/position_papers/structured_phases - Fix import ordering in rollup.py - Remove dead max_tokens_per_argument from DESIGN_SPEC.md examples - Correct M3 status in README.md - Improve docstrings across bus_memory, helpers, hybrid_strategy, orchestrator - Add test_parsing.py (18 tests) + expand tests in 7 existing modules
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (19)
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR addresses 40+ post-merge review feedback items from four recently-merged PRs by fixing critical validation gaps, implementing missing parsing logic for meeting decisions/action items, adding hierarchy-based tiebreaking to conflict resolution, improving waiter handling in message bus, and enhancing decomposition/routing validation with proper ID consistency checks. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refines the multi-agent system by incorporating extensive feedback from external reviews. The changes focus on improving the robustness and functionality of core communication mechanisms, particularly within meeting protocols and conflict resolution. Key updates include more reliable message bus operations, enhanced parsing of LLM outputs for structured decision-making, and stricter data integrity checks across various models. These improvements aim to make the system more stable, predictable, and easier to debug, while also clarifying design specifications and updating development status. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
Addresses post-merge review feedback from PRs #164–#167 by tightening validation, improving meeting protocol outputs (decisions/action items parsing), and expanding test coverage to prevent regressions across communication + engine subsystems.
Changes:
- Add meeting summary/synthesis parsing helpers and populate
decisions/action_itemsin all meeting protocols. - Harden routing/decomposition/conflict-resolution invariants (ID validation, hierarchy tiebreakers, dependency propagation, multi-waiter unsubscribe wakeups).
- Expand/adjust unit + integration tests (including consistent
pytest.mark.timeout(30)in meeting tests).
Reviewed changes
Copilot reviewed 55 out of 55 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
src/ai_company/communication/meeting/_parsing.py |
New parsing utilities for decisions/action items from LLM text |
src/ai_company/communication/meeting/{round_robin,position_papers,structured_phases}.py |
Populate minutes with parsed decisions/action items; adjust budget handling/invariants |
src/ai_company/communication/meeting/_token_tracker.py |
Add logging when token usage exceeds budget / invalid counts |
src/ai_company/communication/meeting/orchestrator.py |
Freeze protocol registry and reject duplicate participants |
src/ai_company/communication/bus_memory.py |
Wake all pending receivers on unsubscribe; make receive-null logging non-racy |
src/ai_company/engine/{routing,decomposition}/... |
Add/adjust validation + propagation (parent_task_id, dependencies, rollups) and event constant usage |
src/ai_company/core/enums.py |
Guard seniority ordering and make comparisons O(1) |
tests/** |
Add coverage for the above behaviors + timeout markers |
README.md, DESIGN_SPEC.md |
Spec/doc alignment with implemented behavior and naming |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from ai_company.observability import get_logger | ||
|
|
||
| logger = get_logger(__name__) |
There was a problem hiding this comment.
get_logger() result is assigned to logger but never used in this module, which should trigger Ruff's unused-variable check. Either remove the observability import/logger, or actually use it (e.g., to log when a section header is found/missing).
| self._protocol_registry = protocol_registry | ||
| self._protocol_registry: MappingProxyType[ | ||
| MeetingProtocolType, MeetingProtocol | ||
| ] = MappingProxyType(copy.deepcopy(dict(protocol_registry))) |
There was a problem hiding this comment.
Using copy.deepcopy() on the protocol registry will also deep-copy protocol implementation instances. That’s risky/unnecessary (custom protocol objects may not be deepcopy-safe and it can duplicate internal state/resources). Prefer making the mapping immutable via MappingProxyType(dict(protocol_registry)) (shallow copy) and, if needed, validate the registry contents rather than deep-copying arbitrary objects.
| ] = MappingProxyType(copy.deepcopy(dict(protocol_registry))) | |
| ] = MappingProxyType(dict(protocol_registry)) |
| # points. The asymmetry with the lock-guarded increment | ||
| # is intentional — the decrement must happen after | ||
| # _await_with_shutdown completes. | ||
| self._waiters[key] = max(0, self._waiters.get(key, 0) - 1) |
There was a problem hiding this comment.
receive() decrements _waiters[key] in the finally block even if unsubscribe() has already pop()'d that key. This re-inserts the key with a 0 count, causing _waiters to accumulate stale entries over time (and can skew future sentinel counts if code changes). Update the decrement logic to avoid re-adding missing keys and/or delete the key when the count reaches 0 (ideally under the lock).
| self._waiters[key] = max(0, self._waiters.get(key, 0) - 1) | |
| current = self._waiters.get(key) | |
| if current is None: | |
| # Key was removed (for example, by unsubscribe); do not recreate it. | |
| pass | |
| elif current <= 1: | |
| # Last waiter for this key; remove entry entirely. | |
| self._waiters.pop(key, None) | |
| else: | |
| self._waiters[key] = current - 1 |
| if not all(r is not None for r in result_inputs): | ||
| msg = f"Expected {num_participants} inputs but some slots are None" | ||
| logger.error(msg, meeting_id=meeting_id) | ||
| raise RuntimeError(msg) | ||
| if not all(c is not None for c in result_contributions): | ||
| msg = f"Expected {num_participants} contributions but some slots are None" | ||
| logger.error(msg, meeting_id=meeting_id) | ||
| raise RuntimeError(msg) |
There was a problem hiding this comment.
These invariant failures are logged with logger.error(msg, ...), which makes the event name dynamic and bypasses the centralized observability.events.meeting constants pattern used elsewhere in this module. Log with a stable meeting event constant (e.g., MEETING_FAILED/MEETING_VALIDATION_FAILED) and include msg in a structured field (like error=msg).
| if not all(r is not None for r in results): | ||
| msg = f"Expected {n} position papers but some slots are None" | ||
| logger.error(msg, meeting_id=meeting_id) | ||
| raise RuntimeError(msg) | ||
| if not all(c is not None for c in contrib_results): | ||
| msg = f"Expected {n} contributions but some slots are None" | ||
| logger.error(msg, meeting_id=meeting_id) | ||
| raise RuntimeError(msg) |
There was a problem hiding this comment.
These invariant failures are logged with logger.error(msg, ...), which makes the event name dynamic and bypasses the centralized observability.events.meeting constants pattern used elsewhere in this module. Log with a stable meeting event constant (e.g., MEETING_FAILED/MEETING_VALIDATION_FAILED) and include msg in a structured field (like error=msg).
| if input_tokens < 0 or output_tokens < 0: | ||
| msg = ( | ||
| f"Token counts must be non-negative, got " | ||
| f"input_tokens={input_tokens}, output_tokens={output_tokens}" | ||
| f"input_tokens={input_tokens}, " | ||
| f"output_tokens={output_tokens}" | ||
| ) | ||
| logger.warning( | ||
| MEETING_BUDGET_EXHAUSTED, | ||
| error=msg, | ||
| input_tokens=input_tokens, | ||
| output_tokens=output_tokens, | ||
| ) | ||
| raise ValueError(msg) |
There was a problem hiding this comment.
TokenTracker.record() logs MEETING_BUDGET_EXHAUSTED even when the problem is invalid input (negative token counts). That event name implies a normal budget exhaustion scenario and can confuse monitoring/alerts. Consider logging MEETING_VALIDATION_FAILED (or a dedicated token-tracking/invalid-usage event) for negative counts, while keeping MEETING_BUDGET_EXHAUSTED for actual over-budget conditions.
There was a problem hiding this comment.
Code Review
This pull request introduces automated extraction of decisions and action items from LLM-generated meeting summaries, while also addressing numerous findings and significantly improving the codebase through stricter data validation, robust concurrency handling in the message bus, and new features like hierarchy-based tie-breaking. However, a critical security vulnerability exists due to insufficient safeguards against Indirect Prompt Injection, where raw agent responses in prompts could allow a malicious agent to manipulate the LLM's output to create unauthorized tasks. Additionally, there is a high-severity concern that the new LLM response parser may not correctly handle multi-line list items, potentially leading to truncated data.
| def parse_action_items( | ||
| summary_text: str, | ||
| ) -> tuple[ActionItem, ...]: | ||
| """Parse action items from an LLM summary/synthesis response. | ||
|
|
||
| Looks for an "Action Items" section header, then extracts | ||
| bulleted or numbered list items. Attempts to detect assignee | ||
| information within each item. | ||
|
|
||
| Args: | ||
| summary_text: The full summary/synthesis text from the LLM. | ||
|
|
||
| Returns: | ||
| Tuple of ActionItem instances (may be empty). | ||
| """ | ||
| section = _extract_section(summary_text, _ACTION_ITEMS_HEADER_RE) | ||
| if not section: | ||
| return () | ||
|
|
||
| items: list[ActionItem] = [] | ||
| for match in _LIST_ITEM_RE.finditer(section): | ||
| raw_text = match.group(1).strip() | ||
| if not raw_text: | ||
| continue | ||
|
|
||
| description, assignee_id = _parse_assignee(raw_text) | ||
| if not description: | ||
| continue | ||
|
|
||
| items.append( | ||
| ActionItem( | ||
| description=description, | ||
| assignee_id=assignee_id, | ||
| ) | ||
| ) | ||
|
|
||
| return tuple(items) |
There was a problem hiding this comment.
The parse_action_items function extracts assignee_id from free-form LLM text without any validation against the meeting's participant list. This allows an attacker to use prompt injection to assign tasks to arbitrary agents who may not even be part of the meeting, potentially bypassing intended workflow boundaries.
| synthesis_text = synthesis_contribution.content | ||
| decisions = parse_decisions(synthesis_text) | ||
| action_items = parse_action_items(synthesis_text) |
There was a problem hiding this comment.
Action items are parsed directly from LLM synthesis output. Since the synthesis prompt (built in _build_synthesis_prompt) concatenates raw responses from other agents, it is vulnerable to indirect prompt injection. A malicious agent can inject instructions to cause the synthesizer to output unauthorized action items.
Recommendation: Validate that the assignee_id in each action item is either the meeting leader or one of the participants.
synthesis_text = synthesis_contribution.content
decisions = parse_decisions(synthesis_text)
raw_action_items = parse_action_items(synthesis_text)
# Validate assignees are participants or the leader
allowed_assignees = set(participant_ids) | {leader_id}
action_items = [
item for item in raw_action_items
if item.assignee_id is None or item.assignee_id in allowed_assignees
]| all_contributions = (*contributions, summary_contribution) | ||
| decisions = parse_decisions(summary) | ||
| action_items = parse_action_items(summary) |
There was a problem hiding this comment.
The round-robin protocol is vulnerable to indirect prompt injection because the transcript (containing raw agent responses) is included in the summary prompt. This allows an agent to manipulate the final list of action items.
Recommendation: Validate that the assignee_id in each action item is either the meeting leader or one of the participants.
all_contributions = (*contributions, summary_contribution)
decisions = parse_decisions(summary)
raw_action_items = parse_action_items(summary)
# Validate assignees are participants or the leader
allowed_assignees = set(participant_ids) | {leader_id}
action_items = [
item for item in raw_action_items
if item.assignee_id is None or item.assignee_id in allowed_assignees
]| decisions = parse_decisions(summary) | ||
| action_items = parse_action_items(summary) |
There was a problem hiding this comment.
The structured phases protocol is vulnerable to indirect prompt injection because participant inputs are concatenated into the synthesis prompt. This allows a participant to inject instructions that manipulate the resulting action items.
Recommendation: Validate that the assignee_id in each action item is either the meeting leader or one of the participants.
decisions = parse_decisions(summary)
raw_action_items = parse_action_items(summary)
# Validate assignees are participants or the leader
allowed_assignees = set(participant_ids) | {leader_id}
action_items = [
item for item in raw_action_items
if item.assignee_id is None or item.assignee_id in allowed_assignees
]| _LIST_ITEM_RE = re.compile( | ||
| r"^[^\S\n]*(?:\d+[\.\)][^\S\n]*|-[^\S\n]*|\*[^\S\n]*|\u2022[^\S\n]*)(.+)", | ||
| re.MULTILINE, | ||
| ) |
There was a problem hiding this comment.
The _LIST_ITEM_RE regex only captures single-line list items. If a decision or action item from the LLM spans multiple lines, only the first line will be captured. This could lead to truncated and incomplete items, which could be critical for decisions and action items.
For example, with this input:
# Decisions
1. This is a decision
that spans multiple lines.
2. This is a single line decision.The parser would extract "This is a decision" and "This is a single line decision.", losing the second line of the first decision.
Consider updating the regex to handle multi-line list items, for example by using a non-greedy match with the re.DOTALL flag that continues until the next list item marker or the end of the section.
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
src/ai_company/engine/decomposition/rollup.py (1)
44-59:⚠️ Potential issue | 🟡 MinorKeep
DECOMPOSITION_ROLLUP_COMPUTEDconsistent on the empty-input branch.This branch returns a valid zeroed rollup, but it logs the same event at
WARNINGand withoutderived_status, unlike the normal path. That makes empty inputs look like operational faults and gives the event two different payload shapes.As per coding guidelines "Log at WARNING or ERROR with context for all error paths before raising exceptions" and "Log at DEBUG for object creation, internal flow, and entry/exit of key functions".Proposed adjustment
if total == 0: - logger.warning( - DECOMPOSITION_ROLLUP_COMPUTED, - parent_task_id=parent_task_id, - total=0, - reason="rollup computed with no subtask statuses", - ) - return SubtaskStatusRollup( + rollup = SubtaskStatusRollup( parent_task_id=parent_task_id, total=0, completed=0, failed=0, in_progress=0, blocked=0, cancelled=0, ) + logger.debug( + DECOMPOSITION_ROLLUP_COMPUTED, + parent_task_id=parent_task_id, + total=0, + derived_status=rollup.derived_parent_status.value, + reason="rollup computed with no subtask statuses", + ) + return rollup🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/decomposition/rollup.py` around lines 44 - 59, The empty-input branch logs DECOMPOSITION_ROLLUP_COMPUTED at WARNING without the derived_status and thus differs from the normal path and misreports an operational fault; update the total == 0 branch to log the same event shape as the normal path (include derived_status="zeroed" or the same field name used elsewhere) and use the same log level as non-error creation (change to DEBUG if normal path logs creation at DEBUG) before returning the zeroed SubtaskStatusRollup(parent_task_id=..., total=0, completed=0, failed=0, in_progress=0, blocked=0, cancelled=0) so the event payload and severity remain consistent with SubtaskStatusRollup creation.src/ai_company/communication/conflict_resolution/debate_strategy.py (1)
255-262:⚠️ Potential issue | 🟡 MinorKeep the fallback reasoning aligned with the actual tiebreaker.
Lines 255-262 can now choose a winner via hierarchy when seniority ties, but the returned reasoning still says the winner "has highest seniority". That makes the audit trail false for equal-level conflicts. Please distinguish pure seniority wins from hierarchy tiebreak wins in the
JudgeDecision.reasoning.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/debate_strategy.py` around lines 255 - 262, The reasoning message claims "highest seniority" even when pick_highest_seniority resolved an equal-seniority tie via hierarchy; update the JudgeDecision.reasoning to reflect which tiebreaker was used. After calling pick_highest_seniority(conflict, hierarchy=self._hierarchy), check whether the win was resolved from multiple agents with equal seniority (e.g., detect if conflict has multiple agents at best.agent_level or if pick_highest_seniority can return/indicate a tie-break flag); if it was a hierarchy tiebreak, set reasoning to something like "Debate fallback: hierarchy tiebreak among equal-seniority agents — {best.agent_id} ({best.agent_level}) selected", otherwise keep "authority-based judging — {best.agent_id} ({best.agent_level}) has highest seniority". Ensure this logic is implemented where JudgeDecision is constructed so the audit trail accurately distinguishes pure seniority wins from hierarchy tiebreak wins.src/ai_company/communication/conflict_resolution/_helpers.py (1)
21-79: 🛠️ Refactor suggestion | 🟠 MajorBreak
find_losers()into validation and extraction helpers.The new winner-integrity checks pushed this helper past the repo's 50-line ceiling, and it now mixes validation, logging, and the happy-path loser selection. Pulling winner validation into a small helper would keep the unhappy paths isolated and the core computation trivial.
As per coding guidelines, 'Keep functions under 50 lines and files under 800 lines'.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/_helpers.py` around lines 21 - 79, Split validation from extraction: add a small helper (e.g. ensure_winner_in_conflict or validate_winner_present(conflict, winner_id)) that performs the winner existence check, logs the CONFLICT_STRATEGY_ERROR and raises ConflictStrategyError with the same context when missing; then simplify find_losers to call that helper and only perform the tuple comprehension (losers = tuple(pos for pos in conflict.positions if pos.agent_id != winner_id)), keep the "no losers" warning/raise in find_losers as the only remaining unhappy-path logic; update imports/refs accordingly.src/ai_company/communication/meeting/orchestrator.py (1)
364-431: 🛠️ Refactor suggestion | 🟠 MajorSplit
_validate_inputs()into smaller validators.This method now bundles token-budget validation, empty-participant checks, duplicate detection, and leader-membership checks into one 68-line branchy helper. Extracting the participant-specific checks and shared log-and-raise scaffolding will make future validation changes safer and bring it back under the repo limit.
As per coding guidelines, 'Keep functions under 50 lines and files under 800 lines'.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/meeting/orchestrator.py` around lines 364 - 431, The _validate_inputs method is too large and mixes token_budget checks with several participant checks; split it into smaller focused validators: create a _validate_token_budget(meeting_id, token_budget) that performs the positive check and logs/raises ValueError, and create a _validate_participants(meeting_id, leader_id, participant_ids) that contains the empty-participants, duplicate detection (use Counter), and leader-in-participants checks and raises MeetingParticipantError with the same context payloads; factor the common logging-and-raise pattern into a helper (e.g., _log_and_raise or _log_participant_error) used by both validators, then have _validate_inputs call these two new helpers to preserve behavior and messages (keep names _validate_inputs, _validate_token_budget, _validate_participants, and the logging helper to locate changes).src/ai_company/communication/conflict_resolution/hybrid_strategy.py (1)
228-240:⚠️ Potential issue | 🟠 MajorPreserve authority-strategy validation in the fallback path.
pick_highest_seniority()only compares seniority plus raw ancestor counts. That means equal-seniority conflicts the authority strategy would reject now silently resolve here — e.g. an agent missing from the hierarchy looks like depth 0 becauseHierarchyResolver.get_ancestors()returns(), and peers with no common manager collapse to an order-dependent winner. Since this branch is the hybrid's authority fallback, it should reuse the same hierarchy validation/tiebreak semantics asAuthorityResolverbefore building the hybrid resolution.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/hybrid_strategy.py` around lines 228 - 240, The hybrid fallback currently calls pick_highest_seniority(conflict, hierarchy=self._hierarchy) without reusing the AuthorityResolver's validation/tiebreak semantics; update the hybrid fallback to first run the same hierarchy validation used by AuthorityResolver (e.g., invoke the AuthorityResolver validation method or replicate its checks: ensure both agents exist in self._hierarchy, detect equal seniority ties and missing-ancestor cases via HierarchyResolver.get_ancestors() semantics) and only then call pick_highest_seniority to build the ConflictResolution; if the AuthorityResolver would have rejected/abstained (tie or missing agent), the hybrid must not silently pick a winner but follow AuthorityResolver’s outcome path (reject/abstain or escalate) before creating the RESOLVED_BY_HYBRID ConflictResolution.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Line 29: The sentence currently contradicts itself by saying "M4: Multi-Agent"
is in progress while also claiming "M3 Single Agent" is in progress; update the
README sentence to make milestone statuses consistent — e.g., mark "M3: Single
Agent" as complete if M4 is in progress, or mark "M4: Multi-Agent" as
planned/not started if M3 is still in progress, and rephrase the line that
mentions "M4: Multi-Agent" and "M3 Single Agent" so it unambiguously lists each
milestone with its correct status (using the exact labels "M4: Multi-Agent" and
"M3: Single Agent").
In `@src/ai_company/communication/bus_memory.py`:
- Around line 462-472: The decrement of self._waiters[key] after await in the
block with _await_with_shutdown can re-insert a key with value 0 even if
unsubscribe() removed it; to fix, change the post-await decrement to only modify
the dict if the key still exists and its current value is >0 (or decrement and
then remove the key when the resulting value is 0) so you never leave
zero-valued orphan entries in self._waiters — update the code around key,
self._waiters, and the finally block that runs after _await_with_shutdown() to
check existence and remove zero entries atomically.
In `@src/ai_company/communication/delegation/hierarchy.py`:
- Around line 231-232: get_lowest_common_manager currently returns agent_a when
agent_a == agent_b without verifying the agent exists; change this fast-path in
get_lowest_common_manager to first check membership in the hierarchy's
known-agent set (or build/maintain such a set during Hierarchy construction),
and only return agent_a if it is present; otherwise return None (or the existing
"no manager" sentinel). Locate get_lowest_common_manager and the class that
builds the hierarchy (e.g., Hierarchy.__init__ or similar) to add/verify the
known-agent collection and use it in the equality fast-path.
In `@src/ai_company/communication/meeting/_parsing.py`:
- Around line 24-27: _ ANY_HEADER_RE currently treats any line that ends with a
colon as a header, which can prematurely split sections (matches things like
"Note:" inside bodies); update the _ANY_HEADER_RE usage so the colon-terminated
alternative only counts as a header when it is followed by a non-blank next line
(use a positive lookahead to assert the next line starts with a non-whitespace
character), keeping the existing markdown-hash header branch intact; modify the
regex assigned to _ANY_HEADER_RE accordingly and keep re.MULTILINE.
In `@src/ai_company/communication/meeting/_token_tracker.py`:
- Around line 77-82: The negative-token validation path is currently logging
MEETING_BUDGET_EXHAUSTED which conflates caller validation errors with true
budget overruns; update the logging there (the logger.warning call in
_token_tracker.py that currently passes MEETING_BUDGET_EXHAUSTED) to emit a
distinct validation event (e.g., MEETING_INVALID_TOKEN_COUNT or
MEETING_TOKEN_VALIDATION_ERROR) and keep the same context fields (error=msg,
input_tokens, output_tokens) so dashboards/alerts can distinguish invalid input
from genuine exhaustion; add the new constant and swap its use in the validation
branch where negative counts are detected.
In `@src/ai_company/communication/meeting/models.py`:
- Around line 237-262: The validator _validate_token_aggregates currently
returns early when contributions is empty, allowing non-zero
total_input_tokens/total_output_tokens; change it so that when
self.contributions is empty you assert both totals are zero (raise ValueError if
total_input_tokens != 0 or total_output_tokens != 0) otherwise proceed to sum
contributions as implemented; update the error messages to reference the field
names (total_input_tokens/total_output_tokens) and expected zero when raising in
the empty-contributions case.
In `@src/ai_company/communication/meeting/position_papers.py`:
- Around line 282-289: Replace the free-form logger.error calls in the two
invariant checks so they use stable event constants from
ai_company.observability.events (e.g., POSITION_PAPERS_MISSING and
CONTRIBUTIONS_MISSING) and structured kwargs rather than formatted strings: for
the results check (variable names results, n, meeting_id) call
logger.error(POSITION_PAPERS_MISSING, detail=msg, meeting_id=meeting_id) (after
importing the constant) and similarly for contrib_results use
logger.error(CONTRIBUTIONS_MISSING, detail=msg, meeting_id=meeting_id); keep
raising RuntimeError(msg) but ensure logging uses the event constant and
structured fields instead of logger.error(msg, ...).
- Around line 153-156: The synthesis output can be missing explicit "Decisions"
and "Action Items" headers so parse_decisions and parse_action_items receive
empty results; update _build_synthesis_prompt() to require the model to emit
clearly labeled, parser-friendly sections named exactly "Decisions:" and "Action
Items:" (or another agreed exact header text) and include examples/format
constraints (e.g., bullet list under each header) so
synthesis_contribution.content (synthesis_text) always contains those headers
for parse_decisions and parse_action_items to consume.
In `@src/ai_company/communication/meeting/round_robin.py`:
- Around line 165-166: The code calls parse_decisions(summary) and
parse_action_items(summary) but the prompt produced by _build_summary_prompt()
does not require "Decisions:" or "Action Items:" headers, so leaders can return
lists that the parsers miss; update _build_summary_prompt() to explicitly
require distinct "Decisions:" and "Action Items:" section headers (with
examples) and then add a small guard where decisions = parse_decisions(summary)
/ action_items = parse_action_items(summary) are invoked to validate the summary
contains those headers (e.g., check for the literal "Decisions:" and "Action
Items:") and if missing, either request the model to reformat or log/raise a
clear parsing error so empty results are not silently accepted.
In `@src/ai_company/engine/decomposition/models.py`:
- Around line 248-253: Update the public class docstring for SubtaskStatusRollup
to explicitly document the mixed terminal-state rule: when completed + cancelled
== total the rollup resolves to TaskStatus.CANCELLED (i.e., any mix of completed
and cancelled subtasks is considered CANCELLED), in addition to the existing
description that pure completed maps to COMPLETED, pure cancelled maps to
CANCELLED, and the remainder maps to IN_PROGRESS; reference the attributes
completed, cancelled, total and the TaskStatus.CANCELLED enum so callers can
rely on this contract.
In `@tests/unit/engine/test_decomposition_models.py`:
- Around line 230-249: The test test_task_id_mismatch_rejected currently only
asserts that a ValueError is raised for ID mismatches; update it to also assert
the exception message contains the specific missing and extra IDs so regressions
that drop diagnostics are caught. Modify the pytest.raises match to look for the
missing plan ID ("sub-2") and the extra created ID ("sub-99") (or add an
explicit str(e) assertion inside the context) when constructing
DecompositionResult for the given DecompositionPlan and created_tasks, ensuring
the validator's diagnostic strings mention both IDs.
---
Outside diff comments:
In `@src/ai_company/communication/conflict_resolution/_helpers.py`:
- Around line 21-79: Split validation from extraction: add a small helper (e.g.
ensure_winner_in_conflict or validate_winner_present(conflict, winner_id)) that
performs the winner existence check, logs the CONFLICT_STRATEGY_ERROR and raises
ConflictStrategyError with the same context when missing; then simplify
find_losers to call that helper and only perform the tuple comprehension (losers
= tuple(pos for pos in conflict.positions if pos.agent_id != winner_id)), keep
the "no losers" warning/raise in find_losers as the only remaining unhappy-path
logic; update imports/refs accordingly.
In `@src/ai_company/communication/conflict_resolution/debate_strategy.py`:
- Around line 255-262: The reasoning message claims "highest seniority" even
when pick_highest_seniority resolved an equal-seniority tie via hierarchy;
update the JudgeDecision.reasoning to reflect which tiebreaker was used. After
calling pick_highest_seniority(conflict, hierarchy=self._hierarchy), check
whether the win was resolved from multiple agents with equal seniority (e.g.,
detect if conflict has multiple agents at best.agent_level or if
pick_highest_seniority can return/indicate a tie-break flag); if it was a
hierarchy tiebreak, set reasoning to something like "Debate fallback: hierarchy
tiebreak among equal-seniority agents — {best.agent_id} ({best.agent_level})
selected", otherwise keep "authority-based judging — {best.agent_id}
({best.agent_level}) has highest seniority". Ensure this logic is implemented
where JudgeDecision is constructed so the audit trail accurately distinguishes
pure seniority wins from hierarchy tiebreak wins.
In `@src/ai_company/communication/conflict_resolution/hybrid_strategy.py`:
- Around line 228-240: The hybrid fallback currently calls
pick_highest_seniority(conflict, hierarchy=self._hierarchy) without reusing the
AuthorityResolver's validation/tiebreak semantics; update the hybrid fallback to
first run the same hierarchy validation used by AuthorityResolver (e.g., invoke
the AuthorityResolver validation method or replicate its checks: ensure both
agents exist in self._hierarchy, detect equal seniority ties and
missing-ancestor cases via HierarchyResolver.get_ancestors() semantics) and only
then call pick_highest_seniority to build the ConflictResolution; if the
AuthorityResolver would have rejected/abstained (tie or missing agent), the
hybrid must not silently pick a winner but follow AuthorityResolver’s outcome
path (reject/abstain or escalate) before creating the RESOLVED_BY_HYBRID
ConflictResolution.
In `@src/ai_company/communication/meeting/orchestrator.py`:
- Around line 364-431: The _validate_inputs method is too large and mixes
token_budget checks with several participant checks; split it into smaller
focused validators: create a _validate_token_budget(meeting_id, token_budget)
that performs the positive check and logs/raises ValueError, and create a
_validate_participants(meeting_id, leader_id, participant_ids) that contains the
empty-participants, duplicate detection (use Counter), and
leader-in-participants checks and raises MeetingParticipantError with the same
context payloads; factor the common logging-and-raise pattern into a helper
(e.g., _log_and_raise or _log_participant_error) used by both validators, then
have _validate_inputs call these two new helpers to preserve behavior and
messages (keep names _validate_inputs, _validate_token_budget,
_validate_participants, and the logging helper to locate changes).
In `@src/ai_company/engine/decomposition/rollup.py`:
- Around line 44-59: The empty-input branch logs DECOMPOSITION_ROLLUP_COMPUTED
at WARNING without the derived_status and thus differs from the normal path and
misreports an operational fault; update the total == 0 branch to log the same
event shape as the normal path (include derived_status="zeroed" or the same
field name used elsewhere) and use the same log level as non-error creation
(change to DEBUG if normal path logs creation at DEBUG) before returning the
zeroed SubtaskStatusRollup(parent_task_id=..., total=0, completed=0, failed=0,
in_progress=0, blocked=0, cancelled=0) so the event payload and severity remain
consistent with SubtaskStatusRollup creation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7ba8cc0a-97c6-4a72-be1a-9f6a3c561cfd
📒 Files selected for processing (55)
DESIGN_SPEC.mdREADME.mdsrc/ai_company/communication/bus_memory.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/communication/conflict_resolution/config.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/communication/delegation/hierarchy.pysrc/ai_company/communication/meeting/_parsing.pysrc/ai_company/communication/meeting/_prompts.pysrc/ai_company/communication/meeting/_token_tracker.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/communication/meeting/models.pysrc/ai_company/communication/meeting/orchestrator.pysrc/ai_company/communication/meeting/position_papers.pysrc/ai_company/communication/meeting/round_robin.pysrc/ai_company/communication/meeting/structured_phases.pysrc/ai_company/communication/messenger.pysrc/ai_company/core/enums.pysrc/ai_company/core/task.pysrc/ai_company/engine/decomposition/models.pysrc/ai_company/engine/decomposition/rollup.pysrc/ai_company/engine/decomposition/service.pysrc/ai_company/engine/parallel.pysrc/ai_company/engine/routing/models.pysrc/ai_company/engine/routing/scorer.pysrc/ai_company/engine/routing/service.pysrc/ai_company/observability/events/communication.pysrc/ai_company/observability/events/task_routing.pytests/integration/communication/test_meeting_integration.pytests/unit/communication/conflict_resolution/test_authority_strategy.pytests/unit/communication/conflict_resolution/test_config.pytests/unit/communication/conflict_resolution/test_debate_strategy.pytests/unit/communication/conflict_resolution/test_helpers.pytests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/communication/delegation/test_hierarchy.pytests/unit/communication/meeting/test_config.pytests/unit/communication/meeting/test_enums.pytests/unit/communication/meeting/test_errors.pytests/unit/communication/meeting/test_models.pytests/unit/communication/meeting/test_orchestrator.pytests/unit/communication/meeting/test_parsing.pytests/unit/communication/meeting/test_position_papers.pytests/unit/communication/meeting/test_prompts.pytests/unit/communication/meeting/test_protocol.pytests/unit/communication/meeting/test_round_robin.pytests/unit/communication/meeting/test_structured_phases.pytests/unit/communication/meeting/test_token_tracker.pytests/unit/communication/test_bus_memory.pytests/unit/communication/test_enums.pytests/unit/engine/test_decomposition_models.pytests/unit/engine/test_decomposition_service.pytests/unit/engine/test_routing_models.pytests/unit/engine/test_routing_service.py
💤 Files with no reviewable changes (2)
- src/ai_company/communication/conflict_resolution/config.py
- src/ai_company/observability/events/communication.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not use 'from future import annotations' in Python 3.14+ code
Use 'except A, B:' syntax without parentheses in exception handling for Python 3.14 (PEP 758)
Include type hints on all public functions; enforce with mypy strict mode
Use Google-style docstrings, required on all public classes and functions, enforced by ruff D rules
Create new objects rather than mutating existing ones; use copy.deepcopy() for non-Pydantic internal collections and MappingProxyType for read-only enforcement
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state
Use Pydantic v2 with BaseModel, model_validator, computed_field, and ConfigDict; use@computed_fieldfor derived values instead of storing redundant fields
Use NotBlankStr from core.types for all identifier/name fields instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code
Enforce maximum line length of 88 characters
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow exceptions
Validate at system boundaries: user input, external APIs, and config files
Files:
tests/unit/communication/meeting/test_round_robin.pytests/unit/communication/meeting/test_structured_phases.pysrc/ai_company/observability/events/task_routing.pytests/unit/communication/meeting/test_protocol.pysrc/ai_company/core/task.pysrc/ai_company/communication/meeting/models.pytests/unit/communication/meeting/test_models.pytests/unit/communication/meeting/test_prompts.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pytests/unit/communication/delegation/test_hierarchy.pysrc/ai_company/communication/delegation/hierarchy.pysrc/ai_company/communication/meeting/orchestrator.pysrc/ai_company/engine/routing/scorer.pysrc/ai_company/engine/decomposition/service.pytests/unit/communication/meeting/test_parsing.pytests/unit/engine/test_decomposition_models.pysrc/ai_company/engine/routing/models.pytests/unit/communication/meeting/test_position_papers.pysrc/ai_company/communication/messenger.pytests/unit/communication/meeting/test_orchestrator.pytests/unit/communication/meeting/test_token_tracker.pytests/unit/communication/conflict_resolution/test_authority_strategy.pysrc/ai_company/engine/decomposition/models.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/communication/meeting/_parsing.pytests/integration/communication/test_meeting_integration.pytests/unit/engine/test_decomposition_service.pysrc/ai_company/communication/meeting/structured_phases.pysrc/ai_company/communication/conflict_resolution/_helpers.pytests/unit/communication/test_bus_memory.pytests/unit/engine/test_routing_models.pysrc/ai_company/core/enums.pysrc/ai_company/communication/meeting/_prompts.pytests/unit/communication/meeting/test_config.pysrc/ai_company/engine/parallel.pytests/unit/communication/meeting/test_errors.pytests/unit/communication/test_enums.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/engine/decomposition/rollup.pytests/unit/communication/conflict_resolution/test_config.pysrc/ai_company/communication/meeting/position_papers.pytests/unit/communication/conflict_resolution/test_helpers.pytests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/communication/conflict_resolution/test_debate_strategy.pysrc/ai_company/communication/meeting/_token_tracker.pytests/unit/communication/meeting/test_enums.pytests/unit/engine/test_routing_service.pysrc/ai_company/communication/meeting/round_robin.pysrc/ai_company/communication/bus_memory.pysrc/ai_company/engine/routing/service.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark unit tests with@pytest.mark.unit
Mark integration tests with@pytest.mark.integration
Mark end-to-end tests with@pytest.mark.e2e
Mark slow tests with@pytest.mark.slow
Prefer@pytest.mark.parametrizefor testing similar cases
Use 'test-provider', 'test-small-001', etc. in test code instead of real vendor names
Files:
tests/unit/communication/meeting/test_round_robin.pytests/unit/communication/meeting/test_structured_phases.pytests/unit/communication/meeting/test_protocol.pytests/unit/communication/meeting/test_models.pytests/unit/communication/meeting/test_prompts.pytests/unit/communication/delegation/test_hierarchy.pytests/unit/communication/meeting/test_parsing.pytests/unit/engine/test_decomposition_models.pytests/unit/communication/meeting/test_position_papers.pytests/unit/communication/meeting/test_orchestrator.pytests/unit/communication/meeting/test_token_tracker.pytests/unit/communication/conflict_resolution/test_authority_strategy.pytests/integration/communication/test_meeting_integration.pytests/unit/engine/test_decomposition_service.pytests/unit/communication/test_bus_memory.pytests/unit/engine/test_routing_models.pytests/unit/communication/meeting/test_config.pytests/unit/communication/meeting/test_errors.pytests/unit/communication/test_enums.pytests/unit/communication/conflict_resolution/test_config.pytests/unit/communication/conflict_resolution/test_helpers.pytests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/communication/conflict_resolution/test_debate_strategy.pytests/unit/communication/meeting/test_enums.pytests/unit/engine/test_routing_service.py
{src,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names like 'example-provider', 'example-large-001', 'example-medium-001', 'example-small-001', 'large'/'medium'/'small'
Files:
tests/unit/communication/meeting/test_round_robin.pytests/unit/communication/meeting/test_structured_phases.pysrc/ai_company/observability/events/task_routing.pytests/unit/communication/meeting/test_protocol.pysrc/ai_company/core/task.pysrc/ai_company/communication/meeting/models.pytests/unit/communication/meeting/test_models.pytests/unit/communication/meeting/test_prompts.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pytests/unit/communication/delegation/test_hierarchy.pysrc/ai_company/communication/delegation/hierarchy.pysrc/ai_company/communication/meeting/orchestrator.pysrc/ai_company/engine/routing/scorer.pysrc/ai_company/engine/decomposition/service.pytests/unit/communication/meeting/test_parsing.pytests/unit/engine/test_decomposition_models.pysrc/ai_company/engine/routing/models.pytests/unit/communication/meeting/test_position_papers.pysrc/ai_company/communication/messenger.pytests/unit/communication/meeting/test_orchestrator.pytests/unit/communication/meeting/test_token_tracker.pytests/unit/communication/conflict_resolution/test_authority_strategy.pysrc/ai_company/engine/decomposition/models.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/communication/meeting/_parsing.pytests/integration/communication/test_meeting_integration.pytests/unit/engine/test_decomposition_service.pysrc/ai_company/communication/meeting/structured_phases.pysrc/ai_company/communication/conflict_resolution/_helpers.pytests/unit/communication/test_bus_memory.pytests/unit/engine/test_routing_models.pysrc/ai_company/core/enums.pysrc/ai_company/communication/meeting/_prompts.pytests/unit/communication/meeting/test_config.pysrc/ai_company/engine/parallel.pytests/unit/communication/meeting/test_errors.pytests/unit/communication/test_enums.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/engine/decomposition/rollup.pytests/unit/communication/conflict_resolution/test_config.pysrc/ai_company/communication/meeting/position_papers.pytests/unit/communication/conflict_resolution/test_helpers.pytests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/communication/conflict_resolution/test_debate_strategy.pysrc/ai_company/communication/meeting/_token_tracker.pytests/unit/communication/meeting/test_enums.pytests/unit/engine/test_routing_service.pysrc/ai_company/communication/meeting/round_robin.pysrc/ai_company/communication/bus_memory.pysrc/ai_company/engine/routing/service.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Every module with business logic must import logger via 'from ai_company.observability import get_logger' and instantiate as 'logger = get_logger(name)'
Never use 'import logging' or 'logging.getLogger()' or 'print()' in application code
Always use 'logger' as the variable name for logging, not '_logger' or 'log'
Use event name constants from domain-specific modules under ai_company.observability.events for logging events
Always use structured logging with kwargs (logger.info(EVENT, key=value)), never formatted strings (logger.info('msg %s', val))
Log at WARNING or ERROR with context for all error paths before raising exceptions
Log at INFO for all state transitions
Log at DEBUG for object creation, internal flow, and entry/exit of key functions
Pure data models, enums, and re-exports do not require logging
All provider calls go through BaseCompletionProvider which applies retry and rate limiting automatically
Files:
src/ai_company/observability/events/task_routing.pysrc/ai_company/core/task.pysrc/ai_company/communication/meeting/models.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/communication/delegation/hierarchy.pysrc/ai_company/communication/meeting/orchestrator.pysrc/ai_company/engine/routing/scorer.pysrc/ai_company/engine/decomposition/service.pysrc/ai_company/engine/routing/models.pysrc/ai_company/communication/messenger.pysrc/ai_company/engine/decomposition/models.pysrc/ai_company/communication/meeting/config.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/communication/meeting/_parsing.pysrc/ai_company/communication/meeting/structured_phases.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/core/enums.pysrc/ai_company/communication/meeting/_prompts.pysrc/ai_company/engine/parallel.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/engine/decomposition/rollup.pysrc/ai_company/communication/meeting/position_papers.pysrc/ai_company/communication/meeting/_token_tracker.pysrc/ai_company/communication/meeting/round_robin.pysrc/ai_company/communication/bus_memory.pysrc/ai_company/engine/routing/service.py
DESIGN_SPEC.md
📄 CodeRabbit inference engine (CLAUDE.md)
Update DESIGN_SPEC.md to reflect approved deviations from the specification
Files:
DESIGN_SPEC.md
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T09:48:46.483Z
Learning: Fix all valid issues found by review agents including pre-existing issues adjacent to PR changes; never defer or skip as out of scope
📚 Learning: 2026-03-08T09:48:46.483Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T09:48:46.483Z
Learning: Enforce 30-second timeout per test
Applied to files:
tests/unit/communication/meeting/test_round_robin.pytests/unit/communication/meeting/test_structured_phases.pytests/unit/communication/meeting/test_protocol.pytests/unit/communication/meeting/test_models.pytests/unit/communication/meeting/test_prompts.pytests/unit/communication/meeting/test_position_papers.pytests/unit/communication/meeting/test_token_tracker.pytests/integration/communication/test_meeting_integration.pytests/unit/communication/meeting/test_config.pytests/unit/communication/meeting/test_errors.pytests/unit/communication/meeting/test_enums.py
📚 Learning: 2026-03-08T09:48:46.483Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T09:48:46.483Z
Learning: Applies to tests/**/*.py : Mark slow tests with pytest.mark.slow
Applied to files:
tests/unit/communication/meeting/test_round_robin.pytests/unit/communication/meeting/test_prompts.pytests/unit/communication/meeting/test_position_papers.pytests/integration/communication/test_meeting_integration.pytests/unit/communication/meeting/test_config.py
📚 Learning: 2026-03-08T09:48:46.483Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T09:48:46.483Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events for logging events
Applied to files:
src/ai_company/engine/routing/scorer.py
📚 Learning: 2026-03-08T09:48:46.483Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T09:48:46.483Z
Learning: Applies to tests/**/*.py : Mark integration tests with pytest.mark.integration
Applied to files:
tests/integration/communication/test_meeting_integration.py
📚 Learning: 2026-03-08T09:48:46.483Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-08T09:48:46.483Z
Learning: Applies to DESIGN_SPEC.md : Update DESIGN_SPEC.md to reflect approved deviations from the specification
Applied to files:
DESIGN_SPEC.md
🧬 Code graph analysis (22)
tests/unit/communication/meeting/test_models.py (1)
src/ai_company/communication/meeting/models.py (4)
MeetingRecord(265-324)MeetingContribution(99-128)MeetingMinutes(153-262)MeetingAgenda(77-96)
src/ai_company/communication/conflict_resolution/hybrid_strategy.py (2)
src/ai_company/communication/delegation/hierarchy.py (1)
HierarchyResolver(16-270)src/ai_company/communication/conflict_resolution/_helpers.py (1)
pick_highest_seniority(136-163)
tests/unit/communication/delegation/test_hierarchy.py (1)
src/ai_company/communication/delegation/hierarchy.py (1)
get_lowest_common_manager(212-246)
src/ai_company/communication/meeting/orchestrator.py (3)
src/ai_company/communication/meeting/enums.py (1)
MeetingProtocolType(6-17)src/ai_company/communication/meeting/protocol.py (1)
MeetingProtocol(61-100)src/ai_company/communication/meeting/errors.py (1)
MeetingParticipantError(22-23)
tests/unit/communication/meeting/test_parsing.py (1)
src/ai_company/communication/meeting/_parsing.py (2)
parse_action_items(121-157)parse_decisions(70-93)
tests/unit/engine/test_decomposition_models.py (1)
src/ai_company/engine/decomposition/models.py (3)
DecompositionPlan(66-122)DecompositionResult(125-176)derived_parent_status(228-255)
src/ai_company/engine/routing/models.py (3)
src/ai_company/core/agent.py (1)
AgentIdentity(246-304)src/ai_company/core/enums.py (1)
CoordinationTopology(325-336)src/ai_company/observability/_logger.py (1)
get_logger(8-28)
src/ai_company/communication/messenger.py (3)
src/ai_company/communication/bus_memory.py (2)
subscribe(326-374)unsubscribe(376-417)src/ai_company/communication/bus_protocol.py (2)
subscribe(83-104)unsubscribe(106-121)src/ai_company/communication/subscription.py (1)
Subscription(9-22)
tests/unit/communication/meeting/test_orchestrator.py (3)
tests/unit/communication/meeting/conftest.py (3)
simple_agenda(83-98)leader_id(102-104)participant_ids(108-110)src/ai_company/communication/meeting/models.py (1)
MeetingAgenda(77-96)src/ai_company/communication/meeting/errors.py (1)
MeetingParticipantError(22-23)
src/ai_company/engine/decomposition/models.py (1)
src/ai_company/core/enums.py (1)
TaskStatus(165-191)
src/ai_company/communication/meeting/_parsing.py (2)
src/ai_company/communication/meeting/models.py (1)
ActionItem(131-150)src/ai_company/observability/_logger.py (1)
get_logger(8-28)
src/ai_company/communication/conflict_resolution/_helpers.py (2)
src/ai_company/communication/delegation/hierarchy.py (1)
get_ancestors(195-210)src/ai_company/communication/errors.py (1)
ConflictStrategyError(107-108)
tests/unit/communication/test_bus_memory.py (1)
src/ai_company/communication/bus_memory.py (3)
subscribe(326-374)receive(419-475)unsubscribe(376-417)
src/ai_company/engine/parallel.py (1)
src/ai_company/engine/errors.py (1)
ParallelExecutionError(42-43)
tests/unit/communication/test_enums.py (1)
src/ai_company/communication/enums.py (1)
MessageType(6-20)
src/ai_company/communication/conflict_resolution/debate_strategy.py (4)
src/ai_company/communication/conflict_resolution/models.py (1)
Conflict(67-111)src/ai_company/communication/conflict_resolution/protocol.py (1)
JudgeDecision(18-27)src/ai_company/communication/conflict_resolution/_helpers.py (1)
pick_highest_seniority(136-163)tests/unit/communication/conflict_resolution/conftest.py (1)
hierarchy(143-145)
src/ai_company/engine/decomposition/rollup.py (1)
src/ai_company/engine/decomposition/models.py (1)
SubtaskStatusRollup(179-255)
tests/unit/communication/conflict_resolution/test_config.py (1)
src/ai_company/communication/conflict_resolution/config.py (1)
DebateConfig(9-22)
src/ai_company/communication/meeting/position_papers.py (1)
src/ai_company/communication/meeting/_parsing.py (2)
parse_action_items(121-157)parse_decisions(70-93)
src/ai_company/communication/meeting/_token_tracker.py (1)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)
tests/unit/engine/test_routing_service.py (1)
src/ai_company/engine/routing/service.py (2)
TaskRoutingService(33-192)route(51-114)
src/ai_company/communication/meeting/round_robin.py (2)
src/ai_company/communication/meeting/_parsing.py (2)
parse_action_items(121-157)parse_decisions(70-93)src/ai_company/communication/meeting/models.py (1)
ActionItem(131-150)
🔇 Additional comments (44)
src/ai_company/engine/parallel.py (1)
181-183: LGTM: explicit exception chaining is preserved.This is a formatting-only change;
ParallelExecutionErrorstill keepsrelease_erroras its cause and the control flow remains unchanged.tests/integration/communication/test_meeting_integration.py (1)
37-38: LGTM!The module-level timeout marker correctly enforces a 30-second limit per test, aligning with the repository's test timeout policy. Based on learnings: "Enforce 30-second timeout per test".
src/ai_company/observability/events/task_routing.py (1)
14-14: LGTM!The new event constant follows the established naming convention and typing pattern used by other constants in this module. It properly centralizes the event key for invalid scorer configuration logging.
src/ai_company/engine/routing/models.py (4)
7-20: LGTM!Logger setup and imports follow the coding guidelines: using
get_logger(__name__), importing event constants fromobservability.events, and naming the logger variablelogger.
72-88: LGTM!The addition of structured warning logging before raising the exception follows the coding guidelines for error-path logging. The log includes relevant context (
subtask_id,error) using kwargs.
112-157: LGTM!The enhanced validation is well-structured:
- Uses
Counterfor efficient O(n) duplicate detection- Provides deterministic error messages via
sorted()- Logs warnings with relevant context before raising exceptions
- Covers all three validation cases: duplicates in decisions, duplicates in unroutable, and overlap between them
191-206: LGTM!The warning logging before raising follows the coding guidelines. The error message includes the field name, providing sufficient context.
src/ai_company/engine/routing/service.py (1)
75-87: LGTM!This early consistency check addresses a critical fix from the PR objectives. The validation:
- Detects mismatches between
parent_task.idandplan.parent_task_idbefore routing proceeds- Logs a structured warning with all relevant context
- Raises a clear
ValueErrorwith both IDs for debuggingsrc/ai_company/engine/routing/scorer.py (1)
12-14: LGTM!Replacing the literal string with
TASK_ROUTING_SCORER_INVALID_CONFIGimproves consistency by using centralized event constants from the observability module. Based on learnings: "Use event name constants from domain-specific modules under ai_company.observability.events for logging events."Also applies to: 61-66
tests/unit/engine/test_routing_models.py (2)
138-163: LGTM!The test properly validates duplicate detection within decisions. It uses the correct marker, creates a realistic scenario with two decisions having the same
subtask_id, and asserts the expected error message.
165-173: LGTM!The test properly validates duplicate detection within the unroutable list. It's appropriately minimal since
unroutableonly contains string IDs and doesn't require the agent fixture.tests/unit/engine/test_routing_service.py (1)
296-307: LGTM!The test properly validates the parent task ID mismatch check. It creates a mismatched scenario using different IDs for the task and decomposition plan, then asserts the expected
ValueError.tests/unit/communication/test_bus_memory.py (1)
291-315: LGTM!The test correctly validates that multiple concurrent
receive()calls are all woken whenunsubscribe()is invoked. The structure mirrors the existing single-receiver test, and the assertions properly verify that all three receivers returnedNone. This provides good coverage for the new per-(channel, subscriber) waiter tracking introduced inbus_memory.py.src/ai_company/communication/messenger.py (2)
17-17: LGTM!The
NotBlankStrimport and its usage in the public API methods (subscribe,unsubscribe,receive) aligns with the coding guidelines for identifier/name fields and matches thebus_protocol.pyinterface. This provides validation at the API boundary.
289-294: LGTM!The updated docstring accurately documents all three conditions under which
receive()returnsNone, matching the implementation inbus_memory.py. This improves API clarity for consumers.src/ai_company/communication/bus_memory.py (4)
102-102: LGTM!The
_waitersdictionary is correctly typed and initialized to track the count of concurrentreceive()calls per (channel, subscriber) pair.
402-412: LGTM!The sentinel wake-up logic correctly:
- Pops the waiter count and uses
max(1, pending)to ensure at least one sentinel is sent (handles edge cases)- Relies on unbounded queues (
maxsize=0) soput_nowaitcannot raiseQueueFull- Cleans up the
_waitersentry along with the queueThis properly implements the "wake all concurrent receivers" requirement.
477-509: LGTM!The async
_log_receive_nullmethod correctly:
- Acquires the lock to get a consistent snapshot of bus state
- Checks conditions in the right order (shutdown → unsubscribed → timeout)
- Uses the
timeout_secondsparameter name consistentlyThis eliminates the race condition where the logged reason could be incorrect.
438-443: LGTM!The updated docstring accurately documents all three conditions for returning
Noneand is consistent with the corresponding documentation inmessenger.py.src/ai_company/core/task.py (1)
73-76: Schema addition is consistent.These fields are documented, typed, and defaulted in a way that keeps existing
Taskconstruction paths backward compatible.Also applies to: 147-154
src/ai_company/engine/decomposition/service.py (1)
121-133: Dependency propagation now preserves the subtask DAG on created tasks.This keeps
created_tasksconsistent with both the decomposition plan and the emitteddependency_edges.tests/unit/engine/test_decomposition_service.py (1)
273-288: Good regression coverage for dependency propagation.This would catch the exact gap the service fix is addressing by asserting the created
Task.dependenciesmatch the plan.tests/unit/engine/test_decomposition_models.py (1)
387-401: Good semantic lock-in for mixed completed/cancelled rollups.This makes the new terminal-state behavior explicit and helps prevent the aggregate status from drifting back to
COMPLETED.src/ai_company/engine/decomposition/models.py (1)
157-165: Add validator to ensuredependency_edgesexactly match plan-declared dependencies.Currently, the
DecompositionResultvalidator checks that edge endpoint IDs are known, butdependency_edgescan omit or exclude declared plan dependencies since the edge-set is derived directly fromplan.subtasks[*].dependenciesin the service (seeservice.pylines 144–153). Add a validator to enforce that the edge set exactly mirrors what the plan declares, and add a regression test to prevent future inconsistencies.tests/unit/communication/meeting/test_prompts.py (1)
8-9: LGTM!Module-level timeout marker correctly enforces the 30-second timeout policy across all tests in this file. Based on learnings: "Enforce 30-second timeout per test".
src/ai_company/communication/meeting/config.py (1)
65-66: LGTM!Docstring line wrap maintains readability within the 88-character line limit.
tests/unit/communication/test_enums.py (1)
31-31: LGTM!The assertion correctly validates the new
MEETING_CONTRIBUTIONenum member. The member count assertion on line 20 (len(MessageType) == 9) is consistent with the enum definition showing 9 members in the relevant code snippet.tests/unit/communication/meeting/test_config.py (1)
14-15: LGTM!Module-level timeout marker correctly enforces the 30-second timeout policy. Based on learnings: "Enforce 30-second timeout per test".
tests/unit/communication/meeting/test_position_papers.py (1)
22-23: LGTM!Module-level timeout marker correctly enforces the 30-second timeout policy for all async protocol tests. Based on learnings: "Enforce 30-second timeout per test".
tests/unit/communication/meeting/test_protocol.py (1)
17-18: LGTM!Module-level timeout marker correctly enforces the 30-second timeout policy. Based on learnings: "Enforce 30-second timeout per test".
tests/unit/communication/meeting/test_token_tracker.py (1)
7-8: LGTM!Module-level timeout marker correctly enforces the 30-second timeout policy. Based on learnings: "Enforce 30-second timeout per test".
tests/unit/communication/meeting/test_errors.py (1)
14-15: LGTM!Module-level timeout marker correctly enforces the 30-second timeout policy. Based on learnings: "Enforce 30-second timeout per test".
src/ai_company/communication/meeting/_parsing.py (3)
70-93: LGTM!The
parse_decisionsfunction correctly extracts decisions from LLM summary text. The logic properly handles missing sections, filters empty items, and returns an immutable tuple.
121-157: LGTM!The
parse_action_itemsfunction properly handles edge cases:
- Empty sections return empty tuples
- Empty raw text is skipped
- Empty descriptions after assignee extraction are handled by falling back to the original text
- Returns immutable tuples with
ActionIteminstancesThe defensive checks at lines 143-144 and 147-148 ensure
NotBlankStrvalidation won't fail on empty descriptions.
11-13: 🧹 Nitpick | 🔵 TrivialUnused logger import.
The logger is imported and instantiated but never used in this module. The parsing functions are pure transformations that return empty tuples on missing sections rather than logging warnings. Consider either:
- Removing the unused logger if no logging is intended, or
- Adding DEBUG-level logging for when sections are not found (to aid debugging LLM response parsing issues).
⛔ Skipped due to learnings
Learnt from: CR Repo: Aureliolo/ai-company PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-08T09:48:46.483Z Learning: Applies to src/**/*.py : Every module with business logic must import logger via 'from ai_company.observability import get_logger' and instantiate as 'logger = get_logger(__name__)'Learnt from: CR Repo: Aureliolo/ai-company PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-08T09:48:46.483Z Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events for logging eventssrc/ai_company/communication/meeting/structured_phases.py (5)
12-15: LGTM!The import of parsing utilities from the internal
_parsingmodule is correct and follows the established naming convention for internal modules.
279-281: LGTM!The integration of
parse_decisionsandparse_action_itemscorrectly extracts structured data from the synthesized summary. Empty tuples are gracefully returned if sections are not found, which is appropriate behavior for MeetingMinutes construction.
401-408: LGTM!Excellent improvement: replacing assertions with explicit
RuntimeErrorexceptions, with ERROR-level logging before raising. This follows coding guidelines for error paths and provides better diagnostics than bare assertions.
548-599: LGTM!The budget handling improvements correctly implement the synthesis reserve pattern:
- Lines 548-551: Reserves 20% of remaining budget for synthesis before allocating to discussion
- Line 563: Introduces
discussion_usedcounter for precise tracking within the discussion phase- Line 566: Adds compound check to prevent exceeding discussion budget
- Lines 589-593: Dynamically calculates remaining budget per agent, ensuring no agent gets zero tokens while respecting the overall cap
- Line 599: Correctly tracks both input and output tokens
This ensures synthesis always has budget available, addressing the critical fix mentioned in PR objectives.
53-54: LGTM!Clear documentation of the synthesis reserve fraction constant.
DESIGN_SPEC.md (4)
653-653: LGTM!Documentation correctly describes the new shared parsing utilities in
_parsing.py, aligning with the implementation.
2447-2447: LGTM!The file structure documentation correctly reflects the addition of
_parsing.pywith an accurate description of its purpose.
2649-2649: LGTM!The conflict resolution convention documentation is updated to reflect the implementation details, including the
build_dissent_records()method (plural), hierarchy proximity tiebreaker, and sharedfind_losersutility.
605-606: LGTM!Minor documentation wording improvements for the Structured Debate + Judge strategy description.
| _ANY_HEADER_RE = re.compile( | ||
| r"^#+\s+\S|^\S.*:\s*$", | ||
| re.MULTILINE, | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
_ANY_HEADER_RE pattern may prematurely truncate sections.
The pattern ^\S.*:\s*$ will match any line starting with a non-whitespace character and ending with a colon. This could match lines like "Note:" or "Example:" within a section body, causing premature section truncation. Consider whether this is intentional or if you need a more restrictive pattern (e.g., requiring the colon-terminated line to be followed by content on subsequent lines).
Given LLM output is semi-structured, this behavior may be acceptable as a best-effort heuristic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/_parsing.py` around lines 24 - 27, _
ANY_HEADER_RE currently treats any line that ends with a colon as a header,
which can prematurely split sections (matches things like "Note:" inside
bodies); update the _ANY_HEADER_RE usage so the colon-terminated alternative
only counts as a header when it is followed by a non-blank next line (use a
positive lookahead to assert the next line starts with a non-whitespace
character), keeping the existing markdown-hash header branch intact; modify the
regex assigned to _ANY_HEADER_RE accordingly and keep re.MULTILINE.
| if not all(r is not None for r in results): | ||
| msg = f"Expected {n} position papers but some slots are None" | ||
| logger.error(msg, meeting_id=meeting_id) | ||
| raise RuntimeError(msg) | ||
| if not all(c is not None for c in contrib_results): | ||
| msg = f"Expected {n} contributions but some slots are None" | ||
| logger.error(msg, meeting_id=meeting_id) | ||
| raise RuntimeError(msg) |
There was a problem hiding this comment.
Use a stable event constant for these error logs.
logger.error(msg, ...) breaks the event-based logging contract in src/ code and makes these invariant failures harder to query consistently. Emit a fixed event name and put the human-readable detail in structured fields instead.
As per coding guidelines, "Use event name constants from domain-specific modules under ai_company.observability.events for logging events" and "Always use structured logging with kwargs (logger.info(EVENT, key=value)), never formatted strings (logger.info('msg %s', val))".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/position_papers.py` around lines 282 -
289, Replace the free-form logger.error calls in the two invariant checks so
they use stable event constants from ai_company.observability.events (e.g.,
POSITION_PAPERS_MISSING and CONTRIBUTIONS_MISSING) and structured kwargs rather
than formatted strings: for the results check (variable names results, n,
meeting_id) call logger.error(POSITION_PAPERS_MISSING, detail=msg,
meeting_id=meeting_id) (after importing the constant) and similarly for
contrib_results use logger.error(CONTRIBUTIONS_MISSING, detail=msg,
meeting_id=meeting_id); keep raising RuntimeError(msg) but ensure logging uses
the event constant and structured fields instead of logger.error(msg, ...).
| decisions = parse_decisions(summary) | ||
| action_items = parse_action_items(summary) |
There was a problem hiding this comment.
Require explicit section headers before parsing the summary.
parse_decisions() and parse_action_items() only extract from Decisions / Action Items sections, but _build_summary_prompt() still asks for plain numbered/bulleted lists. A leader response that follows the current prompt exactly can leave both parsed fields empty.
Suggested prompt contract update
parts.append(
- "Please summarize this meeting. List the key decisions made "
- "and any action items with assignees. Format decisions as a "
- "numbered list and action items as a bulleted list."
+ "Please summarize this meeting using exactly these sections:\n"
+ "Decisions:\n"
+ "1. <decision>\n"
+ "2. <decision>\n\n"
+ "Action Items:\n"
+ "- <action item> (assigned to <agent_id>)\n"
+ "- <action item>"
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/meeting/round_robin.py` around lines 165 - 166,
The code calls parse_decisions(summary) and parse_action_items(summary) but the
prompt produced by _build_summary_prompt() does not require "Decisions:" or
"Action Items:" headers, so leaders can return lists that the parsers miss;
update _build_summary_prompt() to explicitly require distinct "Decisions:" and
"Action Items:" section headers (with examples) and then add a small guard where
decisions = parse_decisions(summary) / action_items =
parse_action_items(summary) are invoked to validate the summary contains those
headers (e.g., check for the literal "Decisions:" and "Action Items:") and if
missing, either request the model to reformat or log/raise a clear parsing error
so empty results are not silently accepted.
Greptile SummaryThis PR addresses 46 post-merge review findings from PRs #164–#167, delivering a broad set of correctness, safety, and observability improvements across the meeting protocol, conflict resolution, message bus, decomposition, and routing subsystems. The changes are generally well-structured and thoroughly tested (3,634 passing unit tests, 18 new parsing tests, updated integration tests). Key highlights:
Confidence Score: 4/5
Last reviewed commit: 1317bea |
- Fix regex patterns in meeting _parsing.py to prevent false header matches on list items and capture multi-line list items - Change MEETING_BUDGET_EXHAUSTED to MEETING_VALIDATION_FAILED for negative token validation in _token_tracker.py - Add assignee validation against meeting participants in all 3 protocols (prompt injection defense) - Fix _waiters decrement in bus_memory.py to avoid orphan entries after unsubscribe - Add empty-contributions-must-have-zero-totals validation in MeetingMinutes - Use MEETING_INTERNAL_ERROR event constant instead of dynamic f-string event names - Upgrade except* log level from debug to warning in parallel.py - Add _known_agents existence check in hierarchy get_lowest_common_manager - Import _MIN_POSITIONS from models instead of redefining in conflict resolution service - Update DESIGN_SPEC.md and README.md for accuracy - Add/update tests for all changes
| # Pattern for "assignee: <name>" or "(assigned to <name>)" at end of line | ||
| _ASSIGNEE_RE = re.compile( | ||
| r"(?:" | ||
| r"\(?assigned?\s+to:?\s*(.+?)\)?" |
There was a problem hiding this comment.
assigned? regex matches "assign to X" unintentionally
The d in assigned? is optional, so the pattern also matches assign to <name> (without the past-tense d). This means any action item whose description contains the phrase "assign to" (e.g., - We need to assign to the platform team) will have its assignee silently extracted as the platform team and its description truncated — even though the item was not explicitly assigned.
The canonical form from the prompt template is (assigned to <agent_id>), so making d optional is unnecessarily permissive. The fix is to use assigned\s+to instead:
| r"\(?assigned?\s+to:?\s*(.+?)\)?" | |
| r"\(?assigned\s+to:?\s*(.+?)\)?" |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/communication/meeting/_parsing.py
Line: 39
Comment:
**`assigned?` regex matches "assign to X" unintentionally**
The `d` in `assigned?` is optional, so the pattern also matches `assign to <name>` (without the past-tense `d`). This means any action item whose description contains the phrase "assign to" (e.g., `- We need to assign to the platform team`) will have its assignee silently extracted as `the platform team` and its description truncated — even though the item was not explicitly assigned.
The canonical form from the prompt template is `(assigned to <agent_id>)`, so making `d` optional is unnecessarily permissive. The fix is to use `assigned\s+to` instead:
```suggestion
r"\(?assigned\s+to:?\s*(.+?)\)?"
```
How can I resolve this? If you propose a fix, please make it concise.PR #174 — git_worktree.py: - _validate_git_ref now accepts error_cls/event params so merge context raises WorkspaceMergeError and teardown raises WorkspaceCleanupError - _run_git catches asyncio.CancelledError to kill subprocess before re-raising, preventing orphaned git processes PR #172 — task assignment: - TaskAssignmentConfig.strategy validated against known strategy names - max_concurrent_tasks_per_agent now enforced in _score_and_filter_candidates via new AssignmentRequest.max_concurrent_tasks field - TaskAssignmentStrategy protocol docstring documents error signaling contract PR #171 — worktree skill: - rebase uses --left-right --count with triple-dot to detect behind-main - setup reuse path uses correct git worktree add (without -b) - setup handles dirty working tree with stash/abort prompt - status table shows both ahead and behind counts - tree command provides circular dependency recovery guidance PR #170 — meeting parsing: - Fix assigned? regex to assigned (prevents false-positive assignee extraction from "assign to X" in action item descriptions)
…176) ## Summary - Fix CI failures on main: 2 test assertion mismatches in cost-optimized assignment tests + mypy `attr-defined` error in strategy registry test - Address all Greptile post-merge review findings across PRs #170–#175 (14 fixes total) ### PR #175 — Test assertion fixes (CI blockers) - `"no cost data"` → `"insufficient cost data"` to match implementation wording - `unknown-dev` → `known-dev` winner assertion (all-or-nothing fallback, sort stability) - `getattr()` for `_scorer` access on protocol type (Windows/Linux mypy difference) ### PR #174 — Workspace isolation - `_validate_git_ref` raises context-appropriate exception types (`WorkspaceMergeError` in merge, `WorkspaceCleanupError` in teardown) - `_run_git` catches `asyncio.CancelledError` to kill subprocess before re-raising (prevents orphaned git processes) ### PR #172 — Task assignment - `TaskAssignmentConfig.strategy` validated against 6 known strategy names - `max_concurrent_tasks_per_agent` enforced via new `AssignmentRequest.max_concurrent_tasks` field in `_score_and_filter_candidates` - `TaskAssignmentStrategy` protocol docstring documents error signaling contract (raises vs `selected=None`) ### PR #171 — Worktree skill - `rebase` uses `--left-right --count` with triple-dot to detect behind-main worktrees - `setup` reuse path uses `git worktree add` without `-b` for existing branches - `setup` handles dirty working tree with stash/abort prompt - `status` table shows both ahead and behind counts - `tree` provides circular dependency recovery guidance ### PR #170 — Meeting parsing - `assigned?` → `assigned` regex fix (prevents false-positive assignee extraction from "assign to X") ## Test plan - [x] All 3988 tests pass (10 new tests added) - [x] mypy strict: 0 errors (463 source files) - [x] ruff lint + format: all clean - [x] Coverage: 96.53% (threshold: 80%) - [x] Pre-commit hooks pass ## Review coverage Quick mode — automated checks only (lint, type-check, tests, coverage). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
winning_agent_idinfind_losers()test_parsing.py(18 tests), expanded tests across 7 modules, timeout markers, spec name correctionsTest plan
Closes #169