fix: incorporate post-merge feedback + pre-PR review fixes#164
fix: incorporate post-merge feedback + pre-PR review fixes#164
Conversation
Address 13 findings from post-merge bot reviews on PRs #157-#162: Communication layer (PR #157): - Wake blocked receive() callers on unsubscribe via None sentinel - Distinguish shutdown vs timeout in receive() logging - Add AgentMessenger.receive() facade method - Validate MessageHandler.handle() is async at registration - Await cancelled tasks in _await_with_shutdown to prevent warnings Observability (PR #158): - Add log-before-raise to all validators missing it (company.py, schema.py) — 14 raise sites across 11 validators Parallel execution (PR #161): - Log suppressed ExceptionGroup instead of silent pass - Add PARALLEL_AGENT_CANCELLED structured event for cancellations - Fix progress.in_progress semantics (increment after semaphore) - Use PARALLEL_LOCK_RELEASE_ERROR for lock release failures - Remove duplicate plan_parsing.py from DESIGN_SPEC file tree Template inheritance (PR #162): - Update DESIGN_SPEC merge key docs to include merge_id - Preserve merge_id in _expand_single_agent (confirmed bug fix) - Defer deepcopy in _apply_child_agent past _remove early-return Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pre-reviewed by 9 agents, 21 findings addressed: Communication layer: - Extract _log_receive_null helper (receive() was 54 lines, limit 50) - Distinguish unsubscribe wakeup from timeout in receive() logging - Log QueueFull instead of silently suppressing in unsubscribe() - Modernize ensure_future → create_task in _await_with_shutdown - Widen queue type to DeliveryEnvelope | None (remove type: ignore) - Add COMM_RECEIVE_UNSUBSCRIBED event constant Parallel execution: - Move in_progress decrement into _execute_assignment (pair with increment) - Use dedicated PARALLEL_GROUP_SUPPRESSED event at DEBUG for ExceptionGroup - Add group_id context to progress callback error log Template merge: - Fix stale comment: merge key is (role, department, merge_id) - Add module-level _STRIP_KEYS frozenset constant - Warn on missing role in _agent_key (silent degradation) Company validators: - Normalize Team, Department team names, HRRegistry to casefold() Tests: - Add @pytest.mark.unit to test_register_sync_handler_raises - Import PARALLEL_AGENT_CANCELLED constant in test assertion - Improve test class docstrings for clarity - Add test for merge_id + _remove directive combination Docs: - Update DESIGN_SPEC milestone tags (M3→M4/M7 for unimplemented files) - Add missing __init__.py for tools/examples/ in project structure - Fix section 3.1 YAML comment: (M2) → (M3) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR adds unsubscribe/shutdown sentinel semantics to the in-memory message bus, adds AgentMessenger.receive, enforces async MessageHandler.handle at registration, extends template inheritance to use (role, department, merge_id) with _remove support and preserves merge_id during rendering, adds case-insensitive duplicate detection/log warnings in core validators, augments parallel executor error/progress handling, and updates design docs and tooling exports. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Bus as InMemoryMessageBus
participant Queue as SubscriberQueue
participant Receiver as BlockedReceiver
Client->>Bus: subscribe(channel, subscriber_id)
Bus->>Queue: _ensure_queue (create/get queue)
Client->>Receiver: call receive(channel, timeout)
Receiver->>Bus: bus.receive(channel, agent_id, timeout)
Bus->>Queue: await queue.get() (blocks)
Client->>Bus: unsubscribe(channel, subscriber_id)
Bus->>Queue: queue.put_nowait(None) // unsubscribe sentinel
Queue-->>Receiver: queue.get() returns None
Receiver->>Bus: _log_receive_null(reason="UNSUBSCRIBED")
Receiver-->>Client: return None
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the stability, observability, and configurability of the system by addressing a range of post-merge feedback and pre-PR review findings. The changes focus on improving the communication layer's resilience to shutdowns and cancellations, refining the parallel execution engine's progress tracking and error handling, and making the template inheritance system more robust and flexible. Additionally, Pydantic validators now provide clearer logging, and documentation has been updated for accuracy, collectively leading to a more reliable and maintainable codebase. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a substantial set of fixes and improvements across the codebase, addressing feedback from a previous merge and a pre-PR review. The changes enhance robustness, observability, and correctness in several key areas.
Key improvements include:
- Communication Layer: The in-memory message bus is now more robust, using sentinels to unblock waiting receivers on unsubscribe and providing more detailed logging for receive timeouts. The use of
asyncio.create_taskand proper cancellation handling modernizes the asyncio patterns. - Parallel Execution: A critical bug causing a double-decrement of the
in_progresscounter has been fixed, ensuring accurate progress tracking for parallel tasks. Exception handling for task groups has also been improved. - Validation: Pydantic validators throughout the codebase now perform case-insensitive checks for uniqueness and consistently log warnings before raising errors, improving both robustness and debuggability.
- Template Merging: The template inheritance logic has been updated to support a
merge_idfor disambiguating agents, and the implementation has been cleaned up. - API and Tests: A new convenience
receivemethod has been added toAgentMessenger. The pull request is well-supported by new unit tests that cover the new functionality and bug fixes.
Overall, these changes significantly improve the quality and reliability of the framework. The fixes are well-implemented and the new tests provide good coverage.
Note: Security Review did not run due to the size of the PR.
There was a problem hiding this comment.
Pull request overview
This PR consolidates a broad set of post-merge and pre-review fixes across templates, communication, and parallel execution, with a focus on correctness (inheritance targeting, unsubscribe wakeups), observability (new structured events, log-before-raise), and small API ergonomics (AgentMessenger.receive).
Changes:
- Template inheritance hardening: include
merge_idin merge keys, preserve it during render expansion, and centralize strip-keys handling. - Communication layer fixes: unblock
receive()onunsubscribe(), add clearer receive-cause logging/events, addAgentMessenger.receive(), and validate async handler registration. - Parallel execution fixes: correct progress semantics under concurrency limits, improve cancellation logging, and add structured events for suppressed ExceptionGroups / lock-release failures.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/ai_company/templates/schema.py |
Adds structured warning logs for schema validation failures (log-before-raise). |
src/ai_company/templates/renderer.py |
Preserves merge_id in expanded agent dicts to support inheritance disambiguation. |
src/ai_company/templates/merge.py |
Merges agents by (role, department, merge_id), centralizes strip keys, and warns on missing role to avoid silent collisions. |
src/ai_company/observability/events/template.py |
Adds TEMPLATE_SCHEMA_VALIDATION_ERROR event constant for schema validator logs. |
src/ai_company/observability/events/parallel.py |
Adds new parallel execution event constants (cancelled, lock release error, group suppressed). |
src/ai_company/observability/events/communication.py |
Adds COMM_RECEIVE_SHUTDOWN and COMM_RECEIVE_UNSUBSCRIBED event constants. |
src/ai_company/engine/parallel.py |
Fixes in-progress tracking placement, adds cancellation logging, and improves observability around TaskGroup exception suppression and lock release errors. |
src/ai_company/core/company.py |
Normalizes duplicate detection (case-insensitive) and adds validation warning logs before raising. |
src/ai_company/communication/messenger.py |
Adds AgentMessenger.receive() convenience wrapper returning `DeliveryEnvelope |
src/ai_company/communication/dispatcher.py |
Rejects sync MessageHandler.handle() implementations at registration time. |
src/ai_company/communication/bus_memory.py |
Wakes blocked receive() calls on unsubscribe() via sentinel and improves null-receive cause logging; modernizes task creation and cancellation cleanup. |
tests/unit/templates/test_renderer.py |
Adds coverage ensuring _expand_single_agent() preserves/omits merge_id as intended. |
tests/unit/templates/test_inheritance.py |
Adds coverage for targeting/removing specific agents by merge_id in inheritance merges. |
tests/unit/engine/test_parallel.py |
Adds coverage for cancellation logging and in_progress semantics under concurrency limits. |
tests/unit/communication/test_messenger.py |
Adds coverage for AgentMessenger.receive() delegation. |
tests/unit/communication/test_dispatcher.py |
Adds coverage ensuring sync handlers are rejected at registration. |
tests/unit/communication/test_bus_memory.py |
Adds coverage ensuring unsubscribe() wakes a blocked receive(). |
DESIGN_SPEC.md |
Aligns design documentation with updated implementation details (milestones, file tree, template inheritance key). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.warning( | ||
| PARALLEL_AGENT_CANCELLED, | ||
| agent_id=agent_id, | ||
| task_id=task_id, |
There was a problem hiding this comment.
PARALLEL_AGENT_CANCELLED log is missing group_id, which makes it harder to correlate with the rest of the parallel execution logs (start/complete/error all include group_id). Consider including group_id=group.group_id in this event as well.
| task_id=task_id, | |
| task_id=task_id, | |
| group_id=group.group_id, |
| queue = self._ensure_queue(channel_name, subscriber_id) | ||
| result = await self._await_with_shutdown(queue, timeout) | ||
| if result is None: | ||
| self._log_receive_null(channel_name, subscriber_id, timeout) | ||
| return result |
There was a problem hiding this comment.
receive() can return None due to the unsubscribe wake-up sentinel (see COMM_RECEIVE_UNSUBSCRIBED path), not just timeout/shutdown. The function docstring still only mentions timeout/shutdown for None; please update it to include the unsubscribe case so callers don’t misinterpret the cause.
| timeout: Max seconds to wait, or ``None`` for indefinite. | ||
|
|
||
| Returns: | ||
| The next delivery envelope, or ``None`` on timeout/shutdown. |
There was a problem hiding this comment.
The docstring says None is returned only on timeout/shutdown, but the underlying bus may also return None when an in-flight receive() is woken by unsubscribe() (sentinel). Consider documenting the unsubscribe wake-up case here as well, since this method is the primary public-facing API.
| The next delivery envelope, or ``None`` on timeout/shutdown. | |
| The next delivery envelope, or ``None`` on timeout, shutdown, | |
| or when an in-flight receive is woken by :meth:`unsubscribe`. |
Greptile SummaryThis PR addresses post-merge feedback from #163 and 21 additional findings from a 9-agent pre-PR review pipeline. The changes span the communication layer, parallel execution engine, template merge system, company validators, and documentation. The overall quality is high — each fix is well-scoped, correctly implemented, and covered by targeted tests. Key changes reviewed:
Confidence Score: 4/5
Sequence DiagramsequenceDiagram
participant Agent
participant AgentMessenger
participant InMemoryMessageBus
participant Queue
Note over Agent,Queue: Normal receive flow
Agent->>AgentMessenger: receive(channel, timeout)
AgentMessenger->>InMemoryMessageBus: receive(channel, agent_id, timeout)
InMemoryMessageBus->>InMemoryMessageBus: _ensure_queue(channel, agent_id)
InMemoryMessageBus->>InMemoryMessageBus: _await_with_shutdown(queue, timeout)
InMemoryMessageBus->>Queue: create_task(queue.get())
Queue-->>InMemoryMessageBus: DeliveryEnvelope
InMemoryMessageBus-->>AgentMessenger: DeliveryEnvelope
AgentMessenger-->>Agent: DeliveryEnvelope
Note over Agent,Queue: Unsubscribe wakeup flow (new)
Agent->>AgentMessenger: receive(channel, timeout=None)
AgentMessenger->>InMemoryMessageBus: receive(channel, agent_id)
InMemoryMessageBus->>Queue: create_task(queue.get()) [blocking]
Note over InMemoryMessageBus: Concurrent unsubscribe call
Agent->>InMemoryMessageBus: unsubscribe(channel, agent_id)
InMemoryMessageBus->>InMemoryMessageBus: _queues.pop(key) → removes queue
InMemoryMessageBus->>Queue: put_nowait(None) [sentinel]
Queue-->>InMemoryMessageBus: get() returns None
InMemoryMessageBus->>InMemoryMessageBus: _log_receive_null()
Note over InMemoryMessageBus: key not in _queues → COMM_RECEIVE_UNSUBSCRIBED
InMemoryMessageBus-->>AgentMessenger: None
AgentMessenger-->>Agent: None
Note over Agent,Queue: Shutdown wakeup flow
InMemoryMessageBus->>InMemoryMessageBus: stop() → _shutdown_event.set()
InMemoryMessageBus->>InMemoryMessageBus: shutdown_task completes
InMemoryMessageBus->>InMemoryMessageBus: _log_receive_null()
Note over InMemoryMessageBus: shutdown_event.is_set() → COMM_RECEIVE_SHUTDOWN
InMemoryMessageBus-->>AgentMessenger: None
AgentMessenger-->>Agent: None
Last reviewed commit: 23bb067 |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/ai_company/templates/schema.py (1)
349-356:⚠️ Potential issue | 🟡 MinorMake department-name uniqueness case-insensitive at the schema boundary.
This check still compares raw names, so
Engineeringandengineeringboth pass here even though downstream department handling is case-insensitive. Fold names before theset/Countercheck so invalid templates fail during schema validation instead of later.As per coding guidelines, "Validate at system boundaries (user input, external APIs, config files)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/templates/schema.py` around lines 349 - 356, The _validate_unique_department_names method currently compares raw department.name values so different-cased names slip through; update it to perform a case-insensitive uniqueness check by folding names (use str.casefold()) before building the list/set/Counter (e.g., map department.name.casefold() into names) while keeping the same error flow that logs TEMPLATE_SCHEMA_VALIDATION_ERROR via logger and raises ValueError when duplicates are found; ensure you still compute dupes from the folded names so the schema validation fails for names like "Engineering" vs "engineering".src/ai_company/engine/parallel.py (1)
157-165:⚠️ Potential issue | 🟠 MajorDo not suppress resource-lock release failures.
If
release_all()raises here, the caller still sees a normal group result while the sharedResourceLockmay remain poisoned. That can show up later as false conflicts or stuck groups. Log it, but also surface it as aParallelExecutionError(or aggregate it with other fatal errors) instead of continuing silently.As per coding guidelines, "Handle errors explicitly; never silently swallow errors".🛠️ Suggested direction
- try: + release_error: Exception | None = None + try: if lock is not None: await self._acquire_all_locks(group, lock) await self._run_task_group( group, outcomes, fatal_errors, progress, ) finally: if lock is not None: try: await self._release_all_locks(group, lock) - except Exception: + except Exception as exc: logger.exception( PARALLEL_LOCK_RELEASE_ERROR, error="Failed to release resource locks", group_id=group.group_id, ) + release_error = exc + + if release_error is not None: + msg = ( + f"Parallel group {group.group_id!r} completed but " + "resource locks could not be released" + ) + raise ParallelExecutionError(msg) from release_error🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/parallel.py` around lines 157 - 165, The try/except around await self._release_all_locks(group, lock) currently swallows exceptions; change it to catch the exception into a variable (e), log it with logger.exception as now, and then surface it by raising a ParallelExecutionError (or appending it to whatever aggregate/fatal errors collection is used for the group) so the caller does not treat the group as successful; update the handling in the surrounding code to ensure the raised ParallelExecutionError (or aggregated error) is propagated instead of allowing normal group results to continue.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/communication/bus_memory.py`:
- Around line 401-405: The unsubscribe path currently pops the queue from
_queues and puts a single None sentinel (queue.put_nowait(None)), which only
wakes one pending receive(); modify the unsubscribe logic in
BusMemory.unsubscribe (or wherever the pop and put_nowait occur) to wake all
pending receivers: either (a) track the number of waiters on each (channel_name,
subscriber_id) and enqueue that many None sentinels before removing the queue,
or (b) add a per-subscription asyncio.Event/flag (e.g.,
subscription_closed_event stored alongside the queue) that unsubscribe sets and
each receive() checks and returns when the event is set; ensure receive() checks
the event or drains all sentinels so no receivers remain blocked after
unsubscribe.
In `@src/ai_company/communication/dispatcher.py`:
- Around line 102-107: The invalid-handler warning emits only a free-form error
message; update the logger.warning call where the async-check fires (in the
block that builds msg for MessageHandler and checks handler.handle) to include
structured fields agent_id, handler_name (use type(handler).__name__ or
handler.name if available) and handler_type (e.g., type(handler)), e.g.
logger.warning(COMM_HANDLER_INVALID, error=msg, agent_id=agent_id,
handler_name=type(handler).__name__, handler_type=type(handler).__name__), then
keep raising the TypeError(msg) as before so the event can be filtered without
parsing the message; reference the existing symbols logger,
COMM_HANDLER_INVALID, handler, and the raised TypeError.
In `@src/ai_company/communication/messenger.py`:
- Around line 281-288: Update the docstring for the Receive the next message
wrapper (the receive_next_message method in messenger.py) to document that the
underlying bus may also return None when an unsubscribe wakes a blocked receive;
explicitly list "unsubscribe" alongside "timeout/shutdown" as a None return path
so callers know to handle unsubscribe as possible. Mention the return value
remains "The next delivery envelope, or None on timeout/shutdown/unsubscribe"
and keep the rest of the description unchanged.
In `@src/ai_company/templates/renderer.py`:
- Around line 636-639: Only copy the internal merge_id into agent_dict when the
template actually uses inheritance: check the template's inheritance flag (e.g.
agent.get("has_extends") or the local has_extends variable) before setting
agent_dict["merge_id"] = merge_id; leave merge_id out for standalone templates
so merge_template_configs() isn't relied on to strip it later. Ensure you
reference the existing agent.get("merge_id") call and gate the assignment with
the has_extends check.
In `@tests/unit/engine/test_parallel.py`:
- Around line 659-678: The test currently only asserts max_in_progress <= 2
which would pass if no progress was ever reported; update the test to also
assert that progress was actually observed by checking that at least one
progress callback recorded in_progress > 0 (e.g., inspect progress_updates or
verify max_in_progress > 0) after calling ParallelExecutor.execute_group on the
group created by _make_group; reference the on_progress callback,
progress_updates list, max_in_progress variable, ParallelExecutor.execute_group,
and _make_group when adding this assertion.
- Around line 615-639: The test is flaky because the first side_effect raises
before any await so the sibling may be cancelled before it starts; modify the
test to synchronize starts with a barrier/event so both agent runs enter an
await point before the failure is raised. Concretely, add an asyncio.Event or
barrier (e.g., start_evt) captured by the test and have the side_effect (the
AsyncMock assigned to engine.run) await start_evt.wait() early in its body for
every call; then in the test set start_evt.set() only after both runs have begun
(or after a tiny sleep) so the failing branch raises after entering the await
and the sibling has started and can be cancelled, ensuring a deterministic
emission of PARALLEL_AGENT_CANCELLED when ParallelExecutor.execute_group is
called with fail_fast=True and group from _make_group.
In `@tests/unit/templates/test_inheritance.py`:
- Around line 501-530: The tests currently don't prove merge_id-based
disambiguation because both sibling parents are identical aside from merge_id
and the child overrides the first one; update the fixtures in
test_child_targets_specific_agent_by_merge_id and the analogous test at 544-570
so sibling parent agents have distinct distinguishing fields (e.g., different
"level" or another field) and then have the child target the non-first duplicate
by specifying its merge_id (e.g., "backend") so the assertion verifies the
correct sibling was overridden/removed; locate and change the parent list
entries and the child entry in those test functions and assert the targeted
sibling's field changed while the other remained unchanged, relying on
_merge_agents to perform merge_id-based matching.
---
Outside diff comments:
In `@src/ai_company/engine/parallel.py`:
- Around line 157-165: The try/except around await
self._release_all_locks(group, lock) currently swallows exceptions; change it to
catch the exception into a variable (e), log it with logger.exception as now,
and then surface it by raising a ParallelExecutionError (or appending it to
whatever aggregate/fatal errors collection is used for the group) so the caller
does not treat the group as successful; update the handling in the surrounding
code to ensure the raised ParallelExecutionError (or aggregated error) is
propagated instead of allowing normal group results to continue.
In `@src/ai_company/templates/schema.py`:
- Around line 349-356: The _validate_unique_department_names method currently
compares raw department.name values so different-cased names slip through;
update it to perform a case-insensitive uniqueness check by folding names (use
str.casefold()) before building the list/set/Counter (e.g., map
department.name.casefold() into names) while keeping the same error flow that
logs TEMPLATE_SCHEMA_VALIDATION_ERROR via logger and raises ValueError when
duplicates are found; ensure you still compute dupes from the folded names so
the schema validation fails for names like "Engineering" vs "engineering".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e1826fcf-d595-4d30-ac53-4d1eeebef0e3
📒 Files selected for processing (18)
DESIGN_SPEC.mdsrc/ai_company/communication/bus_memory.pysrc/ai_company/communication/dispatcher.pysrc/ai_company/communication/messenger.pysrc/ai_company/core/company.pysrc/ai_company/engine/parallel.pysrc/ai_company/observability/events/communication.pysrc/ai_company/observability/events/parallel.pysrc/ai_company/observability/events/template.pysrc/ai_company/templates/merge.pysrc/ai_company/templates/renderer.pysrc/ai_company/templates/schema.pytests/unit/communication/test_bus_memory.pytests/unit/communication/test_dispatcher.pytests/unit/communication/test_messenger.pytests/unit/engine/test_parallel.pytests/unit/templates/test_inheritance.pytests/unit/templates/test_renderer.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotationsin Python 3.14+ code — Python 3.14 has native PEP 649 lazy annotations
Useexcept A, B:(no parentheses) for exception syntax in Python 3.14 — PEP 758 except syntax enforced by ruff
All public functions and classes must have type hints; mypy strict mode is enforced
All public classes and functions must have Google-style docstrings; ruff D rules enforce this
Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement of non-Pydantic internal collections (registries, BaseTool)
For non-Pydantic internal collections and dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow errors
Validate at system boundaries (user input, external APIs, config files)
Line length must be 88 characters (enforced by ruff)
Files:
tests/unit/communication/test_dispatcher.pysrc/ai_company/templates/renderer.pytests/unit/templates/test_inheritance.pysrc/ai_company/observability/events/template.pytests/unit/communication/test_bus_memory.pysrc/ai_company/templates/merge.pytests/unit/engine/test_parallel.pysrc/ai_company/observability/events/parallel.pysrc/ai_company/core/company.pysrc/ai_company/templates/schema.pysrc/ai_company/observability/events/communication.pytests/unit/templates/test_renderer.pysrc/ai_company/communication/dispatcher.pytests/unit/communication/test_messenger.pysrc/ai_company/communication/messenger.pysrc/ai_company/engine/parallel.pysrc/ai_company/communication/bus_memory.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slowas appropriate
Prefer@pytest.mark.parametrizefor testing similar cases
Use vendor-agnostic test names: test-provider, test-small-001, etc. instead of real vendor names
Files:
tests/unit/communication/test_dispatcher.pytests/unit/templates/test_inheritance.pytests/unit/communication/test_bus_memory.pytests/unit/engine/test_parallel.pytests/unit/templates/test_renderer.pytests/unit/communication/test_messenger.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic must importfrom ai_company.observability import get_loggerand createlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Always use variable namelogger(not_logger, notlog) for the logger instance
Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Always use structured logging with kwargs:logger.info(EVENT, key=value)— never use formatted strings likelogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG logging is for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/templates/renderer.pysrc/ai_company/observability/events/template.pysrc/ai_company/templates/merge.pysrc/ai_company/observability/events/parallel.pysrc/ai_company/core/company.pysrc/ai_company/templates/schema.pysrc/ai_company/observability/events/communication.pysrc/ai_company/communication/dispatcher.pysrc/ai_company/communication/messenger.pysrc/ai_company/engine/parallel.pysrc/ai_company/communication/bus_memory.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases
Files:
src/ai_company/templates/renderer.pysrc/ai_company/observability/events/template.pysrc/ai_company/templates/merge.pysrc/ai_company/observability/events/parallel.pysrc/ai_company/core/company.pysrc/ai_company/templates/schema.pysrc/ai_company/observability/events/communication.pysrc/ai_company/communication/dispatcher.pysrc/ai_company/communication/messenger.pysrc/ai_company/engine/parallel.pysrc/ai_company/communication/bus_memory.py
src/ai_company/engine/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains
Files:
src/ai_company/engine/parallel.py
🧠 Learnings (8)
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Applied to files:
src/ai_company/observability/events/parallel.pysrc/ai_company/observability/events/communication.pysrc/ai_company/engine/parallel.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
src/ai_company/core/company.pysrc/ai_company/templates/schema.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : DEBUG logging is for object creation, internal flow, and entry/exit of key functions
Applied to files:
src/ai_company/core/company.pysrc/ai_company/templates/schema.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : All state transitions must log at INFO level
Applied to files:
src/ai_company/core/company.pysrc/ai_company/templates/schema.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use structured logging with kwargs: `logger.info(EVENT, key=value)` — never use formatted strings like `logger.info("msg %s", val)`
Applied to files:
src/ai_company/core/company.pysrc/ai_company/templates/schema.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic must import `from ai_company.observability import get_logger` and create `logger = get_logger(__name__)`
Applied to files:
src/ai_company/core/company.pysrc/ai_company/templates/schema.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use variable name `logger` (not `_logger`, not `log`) for the logger instance
Applied to files:
src/ai_company/templates/schema.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to **/*.py : Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Applied to files:
DESIGN_SPEC.mdsrc/ai_company/engine/parallel.py
🧬 Code graph analysis (11)
tests/unit/communication/test_dispatcher.py (2)
src/ai_company/communication/dispatcher.py (1)
register(77-122)tests/unit/communication/test_handler.py (1)
handle(44-45)
tests/unit/templates/test_inheritance.py (1)
src/ai_company/templates/merge.py (1)
_merge_agents(103-145)
tests/unit/communication/test_bus_memory.py (1)
src/ai_company/communication/bus_memory.py (3)
subscribe(325-373)receive(418-459)unsubscribe(375-416)
tests/unit/engine/test_parallel.py (4)
tests/unit/engine/test_parallel_models.py (2)
_make_assignment(73-83)_make_run_result(86-112)src/ai_company/engine/run_result.py (1)
AgentRunResult(19-109)src/ai_company/engine/parallel.py (1)
ParallelExecutor(80-569)src/ai_company/engine/parallel_models.py (2)
ParallelProgress(251-289)all_succeeded(246-248)
src/ai_company/templates/schema.py (1)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)
tests/unit/templates/test_renderer.py (1)
src/ai_company/templates/renderer.py (1)
_expand_single_agent(580-641)
src/ai_company/communication/dispatcher.py (3)
tests/unit/communication/test_messenger.py (4)
handler(359-360)handler(380-381)handler(416-417)handler(472-473)tests/unit/communication/test_dispatcher.py (3)
handle(39-40)handle(49-50)handle(97-98)src/ai_company/communication/handler.py (2)
handle(24-30)handle(63-69)
tests/unit/communication/test_messenger.py (3)
src/ai_company/communication/bus_memory.py (1)
receive(418-459)src/ai_company/communication/messenger.py (1)
receive(275-294)src/ai_company/communication/bus_protocol.py (1)
receive(123-150)
src/ai_company/communication/messenger.py (3)
src/ai_company/communication/subscription.py (2)
DeliveryEnvelope(25-41)Subscription(9-22)src/ai_company/communication/bus_memory.py (1)
receive(418-459)src/ai_company/communication/bus_protocol.py (1)
receive(123-150)
src/ai_company/engine/parallel.py (4)
src/ai_company/engine/parallel_models.py (4)
agent_id(79-81)task_id(87-89)AgentOutcome(143-193)is_success(191-193)src/ai_company/engine/run_result.py (1)
AgentRunResult(19-109)src/ai_company/engine/agent_engine.py (1)
run(120-203)src/ai_company/tools/sandbox/result.py (1)
success(26-28)
src/ai_company/communication/bus_memory.py (1)
src/ai_company/communication/subscription.py (1)
DeliveryEnvelope(25-41)
🔇 Additional comments (4)
src/ai_company/core/company.py (1)
41-42: Validation failures are now much easier to diagnose.The case-insensitive duplicate checks plus the pre-raise
COMPANY_VALIDATION_ERRORlogs make these model failures far more observable without changing the frozen-model contract.Based on learnings: "Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR with context before raising"
Also applies to: 246-252, 297-317, 383-389
tests/unit/templates/test_renderer.py (1)
516-540: Good merge_id regression coverage.These cases pin the preserve-vs-omit behavior around
_expand_single_agent(), which should help keep the inheritance merge-key contract from drifting.tests/unit/communication/test_messenger.py (1)
329-342: Good delegation coverage.This pins the new facade to the expected bus call shape and verifies that the
Noneresult is passed through unchanged.tests/unit/communication/test_bus_memory.py (1)
268-289: Nice regression test for the unsubscribe wake-up path.This covers the new sentinel behavior directly and uses
TaskGroupso the blocked receiver and unsubscribe flow are joined cleanly.
| queue = self._queues.pop((channel_name, subscriber_id), None) | ||
| if queue is not None: | ||
| try: | ||
| queue.put_nowait(None) | ||
| except asyncio.QueueFull: |
There was a problem hiding this comment.
A single unsubscribe sentinel only wakes one waiter.
If two receive() calls are pending on the same (channel, subscriber) queue, one consumes the None and the rest stay blocked forever because the queue has already been removed from _queues. Use a per-subscription event, or otherwise broadcast the unsubscribe signal, so every pending receiver wakes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/bus_memory.py` around lines 401 - 405, The
unsubscribe path currently pops the queue from _queues and puts a single None
sentinel (queue.put_nowait(None)), which only wakes one pending receive();
modify the unsubscribe logic in BusMemory.unsubscribe (or wherever the pop and
put_nowait occur) to wake all pending receivers: either (a) track the number of
waiters on each (channel_name, subscriber_id) and enqueue that many None
sentinels before removing the queue, or (b) add a per-subscription
asyncio.Event/flag (e.g., subscription_closed_event stored alongside the queue)
that unsubscribe sets and each receive() checks and returns when the event is
set; ensure receive() checks the event or drains all sentinels so no receivers
remain blocked after unsubscribe.
| call_count = 0 | ||
|
|
||
| async def side_effect(**kwargs: object) -> AgentRunResult: | ||
| nonlocal call_count | ||
| call_count += 1 | ||
| if call_count == 1: | ||
| msg = "fail fast" | ||
| raise RuntimeError(msg) | ||
| await asyncio.sleep(10) | ||
| identity = kwargs.get("identity") | ||
| task = kwargs.get("task") | ||
| return _make_run_result(identity, task) # type: ignore[arg-type] | ||
|
|
||
| engine = _mock_engine() | ||
| engine.run = AsyncMock(side_effect=side_effect) | ||
| executor = ParallelExecutor(engine=engine) | ||
| group = _make_group(a1, a2, fail_fast=True) | ||
|
|
||
| with structlog.testing.capture_logs() as cap: | ||
| await executor.execute_group(group) | ||
|
|
||
| cancelled_events = [ | ||
| e for e in cap if e.get("event") == PARALLEL_AGENT_CANCELLED | ||
| ] | ||
| assert len(cancelled_events) >= 1 |
There was a problem hiding this comment.
Make this cancellation log test deterministic.
The failing branch raises before its first await, so fail-fast can cancel the sibling before it ever starts. In that scheduling order no PARALLEL_AGENT_CANCELLED event is emitted, which makes this test flaky.
💡 Stabilize the setup with a barrier
call_count = 0
+ peer_started = asyncio.Event()
async def side_effect(**kwargs: object) -> AgentRunResult:
nonlocal call_count
call_count += 1
if call_count == 1:
+ await peer_started.wait()
msg = "fail fast"
raise RuntimeError(msg)
+ peer_started.set()
await asyncio.sleep(10)
identity = kwargs.get("identity")
task = kwargs.get("task")
return _make_run_result(identity, task) # type: ignore[arg-type]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| call_count = 0 | |
| async def side_effect(**kwargs: object) -> AgentRunResult: | |
| nonlocal call_count | |
| call_count += 1 | |
| if call_count == 1: | |
| msg = "fail fast" | |
| raise RuntimeError(msg) | |
| await asyncio.sleep(10) | |
| identity = kwargs.get("identity") | |
| task = kwargs.get("task") | |
| return _make_run_result(identity, task) # type: ignore[arg-type] | |
| engine = _mock_engine() | |
| engine.run = AsyncMock(side_effect=side_effect) | |
| executor = ParallelExecutor(engine=engine) | |
| group = _make_group(a1, a2, fail_fast=True) | |
| with structlog.testing.capture_logs() as cap: | |
| await executor.execute_group(group) | |
| cancelled_events = [ | |
| e for e in cap if e.get("event") == PARALLEL_AGENT_CANCELLED | |
| ] | |
| assert len(cancelled_events) >= 1 | |
| call_count = 0 | |
| peer_started = asyncio.Event() | |
| async def side_effect(**kwargs: object) -> AgentRunResult: | |
| nonlocal call_count | |
| call_count += 1 | |
| if call_count == 1: | |
| await peer_started.wait() | |
| msg = "fail fast" | |
| raise RuntimeError(msg) | |
| peer_started.set() | |
| await asyncio.sleep(10) | |
| identity = kwargs.get("identity") | |
| task = kwargs.get("task") | |
| return _make_run_result(identity, task) # type: ignore[arg-type] | |
| engine = _mock_engine() | |
| engine.run = AsyncMock(side_effect=side_effect) | |
| executor = ParallelExecutor(engine=engine) | |
| group = _make_group(a1, a2, fail_fast=True) | |
| with structlog.testing.capture_logs() as cap: | |
| await executor.execute_group(group) | |
| cancelled_events = [ | |
| e for e in cap if e.get("event") == PARALLEL_AGENT_CANCELLED | |
| ] | |
| assert len(cancelled_events) >= 1 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/engine/test_parallel.py` around lines 615 - 639, The test is flaky
because the first side_effect raises before any await so the sibling may be
cancelled before it starts; modify the test to synchronize starts with a
barrier/event so both agent runs enter an await point before the failure is
raised. Concretely, add an asyncio.Event or barrier (e.g., start_evt) captured
by the test and have the side_effect (the AsyncMock assigned to engine.run)
await start_evt.wait() early in its body for every call; then in the test set
start_evt.set() only after both runs have begun (or after a tiny sleep) so the
failing branch raises after entering the await and the sibling has started and
can be cancelled, ensuring a deterministic emission of PARALLEL_AGENT_CANCELLED
when ParallelExecutor.execute_group is called with fail_fast=True and group from
_make_group.
…ilot, and Greptile - Error messages now show original names instead of casefolded (3 validators) - merge_id only preserved when inheritance is active (standalone templates don't leak it) - Lock release failure surfaced as ParallelExecutionError instead of silently swallowed - ExceptionGroup logged at WARNING instead of DEBUG to surface infrastructure bugs - PARALLEL_AGENT_CANCELLED now includes group_id for log correlation - receive() docstrings document unsubscribe as None return cause - Invalid handler warning includes structured context (agent_id, handler_name, handler_type) - _merge_agents docstring updated to mention merge_id in strip keys - QueueFull uses dedicated COMM_UNSUBSCRIBE_SENTINEL_FAILED event constant - Department name uniqueness in schema.py now case-insensitive - _agent_key raises TemplateInheritanceError on missing role instead of warning-only - parallel.py module docstring: "Follows" → "Inspired by" - receive() docstring notes single-waiter contract per subscription - Tests: case-insensitive normalization (3 new), merge_id disambiguation (strengthened), cancellation test stabilized with barrier, in_progress assertion tightened, standalone merge_id omission, missing-role raises Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/ai_company/templates/merge.py (1)
204-211: 🧹 Nitpick | 🔵 TrivialMinor: redundant local alias.
The local alias
_strip_keys = _STRIP_KEYSon line 204 provides marginal lookup optimization but reduces clarity. Consider using_STRIP_KEYSdirectly in the comprehension for consistency with_apply_child_agent.♻️ Suggested simplification
def _collect_merged_agents( parent_entries: dict[tuple[str, str, str], list[_ParentEntry]], appended: list[dict[str, Any]], ) -> list[dict[str, Any]]: """Collect surviving parent agents (in order) + appended.""" all_entries = sorted( (entry for entries in parent_entries.values() for entry in entries), key=lambda e: e.index, ) - _strip_keys = _STRIP_KEYS result: list[dict[str, Any]] = [ - {k: v for k, v in entry.agent.items() if k not in _strip_keys} + {k: v for k, v in entry.agent.items() if k not in _STRIP_KEYS} for entry in all_entries if entry.agent is not None ] result.extend(appended) return result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/templates/merge.py` around lines 204 - 211, Remove the redundant local alias _strip_keys and use the module-level _STRIP_KEYS directly in the list comprehension that builds result (the comprehension iterating over all_entries and filtering entry.agent), mirroring the style used in _apply_child_agent; leave result.extend(appended) and return result unchanged, and ensure there are no other references to _strip_keys in the function.src/ai_company/communication/bus_memory.py (1)
519-522:⚠️ Potential issue | 🟠 MajorAwait the helper tasks in the cancellation path.
The
except BaseExceptionbranch cancelsget_taskandshutdown_taskand immediately re-raises without awaiting them. This leaves those tasks un-awaited when the outerreceive()coroutine is cancelled whileasyncio.wait()is pending, which can trigger "Task was destroyed but it is pending!" warnings at shutdown. Child tasks created withasyncio.create_task()must be awaited after cancellation before re-raising, following the canonical Python pattern: cancel → await → re-raise.Suggested cleanup
except BaseException: get_task.cancel() shutdown_task.cancel() + with contextlib.suppress(asyncio.CancelledError): + await get_task + with contextlib.suppress(asyncio.CancelledError): + await shutdown_task raise🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/bus_memory.py` around lines 519 - 522, In the except BaseException branch inside receive(), after calling get_task.cancel() and shutdown_task.cancel() await both tasks (e.g., await get_task and await shutdown_task) and handle their CancelledError/other exceptions before re-raising; follow the canonical cancel→await→re-raise pattern so child tasks created with asyncio.create_task() are properly awaited and cleaned up (use try/except around each await to swallow CancelledError but propagate the original exception afterwards).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/communication/bus_memory.py`:
- Around line 464-493: Change _await_with_shutdown to return a tuple (result,
reason) where reason is a literal/enum like "shutdown" | "timeout" |
"unsubscribed"; update callers (e.g., receive()) to unpack that tuple and pass
the explicit reason into _log_receive_null instead of letting _log_receive_null
re-check _shutdown_event or _queues; modify _log_receive_null signature to
accept the reason and log COMM_RECEIVE_SHUTDOWN / COMM_RECEIVE_TIMEOUT /
COMM_RECEIVE_UNSUBSCRIBED based solely on the supplied reason so the logged
cause matches the actual wake-up source.
In `@src/ai_company/communication/messenger.py`:
- Around line 275-280: The receive method's channel_name parameter is currently
typed as plain str which allows blank names; update the signature of async def
receive(self, channel_name: str, *, timeout: float | None = None) ->
DeliveryEnvelope | None to use NotBlankStr from core.types (i.e., channel_name:
NotBlankStr) to enforce non-blank identifiers at the messenger boundary, and
update any imports to include NotBlankStr and adjust any callers/tests if they
rely on a plain str type to ensure compatibility with MessageBus.receive()
typing.
In `@src/ai_company/engine/parallel.py`:
- Around line 403-404: The finally block currently masks bookkeeping bugs by
clamping progress.in_progress with max(0, ...); change it to perform a plain
decrement of progress.in_progress (i.e., progress.in_progress -= 1) and then
explicitly check for negative values—if progress.in_progress < 0, log an error
or raise an exception (or assert) so the regression is visible; update the code
in the same scope where progress.in_progress is modified (the finally block
around the parallel task handling) and ensure any negative-counter handling
provides sufficient context in the log/exception.
- Around line 157-175: The cleanup currently logs failures from
_release_all_locks() but lets an earlier exception (from _acquire_all_locks() or
_run_task_group()) mask release failures; refactor the try/except/finally so you
capture any original exception, always await self._release_all_locks(group,
lock) in the finally, store any release exception in release_error, and then: if
release_error is not None raise ParallelExecutionError(f"Parallel group
{group.group_id!r} completed but resource locks could not be released") from
release_error (so the lock-release failure wins), otherwise re-raise the
original exception; alternatively extract the cleanup into a helper that does
the same logic around _release_all_locks, using the symbols _acquire_all_locks,
_run_task_group, _release_all_locks, release_error, group.group_id, and
ParallelExecutionError to locate and implement the change.
In `@src/ai_company/templates/merge.py`:
- Around line 279-287: Change the log level from warning to error for the
missing 'role' condition: in the block that checks "if not role" (where it
builds msg and currently calls logger.warning with TEMPLATE_INHERIT_MERGE_ERROR,
action="missing_role", agent_keys=sorted(agent.keys()), error=msg), replace
logger.warning with logger.error so the code logs an error before raising
TemplateInheritanceError for the agent missing a 'role'.
In `@src/ai_company/templates/schema.py`:
- Around line 49-52: Create a small helper function (e.g.,
_raise_validation_error(msg: str) -> NoReturn) that calls
logger.warning(TEMPLATE_SCHEMA_VALIDATION_ERROR, error=msg) and then raises
ValueError(msg), and replace all inline occurrences of the pattern
logger.warning(...); raise ValueError(msg) across this module (examples found
around the blocks handling self.required/default and other validators referenced
at 49-52, 63-68, 77-82, 145-151, 220-223, 321-334, 341-345) so each validator
calls _raise_validation_error(msg) instead; keep the
TEMPLATE_SCHEMA_VALIDATION_ERROR symbol and the exact message text unchanged.
In `@tests/unit/engine/test_parallel.py`:
- Around line 636-642: Update the test to assert structured cancellation fields
on the captured PARALLEL_AGENT_CANCELLED event: after collecting
cancelled_events from cap (where event == PARALLEL_AGENT_CANCELLED), find at
least one event and assert its "group_id" equals group.group_id (and optionally
check "task_id" and "agent_id" are present/non-empty). Reference the existing
symbols: keep using executor.execute_group(group), PARALLEL_AGENT_CANCELLED, and
the captured log entries from structlog.testing.capture_logs() to locate and add
the extra assertions.
---
Outside diff comments:
In `@src/ai_company/communication/bus_memory.py`:
- Around line 519-522: In the except BaseException branch inside receive(),
after calling get_task.cancel() and shutdown_task.cancel() await both tasks
(e.g., await get_task and await shutdown_task) and handle their
CancelledError/other exceptions before re-raising; follow the canonical
cancel→await→re-raise pattern so child tasks created with asyncio.create_task()
are properly awaited and cleaned up (use try/except around each await to swallow
CancelledError but propagate the original exception afterwards).
In `@src/ai_company/templates/merge.py`:
- Around line 204-211: Remove the redundant local alias _strip_keys and use the
module-level _STRIP_KEYS directly in the list comprehension that builds result
(the comprehension iterating over all_entries and filtering entry.agent),
mirroring the style used in _apply_child_agent; leave result.extend(appended)
and return result unchanged, and ensure there are no other references to
_strip_keys in the function.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: cad05a1a-a504-4370-a468-37917ad0c87d
📒 Files selected for processing (13)
src/ai_company/communication/bus_memory.pysrc/ai_company/communication/dispatcher.pysrc/ai_company/communication/messenger.pysrc/ai_company/core/company.pysrc/ai_company/engine/parallel.pysrc/ai_company/observability/events/communication.pysrc/ai_company/templates/merge.pysrc/ai_company/templates/renderer.pysrc/ai_company/templates/schema.pytests/unit/core/test_company.pytests/unit/engine/test_parallel.pytests/unit/templates/test_inheritance.pytests/unit/templates/test_renderer.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotationsin Python 3.14+ code — Python 3.14 has native PEP 649 lazy annotations
Useexcept A, B:(no parentheses) for exception syntax in Python 3.14 — PEP 758 except syntax enforced by ruff
All public functions and classes must have type hints; mypy strict mode is enforced
All public classes and functions must have Google-style docstrings; ruff D rules enforce this
Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement of non-Pydantic internal collections (registries, BaseTool)
For non-Pydantic internal collections and dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow errors
Validate at system boundaries (user input, external APIs, config files)
Line length must be 88 characters (enforced by ruff)
Files:
src/ai_company/observability/events/communication.pysrc/ai_company/engine/parallel.pysrc/ai_company/communication/messenger.pysrc/ai_company/core/company.pysrc/ai_company/communication/dispatcher.pytests/unit/templates/test_renderer.pytests/unit/core/test_company.pytests/unit/engine/test_parallel.pysrc/ai_company/templates/schema.pysrc/ai_company/templates/merge.pysrc/ai_company/templates/renderer.pytests/unit/templates/test_inheritance.pysrc/ai_company/communication/bus_memory.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic must importfrom ai_company.observability import get_loggerand createlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Always use variable namelogger(not_logger, notlog) for the logger instance
Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Always use structured logging with kwargs:logger.info(EVENT, key=value)— never use formatted strings likelogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG logging is for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/observability/events/communication.pysrc/ai_company/engine/parallel.pysrc/ai_company/communication/messenger.pysrc/ai_company/core/company.pysrc/ai_company/communication/dispatcher.pysrc/ai_company/templates/schema.pysrc/ai_company/templates/merge.pysrc/ai_company/templates/renderer.pysrc/ai_company/communication/bus_memory.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases
Files:
src/ai_company/observability/events/communication.pysrc/ai_company/engine/parallel.pysrc/ai_company/communication/messenger.pysrc/ai_company/core/company.pysrc/ai_company/communication/dispatcher.pysrc/ai_company/templates/schema.pysrc/ai_company/templates/merge.pysrc/ai_company/templates/renderer.pysrc/ai_company/communication/bus_memory.py
src/ai_company/engine/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains
Files:
src/ai_company/engine/parallel.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slowas appropriate
Prefer@pytest.mark.parametrizefor testing similar cases
Use vendor-agnostic test names: test-provider, test-small-001, etc. instead of real vendor names
Files:
tests/unit/templates/test_renderer.pytests/unit/core/test_company.pytests/unit/engine/test_parallel.pytests/unit/templates/test_inheritance.py
🧠 Learnings (6)
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Applied to files:
src/ai_company/observability/events/communication.pysrc/ai_company/engine/parallel.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to **/*.py : Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Applied to files:
src/ai_company/engine/parallel.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
src/ai_company/communication/dispatcher.pysrc/ai_company/templates/schema.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use structured logging with kwargs: `logger.info(EVENT, key=value)` — never use formatted strings like `logger.info("msg %s", val)`
Applied to files:
src/ai_company/communication/dispatcher.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic must import `from ai_company.observability import get_logger` and create `logger = get_logger(__name__)`
Applied to files:
src/ai_company/templates/schema.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use variable name `logger` (not `_logger`, not `log`) for the logger instance
Applied to files:
src/ai_company/templates/schema.py
🧬 Code graph analysis (10)
src/ai_company/engine/parallel.py (3)
src/ai_company/engine/errors.py (1)
ParallelExecutionError(42-43)src/ai_company/engine/parallel_models.py (4)
agent_id(79-81)task_id(87-89)AgentOutcome(143-193)is_success(191-193)src/ai_company/engine/run_result.py (1)
AgentRunResult(19-109)
src/ai_company/communication/messenger.py (3)
src/ai_company/communication/subscription.py (2)
DeliveryEnvelope(25-41)Subscription(9-22)src/ai_company/communication/bus_memory.py (1)
receive(419-466)src/ai_company/communication/bus_protocol.py (1)
receive(123-150)
src/ai_company/communication/dispatcher.py (3)
tests/unit/communication/test_messenger.py (4)
handler(359-360)handler(380-381)handler(416-417)handler(472-473)tests/unit/communication/test_dispatcher.py (3)
handle(39-40)handle(49-50)handle(97-98)src/ai_company/communication/handler.py (2)
handle(24-30)handle(63-69)
tests/unit/templates/test_renderer.py (1)
src/ai_company/templates/renderer.py (1)
_expand_single_agent(580-642)
tests/unit/core/test_company.py (1)
src/ai_company/core/company.py (3)
Team(223-254)Department(257-322)HRRegistry(358-397)
tests/unit/engine/test_parallel.py (2)
src/ai_company/engine/parallel.py (1)
ParallelExecutor(81-580)src/ai_company/engine/parallel_models.py (1)
ParallelProgress(251-289)
src/ai_company/templates/schema.py (2)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/config/schema.py (1)
_validate_unique_department_names(457-464)
src/ai_company/templates/merge.py (1)
src/ai_company/templates/errors.py (1)
TemplateInheritanceError(23-28)
tests/unit/templates/test_inheritance.py (2)
src/ai_company/templates/merge.py (1)
_merge_agents(103-145)src/ai_company/templates/errors.py (1)
TemplateInheritanceError(23-28)
src/ai_company/communication/bus_memory.py (1)
src/ai_company/communication/subscription.py (1)
DeliveryEnvelope(25-41)
🔇 Additional comments (24)
src/ai_company/communication/dispatcher.py (3)
7-7: LGTM!The
inspectimport andCOMM_HANDLER_INVALIDevent constant are correctly added to support the new handler validation logic.Also applies to: 32-32
101-113: LGTM! Properly validates async handler requirement with structured logging.The implementation correctly:
- Uses
inspect.iscoroutinefunction()to detect synchronous handlers at registration time (fail-fast)- Logs with full structured context (
agent_id,handler_name,handler_type,error) before raising- Provides a clear, actionable error message
This addresses the previous review feedback and aligns well with the
MessageHandler.handle()protocol defined inhandler.py.
99-100: Note: FunctionHandler bypasses the async check by design.The
FunctionHandlerwrapper (created for bare async functions) is instantiated before theelifcheck at line 101, so it correctly bypasses theiscoroutinefunctionvalidation. This is safe becauseFunctionHandler.handle()is defined asasync definhandler.py(line 62), making it inherently compliant.src/ai_company/core/company.py (5)
33-43: LGTM!The validator correctly performs case-insensitive self-report detection using
casefold()and logs the validation error before raising, in compliance with coding guidelines.
244-254: LGTM!Well-implemented case-insensitive duplicate detection. The pattern of preserving original member names in the error message (rather than the normalized form) provides a better user experience for debugging.
296-308: LGTM!Consistent with the Team validator pattern—case-insensitive detection via
casefold()while surfacing original team names in the error message.
310-322: LGTM!The validator correctly logs before raising, satisfying the "log at WARNING before raising" guideline.
385-397: LGTM!Case-insensitive duplicate detection for active agents follows the same established pattern as Team and Department validators, maintaining consistency across the codebase.
tests/unit/core/test_company.py (3)
93-100: LGTM!Good test coverage for the case-insensitive duplicate member detection. The test properly validates that members differing only by case (
"Alice"vs"alice") are rejected.
189-199: LGTM!Appropriate test for case-insensitive team name validation. The test structure mirrors the existing
test_duplicate_team_names_rejectedtest while specifically targeting the case-sensitivity behavior.
306-309: LGTM!Concise test for case-insensitive active agent duplicate detection, consistent with the other case-insensitive tests added in this PR.
src/ai_company/templates/merge.py (4)
26-27: LGTM!The module-level
_STRIP_KEYSconstant is well-defined as an immutablefrozenset, providing a single source of truth for keys that should be stripped during merge operations.
80-80: LGTM!The comment accurately reflects the updated merge semantics using the
(role, department, merge_id)key.
118-119: LGTM!Documentation updated to reflect that both
_removeandmerge_idkeys are stripped from the output.
175-185: LGTM!Good refactoring: the early return for
_removepath is clear, and the key-stripping logic using_STRIP_KEYSis applied consistently to both replacements and appended agents.src/ai_company/templates/renderer.py (1)
636-641: LGTM! Past review concern addressed.The implementation now correctly preserves
merge_idonly when inheritance is active (has_extends) and the value is non-empty. The comment clearly documents the rationale that standalone templates have no merge step to strip it later.tests/unit/templates/test_renderer.py (1)
516-553: LGTM!Comprehensive test coverage for
merge_idhandling in_expand_single_agent:
- Preserves
merge_idwhenhas_extends=Trueand value is non-empty- Omits
merge_idwhen empty (even withhas_extends=True)- Omits
merge_idwhenhas_extends=False(standalone templates)This aligns well with the implementation in
renderer.pylines 636-641.tests/unit/templates/test_inheritance.py (4)
499-533: LGTM! Past review concern addressed.The test now properly proves
merge_id-based disambiguation:
- Parent agents have distinguishable
specialtyfields ("frontend"vs"backend")- Child targets the second parent (
merge_id="backend") rather than the first- Assertions verify the correct agent was updated by checking
specialtyrather than relying on positional orderThis ensures the test would fail if
_merge_agentsfell back to positional matching.
535-545: LGTM!Good coverage for verifying that
merge_idis stripped from the final output, consistent with_STRIP_KEYShandling in the merge logic.
547-575: LGTM!The
_removetest now properly provesmerge_id-targeted removal by:
- Using distinguishable
specialtyfields on parent agents- Removing the
frontendagent specifically- Asserting the remaining agent has
specialty="backend"
577-581: LGTM!Good coverage for the error case where an agent dict is missing the required
rolefield, validating the new_agent_keybehavior inmerge.py.src/ai_company/templates/schema.py (2)
10-13: Logger setup here looks correct.This follows the module-level observability pattern and keeps the logger name consistent.
Based on learnings: Applies to
src/ai_company/**/*.py: Every module with business logic must importfrom ai_company.observability import get_loggerand createlogger = get_logger(__name__); Always use variable namelogger(not_logger, notlog) for the logger instance.
349-361: Case-insensitive department dedupe is a solid fix.Normalizing with
strip().casefold()closes the"Sales"vs" sales "gap while still reporting the original inputs in the error message.tests/unit/engine/test_parallel.py (1)
680-683: Nice tightening of this concurrency assertion.Checking that at least one callback saw
in_progress > 0closes the false-positive hole in the oldmax_in_progress <= 2assertion.
| if result is None: | ||
| self._log_receive_null(channel_name, subscriber_id, timeout) | ||
| return result | ||
|
|
||
| def _log_receive_null( | ||
| self, | ||
| channel_name: str, | ||
| subscriber_id: str, | ||
| timeout: float | None, | ||
| ) -> None: | ||
| """Log the cause when ``receive()`` returns ``None``.""" | ||
| if self._shutdown_event.is_set(): | ||
| logger.debug( | ||
| COMM_RECEIVE_SHUTDOWN, | ||
| channel=channel_name, | ||
| subscriber=subscriber_id, | ||
| ) | ||
| elif (channel_name, subscriber_id) not in self._queues: | ||
| logger.debug( | ||
| COMM_RECEIVE_UNSUBSCRIBED, | ||
| channel=channel_name, | ||
| subscriber=subscriber_id, | ||
| ) | ||
| else: | ||
| logger.debug( | ||
| COMM_RECEIVE_TIMEOUT, | ||
| channel=channel_name, | ||
| subscriber=subscriber_id, | ||
| timeout=timeout, | ||
| ) |
There was a problem hiding this comment.
Return the wake-up reason explicitly instead of inferring it from current state.
_log_receive_null() decides between timeout/shutdown/unsubscribe by checking _shutdown_event and _queues after the await has already finished. That makes the log racey: a timeout can be logged as shutdown if stop() runs just after the wait returns, and an unsubscribe wake can be logged as timeout if the same subscriber is re-added before the helper runs. Have _await_with_shutdown() return both the envelope and a reason enum/literal so the log reflects the actual wake-up cause.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/bus_memory.py` around lines 464 - 493, Change
_await_with_shutdown to return a tuple (result, reason) where reason is a
literal/enum like "shutdown" | "timeout" | "unsubscribed"; update callers (e.g.,
receive()) to unpack that tuple and pass the explicit reason into
_log_receive_null instead of letting _log_receive_null re-check _shutdown_event
or _queues; modify _log_receive_null signature to accept the reason and log
COMM_RECEIVE_SHUTDOWN / COMM_RECEIVE_TIMEOUT / COMM_RECEIVE_UNSUBSCRIBED based
solely on the supplied reason so the logged cause matches the actual wake-up
source.
| async def receive( | ||
| self, | ||
| channel_name: str, | ||
| *, | ||
| timeout: float | None = None, # noqa: ASYNC109 | ||
| ) -> DeliveryEnvelope | None: |
There was a problem hiding this comment.
Use NotBlankStr for the new channel_name parameter.
This adds a new public identifier field as plain str, so blank channel names still depend on downstream validation instead of being rejected at the messenger boundary. Align this facade with MessageBus.receive() and the repo’s identifier-typing rule.
As per coding guidelines, "Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/messenger.py` around lines 275 - 280, The
receive method's channel_name parameter is currently typed as plain str which
allows blank names; update the signature of async def receive(self,
channel_name: str, *, timeout: float | None = None) -> DeliveryEnvelope | None
to use NotBlankStr from core.types (i.e., channel_name: NotBlankStr) to enforce
non-blank identifiers at the messenger boundary, and update any imports to
include NotBlankStr and adjust any callers/tests if they rely on a plain str
type to ensure compatibility with MessageBus.receive() typing.
src/ai_company/engine/parallel.py
Outdated
| finally: | ||
| progress.in_progress = max(0, progress.in_progress - 1) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Don’t clamp in_progress back to zero here.
max(0, ...) turns a missing increment or double-decrement into silent, seemingly valid progress data. A plain decrement plus an explicit negative-counter check would keep bookkeeping regressions visible instead of masking them.
As per coding guidelines: "Handle errors explicitly; never silently swallow errors".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/engine/parallel.py` around lines 403 - 404, The finally block
currently masks bookkeeping bugs by clamping progress.in_progress with max(0,
...); change it to perform a plain decrement of progress.in_progress (i.e.,
progress.in_progress -= 1) and then explicitly check for negative values—if
progress.in_progress < 0, log an error or raise an exception (or assert) so the
regression is visible; update the code in the same scope where
progress.in_progress is modified (the finally block around the parallel task
handling) and ensure any negative-counter handling provides sufficient context
in the log/exception.
| if self.required and self.default is not None: | ||
| msg = f"Variable {self.name!r} is required but defines a default" | ||
| logger.warning(TEMPLATE_SCHEMA_VALIDATION_ERROR, error=msg) | ||
| raise ValueError(msg) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Centralize the repeated warning-and-raise path.
The same logger.warning(...); raise ValueError(msg) sequence is now repeated across most validators in this module. A tiny helper would keep the event payload consistent and reduce copy/paste drift.
♻️ Proposed refactor
from typing import Never
def _raise_validation_error(msg: str) -> Never:
"""Log a schema validation error and raise."""
logger.warning(TEMPLATE_SCHEMA_VALIDATION_ERROR, error=msg)
raise ValueError(msg)- logger.warning(TEMPLATE_SCHEMA_VALIDATION_ERROR, error=msg)
- raise ValueError(msg)
+ _raise_validation_error(msg)Apply the same substitution to the other validators in this file.
Also applies to: 63-68, 77-82, 145-151, 220-223, 321-334, 341-345
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/templates/schema.py` around lines 49 - 52, Create a small
helper function (e.g., _raise_validation_error(msg: str) -> NoReturn) that calls
logger.warning(TEMPLATE_SCHEMA_VALIDATION_ERROR, error=msg) and then raises
ValueError(msg), and replace all inline occurrences of the pattern
logger.warning(...); raise ValueError(msg) across this module (examples found
around the blocks handling self.required/default and other validators referenced
at 49-52, 63-68, 77-82, 145-151, 220-223, 321-334, 341-345) so each validator
calls _raise_validation_error(msg) instead; keep the
TEMPLATE_SCHEMA_VALIDATION_ERROR symbol and the exact message text unchanged.
- Revert PARALLEL_GROUP_SUPPRESSED to DEBUG (double-reports with individual logs) - Surface lock release failures on error paths via add_note - Remove defensive max(0, ...) clamp on in_progress decrement - Upgrade _agent_key missing role log from warning to error - Assert group_id on captured PARALLEL_AGENT_CANCELLED events in test - Remove redundant _strip_keys local alias in merge.py
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
src/ai_company/templates/merge.py:291
_agent_key()usesstr(...).lower()forroleandmerge_idwithout stripping/validating. This means values likeNonebecome'none'(non-empty) and whitespace-only roles/merge_ids become valid keys, which can lead to silent key collisions or mismatched targeting. Consider treating non-strings as invalid, stripping whitespace, and only accepting a non-empty role after normalization; similarly normalizemerge_idwithstrip().casefold()and treatNone/blank as"".
role = str(agent.get("role", "")).lower()
if not role:
msg = f"Agent dict is missing 'role' field (keys: {sorted(agent.keys())})"
logger.error(
TEMPLATE_INHERIT_MERGE_ERROR,
action="missing_role",
agent_keys=sorted(agent.keys()),
error=msg,
)
raise TemplateInheritanceError(msg)
dept = agent.get("department")
if not dept:
dept = DEFAULT_MERGE_DEPARTMENT
merge_id = str(agent.get("merge_id", "")).lower()
return (role, str(dept).lower(), merge_id)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -150,16 +156,32 @@ async def execute_group( | |||
| fatal_errors, | |||
| progress, | |||
| ) | |||
| except Exception as exc: | |||
| task_error = exc | |||
| finally: | |||
| if lock is not None: | |||
| try: | |||
| await self._release_all_locks(group, lock) | |||
| except Exception: | |||
| except Exception as exc: | |||
| logger.exception( | |||
| PARALLEL_VALIDATION_ERROR, | |||
| PARALLEL_LOCK_RELEASE_ERROR, | |||
| error="Failed to release resource locks", | |||
| group_id=group.group_id, | |||
| ) | |||
| release_error = exc | |||
|
|
|||
| if release_error is not None: | |||
| lock_msg = ( | |||
| f"Parallel group {group.group_id!r}: " | |||
| "resource locks could not be released" | |||
| ) | |||
| if task_error is not None: | |||
| task_error.add_note(lock_msg) | |||
| else: | |||
| raise ParallelExecutionError(lock_msg) from release_error | |||
|
|
|||
| if task_error is not None: | |||
| raise task_error | |||
There was a problem hiding this comment.
execute_group() captures an exception into task_error and later re-raises it with raise task_error. Re-raising an exception instance outside the except block can lose the original traceback context, making debugging harder. Consider preserving the original traceback (e.g., store exc.__traceback__ and re-raise with with_traceback(...)) or restructuring the control flow so the original exception is re-raised directly after lock release handling.
| merge_id = str(agent.get("merge_id", "")).strip() | ||
| if has_extends and merge_id: | ||
| agent_dict["merge_id"] = merge_id |
There was a problem hiding this comment.
merge_id is coerced with str(...), so a YAML merge_id: null (or any non-string) would silently become a non-empty string like 'None' and then be preserved into the expanded agent dict when has_extends=True. That both bypasses schema validation (where merge_id is declared as str) and can break merge targeting. Consider validating that merge_id is either missing/empty or a real string (e.g., reject None / non-str types) before adding it to agent_dict.
| merge_id = str(agent.get("merge_id", "")).strip() | |
| if has_extends and merge_id: | |
| agent_dict["merge_id"] = merge_id | |
| raw_merge_id = agent.get("merge_id", None) | |
| if has_extends and raw_merge_id not in (None, ""): | |
| if not isinstance(raw_merge_id, str): | |
| msg = ( | |
| f"Agent {name!r} has invalid 'merge_id' type " | |
| f"{type(raw_merge_id).__name__}; expected a string" | |
| ) | |
| logger.warning( | |
| TEMPLATE_RENDER_VARIABLE_ERROR, | |
| agent=name, | |
| field="merge_id", | |
| ) | |
| raise TemplateRenderError(msg) | |
| merge_id = raw_merge_id.strip() | |
| if merge_id: | |
| agent_dict["merge_id"] = merge_id |
| elif (channel_name, subscriber_id) not in self._queues: | ||
| logger.debug( | ||
| COMM_RECEIVE_UNSUBSCRIBED, | ||
| channel=channel_name, | ||
| subscriber=subscriber_id, | ||
| ) |
There was a problem hiding this comment.
_log_receive_null() infers “unsubscribed” by checking (channel_name, subscriber_id) not in self._queues after the await. Because this check is outside _lock, a concurrent unsubscribe() (or even a later subscribe()) can race with a timeout and cause the wrong event to be logged. To make the event reliable, consider having _await_with_shutdown() return an explicit reason (timeout vs shutdown vs unsubscribe-sentinel) rather than re-deriving it from shared state.
| elif (channel_name, subscriber_id) not in self._queues: | |
| logger.debug( | |
| COMM_RECEIVE_UNSUBSCRIBED, | |
| channel=channel_name, | |
| subscriber=subscriber_id, | |
| ) |
| queue = self._queues.pop((channel_name, subscriber_id), None) | ||
| if queue is not None: | ||
| try: | ||
| queue.put_nowait(None) | ||
| except asyncio.QueueFull: | ||
| logger.exception( | ||
| COMM_UNSUBSCRIBE_SENTINEL_FAILED, | ||
| channel=channel_name, | ||
| subscriber=subscriber_id, | ||
| detail="Queue full — unsubscribe sentinel not delivered", | ||
| ) |
There was a problem hiding this comment.
asyncio.QueueFull can never be raised here — dead code branch
_ensure_queue creates queues with asyncio.Queue() (no maxsize argument), which defaults to maxsize=0 — an unbounded queue. put_nowait on an unbounded queue never raises QueueFull, so this except branch is unreachable dead code.
If an explicit capacity limit is ever added to the queue in the future, the handler would kick in correctly. For now, consider either:
- Removing the
try/exceptand just callingqueue.put_nowait(None)unconditionally (it can't raise), or - Adding a comment explaining this is a forward-compatible guard.
| queue = self._queues.pop((channel_name, subscriber_id), None) | |
| if queue is not None: | |
| try: | |
| queue.put_nowait(None) | |
| except asyncio.QueueFull: | |
| logger.exception( | |
| COMM_UNSUBSCRIBE_SENTINEL_FAILED, | |
| channel=channel_name, | |
| subscriber=subscriber_id, | |
| detail="Queue full — unsubscribe sentinel not delivered", | |
| ) | |
| queue = self._queues.pop((channel_name, subscriber_id), None) | |
| if queue is not None: | |
| # Queue is unbounded (asyncio.Queue() default maxsize=0), | |
| # so put_nowait cannot raise QueueFull here. | |
| queue.put_nowait(None) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/communication/bus_memory.py
Line: 402-412
Comment:
`asyncio.QueueFull` can never be raised here — dead code branch
`_ensure_queue` creates queues with `asyncio.Queue()` (no `maxsize` argument), which defaults to `maxsize=0` — an unbounded queue. `put_nowait` on an unbounded queue never raises `QueueFull`, so this `except` branch is unreachable dead code.
If an explicit capacity limit is ever added to the queue in the future, the handler would kick in correctly. For now, consider either:
1. Removing the `try/except` and just calling `queue.put_nowait(None)` unconditionally (it can't raise), or
2. Adding a comment explaining this is a forward-compatible guard.
```suggestion
queue = self._queues.pop((channel_name, subscriber_id), None)
if queue is not None:
# Queue is unbounded (asyncio.Queue() default maxsize=0),
# so put_nowait cannot raise QueueFull here.
queue.put_nowait(None)
```
How can I resolve this? If you propose a fix, please make it concise.Critical (C1-C4): - Parse decisions/action_items from LLM synthesis in all 3 meeting protocols - Validate winning_agent_id exists in find_losers() before computing losers Major (M1-M17): - Guard summary budget reserve when leader_summarizes=False - Add synthesis sub-reserve in structured phases discussion - Reject duplicate participant_ids in meeting orchestrator - Freeze protocol registry with MappingProxyType - Warn when token tracker exceeds budget - Add hierarchy tiebreaker to pick_highest_seniority() - Wire hierarchy into debate/hybrid authority fallbacks - Fast-path get_lowest_common_manager(a, a) → a - Validate _SENIORITY_ORDER matches enum members at import - Remove dead max_tokens_per_argument config field - Verify task IDs match plan subtask IDs in DecompositionResult - Return CANCELLED for mixed completed+cancelled terminal states - Fix double-logging in rollup compute() for empty case - Copy subtask dependencies from plan to created Tasks - Reject duplicate subtask IDs in RoutingResult - Wake all pending waiters on unsubscribe (not just one) Minor (m1-m15): - Remove duplicate MEETING_CONFLICT_DETECTED log events - Replace assert with explicit raises in meeting protocols - Include presenter_id in formatted agenda prompt - Validate token aggregates in MeetingMinutes - Require non-empty error_message for FAILED/BUDGET_EXHAUSTED - Move _MIN_POSITIONS to local constant in service.py - Precompute seniority rank dict for O(1) lookups - Remove dead asyncio.QueueFull catch on unbounded queue - Fix racy state check in _log_receive_null (acquire lock) - Type channel_name as NotBlankStr in messenger - Document unsubscribe as None return path in receive() - Preserve traceback context in parallel.py re-raise - Validate parent_task.id matches plan.parent_task_id - Add logging before raises in routing model validators Trivial (t1-t4): - Use centralized event constant in routing scorer - Add task_structure/coordination_topology to Task docstring - Fix DESIGN_SPEC.md model/function names to match code - Fix StructuredPhasesConfig docstring Tests (T1-T5): - Assert MEETING_CONTRIBUTION enum value - Add timeout markers to all meeting test modules - Add 3+ participant test for authority/debate strategies - Remove dead max_tokens_per_argument test references - Update HybridResolver tests for new hierarchy parameter Closes #169
## Summary - **46 findings** from external reviewers (Copilot, Greptile, CodeRabbit, Gemini) on PRs #164-#167, all addressed - **4 critical**: Parse decisions/action_items from LLM synthesis in all 3 meeting protocols; validate `winning_agent_id` in `find_losers()` - **17 major**: Token budget guards, duplicate participant rejection, frozen registry, hierarchy tiebreakers, dependency copying, wake-all-waiters on unsubscribe, and more - **15 minor**: Duplicate log removal, assert→raise, traceback preservation, O(1) seniority lookups, NotBlankStr typing, routing validation ordering - **4 trivial**: Centralized event constants, docstring/spec fixes - **6 test/doc gaps**: New `test_parsing.py` (18 tests), expanded tests across 7 modules, timeout markers, spec name corrections - **Pre-PR review fixes**: List-item regex crossing line boundaries, parent_task_id validation ordering, circular exception cause, dead code removal ## Test plan - [x] All 3634 unit tests pass - [x] Ruff lint clean - [x] Mypy strict clean - [x] Pre-commit hooks pass - [ ] CI pipeline (lint + type-check + test + coverage) Closes #169
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
Addresses post-merge bot feedback from #163, plus 21 additional findings from a 9-agent pre-PR review pipeline.
Original fixes (commit 1 — #163 feedback)
merge_idto template inheritance merge key for agent disambiguationreceive()calls via sentinelMessageDispatcherAgentMessenger.receive()convenience methodin_progresstracking inside semaphore context inParallelExecutorlogger.warning()beforeraisein Pydantic validators (CLAUDE.md rule)Pre-PR review fixes (commit 2 — 21 findings from 9 agents)
Communication layer:
_log_receive_nullhelper (receive()exceeded 50-line limit)COMM_RECEIVE_UNSUBSCRIBED)QueueFullinstead of silently suppressing inunsubscribe()ensure_future→create_taskDeliveryEnvelope | None(removetype: ignore)Parallel execution:
in_progressincrement/decrement in_execute_assignment(fix double-decrement)PARALLEL_GROUP_SUPPRESSEDevent at DEBUG for ExceptionGroupgroup_idcontext to progress callback error logTemplate merge:
(role, department, merge_id)_STRIP_KEYSfrozenset constant (single source of truth)_agent_key(prevent silent key collision)Company validators:
Team,Departmentteam names,HRRegistrytocasefold()(consistent case-insensitive duplicate detection)Docs:
__init__.pyfortools/examples/in project structure listing(M2)→(M3)Test plan
uv run ruff check src/ tests/— cleanuv run ruff format --check src/ tests/— cleanuv run mypy src/ tests/— no issuesuv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80— 3242 passed, 96.33% coveragetest_remove_targets_specific_agent_by_merge_id@pytest.mark.unittotest_register_sync_handler_raisesReview coverage
Closes #163