feat: implement conflict resolution protocol (#122)#166
Conversation
Add pluggable conflict resolution system (DESIGN_SPEC §5.6) with 4 strategies: authority+dissent, structured debate+judge, human escalation stub, and hybrid review. Includes frozen Pydantic models, ConflictResolutionService orchestrator with audit trail, and 94 tests.
Pre-reviewed by 9 agents, 37 findings addressed. Polished by code-simplifier (extracted shared helpers, eliminated duplication).
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (8)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a new pluggable Conflict Resolution subsystem under communication: domain models, protocols, four resolver strategies (authority, debate, human, hybrid), an orchestration service with an audit trail and query APIs, hierarchy and seniority helpers, observability event constants, configuration models, and comprehensive unit tests. Changes
Sequence Diagram(s)sequenceDiagram
participant AgentA as Agent A
participant AgentB as Agent B
participant Service as ConflictResolutionService
participant Config as ConflictResolutionConfig
participant Resolver as Strategy Resolver
participant Hierarchy as HierarchyResolver
participant Audit as Audit Trail
participant Events as Observability
AgentA->>Service: create_conflict(positions)
Service->>Events: emit CONFLICT_DETECTED
Service->>Service: validate & create Conflict
Service->>Config: read strategy
Service->>Events: emit CONFLICT_RESOLUTION_STARTED
Service->>Resolver: resolve(conflict)
alt Authority
Resolver->>Resolver: select highest seniority
Resolver->>Hierarchy: get_lowest_common_manager (if tie)
Hierarchy-->>Resolver: lcm_id
Resolver->>Events: emit CONFLICT_AUTHORITY_DECIDED
else Debate
Resolver->>Resolver: determine judge
Resolver->>Resolver: call judge_evaluator (if present)
Resolver->>Events: emit CONFLICT_DEBATE_JUDGE_DECIDED
else Hybrid
Resolver->>Resolver: call review_evaluator
alt Clear winner
Resolver->>Events: emit CONFLICT_HYBRID_AUTO_RESOLVED
else Ambiguous
Resolver->>Events: emit CONFLICT_AMBIGUOUS_RESULT
Resolver->>Resolver: escalate to human or fallback to authority
end
else Human
Resolver->>Events: emit CONFLICT_HUMAN_ESCALATION_STUB
end
Resolver-->>Service: ConflictResolution
Service->>Resolver: build_dissent_records(conflict, resolution)
Resolver-->>Service: DissentRecord(s)
Service->>Audit: append DissentRecord(s)
Service->>Events: emit CONFLICT_RESOLVED
Service->>Events: emit CONFLICT_DISSENT_RECORDED
Service-->>AgentA: return resolution + dissent records
sequenceDiagram
participant Caller as Consumer
participant Service as ConflictResolutionService
participant Audit as Audit Trail
Caller->>Service: query_dissent_records(filters)
Service->>Audit: filter by agent_id, conflict_type, strategy, since
Audit-->>Service: matching records
Service-->>Caller: return tuple[DissentRecord]
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a comprehensive conflict resolution protocol for AI agents, enabling the system to manage disagreements through various configurable strategies. It establishes a robust framework for detecting, resolving, and auditing conflicts, enhancing the autonomy and decision-making capabilities of the AI company's agents. The implementation includes pluggable resolution strategies, a dedicated orchestration service, and detailed audit trails to ensure transparency and facilitate learning from past conflicts. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is an impressive and comprehensive implementation of the conflict resolution protocol. The changes are well-structured, following the strategy pattern effectively with clear separation of concerns between the service, strategies, models, and configuration. The use of frozen Pydantic models and protocol-based design enhances robustness and extensibility. The addition of a detailed dissent audit trail is a great feature for organizational learning. The test coverage is extensive and thorough, covering various scenarios for each strategy. My review includes a couple of suggestions for refactoring to improve maintainability and performance by removing duplicated validation logic and optimizing a query method.
| if len(positions) < _MIN_POSITIONS: | ||
| msg = "A conflict requires at least 2 positions" | ||
| logger.warning( | ||
| CONFLICT_VALIDATION_ERROR, | ||
| error=msg, | ||
| position_count=len(positions), | ||
| ) | ||
| raise ConflictResolutionError(msg) | ||
|
|
||
| agent_ids = [p.agent_id for p in positions] | ||
| if len(agent_ids) != len(set(agent_ids)): | ||
| msg = "Duplicate agent_id in conflict positions" | ||
| logger.warning( | ||
| CONFLICT_VALIDATION_ERROR, | ||
| error=msg, | ||
| agent_ids=agent_ids, | ||
| ) | ||
| raise ConflictResolutionError(msg) | ||
|
|
||
| conflict = Conflict( | ||
| id=f"conflict-{uuid4().hex[:12]}", | ||
| type=conflict_type, | ||
| task_id=task_id, | ||
| subject=subject, | ||
| positions=tuple(positions), | ||
| detected_at=datetime.now(UTC), | ||
| ) |
There was a problem hiding this comment.
The validation logic here for checking the number of positions and for duplicate agent IDs is already implemented in the Conflict model's _validate_positions validator. Duplicating this logic in the service layer violates the DRY (Don't Repeat Yourself) principle and creates two sources of truth for validation. The service should rely on the Pydantic model to perform validation by wrapping the model instantiation in a try...except block.
| if len(positions) < _MIN_POSITIONS: | |
| msg = "A conflict requires at least 2 positions" | |
| logger.warning( | |
| CONFLICT_VALIDATION_ERROR, | |
| error=msg, | |
| position_count=len(positions), | |
| ) | |
| raise ConflictResolutionError(msg) | |
| agent_ids = [p.agent_id for p in positions] | |
| if len(agent_ids) != len(set(agent_ids)): | |
| msg = "Duplicate agent_id in conflict positions" | |
| logger.warning( | |
| CONFLICT_VALIDATION_ERROR, | |
| error=msg, | |
| agent_ids=agent_ids, | |
| ) | |
| raise ConflictResolutionError(msg) | |
| conflict = Conflict( | |
| id=f"conflict-{uuid4().hex[:12]}", | |
| type=conflict_type, | |
| task_id=task_id, | |
| subject=subject, | |
| positions=tuple(positions), | |
| detected_at=datetime.now(UTC), | |
| ) | |
| try: | |
| conflict = Conflict( | |
| id=f"conflict-{uuid4().hex[:12]}", | |
| type=conflict_type, | |
| task_id=task_id, | |
| subject=subject, | |
| positions=tuple(positions), | |
| detected_at=datetime.now(UTC), | |
| ) | |
| except ValueError as e: | |
| logger.warning( | |
| CONFLICT_VALIDATION_ERROR, | |
| error=str(e), | |
| subject=subject, | |
| agent_ids=[p.agent_id for p in positions], | |
| ) | |
| raise ConflictResolutionError(str(e)) from e |
| results = self._audit_trail | ||
|
|
||
| if agent_id is not None: | ||
| results = [r for r in results if r.dissenting_agent_id == agent_id] | ||
| if conflict_type is not None: | ||
| results = [r for r in results if r.conflict.type == conflict_type] | ||
| if strategy is not None: | ||
| results = [r for r in results if r.strategy_used == strategy] | ||
| if since is not None: | ||
| results = [r for r in results if r.timestamp >= since] |
There was a problem hiding this comment.
This query implementation can be inefficient. It iterates through the _audit_trail list up to four times, once for each potential filter. For a large audit trail, this can impact performance. This can be optimized by using a single list comprehension or a generator expression that applies all filters in a single pass.
| results = self._audit_trail | |
| if agent_id is not None: | |
| results = [r for r in results if r.dissenting_agent_id == agent_id] | |
| if conflict_type is not None: | |
| results = [r for r in results if r.conflict.type == conflict_type] | |
| if strategy is not None: | |
| results = [r for r in results if r.strategy_used == strategy] | |
| if since is not None: | |
| results = [r for r in results if r.timestamp >= since] | |
| results = [ | |
| r | |
| for r in self._audit_trail | |
| if (agent_id is None or r.dissenting_agent_id == agent_id) | |
| and (conflict_type is None or r.conflict.type == conflict_type) | |
| and (strategy is None or r.strategy_used == strategy) | |
| and (since is None or r.timestamp >= since) | |
| ] |
Greptile SummaryThis PR implements the full conflict resolution protocol (DESIGN_SPEC §5.6) within the existing Key findings from the review:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant CRS as ConflictResolutionService
participant Resolver as ConflictResolver<br/>(Authority/Debate/Human/Hybrid)
participant HierarchyResolver
participant JudgeEvaluator
Caller->>CRS: create_conflict(type, subject, positions)
CRS->>CRS: validate positions (≥2, unique IDs)
CRS->>CRS: generate conflict ID
CRS-->>Caller: Conflict
Caller->>CRS: resolve(conflict)
CRS->>CRS: lookup resolver by config.strategy
CRS->>Resolver: resolve(conflict)
alt AuthorityResolver
Resolver->>HierarchyResolver: get_lowest_common_manager(a, b)
HierarchyResolver-->>Resolver: lcm
Resolver->>HierarchyResolver: get_delegation_depth(lcm, agent)
HierarchyResolver-->>Resolver: depth
Resolver-->>CRS: ConflictResolution(RESOLVED_BY_AUTHORITY)
else DebateResolver (with judge)
Resolver->>HierarchyResolver: get_lowest_common_manager() iterative
HierarchyResolver-->>Resolver: judge_id
Resolver->>JudgeEvaluator: evaluate(conflict, judge_id)
JudgeEvaluator-->>Resolver: JudgeDecision(winner, reasoning)
Resolver-->>CRS: ConflictResolution(RESOLVED_BY_DEBATE)
else HumanEscalationResolver
Resolver-->>CRS: ConflictResolution(ESCALATED_TO_HUMAN)
else HybridResolver
Resolver->>JudgeEvaluator: evaluate(conflict, review_agent)
JudgeEvaluator-->>Resolver: JudgeDecision
alt winner in positions
Resolver-->>CRS: ConflictResolution(RESOLVED_BY_HYBRID)
else ambiguous
Resolver->>Resolver: escalate_on_ambiguity?
Resolver-->>CRS: ConflictResolution(ESCALATED_TO_HUMAN or RESOLVED_BY_HYBRID)
end
end
CRS->>Resolver: build_dissent_records(conflict, resolution)
Resolver-->>CRS: tuple[DissentRecord, ...] (one per loser)
CRS->>CRS: _audit_trail.extend(dissent_records)
CRS-->>Caller: (ConflictResolution, tuple[DissentRecord, ...])
Last reviewed commit: 6164d92 |
There was a problem hiding this comment.
Pull request overview
Implements the DESIGN_SPEC §5.6 conflict resolution protocol by adding a pluggable conflict-resolution subsystem (strategies + orchestrating service), structured observability events, and comprehensive unit tests, while updating core communication config/docs to expose the feature.
Changes:
- Added
communication/conflict_resolution/package with frozen Pydantic models, resolver protocols, an orchestrator service, and four strategies (authority, debate, human escalation stub, hybrid). - Extended hierarchy utilities (LCM lookup) and core seniority comparison to support strategy logic.
- Added conflict observability events and expanded unit test coverage across models, service behavior, strategies, and event constants.
Reviewed changes
Copilot reviewed 32 out of 33 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/observability/test_events.py | Asserts new conflict.* event constants and module discovery. |
| tests/unit/core/test_enums.py | Adds unit tests for new compare_seniority() helper. |
| tests/unit/communication/delegation/test_hierarchy.py | Adds tests for HierarchyResolver.get_lowest_common_manager(). |
| tests/unit/communication/conftest.py | Adds Polyfactory factories for conflict-resolution config models. |
| tests/unit/communication/conflict_resolution/test_service.py | Tests ConflictResolutionService creation, dispatch, and audit trail querying. |
| tests/unit/communication/conflict_resolution/test_models.py | Tests frozen/validation behavior of conflict domain models. |
| tests/unit/communication/conflict_resolution/test_hybrid_strategy.py | Tests hybrid strategy auto-resolve vs ambiguity fallback/escalation behavior. |
| tests/unit/communication/conflict_resolution/test_human_strategy.py | Tests human escalation stub resolution + dissent record metadata. |
| tests/unit/communication/conflict_resolution/test_debate_strategy.py | Tests debate strategy judge selection, fallback, and error cases. |
| tests/unit/communication/conflict_resolution/test_config.py | Tests defaults/validation/immutability for conflict-resolution config. |
| tests/unit/communication/conflict_resolution/test_authority_strategy.py | Tests authority strategy seniority + hierarchy tiebreak behavior. |
| tests/unit/communication/conflict_resolution/conftest.py | Provides fixtures/factories for conflict domain objects and hierarchy. |
| tests/unit/communication/conflict_resolution/init.py | Test package marker for conflict-resolution tests. |
| src/ai_company/observability/events/conflict.py | Introduces CONFLICT_* structured logging event constants. |
| src/ai_company/core/enums.py | Adds compare_seniority() helper used by strategies. |
| src/ai_company/communication/errors.py | Adds conflict-resolution error types to communication error hierarchy. |
| src/ai_company/communication/enums.py | Adds ConflictType and ConflictResolutionStrategy enums. |
| src/ai_company/communication/delegation/hierarchy.py | Adds get_lowest_common_manager() for cross-team/department resolution logic. |
| src/ai_company/communication/conflict_resolution/service.py | Adds ConflictResolutionService orchestrator + audit trail querying. |
| src/ai_company/communication/conflict_resolution/protocol.py | Defines ConflictResolver and JudgeEvaluator protocols. |
| src/ai_company/communication/conflict_resolution/models.py | Adds frozen domain models: Conflict*, DissentRecord, outcomes. |
| src/ai_company/communication/conflict_resolution/hybrid_strategy.py | Implements hybrid review → escalation/authority fallback strategy. |
| src/ai_company/communication/conflict_resolution/human_strategy.py | Implements human escalation stub strategy returning escalated outcome. |
| src/ai_company/communication/conflict_resolution/debate_strategy.py | Implements debate + judge strategy with LCM-based judge selection. |
| src/ai_company/communication/conflict_resolution/config.py | Adds config models for selecting and tuning conflict strategies. |
| src/ai_company/communication/conflict_resolution/authority_strategy.py | Implements authority (seniority + hierarchy tiebreak) strategy. |
| src/ai_company/communication/conflict_resolution/_helpers.py | Adds shared helper utilities for strategies (position lookup, seniority pick). |
| src/ai_company/communication/conflict_resolution/init.py | Exposes conflict-resolution public API types/services (not strategies). |
| src/ai_company/communication/config.py | Adds conflict_resolution to top-level communication config model. |
| src/ai_company/communication/init.py | Re-exports conflict-resolution APIs and strategy classes at package level. |
| README.md | Documents conflict resolution as a headline feature. |
| DESIGN_SPEC.md | Updates current-state + directory listing + conventions to reflect implementation. |
| CLAUDE.md | Updates repo guidance to mention conflict events as an example domain. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| reasoning=( | ||
| f"Authority decision: {winner.agent_id} " | ||
| f"({winner.agent_level}) outranks " | ||
| f"{non_winners[0].agent_id} ({non_winners[0].agent_level})" | ||
| ), |
There was a problem hiding this comment.
The reasoning string always says the winner “outranks” non_winners[0]. This is incorrect when seniority is tied and the hierarchy tiebreaker decides, and it’s also misleading for N-party conflicts (it only mentions one loser). Generate reasoning that matches the actual decision rule used (seniority vs hierarchy) and avoid hard-coding a single loser in the message.
| conflict = Conflict( | ||
| id=f"conflict-{uuid4().hex[:12]}", | ||
| type=conflict_type, | ||
| task_id=task_id, | ||
| subject=subject, |
There was a problem hiding this comment.
create_conflict() can raise a Pydantic ValidationError during Conflict(...) construction (e.g., blank subject/task_id). The docstring promises ConflictResolutionError for invalid inputs and you already log CONFLICT_VALIDATION_ERROR for manual checks; consider catching ValidationError, logging CONFLICT_VALIDATION_ERROR with details, and re-raising ConflictResolutionError for a consistent service API (similar to DelegationService).
| logger.info( | ||
| CONFLICT_RESOLVED, | ||
| conflict_id=conflict.id, | ||
| outcome=resolution.outcome, | ||
| winning_agent_id=resolution.winning_agent_id, |
There was a problem hiding this comment.
When resolution.outcome == ESCALATED_TO_HUMAN, the service logs CONFLICT_RESOLVED but never emits the dedicated CONFLICT_ESCALATED lifecycle event (which is defined/tested). Consider logging CONFLICT_ESCALATED (INFO) on escalations so they can be tracked separately from successful resolutions.
| def get_lowest_common_manager( | ||
| self, | ||
| agent_a: str, | ||
| agent_b: str, | ||
| ) -> str | None: |
There was a problem hiding this comment.
get_lowest_common_manager() returns the supervisor (or another ancestor) when agent_a == agent_b, because get_ancestors() does not include the agent itself. Add a fast-path if agent_a == agent_b: return agent_a so LCM(id, id) behaves as expected.
| winning_agent_id, reasoning = self._authority_fallback(conflict) | ||
|
|
||
| winning_pos = find_position_or_raise(conflict, winning_agent_id) | ||
|
|
||
| logger.info( |
There was a problem hiding this comment.
If find_position_or_raise() raises ConflictStrategyError (e.g., judge returns a non-participant), this error path is currently unlogged. Catch ConflictStrategyError around this call, log CONFLICT_STRATEGY_ERROR with conflict_id + returned_winner, then re-raise to comply with the “log before raising” convention.
| best = pick_highest_seniority(conflict) | ||
| return ( | ||
| best.agent_id, | ||
| ( | ||
| f"Debate fallback: authority-based judging — " | ||
| f"{best.agent_id} ({best.agent_level}) has highest " | ||
| f"seniority" |
There was a problem hiding this comment.
The debate strategy’s authority fallback uses pick_highest_seniority() only. If the top seniority is tied, this becomes “first position wins”, which diverges from the Authority strategy (which uses hierarchy proximity as a tiebreaker). Consider reusing the AuthorityResolver tie-break logic (or factoring it into a shared helper) for fallback decisions.
| best = pick_highest_seniority(conflict) | |
| return ( | |
| best.agent_id, | |
| ( | |
| f"Debate fallback: authority-based judging — " | |
| f"{best.agent_id} ({best.agent_level}) has highest " | |
| f"seniority" | |
| positions = list(conflict.positions) | |
| if not positions: | |
| raise ValueError("Conflict has no positions to resolve.") | |
| # Determine highest seniority among all positions. | |
| max_level = max(p.agent_level for p in positions) | |
| top_positions = [p for p in positions if p.agent_level == max_level] | |
| if len(top_positions) == 1: | |
| best = top_positions[0] | |
| tie_note = "" | |
| else: | |
| # Explicit, deterministic tie-breaker that does not depend on | |
| # the original ordering of ``conflict.positions``. | |
| top_positions.sort(key=lambda p: p.agent_id) | |
| best = top_positions[0] | |
| tie_note = " (tie on seniority; broken by agent_id ordering)" | |
| return ( | |
| best.agent_id, | |
| ( | |
| "Debate fallback: authority-based judging — " | |
| f"{best.agent_id} ({best.agent_level}) has highest " | |
| f"seniority{tie_note}" |
|
|
||
| if self._config.escalate_on_ambiguity: | ||
| return await self._human_resolver.resolve(conflict) | ||
|
|
There was a problem hiding this comment.
In the ambiguous-review branch (escalate_on_ambiguity=False), the resolver falls back to _authority_fallback() without emitting CONFLICT_AUTHORITY_FALLBACK, even though the helper’s docstring says callers should log that event before calling. Add a logger.warning(CONFLICT_AUTHORITY_FALLBACK, ...) (reason like "ambiguous_review") before returning the fallback resolution.
| logger.warning( | |
| CONFLICT_AUTHORITY_FALLBACK, | |
| conflict_id=conflict.id, | |
| strategy="hybrid", | |
| reason="ambiguous_review", | |
| ) |
| Returns: | ||
| Integer indicating relative seniority. | ||
| """ | ||
| return _SENIORITY_ORDER.index(a) - _SENIORITY_ORDER.index(b) |
There was a problem hiding this comment.
compare_seniority() does two linear .index() lookups on every call. Since this is used inside resolution loops, precompute a dict[SeniorityLevel, int] rank map once and do rank[a] - rank[b] for O(1) comparisons.
| for pos in conflict.positions: | ||
| if pos.agent_id != resolution.winning_agent_id: | ||
| return pos | ||
| msg = f"No losing position found for winner {resolution.winning_agent_id!r}" | ||
| raise ConflictStrategyError( |
There was a problem hiding this comment.
find_loser() will return the first position if resolution.winning_agent_id is not present in conflict.positions (because every position is “not the winner”). That masks data integrity bugs and can produce an incorrect dissent record. Consider first verifying the winner exists and raising ConflictStrategyError if it doesn’t.
| best = pick_highest_seniority(conflict) | ||
|
|
There was a problem hiding this comment.
The hybrid strategy’s _authority_fallback() uses pick_highest_seniority() only. If multiple participants share the highest seniority, this will pick the first position arbitrarily rather than applying the Authority strategy’s hierarchy tiebreaker. Consider reusing the full Authority resolution logic for a deterministic, consistent fallback.
| best = pick_highest_seniority(conflict) | |
| # Determine the highest seniority among all positions. | |
| max_level = max(pos.agent_level for pos in conflict.positions) | |
| highest_positions = [ | |
| pos for pos in conflict.positions if pos.agent_level == max_level | |
| ] | |
| if len(highest_positions) == 1: | |
| best = highest_positions[0] | |
| else: | |
| # Deterministic tiebreaker among equal-seniority participants. | |
| # Sort by agent_id so that the same set of inputs always yields | |
| # the same winner, instead of relying on list order. | |
| highest_positions.sort(key=lambda p: p.agent_id) | |
| best = highest_positions[0] |
There was a problem hiding this comment.
Actionable comments posted: 17
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/communication/conflict_resolution/_helpers.py`:
- Around line 1-11: This module lacks a logger and doesn't log error context
before raising; import get_logger from ai_company.observability, instantiate
logger = get_logger(__name__), and in both find_loser and find_position_or_raise
log a warning (logger.warning) with CONFLICT_STRATEGY_ERROR, include
conflict_id=conflict.id (or conflict_id/contextual id) and error=msg (the
constructed error message) immediately before raising ConflictStrategyError so
every error path emits a WARNING with context.
In `@src/ai_company/communication/conflict_resolution/authority_strategy.py`:
- Around line 53-94: The reasoning text in resolve (method resolve in
authority_strategy.py) only mentions non_winners[0] which is misleading for
N-party conflicts; update the reasoning construction in the ConflictResolution
returned by resolve to list all losing agent IDs and their levels (use
non_winners and their agent_level fields) or otherwise summarize them, so the
reasoning accurately reflects multiple losers; ensure you still set
winning_agent_id, winning_position, decided_by and outcome
ConflictResolutionOutcome.RESOLVED_BY_AUTHORITY unchanged.
In `@src/ai_company/communication/conflict_resolution/config.py`:
- Around line 20-24: max_tokens_per_argument in config.py is unused; either
remove it or wire it into the evaluation/strategy flow — update
JudgeEvaluator.evaluate to accept a token_budget (or budget_kwargs) and have
DebateStrategy and HybridStrategy consume that budget when generating/truncating
arguments; specifically, add a parameter to JudgeEvaluator.evaluate, propagate
it into the methods on DebateStrategy and HybridStrategy (e.g.,
DebateStrategy.generate_arguments / HybridStrategy.generate_arguments or
equivalent entrypoints), and enforce the cap by truncating or limiting tokens
per argument using max_tokens_per_argument from the config class; ensure all
call sites are updated to pass the config value so the cap is actually enforced.
In `@src/ai_company/communication/conflict_resolution/debate_strategy.py`:
- Around line 86-123: The code currently calls _determine_judge() before
checking _judge_evaluator, causing decided_by to be set even when
_authority_fallback is used; change the flow so you only call _determine_judge()
when _judge_evaluator is present (or otherwise ensure decided_by is cleared for
the fallback path): move or delay the call to _determine_judge() until after the
_judge_evaluator null check, use _judge_evaluator.evaluate(conflict, judge_id)
when present, and when you call _authority_fallback(conflict) set decided_by to
None (or another explicit “no judge” sentinel) in the returned
ConflictResolution to avoid misattributing the decision.
In `@src/ai_company/communication/conflict_resolution/human_strategy.py`:
- Around line 66-94: The build_dissent_record function currently assigns
conflict.positions[0] as the dissenter which fabricates a losing position for
escalated human reviews; update build_dissent_record (and any callers) to detect
when resolution.strategy == ConflictResolutionStrategy.HUMAN or resolution
indicates ESCALATED_TO_HUMAN and either (a) defer creating a DissentRecord until
the human decision exists, or (b) create a DissentRecord with
dissenting_agent_id and dissenting_position set to None and add clear metadata
like ("pending_human_review","true") so the record accurately represents a
pending human review instead of a loss; ensure any code expecting dissenting_*
fields handles nullable values or the deferred creation flow.
- Around line 37-64: The resolve() method in human_strategy.py currently logs
CONFLICT_HUMAN_ESCALATION_STUB and returns a ConflictResolution with outcome
ESCALATED_TO_HUMAN but does not enqueue or persist any pending human decision
nor park the related task; update resolve(conflict: Conflict) to (1) create and
persist a HumanDecision (or PendingDecision) record linked to conflict.id and
conflict.subject, (2) push a queue item into the human review queue
service/adapter (use your project’s queue client) with reference to the pending
decision id, and (3) mark or park the associated task/work item as
awaiting_human (or set a stalled state) so it won’t be dropped; ensure the
returned ConflictResolution includes the pending decision reference and that
resolve logs the created pending decision id in addition to
CONFLICT_HUMAN_ESCALATION_STUB so the human-queue integration is actionable.
In `@src/ai_company/communication/conflict_resolution/hybrid_strategy.py`:
- Around line 185-214: _authority_fallback currently calls
pick_highest_seniority directly and ignores self._hierarchy, causing different
winners than AuthorityResolver for ties or cross-department cases; change
_authority_fallback to reuse the same authority-path logic by delegating to the
existing AuthorityResolver (or the same selection helper it uses) so it consults
self._hierarchy for tie-breaking and department rules instead of calling
pick_highest_seniority directly; locate _authority_fallback, AuthorityResolver
(or the authority selection helper) and replace the direct call to
pick_highest_seniority with a call into the AuthorityResolver API or the shared
selection function that honors self._hierarchy, returning a ConflictResolution
constructed from that authoritative winner and preserving the existing metadata
fields (decided_by, reasoning, resolved_at).
- Around line 160-172: The code currently constructs a DissentRecord inline when
resolution.outcome == ConflictResolutionOutcome.ESCALATED_TO_HUMAN, which
discards escalation-specific metadata; instead call into the human escalation
handler and return its dissent record. Replace the inline DissentRecord
construction with a delegation to the HumanEscalationResolver (or the existing
human_resolver instance/function) — e.g., invoke the resolver's method that
creates/returns a DissentRecord (pass in conflict and resolution, and any
context/metadata) and return that result so approval-queue IDs and escalation
metadata are preserved.
In `@src/ai_company/communication/conflict_resolution/protocol.py`:
- Around line 18-50: The ConflictResolver protocol currently only exposes
resolve and build_dissent_record; add the missing extension points
detect_conflict and log_dissent to match the required contract so strategies can
handle detection and dissent persistence. Specifically, add a detect_conflict
method (sync or async as your strategy implementations expect) that accepts the
input/context needed to determine a Conflict and return a Conflict or
Optional[Conflict], and add a log_dissent method that accepts the Conflict and
ConflictResolution (or a DissentRecord) and persists/logs the losing reasoning;
keep or adapt build_dissent_record as a helper but ensure log_dissent is part of
the protocol surface so callers can rely on strategy implementations for
persistence. Ensure method names match detect_conflict and log_dissent alongside
the existing resolve in the ConflictResolver class.
In `@src/ai_company/communication/conflict_resolution/service.py`:
- Around line 153-165: The dispatch logic in resolve() always uses
self._config.strategy so per-conflict or per-department overrides are ignored;
change resolve() to determine the active strategy by first checking for a
conflict-specific override (e.g., a strategy field on the conflict object or an
overrides/context argument), then falling back to self._config.strategy, and
look up the resolver from self._resolvers using that active strategy; if no
resolver is found keep the existing logging (CONFLICT_NO_RESOLVER, strategy=...)
and raise ConflictResolutionError with context={"strategy": active_strategy} to
preserve current error behavior while honoring overrides.
In `@src/ai_company/communication/delegation/hierarchy.py`:
- Around line 212-243: The function get_lowest_common_manager incorrectly treats
identical inputs as if one were only an ancestor; before calling get_ancestors
or performing ancestor checks, add an explicit equality check if agent_a ==
agent_b and return agent_a immediately. Keep the rest of the logic (using
get_ancestors and ancestors_b_set) unchanged so identical-agent cases return the
agent itself rather than its supervisor.
In `@tests/unit/communication/conflict_resolution/test_authority_strategy.py`:
- Around line 24-229: Add a new async unit test that exercises N>2 participant
resolution: instantiate AuthorityResolver, create a conflict with at least three
make_position entries (mixed SeniorityLevel values and departments to exercise
LCM logic), call resolver.resolve(conflict) and assert the final
resolution.winning_agent_id and resolution.outcome (e.g.,
ConflictResolutionOutcome.RESOLVED_BY_AUTHORITY), then call
resolver.build_dissent_record(conflict, resolution) and assert the record
contains the losing agent(s) info (dissenting_agent_id, dissenting_position) and
strategy_used == ConflictResolutionStrategy.AUTHORITY; place the test alongside
the existing TestAuthorityResolverSeniority or
TestAuthorityResolverDissentRecord tests so it runs with the other authority
tests.
In `@tests/unit/communication/conflict_resolution/test_config.py`:
- Around line 17-62: Add negative tests that assert ValidationError is raised
when identifier fields accept blank/whitespace-only strings: for DebateConfig,
call DebateConfig(judge="") and DebateConfig(judge=" ") inside
pytest.raises(ValidationError); for HybridConfig, call
HybridConfig(review_agent="") and HybridConfig(review_agent=" ") inside
pytest.raises(ValidationError). Reference the DebateConfig.judge and
HybridConfig.review_agent fields (and NotBlankStr expectation) so these tests
fail if the types are relaxed to plain str.
In `@tests/unit/communication/conflict_resolution/test_hybrid_strategy.py`:
- Around line 27-39: The tests do not verify that HybridConfig.review_agent is
forwarded to the evaluator; update FakeReviewEvaluator to record the received
judge_agent_id (e.g., add an attribute like last_judge_agent_id set inside
evaluate) and add at least one test case that sets a non-default
HybridConfig.review_agent value and asserts that
FakeReviewEvaluator.last_judge_agent_id equals that exact ID after the resolver
runs; update other related scenarios (lines 42–147) to use or assert the
recorded value where appropriate so the test fails if the review agent is not
forwarded.
In `@tests/unit/communication/conflict_resolution/test_service.py`:
- Around line 357-399: The test_query_by_strategy currently creates the
HUMAN-resolution record on service_h so querying
service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY) can
pass even if the strategy filter is ignored; fix by ensuring both records live
in the same service before querying (e.g., create both conflict_a and conflict_h
using the same service instance or call service.create_conflict for the HUMAN
record) and then call
service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY) and
assert that only the AUTHORITY record is returned; reference
test_query_by_strategy, service, service_h, create_conflict,
query_dissent_records, and ConflictResolutionStrategy.AUTHORITY.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ef5f8cff-ac49-42e3-879f-07891b9dd186
📒 Files selected for processing (33)
CLAUDE.mdDESIGN_SPEC.mdREADME.mdsrc/ai_company/communication/__init__.pysrc/ai_company/communication/config.pysrc/ai_company/communication/conflict_resolution/__init__.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/communication/conflict_resolution/authority_strategy.pysrc/ai_company/communication/conflict_resolution/config.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/communication/conflict_resolution/human_strategy.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/communication/conflict_resolution/models.pysrc/ai_company/communication/conflict_resolution/protocol.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/communication/delegation/hierarchy.pysrc/ai_company/communication/enums.pysrc/ai_company/communication/errors.pysrc/ai_company/core/enums.pysrc/ai_company/observability/events/conflict.pytests/unit/communication/conflict_resolution/__init__.pytests/unit/communication/conflict_resolution/conftest.pytests/unit/communication/conflict_resolution/test_authority_strategy.pytests/unit/communication/conflict_resolution/test_config.pytests/unit/communication/conflict_resolution/test_debate_strategy.pytests/unit/communication/conflict_resolution/test_human_strategy.pytests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/communication/conflict_resolution/test_models.pytests/unit/communication/conflict_resolution/test_service.pytests/unit/communication/conftest.pytests/unit/communication/delegation/test_hierarchy.pytests/unit/core/test_enums.pytests/unit/observability/test_events.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotationsin Python 3.14+ code — Python 3.14 has native PEP 649 lazy annotations
Useexcept A, B:(no parentheses) for exception syntax in Python 3.14 — PEP 758 except syntax enforced by ruff
All public functions and classes must have type hints; mypy strict mode is enforced
All public classes and functions must have Google-style docstrings; ruff D rules enforce this
Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement of non-Pydantic internal collections (registries, BaseTool)
For non-Pydantic internal collections and dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow errors
Validate at system boundaries (user input, external APIs, config files)
Line length must be 88 characters (enforced by ruff)
Files:
tests/unit/communication/conflict_resolution/test_models.pytests/unit/communication/conflict_resolution/test_human_strategy.pysrc/ai_company/communication/conflict_resolution/config.pytests/unit/communication/conflict_resolution/test_hybrid_strategy.pysrc/ai_company/communication/conflict_resolution/human_strategy.pytests/unit/observability/test_events.pysrc/ai_company/communication/conflict_resolution/__init__.pytests/unit/communication/conflict_resolution/test_config.pytests/unit/communication/conflict_resolution/test_service.pysrc/ai_company/communication/enums.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/communication/config.pytests/unit/communication/conflict_resolution/conftest.pysrc/ai_company/communication/conflict_resolution/models.pysrc/ai_company/communication/__init__.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/core/enums.pytests/unit/communication/conflict_resolution/test_debate_strategy.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/communication/conflict_resolution/authority_strategy.pytests/unit/core/test_enums.pytests/unit/communication/conflict_resolution/test_authority_strategy.pysrc/ai_company/communication/delegation/hierarchy.pysrc/ai_company/communication/errors.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pytests/unit/communication/conftest.pysrc/ai_company/observability/events/conflict.pysrc/ai_company/communication/conflict_resolution/protocol.pytests/unit/communication/delegation/test_hierarchy.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slowas appropriate
Prefer@pytest.mark.parametrizefor testing similar cases
Use vendor-agnostic test names: test-provider, test-small-001, etc. instead of real vendor names
Files:
tests/unit/communication/conflict_resolution/test_models.pytests/unit/communication/conflict_resolution/test_human_strategy.pytests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/observability/test_events.pytests/unit/communication/conflict_resolution/test_config.pytests/unit/communication/conflict_resolution/test_service.pytests/unit/communication/conflict_resolution/conftest.pytests/unit/communication/conflict_resolution/test_debate_strategy.pytests/unit/core/test_enums.pytests/unit/communication/conflict_resolution/test_authority_strategy.pytests/unit/communication/conftest.pytests/unit/communication/delegation/test_hierarchy.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic must importfrom ai_company.observability import get_loggerand createlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Always use variable namelogger(not_logger, notlog) for the logger instance
Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Always use structured logging with kwargs:logger.info(EVENT, key=value)— never use formatted strings likelogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG logging is for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/communication/conflict_resolution/config.pysrc/ai_company/communication/conflict_resolution/human_strategy.pysrc/ai_company/communication/conflict_resolution/__init__.pysrc/ai_company/communication/enums.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/communication/config.pysrc/ai_company/communication/conflict_resolution/models.pysrc/ai_company/communication/__init__.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/core/enums.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/communication/conflict_resolution/authority_strategy.pysrc/ai_company/communication/delegation/hierarchy.pysrc/ai_company/communication/errors.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/observability/events/conflict.pysrc/ai_company/communication/conflict_resolution/protocol.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases
Files:
src/ai_company/communication/conflict_resolution/config.pysrc/ai_company/communication/conflict_resolution/human_strategy.pysrc/ai_company/communication/conflict_resolution/__init__.pysrc/ai_company/communication/enums.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/communication/config.pysrc/ai_company/communication/conflict_resolution/models.pysrc/ai_company/communication/__init__.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/core/enums.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/communication/conflict_resolution/authority_strategy.pysrc/ai_company/communication/delegation/hierarchy.pysrc/ai_company/communication/errors.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/observability/events/conflict.pysrc/ai_company/communication/conflict_resolution/protocol.py
🧠 Learnings (11)
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Applied to files:
src/ai_company/communication/conflict_resolution/config.pysrc/ai_company/communication/conflict_resolution/models.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Applied to files:
tests/unit/observability/test_events.pyDESIGN_SPEC.mdCLAUDE.mdsrc/ai_company/observability/events/conflict.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to **/*.py : Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use computed_field for derived values instead of storing redundant fields
Applied to files:
src/ai_company/communication/config.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic must import `from ai_company.observability import get_logger` and create `logger = get_logger(__name__)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use structured logging with kwargs: `logger.info(EVENT, key=value)` — never use formatted strings like `logger.info("msg %s", val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use variable name `logger` (not `_logger`, not `log`) for the logger instance
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : All state transitions must log at INFO level
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : DEBUG logging is for object creation, internal flow, and entry/exit of key functions
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/providers/**/*.py : RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig
Applied to files:
tests/unit/communication/conftest.py
🧬 Code graph analysis (14)
tests/unit/communication/conflict_resolution/test_human_strategy.py (3)
src/ai_company/communication/conflict_resolution/human_strategy.py (3)
HumanEscalationResolver(26-94)resolve(37-64)build_dissent_record(66-94)src/ai_company/communication/conflict_resolution/models.py (1)
ConflictResolutionOutcome(29-42)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)
src/ai_company/communication/conflict_resolution/config.py (3)
src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)src/ai_company/tools/base.py (1)
description(115-117)src/ai_company/engine/shutdown.py (1)
strategy(352-354)
src/ai_company/communication/conflict_resolution/human_strategy.py (2)
src/ai_company/communication/conflict_resolution/models.py (4)
Conflict(67-111)ConflictResolution(114-166)ConflictResolutionOutcome(29-42)DissentRecord(169-203)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)
tests/unit/communication/conflict_resolution/test_config.py (2)
src/ai_company/communication/conflict_resolution/config.py (3)
ConflictResolutionConfig(58-80)DebateConfig(9-28)HybridConfig(31-55)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)
src/ai_company/communication/conflict_resolution/service.py (8)
src/ai_company/communication/conflict_resolution/config.py (1)
ConflictResolutionConfig(58-80)src/ai_company/communication/conflict_resolution/models.py (5)
Conflict(67-111)ConflictPosition(45-64)ConflictResolution(114-166)DissentRecord(169-203)is_cross_department(97-99)src/ai_company/communication/conflict_resolution/protocol.py (3)
ConflictResolver(18-50)resolve(25-34)build_dissent_record(36-50)src/ai_company/communication/enums.py (2)
ConflictResolutionStrategy(95-108)ConflictType(75-92)src/ai_company/communication/conflict_resolution/authority_strategy.py (2)
resolve(53-94)build_dissent_record(96-119)src/ai_company/communication/conflict_resolution/debate_strategy.py (2)
resolve(72-125)build_dissent_record(127-151)src/ai_company/communication/conflict_resolution/hybrid_strategy.py (2)
resolve(84-144)build_dissent_record(146-183)src/ai_company/communication/conflict_resolution/human_strategy.py (2)
resolve(37-64)build_dissent_record(66-94)
src/ai_company/communication/config.py (1)
src/ai_company/communication/conflict_resolution/config.py (1)
ConflictResolutionConfig(58-80)
src/ai_company/communication/__init__.py (5)
src/ai_company/communication/conflict_resolution/models.py (5)
Conflict(67-111)ConflictPosition(45-64)ConflictResolution(114-166)ConflictResolutionOutcome(29-42)DissentRecord(169-203)src/ai_company/communication/conflict_resolution/debate_strategy.py (1)
DebateResolver(46-239)src/ai_company/communication/conflict_resolution/human_strategy.py (1)
HumanEscalationResolver(26-94)src/ai_company/communication/enums.py (2)
ConflictResolutionStrategy(95-108)ConflictType(75-92)src/ai_company/communication/errors.py (3)
ConflictHierarchyError(111-112)ConflictResolutionError(103-104)ConflictStrategyError(107-108)
src/ai_company/communication/conflict_resolution/_helpers.py (3)
src/ai_company/communication/conflict_resolution/models.py (3)
Conflict(67-111)ConflictPosition(45-64)ConflictResolution(114-166)src/ai_company/communication/errors.py (1)
ConflictStrategyError(107-108)src/ai_company/core/enums.py (1)
compare_seniority(27-40)
src/ai_company/communication/conflict_resolution/authority_strategy.py (5)
src/ai_company/communication/conflict_resolution/_helpers.py (1)
find_loser(14-38)src/ai_company/communication/delegation/hierarchy.py (2)
get_lowest_common_manager(212-243)get_delegation_depth(245-267)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)src/ai_company/communication/errors.py (1)
ConflictHierarchyError(111-112)src/ai_company/core/enums.py (1)
compare_seniority(27-40)
tests/unit/core/test_enums.py (1)
src/ai_company/core/enums.py (2)
compare_seniority(27-40)SeniorityLevel(6-21)
tests/unit/communication/conflict_resolution/test_authority_strategy.py (10)
src/ai_company/communication/conflict_resolution/authority_strategy.py (3)
AuthorityResolver(36-255)resolve(53-94)build_dissent_record(96-119)src/ai_company/communication/conflict_resolution/models.py (1)
ConflictResolutionOutcome(29-42)src/ai_company/communication/delegation/hierarchy.py (1)
HierarchyResolver(16-267)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)src/ai_company/communication/errors.py (1)
ConflictHierarchyError(111-112)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)src/ai_company/communication/conflict_resolution/debate_strategy.py (2)
resolve(72-125)build_dissent_record(127-151)src/ai_company/communication/conflict_resolution/hybrid_strategy.py (2)
resolve(84-144)build_dissent_record(146-183)src/ai_company/communication/conflict_resolution/protocol.py (2)
resolve(25-34)build_dissent_record(36-50)src/ai_company/communication/conflict_resolution/human_strategy.py (2)
resolve(37-64)build_dissent_record(66-94)
src/ai_company/communication/conflict_resolution/debate_strategy.py (4)
src/ai_company/communication/conflict_resolution/_helpers.py (3)
find_loser(14-38)find_position_or_raise(60-86)pick_highest_seniority(89-104)src/ai_company/communication/conflict_resolution/config.py (1)
DebateConfig(9-28)src/ai_company/communication/conflict_resolution/models.py (3)
Conflict(67-111)ConflictResolution(114-166)ConflictResolutionOutcome(29-42)src/ai_company/communication/conflict_resolution/protocol.py (3)
JudgeEvaluator(53-74)resolve(25-34)evaluate(60-74)
tests/unit/communication/conftest.py (2)
src/ai_company/communication/conflict_resolution/config.py (3)
ConflictResolutionConfig(58-80)DebateConfig(9-28)HybridConfig(31-55)src/ai_company/communication/config.py (1)
CommunicationConfig(273-311)
tests/unit/communication/delegation/test_hierarchy.py (1)
src/ai_company/communication/delegation/hierarchy.py (2)
HierarchyResolver(16-267)get_lowest_common_manager(212-243)
🪛 LanguageTool
README.md
[typographical] ~18-~18: To join two clauses or introduce examples, consider using an em dash.
Context: ...oop prevention - Conflict Resolution - Pluggable strategies for resolving agent...
(DASH_RULE)
🔇 Additional comments (17)
src/ai_company/communication/conflict_resolution/_helpers.py (2)
89-104: LGTM!The
pick_highest_seniorityimplementation correctly usescompare_seniorityand preserves the incumbent on ties. The assumption of non-emptypositionsis guaranteed by theConflictmodel's validator.
14-38:find_loseris working as designed — the N-party limitation is architectural.The function correctly returns the first non-winning position, and the docstring accurately describes this. However, there is a legitimate architectural concern: the conflict resolution system creates only one
DissentRecordper resolution, so for N-party conflicts with multiple non-winning agents, only the first loser's reasoning is preserved. The other agents' positions are not audited.This is not a bug in
find_loserbut rather a system-wide design constraint. If multi-loser audit trails are needed for N-party conflicts, the fix would require changes toConflictResolutionand the service layer to support multiple dissent records, not changes to this helper function.src/ai_company/communication/conflict_resolution/authority_strategy.py (3)
96-119: LGTM!The
build_dissent_recordmethod correctly creates a dissent record for the losing position, usingfind_loserand appropriate timestamps.
121-153: LGTM!The
_pick_winnermethod correctly implements iterative N-party comparison with seniority-first logic and hierarchy tiebreaker. The incumbent-favoring behavior on equal depth is documented inline.
155-255: LGTM!The hierarchy resolution methods (
_resolve_by_hierarchy,_resolve_depth) handle edge cases well:
- LCM lookup with proper logging
- Agent-is-LCM edge case returns depth 0
- Unreachable agent detection with WARNING log before raise
Error paths are properly logged before raising
ConflictHierarchyError.src/ai_company/communication/conflict_resolution/models.py (4)
1-43: LGTM!The module follows Pydantic v2 patterns with frozen models,
NotBlankStrfor identifiers, and properly usesStrEnumforConflictResolutionOutcome.
45-112: LGTM!
ConflictPositionandConflictmodels are well-structured:
@computed_fieldforis_cross_departmentavoids redundant storage- Model validator ensures minimum 2 positions with unique agent IDs
- Immutable tuple for positions collection
114-167: LGTM!
ConflictResolutionproperly enforces consistency betweenoutcomeand winner fields via the model validator. TheESCALATED_TO_HUMANspecial case is handled correctly.
169-203: LGTM!
DissentRecordcorrectly preserves the audit trail with immutablemetadataastuple[tuple[NotBlankStr, NotBlankStr], ...], ensuring the frozen model constraint is maintained.src/ai_company/observability/events/conflict.py (1)
1-39: LGTM!Event constants are properly defined with
Final[str]typing and organized by category. The naming follows the establisheddomain.subjectpattern.tests/unit/observability/test_events.py (3)
24-45: LGTM!Comprehensive import of all conflict event constants for test coverage.
139-139: LGTM!Adding "conflict" to the expected domain modules ensures the new event module is discovered by the test suite.
228-249: LGTM!The test method thoroughly validates all 20 conflict event constants with their expected string values, ensuring the observability contract is maintained.
README.md (1)
18-18: LGTM!The feature description accurately summarizes the conflict resolution subsystem capabilities and follows the existing bullet point formatting style.
src/ai_company/communication/config.py (2)
8-10: LGTM!Clean import of
ConflictResolutionConfigfrom the new conflict resolution config module.
308-311: LGTM!The
conflict_resolutionfield follows the established pattern of other nested config fields withdefault_factoryand appropriate documentation referencing DESIGN_SPEC §5.6.tests/unit/communication/delegation/test_hierarchy.py (1)
341-380: LGTM!Comprehensive test coverage for
get_lowest_common_manager:
- Same-team (LCM is team lead)
- Cross-team (LCM is department head)
- No common manager (separate hierarchy roots)
- Ancestor relationships (both directions)
- Cross-department without shared root
Tests correctly verify the expected behavior from the
HierarchyResolverimplementation.
| max_tokens_per_argument: int = Field( | ||
| default=500, | ||
| gt=0, | ||
| description="Token budget per argument", | ||
| ) |
There was a problem hiding this comment.
max_tokens_per_argument is dead configuration right now.
Neither src/ai_company/communication/conflict_resolution/debate_strategy.py nor src/ai_company/communication/conflict_resolution/hybrid_strategy.py consumes these values, and JudgeEvaluator.evaluate() has no budget parameter either. Callers can set a cap here that the strategies never enforce.
Also applies to: 43-46
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/communication/conflict_resolution/config.py` around lines 20 -
24, max_tokens_per_argument in config.py is unused; either remove it or wire it
into the evaluation/strategy flow — update JudgeEvaluator.evaluate to accept a
token_budget (or budget_kwargs) and have DebateStrategy and HybridStrategy
consume that budget when generating/truncating arguments; specifically, add a
parameter to JudgeEvaluator.evaluate, propagate it into the methods on
DebateStrategy and HybridStrategy (e.g., DebateStrategy.generate_arguments /
HybridStrategy.generate_arguments or equivalent entrypoints), and enforce the
cap by truncating or limiting tokens per argument using max_tokens_per_argument
from the config class; ensure all call sites are updated to pass the config
value so the cap is actually enforced.
| if self._config.judge == "ceo": | ||
| # Walk from first position to hierarchy root | ||
| ancestors = self._hierarchy.get_ancestors( | ||
| conflict.positions[0].agent_id, | ||
| ) | ||
| if ancestors: | ||
| return ancestors[-1] | ||
| # Agent has no ancestors — they are the root | ||
| return conflict.positions[0].agent_id |
There was a problem hiding this comment.
judge="ceo" becomes input-order dependent in multi-root hierarchies.
This picks the root of the first participant's chain, or the participant itself, so a cross-department debate can be judged by one side's department head just because that position was listed first. If there is no unique company root, this should raise instead of silently choosing the first chain.
| class TestAuthorityResolverSeniority: | ||
| async def test_higher_seniority_wins( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="sr_dev", | ||
| level=SeniorityLevel.SENIOR, | ||
| position="Use microservices", | ||
| ), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Use monolith", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| assert resolution.winning_agent_id == "sr_dev" | ||
| assert resolution.outcome == ConflictResolutionOutcome.RESOLVED_BY_AUTHORITY | ||
|
|
||
| async def test_lower_seniority_loses( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Quick fix", | ||
| ), | ||
| make_position( | ||
| agent_id="sr_dev", | ||
| level=SeniorityLevel.SENIOR, | ||
| position="Proper refactor", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| assert resolution.winning_agent_id == "sr_dev" | ||
|
|
||
| async def test_c_suite_beats_lead( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="cto", | ||
| level=SeniorityLevel.C_SUITE, | ||
| position="New architecture", | ||
| ), | ||
| make_position( | ||
| agent_id="backend_lead", | ||
| level=SeniorityLevel.LEAD, | ||
| position="Keep current", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| assert resolution.winning_agent_id == "cto" | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| class TestAuthorityResolverHierarchy: | ||
| async def test_equal_seniority_closer_to_lcm_wins( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| # backend_lead and frontend_lead are both LEADs under cto | ||
| # backend_lead is first in hierarchy, same distance to cto | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="backend_lead", | ||
| level=SeniorityLevel.LEAD, | ||
| position="Use REST", | ||
| ), | ||
| make_position( | ||
| agent_id="frontend_lead", | ||
| level=SeniorityLevel.LEAD, | ||
| position="Use GraphQL", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| # Both are depth 1 from LCM (cto), so first position wins | ||
| assert resolution.winning_agent_id == "backend_lead" | ||
|
|
||
| async def test_cross_department_no_common_manager_raises( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| # cto and qa_head have no common manager | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="cto", | ||
| level=SeniorityLevel.C_SUITE, | ||
| position="Ship fast", | ||
| ), | ||
| make_position( | ||
| agent_id="qa_head", | ||
| level=SeniorityLevel.C_SUITE, | ||
| position="More testing", | ||
| department="qa", | ||
| ), | ||
| ), | ||
| ) | ||
| with pytest.raises(ConflictHierarchyError, match="No common manager"): | ||
| await resolver.resolve(conflict) | ||
|
|
||
| async def test_subordinate_vs_supervisor_equal_seniority( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| """When seniority is equal but one is closer to LCM, closer wins.""" | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="backend_lead", | ||
| level=SeniorityLevel.SENIOR, | ||
| position="Approach A", | ||
| ), | ||
| make_position( | ||
| agent_id="sr_dev", | ||
| level=SeniorityLevel.SENIOR, | ||
| position="Approach B", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| # backend_lead is closer to LCM (cto) — depth 0 (IS the LCM) or 1 | ||
| assert resolution.winning_agent_id == "backend_lead" | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| class TestAuthorityResolverDissentRecord: | ||
| async def test_dissent_record_has_loser_info( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="sr_dev", | ||
| level=SeniorityLevel.SENIOR, | ||
| position="My approach", | ||
| ), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Other approach", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| record = resolver.build_dissent_record(conflict, resolution) | ||
| assert record.dissenting_agent_id == "jr_dev" | ||
| assert record.dissenting_position == "Other approach" | ||
| assert record.strategy_used == ConflictResolutionStrategy.AUTHORITY | ||
|
|
||
| async def test_dissent_record_id_format( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| conflict = make_conflict() | ||
| resolution = await resolver.resolve(conflict) | ||
| record = resolver.build_dissent_record(conflict, resolution) | ||
| assert record.id.startswith("dissent-") | ||
|
|
||
| async def test_cross_department_logged( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| """Cross-department flag triggers logging but resolution works.""" | ||
| resolver = AuthorityResolver(hierarchy=hierarchy) | ||
| # sr_dev (eng, SENIOR) vs qa_eng (qa, JUNIOR) — different seniority | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="sr_dev", | ||
| level=SeniorityLevel.SENIOR, | ||
| position="Eng approach", | ||
| ), | ||
| make_position( | ||
| agent_id="qa_eng", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="QA approach", | ||
| department="qa", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| assert resolution.winning_agent_id == "sr_dev" |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add a 3+ participant authority-resolution case.
Every resolver scenario here is still two-party, but this PR explicitly adds N-party winner selection via pairwise seniority and iterative LCM handling. A regression in that branch would leave this file green, and the same authority logic underpins debate/hybrid fallback paths. Please add at least one three-position conflict that asserts both the final winner and the dissent record.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/conflict_resolution/test_authority_strategy.py`
around lines 24 - 229, Add a new async unit test that exercises N>2 participant
resolution: instantiate AuthorityResolver, create a conflict with at least three
make_position entries (mixed SeniorityLevel values and departments to exercise
LCM logic), call resolver.resolve(conflict) and assert the final
resolution.winning_agent_id and resolution.outcome (e.g.,
ConflictResolutionOutcome.RESOLVED_BY_AUTHORITY), then call
resolver.build_dissent_record(conflict, resolution) and assert the record
contains the losing agent(s) info (dissenting_agent_id, dissenting_position) and
strategy_used == ConflictResolutionStrategy.AUTHORITY; place the test alongside
the existing TestAuthorityResolverSeniority or
TestAuthorityResolverDissentRecord tests so it runs with the other authority
tests.
| class TestDebateConfig: | ||
| def test_defaults(self) -> None: | ||
| cfg = DebateConfig() | ||
| assert cfg.max_tokens_per_argument == 500 | ||
| assert cfg.judge == "shared_manager" | ||
|
|
||
| def test_custom_values(self) -> None: | ||
| cfg = DebateConfig(max_tokens_per_argument=1000, judge="ceo") | ||
| assert cfg.max_tokens_per_argument == 1000 | ||
| assert cfg.judge == "ceo" | ||
|
|
||
| def test_frozen(self) -> None: | ||
| cfg = DebateConfig() | ||
| with pytest.raises(ValidationError): | ||
| cfg.judge = "changed" # type: ignore[misc] | ||
|
|
||
| def test_zero_tokens_rejected(self) -> None: | ||
| with pytest.raises(ValidationError): | ||
| DebateConfig(max_tokens_per_argument=0) | ||
|
|
||
| def test_negative_tokens_rejected(self) -> None: | ||
| with pytest.raises(ValidationError): | ||
| DebateConfig(max_tokens_per_argument=-1) | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| class TestHybridConfig: | ||
| def test_defaults(self) -> None: | ||
| cfg = HybridConfig() | ||
| assert cfg.max_tokens_per_argument == 500 | ||
| assert cfg.review_agent == "conflict_reviewer" | ||
| assert cfg.escalate_on_ambiguity is True | ||
|
|
||
| def test_custom_values(self) -> None: | ||
| cfg = HybridConfig( | ||
| max_tokens_per_argument=300, | ||
| review_agent="senior_reviewer", | ||
| escalate_on_ambiguity=False, | ||
| ) | ||
| assert cfg.review_agent == "senior_reviewer" | ||
| assert cfg.escalate_on_ambiguity is False | ||
|
|
||
| def test_frozen(self) -> None: | ||
| cfg = HybridConfig() | ||
| with pytest.raises(ValidationError): | ||
| cfg.escalate_on_ambiguity = False # type: ignore[misc] |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add whitespace-rejection coverage for config identifiers.
judge and review_agent are config-boundary identifiers, but this suite only exercises valid values. If either field is relaxed from NotBlankStr to plain str, these tests still stay green. Please add blank/whitespace cases for both models.
Example coverage
`@pytest.mark.unit`
class TestDebateConfig:
+ `@pytest.mark.parametrize`("judge", ["", " "])
+ def test_blank_judge_rejected(self, judge: str) -> None:
+ with pytest.raises(ValidationError):
+ DebateConfig(judge=judge)
+
def test_defaults(self) -> None:
cfg = DebateConfig()
assert cfg.max_tokens_per_argument == 500
assert cfg.judge == "shared_manager"
@@
`@pytest.mark.unit`
class TestHybridConfig:
+ `@pytest.mark.parametrize`("review_agent", ["", " "])
+ def test_blank_review_agent_rejected(self, review_agent: str) -> None:
+ with pytest.raises(ValidationError):
+ HybridConfig(review_agent=review_agent)
+
def test_defaults(self) -> None:
cfg = HybridConfig()
assert cfg.max_tokens_per_argument == 500
assert cfg.review_agent == "conflict_reviewer"As per coding guidelines, "Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators" and "Validate at system boundaries (user input, external APIs, config files)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/conflict_resolution/test_config.py` around lines 17
- 62, Add negative tests that assert ValidationError is raised when identifier
fields accept blank/whitespace-only strings: for DebateConfig, call
DebateConfig(judge="") and DebateConfig(judge=" ") inside
pytest.raises(ValidationError); for HybridConfig, call
HybridConfig(review_agent="") and HybridConfig(review_agent=" ") inside
pytest.raises(ValidationError). Reference the DebateConfig.judge and
HybridConfig.review_agent fields (and NotBlankStr expectation) so these tests
fail if the types are relaxed to plain str.
| @pytest.mark.unit | ||
| class TestDebateResolverWithJudge: | ||
| async def test_judge_evaluator_picks_winner( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| judge = FakeJudgeEvaluator(winner_id="sr_dev") | ||
| resolver = DebateResolver( | ||
| hierarchy=hierarchy, | ||
| config=DebateConfig(judge="shared_manager"), | ||
| judge_evaluator=judge, | ||
| ) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position(agent_id="sr_dev", level=SeniorityLevel.SENIOR), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Other", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| assert resolution.winning_agent_id == "sr_dev" | ||
| assert resolution.outcome == ConflictResolutionOutcome.RESOLVED_BY_DEBATE | ||
| assert len(judge.calls) == 1 | ||
|
|
||
| async def test_judge_receives_correct_agent_id( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| judge = FakeJudgeEvaluator(winner_id="sr_dev") | ||
| resolver = DebateResolver( | ||
| hierarchy=hierarchy, | ||
| config=DebateConfig(judge="shared_manager"), | ||
| judge_evaluator=judge, | ||
| ) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position(agent_id="sr_dev", level=SeniorityLevel.SENIOR), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Other", | ||
| ), | ||
| ), | ||
| ) | ||
| await resolver.resolve(conflict) | ||
| # shared_manager of sr_dev and jr_dev is backend_lead | ||
| _, judge_id = judge.calls[0] | ||
| assert judge_id == "backend_lead" | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| class TestDebateResolverFallback: | ||
| async def test_no_evaluator_falls_back_to_authority( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = DebateResolver( | ||
| hierarchy=hierarchy, | ||
| config=DebateConfig(), | ||
| judge_evaluator=None, | ||
| ) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="sr_dev", | ||
| level=SeniorityLevel.SENIOR, | ||
| position="Approach A", | ||
| ), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Approach B", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| assert resolution.winning_agent_id == "sr_dev" | ||
| assert "fallback" in resolution.reasoning.lower() | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| class TestDebateJudgeSelection: | ||
| async def test_ceo_judge( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| judge = FakeJudgeEvaluator(winner_id="sr_dev") | ||
| resolver = DebateResolver( | ||
| hierarchy=hierarchy, | ||
| config=DebateConfig(judge="ceo"), | ||
| judge_evaluator=judge, | ||
| ) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position(agent_id="sr_dev", level=SeniorityLevel.SENIOR), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Other", | ||
| ), | ||
| ), | ||
| ) | ||
| await resolver.resolve(conflict) | ||
| _, judge_id = judge.calls[0] | ||
| # Root of sr_dev's hierarchy is cto | ||
| assert judge_id == "cto" | ||
|
|
||
| async def test_named_judge( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| judge = FakeJudgeEvaluator(winner_id="sr_dev") | ||
| resolver = DebateResolver( | ||
| hierarchy=hierarchy, | ||
| config=DebateConfig(judge="external_reviewer"), | ||
| judge_evaluator=judge, | ||
| ) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position(agent_id="sr_dev", level=SeniorityLevel.SENIOR), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Other", | ||
| ), | ||
| ), | ||
| ) | ||
| await resolver.resolve(conflict) | ||
| _, judge_id = judge.calls[0] | ||
| assert judge_id == "external_reviewer" | ||
|
|
||
| async def test_shared_manager_no_lcm_raises( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| resolver = DebateResolver( | ||
| hierarchy=hierarchy, | ||
| config=DebateConfig(judge="shared_manager"), | ||
| ) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position( | ||
| agent_id="cto", | ||
| level=SeniorityLevel.C_SUITE, | ||
| ), | ||
| make_position( | ||
| agent_id="qa_head", | ||
| level=SeniorityLevel.C_SUITE, | ||
| position="Other", | ||
| department="qa", | ||
| ), | ||
| ), | ||
| ) | ||
| with pytest.raises(ConflictHierarchyError): | ||
| await resolver.resolve(conflict) | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| class TestDebateResolverDissentRecord: | ||
| async def test_dissent_record_includes_judge( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| judge = FakeJudgeEvaluator(winner_id="sr_dev") | ||
| resolver = DebateResolver( | ||
| hierarchy=hierarchy, | ||
| config=DebateConfig(judge="shared_manager"), | ||
| judge_evaluator=judge, | ||
| ) | ||
| conflict = make_conflict( | ||
| positions=( | ||
| make_position(agent_id="sr_dev", level=SeniorityLevel.SENIOR), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Other approach", | ||
| ), | ||
| ), | ||
| ) | ||
| resolution = await resolver.resolve(conflict) | ||
| record = resolver.build_dissent_record(conflict, resolution) | ||
| assert record.dissenting_agent_id == "jr_dev" | ||
| assert record.strategy_used == ConflictResolutionStrategy.DEBATE | ||
| assert ("judge", resolution.decided_by) in record.metadata | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Exercise the token-cap and transcript audit contract.
These tests cover judge selection and winner picking, but they never make max_tokens_per_argument observable. The dissent-record assertion also stops at a single metadata tuple, so a resolver that ignores the token budget or drops the recorded arguments/judge reasoning would still pass. Please add an oversized-argument case that validates the capped debate flow and the persisted audit trail.
| class FakeReviewEvaluator: | ||
| """Fake review evaluator for testing.""" | ||
|
|
||
| def __init__(self, winner_id: str, reasoning: str = "Review decided") -> None: | ||
| self._winner_id = winner_id | ||
| self._reasoning = reasoning | ||
|
|
||
| async def evaluate( | ||
| self, | ||
| conflict: Conflict, | ||
| judge_agent_id: str, | ||
| ) -> tuple[str, str]: | ||
| return self._winner_id, self._reasoning |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Assert that review_agent is actually forwarded.
HybridConfig.review_agent is a new public knob, but the fake evaluator never records who invoked it and every scenario here uses the default reviewer implicitly. A resolver that hardcodes the review agent would still pass this file. Add one non-default review_agent case and assert that exact ID reaches evaluate().
Example test shape
class FakeReviewEvaluator:
"""Fake review evaluator for testing."""
def __init__(self, winner_id: str, reasoning: str = "Review decided") -> None:
self._winner_id = winner_id
self._reasoning = reasoning
+ self.calls: list[tuple[Conflict, str]] = []
async def evaluate(
self,
conflict: Conflict,
judge_agent_id: str,
) -> tuple[str, str]:
+ self.calls.append((conflict, judge_agent_id))
return self._winner_id, self._reasoning
@@
`@pytest.mark.unit`
class TestHybridResolverAutoResolve:
+ async def test_custom_review_agent_is_used(
+ self,
+ hierarchy: HierarchyResolver,
+ ) -> None:
+ evaluator = FakeReviewEvaluator(winner_id="sr_dev")
+ resolver = HybridResolver(
+ hierarchy=hierarchy,
+ config=HybridConfig(review_agent="custom_reviewer"),
+ human_resolver=HumanEscalationResolver(),
+ review_evaluator=evaluator,
+ )
+ conflict = make_conflict(
+ positions=(
+ make_position(agent_id="sr_dev", level=SeniorityLevel.SENIOR),
+ make_position(
+ agent_id="jr_dev",
+ level=SeniorityLevel.JUNIOR,
+ position="Other",
+ ),
+ ),
+ )
+ await resolver.resolve(conflict)
+ _, reviewer_id = evaluator.calls[0]
+ assert reviewer_id == "custom_reviewer"Also applies to: 42-147
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/conflict_resolution/test_hybrid_strategy.py` around
lines 27 - 39, The tests do not verify that HybridConfig.review_agent is
forwarded to the evaluator; update FakeReviewEvaluator to record the received
judge_agent_id (e.g., add an attribute like last_judge_agent_id set inside
evaluate) and add at least one test case that sets a non-default
HybridConfig.review_agent value and asserts that
FakeReviewEvaluator.last_judge_agent_id equals that exact ID after the resolver
runs; update other related scenarios (lines 42–147) to use or assert the
recorded value where appropriate so the test fails if the review agent is not
forwarded.
| async def test_query_by_strategy( | ||
| self, | ||
| hierarchy: HierarchyResolver, | ||
| ) -> None: | ||
| service = _make_service(hierarchy) | ||
| # Resolve via authority | ||
| conflict_a = service.create_conflict( | ||
| conflict_type=ConflictType.ARCHITECTURE, | ||
| subject="Design dispute", | ||
| positions=[ | ||
| make_position( | ||
| agent_id="sr_dev", | ||
| level=SeniorityLevel.SENIOR, | ||
| ), | ||
| make_position( | ||
| agent_id="jr_dev", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Other", | ||
| ), | ||
| ], | ||
| ) | ||
| await service.resolve(conflict_a) | ||
| # Resolve via human escalation | ||
| service_h = _make_service(hierarchy, ConflictResolutionStrategy.HUMAN) | ||
| conflict_h = service_h.create_conflict( | ||
| conflict_type=ConflictType.PROCESS, | ||
| subject="Process dispute", | ||
| positions=[ | ||
| make_position(agent_id="a-h", level=SeniorityLevel.SENIOR), | ||
| make_position( | ||
| agent_id="b-h", | ||
| level=SeniorityLevel.JUNIOR, | ||
| position="Other", | ||
| ), | ||
| ], | ||
| ) | ||
| await service_h.resolve(conflict_h) | ||
| # Query authority service by strategy | ||
| results = service.query_dissent_records( | ||
| strategy=ConflictResolutionStrategy.AUTHORITY, | ||
| ) | ||
| assert len(results) == 1 | ||
| assert results[0].strategy_used == ConflictResolutionStrategy.AUTHORITY |
There was a problem hiding this comment.
This test doesn't prove the strategy filter works.
The HUMAN record lives on service_h, not service, so service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY) would still return 1 even if the filter were ignored entirely.
Suggested tightening
results = service.query_dissent_records(
strategy=ConflictResolutionStrategy.AUTHORITY,
)
assert len(results) == 1
assert results[0].strategy_used == ConflictResolutionStrategy.AUTHORITY
+ assert service.query_dissent_records(
+ strategy=ConflictResolutionStrategy.HUMAN,
+ ) == ()
+ assert service_h.query_dissent_records(
+ strategy=ConflictResolutionStrategy.AUTHORITY,
+ ) == ()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def test_query_by_strategy( | |
| self, | |
| hierarchy: HierarchyResolver, | |
| ) -> None: | |
| service = _make_service(hierarchy) | |
| # Resolve via authority | |
| conflict_a = service.create_conflict( | |
| conflict_type=ConflictType.ARCHITECTURE, | |
| subject="Design dispute", | |
| positions=[ | |
| make_position( | |
| agent_id="sr_dev", | |
| level=SeniorityLevel.SENIOR, | |
| ), | |
| make_position( | |
| agent_id="jr_dev", | |
| level=SeniorityLevel.JUNIOR, | |
| position="Other", | |
| ), | |
| ], | |
| ) | |
| await service.resolve(conflict_a) | |
| # Resolve via human escalation | |
| service_h = _make_service(hierarchy, ConflictResolutionStrategy.HUMAN) | |
| conflict_h = service_h.create_conflict( | |
| conflict_type=ConflictType.PROCESS, | |
| subject="Process dispute", | |
| positions=[ | |
| make_position(agent_id="a-h", level=SeniorityLevel.SENIOR), | |
| make_position( | |
| agent_id="b-h", | |
| level=SeniorityLevel.JUNIOR, | |
| position="Other", | |
| ), | |
| ], | |
| ) | |
| await service_h.resolve(conflict_h) | |
| # Query authority service by strategy | |
| results = service.query_dissent_records( | |
| strategy=ConflictResolutionStrategy.AUTHORITY, | |
| ) | |
| assert len(results) == 1 | |
| assert results[0].strategy_used == ConflictResolutionStrategy.AUTHORITY | |
| async def test_query_by_strategy( | |
| self, | |
| hierarchy: HierarchyResolver, | |
| ) -> None: | |
| service = _make_service(hierarchy) | |
| # Resolve via authority | |
| conflict_a = service.create_conflict( | |
| conflict_type=ConflictType.ARCHITECTURE, | |
| subject="Design dispute", | |
| positions=[ | |
| make_position( | |
| agent_id="sr_dev", | |
| level=SeniorityLevel.SENIOR, | |
| ), | |
| make_position( | |
| agent_id="jr_dev", | |
| level=SeniorityLevel.JUNIOR, | |
| position="Other", | |
| ), | |
| ], | |
| ) | |
| await service.resolve(conflict_a) | |
| # Resolve via human escalation | |
| service_h = _make_service(hierarchy, ConflictResolutionStrategy.HUMAN) | |
| conflict_h = service_h.create_conflict( | |
| conflict_type=ConflictType.PROCESS, | |
| subject="Process dispute", | |
| positions=[ | |
| make_position(agent_id="a-h", level=SeniorityLevel.SENIOR), | |
| make_position( | |
| agent_id="b-h", | |
| level=SeniorityLevel.JUNIOR, | |
| position="Other", | |
| ), | |
| ], | |
| ) | |
| await service_h.resolve(conflict_h) | |
| # Query authority service by strategy | |
| results = service.query_dissent_records( | |
| strategy=ConflictResolutionStrategy.AUTHORITY, | |
| ) | |
| assert len(results) == 1 | |
| assert results[0].strategy_used == ConflictResolutionStrategy.AUTHORITY | |
| assert service.query_dissent_records( | |
| strategy=ConflictResolutionStrategy.HUMAN, | |
| ) == () | |
| assert service_h.query_dissent_records( | |
| strategy=ConflictResolutionStrategy.AUTHORITY, | |
| ) == () |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/communication/conflict_resolution/test_service.py` around lines
357 - 399, The test_query_by_strategy currently creates the HUMAN-resolution
record on service_h so querying
service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY) can
pass even if the strategy filter is ignored; fix by ensuring both records live
in the same service before querying (e.g., create both conflict_a and conflict_h
using the same service instance or call service.create_conflict for the HUMAN
record) and then call
service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY) and
assert that only the AUTHORITY record is returned; reference
test_query_by_strategy, service, service_h, create_conflict,
query_dissent_records, and ConflictResolutionStrategy.AUTHORITY.
- N-party dissent: build_dissent_record → build_dissent_records returning tuple - JudgeEvaluator returns JudgeDecision NamedTuple - Add logging to _helpers.py (find_losers, find_position_or_raise) - Wire up unused events: CONFLICT_ESCALATED, CONFLICT_STRATEGY_ERROR, CONFLICT_RESOLUTION_FAILED - Use logger.exception in except blocks (TRY400) - Remove unused HybridResolver._hierarchy parameter - Fix decided_by from "authority_fallback" to best.agent_id in hybrid - Move CONFLICT_AUTHORITY_FALLBACK logging into _authority_fallback method - Add DissentRecord._validate_dissent_consistency cross-field validator - Optimize query_dissent_records to single-pass comprehension - Add test_helpers.py, DissentRecord validation tests, 3-party + CEO root tests
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (10)
src/ai_company/communication/conflict_resolution/hybrid_strategy.py (2)
196-233:⚠️ Potential issue | 🟠 MajorAuthority fallback diverges from AuthorityResolver behavior.
_authority_fallbackusespick_highest_senioritywithout consultingself._hierarchy. When seniority is equal, this differs fromAuthorityResolverwhich uses hierarchy proximity as a tiebreaker. Cross-department conflicts will resolve inconsistently.🔧 Consider reusing AuthorityResolver
Either inject an
AuthorityResolverinstance and delegate:def __init__( self, *, config: HybridConfig, human_resolver: ConflictResolver, authority_resolver: AuthorityResolver, # Add this review_evaluator: JudgeEvaluator | None = None, ) -> None: ... self._authority_resolver = authority_resolverOr extract the shared winner-selection logic into a helper that both resolvers can use.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/hybrid_strategy.py` around lines 196 - 233, _authority_fallback currently calls pick_highest_seniority directly and therefore ignores self._hierarchy, causing ties to be broken differently than AuthorityResolver; update _authority_fallback to reuse the same winner-selection logic as AuthorityResolver by either delegating to an injected AuthorityResolver instance (add an authority_resolver attribute initialized in __init__ and call authority_resolver.resolve or its selection helper) or by extracting the tie-breaking/winner-selection logic into a shared helper function used by both AuthorityResolver and _authority_fallback, ensuring that hierarchy proximity (self._hierarchy) is consulted when seniority ties occur.
167-180:⚠️ Potential issue | 🟠 MajorInline dissent record creation bypasses human resolver's logic.
When
HumanEscalationResolverevolves to include queue IDs or escalation-specific metadata, this inline construction will diverge. Delegate toself._human_resolver.build_dissent_records()and augment the returned records with hybrid-specific metadata.🔧 Suggested delegation
if resolution.outcome == ConflictResolutionOutcome.ESCALATED_TO_HUMAN: - return tuple( - DissentRecord( - id=f"dissent-{uuid4().hex[:12]}", - conflict=conflict, - resolution=resolution, - dissenting_agent_id=pos.agent_id, - dissenting_position=pos.position, - strategy_used=ConflictResolutionStrategy.HYBRID, - timestamp=datetime.now(UTC), - metadata=(("escalation_reason", "ambiguous_review"),), - ) - for pos in conflict.positions - ) + human_records = self._human_resolver.build_dissent_records( + conflict, + resolution, + ) + return tuple( + r.model_copy( + update={ + "strategy_used": ConflictResolutionStrategy.HYBRID, + "metadata": r.metadata + (("escalation_reason", "ambiguous_review"),), + } + ) + for r in human_records + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/hybrid_strategy.py` around lines 167 - 180, The inline creation of DissentRecord in HybridStrategy bypasses HumanEscalationResolver logic; replace the generator that constructs DissentRecord for each pos with a call to self._human_resolver.build_dissent_records(conflict, resolution) and then iterate the returned records to augment or override hybrid-specific fields (set strategy_used to ConflictResolutionStrategy.HYBRID, add/merge metadata key ("escalation_reason","ambiguous_review"), ensure timestamp/ids preserved if resolver provides them). This delegates queue/escalation metadata responsibility to HumanEscalationResolver while ensuring hybrid-specific annotations are applied.tests/unit/communication/conflict_resolution/test_authority_strategy.py (1)
23-254: 🛠️ Refactor suggestion | 🟠 MajorMissing 3+ participant authority resolution test case.
The test coverage is thorough for 2-party conflicts, but there's no test exercising the N-party (3+) winner selection logic. The
AuthorityResolver._pick_winneriterates pairwise through positions, and a regression in that path would go undetected. Add at least one test with three positions mixing seniority levels to verify both the final winner and the dissent record count.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/communication/conflict_resolution/test_authority_strategy.py` around lines 23 - 254, Add a test exercising AuthorityResolver._pick_winner with 3+ participants: create a conflict via make_conflict and make_position with three agents mixing SeniorityLevel (e.g., C_SUITE or SENIOR, LEAD, JUNIOR) so the resolver must iterate pairwise and produce a single winning_agent_id; assert the correct winner from await AuthorityResolver(hierarchy=hierarchy).resolve(conflict) and verify resolver.build_dissent_records(conflict, resolution) returns the expected number of dissent records (>=2) and contains the losing agents' ids and positions. Ensure the test name reflects "three_participant" or "n_party" and follows existing patterns in TestAuthorityResolver* classes.tests/unit/communication/conflict_resolution/test_service.py (1)
357-399:⚠️ Potential issue | 🟡 MinorTest doesn't validate strategy filter exclusion.
The
service_hinstance holds the HUMAN record whileserviceholds the AUTHORITY record. Queryingserviceforstrategy=AUTHORITYwould return 1 even if the filter were broken. To properly validate, create both records in the same service instance.🧪 Suggested fix
async def test_query_by_strategy( self, hierarchy: HierarchyResolver, ) -> None: - service = _make_service(hierarchy) + # Use a single service with both resolvers + config = ConflictResolutionConfig(strategy=ConflictResolutionStrategy.AUTHORITY) + service = ConflictResolutionService( + config=config, + resolvers={ + ConflictResolutionStrategy.AUTHORITY: AuthorityResolver(hierarchy=hierarchy), + ConflictResolutionStrategy.HUMAN: HumanEscalationResolver(), + }, + ) # Resolve via authority conflict_a = service.create_conflict( ... ) await service.resolve(conflict_a) - # Resolve via human escalation - service_h = _make_service(hierarchy, ConflictResolutionStrategy.HUMAN) - conflict_h = service_h.create_conflict( + # Resolve via human escalation (same service, different config) + service._config = ConflictResolutionConfig(strategy=ConflictResolutionStrategy.HUMAN) + conflict_h = service.create_conflict( ... ) - await service_h.resolve(conflict_h) + await service.resolve(conflict_h) # Query authority service by strategy results = service.query_dissent_records( strategy=ConflictResolutionStrategy.AUTHORITY, ) assert len(results) == 1 assert results[0].strategy_used == ConflictResolutionStrategy.AUTHORITY + # Verify HUMAN filter also works + results_h = service.query_dissent_records( + strategy=ConflictResolutionStrategy.HUMAN, + ) + assert len(results_h) >= 1 + assert all(r.strategy_used == ConflictResolutionStrategy.HUMAN for r in results_h)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/communication/conflict_resolution/test_service.py` around lines 357 - 399, The test currently creates the AUTHORITY record on `service` and the HUMAN record on `service_h`, so querying `service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY)` would pass even if filtering is broken; fix it by creating both conflicts on the same service instance returned by `_make_service` (i.e., call `create_conflict` for both the AUTHORITY and HUMAN records on `service`), call `await service.resolve(...)` for each, then call `service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY)` and assert only the AUTHORITY record is returned and its `strategy_used` equals `ConflictResolutionStrategy.AUTHORITY`.src/ai_company/communication/conflict_resolution/service.py (1)
155-167:⚠️ Potential issue | 🟠 MajorStrategy dispatch doesn't support per-conflict overrides.
resolve()always usesself._config.strategy, but the linked issue requires strategy selection "overrideable per company/department/conflict type." All conflicts in a service instance are forced through the same resolver. Consider accepting an optionalstrategyparameter or looking up overrides based onconflict.typeor department.🔧 Suggested approach
async def resolve( self, conflict: Conflict, + *, + strategy_override: ConflictResolutionStrategy | None = None, ) -> tuple[ConflictResolution, tuple[DissentRecord, ...]]: - strategy = self._config.strategy + strategy = strategy_override or self._config.strategy resolver = self._resolvers.get(strategy)For department-based overrides, add a lookup table or configuration.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/service.py` around lines 155 - 167, The resolve() method currently always uses self._config.strategy so per-conflict overrides aren't possible; change the method signature to accept an optional strategy parameter (or compute a selected_strategy by checking a per-department/conflict-type override lookup using conflict.type or conflict.department) then set resolver = self._resolvers.get(selected_strategy or self._config.strategy); if resolver is None, raise ConflictResolutionError as before (preserving the existing message and logger usage). Ensure all references to self._config.strategy in the resolver selection are replaced with the new selected_strategy fallback logic so per-conflict overrides take precedence.src/ai_company/communication/conflict_resolution/human_strategy.py (2)
38-70:⚠️ Potential issue | 🟠 MajorStub implementation acknowledged — escalation path is incomplete.
The resolver returns
ESCALATED_TO_HUMANwithout creating a queue item, parking the task, or persisting a pending decision. This is documented as deferred to#37. Ensure this is tracked and the reasoning field clearly communicates the stub status to callers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/human_strategy.py` around lines 38 - 70, The resolve method in human_strategy.py currently returns ConflictResolution with outcome ConflictResolutionOutcome.ESCALATED_TO_HUMAN but does not persist or enqueue any pending human decision and the reasoning message is vague about the stub; update resolve to (1) create/persist a pending decision or enqueue a human task (use your existing queue/persistence API) linked to conflict.id and conflict.positions, (2) set ConflictResolution fields (ConflictResolution.conflict_id, outcome, decided_by="human", winning_agent_id=None, winning_position=None, resolved_at=None or omit until decision) to reflect that resolution is pending, and (3) make the reasoning field include a clear stub marker and a task/queue id or reference (use CONFLICT_HUMAN_ESCALATION_STUB and CONFLICT_ESCALATED log calls to log the created queue item id) so callers can track the pending decision.
72-101: 🧹 Nitpick | 🔵 TrivialRecording all positions as "dissenters" may confuse the audit trail.
For escalated conflicts, every position gets a
DissentRecordeven though no position was overruled. While the docstring explains this preserves all stances for human review, it conflicts withDissentRecord's stated purpose of "preserving the losing agent's reasoning." Consider a separate model (e.g.,PendingReviewRecord) or add explicit metadata distinguishing pending-review from actual dissent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/human_strategy.py` around lines 72 - 101, The method build_dissent_records is using DissentRecord for every position in an escalated conflict which misrepresents "losing" reasoning; change the implementation to either (A) create and return a distinct PendingReviewRecord (new model) for each pos instead of DissentRecord when ConflictResolutionStrategy.HUMAN/escalation is in play, or (B) if you must reuse DissentRecord, add explicit metadata flags (e.g., ("pending_review", "true") and ("escalation_reason", "...")) and a clear field value (e.g., dissenting_agent_id or dissenting_position set to a sentinel/nullable state) so callers can distinguish pending-review records from true dissents; update build_dissent_records to detect human-escalation and emit the appropriate record type/metadata and update any consumers that expect DissentRecord accordingly.tests/unit/communication/conflict_resolution/test_debate_strategy.py (1)
46-234: 🧹 Nitpick | 🔵 TrivialMissing token budget and transcript audit tests.
DebateConfig.max_tokens_per_argumentis accepted but never exercised. Tests should verify the token cap is observed and that debate arguments/reasoning are persisted in the audit trail. Without these, a resolver ignoring the token budget or dropping arguments would still pass.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/communication/conflict_resolution/test_debate_strategy.py` around lines 46 - 234, Add unit tests exercising DebateConfig.max_tokens_per_argument and audit persistence: create a DebateResolver with DebateConfig(max_tokens_per_argument=small_value, judge="shared_manager") and a FakeJudgeEvaluator that returns long arguments so you can assert DebateResolver.resolve enforces the token cap (e.g., trimmed/limited argument length or fewer tokens) and that the resulting resolution.reasoning or resolution.arguments reflects the cap; additionally verify build_dissent_records (and any audit/transcript fields on resolution, e.g., resolution.transcript or resolution.decided_by) contains the full/stored debate text so arguments and reasoning are persisted in the audit trail and include judge attribution (use FakeJudgeEvaluator.calls and records returned by build_dissent_records for checks).src/ai_company/communication/conflict_resolution/debate_strategy.py (2)
221-236:⚠️ Potential issue | 🟠 MajorCEO judge selection is input-order dependent.
When
judge="ceo", the code walks ancestors fromconflict.positions[0]. In cross-department conflicts with different roots, the first-listed position determines the judge. This creates inconsistent behavior based on position ordering.🔧 Possible approaches
- Require a unique root: Find roots for all participants and raise if they differ.
- Use LCM: Fall back to
shared_managerlogic to find a common ancestor.- Document the behavior: If order-dependence is acceptable, make it explicit in the config docs.
if self._config.judge == "ceo": roots = set() for pos in conflict.positions: ancestors = self._hierarchy.get_ancestors(pos.agent_id) root = ancestors[-1] if ancestors else pos.agent_id roots.add(root) if len(roots) > 1: raise ConflictHierarchyError( "Multiple hierarchy roots found; cannot determine CEO judge", context={"roots": list(roots)}, ) return roots.pop()
88-137:⚠️ Potential issue | 🔴 CriticalJudge determination blocks fallback path and misattributes decisions.
_determine_judge(conflict)at line 88 runs unconditionally. Forjudge="shared_manager", this can raiseConflictHierarchyErroreven when the evaluator isNoneand fallback would otherwise work. Additionally,decided_by=judge_id(line 134) misattributes the decision when authority fallback was used.🔧 Suggested fix: defer judge determination
async def resolve(self, conflict: Conflict) -> ConflictResolution: - judge_id = self._determine_judge(conflict) - - logger.info( - CONFLICT_DEBATE_STARTED, - conflict_id=conflict.id, - judge=judge_id, - ) - if self._judge_evaluator is not None: + judge_id = self._determine_judge(conflict) + logger.info( + CONFLICT_DEBATE_STARTED, + conflict_id=conflict.id, + judge=judge_id, + ) try: winning_agent_id, reasoning = await self._judge_evaluator.evaluate( conflict, judge_id, ) except Exception: ... + decided_by = judge_id else: logger.warning( CONFLICT_AUTHORITY_FALLBACK, ... ) winning_agent_id, reasoning = self._authority_fallback(conflict) + decided_by = "authority_fallback" winning_pos = find_position_or_raise(conflict, winning_agent_id) - logger.info( - CONFLICT_DEBATE_JUDGE_DECIDED, - conflict_id=conflict.id, - judge=judge_id, - winner=winning_agent_id, - ) + if self._judge_evaluator is not None: + logger.info( + CONFLICT_DEBATE_JUDGE_DECIDED, + conflict_id=conflict.id, + judge=judge_id, + winner=winning_agent_id, + ) return ConflictResolution( ... - decided_by=judge_id, + decided_by=decided_by, ... )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/communication/conflict_resolution/debate_strategy.py` around lines 88 - 137, Do not call _determine_judge(conflict) unconditionally — defer determining judge until you know whether the evaluator will be used: if self._judge_evaluator is not None, call _determine_judge and pass that judge_id into self._judge_evaluator.evaluate; if self._judge_evaluator is None, skip _determine_judge, call self._authority_fallback(conflict) and avoid any operation that could raise ConflictHierarchyError. Also ensure the returned ConflictResolution.decided_by reflects the actual decision path: set decided_by to judge_id only when the evaluator path was used, and set it to a clear fallback marker (e.g. "authority_fallback") when using _authority_fallback.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/communication/conflict_resolution/test_helpers.py`:
- Around line 50-76: Rename the misleading test function test_no_losers_raises
to a name that reflects the behavior being exercised (e.g.,
test_finds_loser_with_distinct_agents) and update its doc/comments accordingly;
the test uses make_conflict, make_position, make_resolution and asserts
find_losers returns "other-agent", so keep the assertion but change the function
name and any references to it, or alternatively if you intend to test the error
path, modify the test to construct a scenario that triggers the guard in
find_losers (see _helpers.py guard around lines 41-52) and assert it
raises—choose one approach and implement the corresponding change.
---
Duplicate comments:
In `@src/ai_company/communication/conflict_resolution/debate_strategy.py`:
- Around line 88-137: Do not call _determine_judge(conflict) unconditionally —
defer determining judge until you know whether the evaluator will be used: if
self._judge_evaluator is not None, call _determine_judge and pass that judge_id
into self._judge_evaluator.evaluate; if self._judge_evaluator is None, skip
_determine_judge, call self._authority_fallback(conflict) and avoid any
operation that could raise ConflictHierarchyError. Also ensure the returned
ConflictResolution.decided_by reflects the actual decision path: set decided_by
to judge_id only when the evaluator path was used, and set it to a clear
fallback marker (e.g. "authority_fallback") when using _authority_fallback.
In `@src/ai_company/communication/conflict_resolution/human_strategy.py`:
- Around line 38-70: The resolve method in human_strategy.py currently returns
ConflictResolution with outcome ConflictResolutionOutcome.ESCALATED_TO_HUMAN but
does not persist or enqueue any pending human decision and the reasoning message
is vague about the stub; update resolve to (1) create/persist a pending decision
or enqueue a human task (use your existing queue/persistence API) linked to
conflict.id and conflict.positions, (2) set ConflictResolution fields
(ConflictResolution.conflict_id, outcome, decided_by="human",
winning_agent_id=None, winning_position=None, resolved_at=None or omit until
decision) to reflect that resolution is pending, and (3) make the reasoning
field include a clear stub marker and a task/queue id or reference (use
CONFLICT_HUMAN_ESCALATION_STUB and CONFLICT_ESCALATED log calls to log the
created queue item id) so callers can track the pending decision.
- Around line 72-101: The method build_dissent_records is using DissentRecord
for every position in an escalated conflict which misrepresents "losing"
reasoning; change the implementation to either (A) create and return a distinct
PendingReviewRecord (new model) for each pos instead of DissentRecord when
ConflictResolutionStrategy.HUMAN/escalation is in play, or (B) if you must reuse
DissentRecord, add explicit metadata flags (e.g., ("pending_review", "true") and
("escalation_reason", "...")) and a clear field value (e.g., dissenting_agent_id
or dissenting_position set to a sentinel/nullable state) so callers can
distinguish pending-review records from true dissents; update
build_dissent_records to detect human-escalation and emit the appropriate record
type/metadata and update any consumers that expect DissentRecord accordingly.
In `@src/ai_company/communication/conflict_resolution/hybrid_strategy.py`:
- Around line 196-233: _authority_fallback currently calls
pick_highest_seniority directly and therefore ignores self._hierarchy, causing
ties to be broken differently than AuthorityResolver; update _authority_fallback
to reuse the same winner-selection logic as AuthorityResolver by either
delegating to an injected AuthorityResolver instance (add an authority_resolver
attribute initialized in __init__ and call authority_resolver.resolve or its
selection helper) or by extracting the tie-breaking/winner-selection logic into
a shared helper function used by both AuthorityResolver and _authority_fallback,
ensuring that hierarchy proximity (self._hierarchy) is consulted when seniority
ties occur.
- Around line 167-180: The inline creation of DissentRecord in HybridStrategy
bypasses HumanEscalationResolver logic; replace the generator that constructs
DissentRecord for each pos with a call to
self._human_resolver.build_dissent_records(conflict, resolution) and then
iterate the returned records to augment or override hybrid-specific fields (set
strategy_used to ConflictResolutionStrategy.HYBRID, add/merge metadata key
("escalation_reason","ambiguous_review"), ensure timestamp/ids preserved if
resolver provides them). This delegates queue/escalation metadata responsibility
to HumanEscalationResolver while ensuring hybrid-specific annotations are
applied.
In `@src/ai_company/communication/conflict_resolution/service.py`:
- Around line 155-167: The resolve() method currently always uses
self._config.strategy so per-conflict overrides aren't possible; change the
method signature to accept an optional strategy parameter (or compute a
selected_strategy by checking a per-department/conflict-type override lookup
using conflict.type or conflict.department) then set resolver =
self._resolvers.get(selected_strategy or self._config.strategy); if resolver is
None, raise ConflictResolutionError as before (preserving the existing message
and logger usage). Ensure all references to self._config.strategy in the
resolver selection are replaced with the new selected_strategy fallback logic so
per-conflict overrides take precedence.
In `@tests/unit/communication/conflict_resolution/test_authority_strategy.py`:
- Around line 23-254: Add a test exercising AuthorityResolver._pick_winner with
3+ participants: create a conflict via make_conflict and make_position with
three agents mixing SeniorityLevel (e.g., C_SUITE or SENIOR, LEAD, JUNIOR) so
the resolver must iterate pairwise and produce a single winning_agent_id; assert
the correct winner from await
AuthorityResolver(hierarchy=hierarchy).resolve(conflict) and verify
resolver.build_dissent_records(conflict, resolution) returns the expected number
of dissent records (>=2) and contains the losing agents' ids and positions.
Ensure the test name reflects "three_participant" or "n_party" and follows
existing patterns in TestAuthorityResolver* classes.
In `@tests/unit/communication/conflict_resolution/test_debate_strategy.py`:
- Around line 46-234: Add unit tests exercising
DebateConfig.max_tokens_per_argument and audit persistence: create a
DebateResolver with DebateConfig(max_tokens_per_argument=small_value,
judge="shared_manager") and a FakeJudgeEvaluator that returns long arguments so
you can assert DebateResolver.resolve enforces the token cap (e.g.,
trimmed/limited argument length or fewer tokens) and that the resulting
resolution.reasoning or resolution.arguments reflects the cap; additionally
verify build_dissent_records (and any audit/transcript fields on resolution,
e.g., resolution.transcript or resolution.decided_by) contains the full/stored
debate text so arguments and reasoning are persisted in the audit trail and
include judge attribution (use FakeJudgeEvaluator.calls and records returned by
build_dissent_records for checks).
In `@tests/unit/communication/conflict_resolution/test_service.py`:
- Around line 357-399: The test currently creates the AUTHORITY record on
`service` and the HUMAN record on `service_h`, so querying
`service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY)`
would pass even if filtering is broken; fix it by creating both conflicts on the
same service instance returned by `_make_service` (i.e., call `create_conflict`
for both the AUTHORITY and HUMAN records on `service`), call `await
service.resolve(...)` for each, then call
`service.query_dissent_records(strategy=ConflictResolutionStrategy.AUTHORITY)`
and assert only the AUTHORITY record is returned and its `strategy_used` equals
`ConflictResolutionStrategy.AUTHORITY`.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: eb02a8d0-f1fc-4818-ba12-95e669d8ee74
📒 Files selected for processing (19)
src/ai_company/communication/__init__.pysrc/ai_company/communication/conflict_resolution/__init__.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/communication/conflict_resolution/authority_strategy.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/communication/conflict_resolution/human_strategy.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/communication/conflict_resolution/models.pysrc/ai_company/communication/conflict_resolution/protocol.pysrc/ai_company/communication/conflict_resolution/service.pysrc/ai_company/observability/events/conflict.pytests/unit/communication/conflict_resolution/test_authority_strategy.pytests/unit/communication/conflict_resolution/test_debate_strategy.pytests/unit/communication/conflict_resolution/test_helpers.pytests/unit/communication/conflict_resolution/test_human_strategy.pytests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/communication/conflict_resolution/test_models.pytests/unit/communication/conflict_resolution/test_service.pytests/unit/observability/test_events.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotationsin Python 3.14+ code — Python 3.14 has native PEP 649 lazy annotations
Useexcept A, B:(no parentheses) for exception syntax in Python 3.14 — PEP 758 except syntax enforced by ruff
All public functions and classes must have type hints; mypy strict mode is enforced
All public classes and functions must have Google-style docstrings; ruff D rules enforce this
Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement of non-Pydantic internal collections (registries, BaseTool)
For non-Pydantic internal collections and dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use@computed_fieldfor derived values instead of storing redundant fields
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow errors
Validate at system boundaries (user input, external APIs, config files)
Line length must be 88 characters (enforced by ruff)
Files:
tests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/observability/test_events.pytests/unit/communication/conflict_resolution/test_human_strategy.pysrc/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/observability/events/conflict.pysrc/ai_company/communication/conflict_resolution/models.pysrc/ai_company/communication/conflict_resolution/human_strategy.pytests/unit/communication/conflict_resolution/test_models.pytests/unit/communication/conflict_resolution/test_debate_strategy.pytests/unit/communication/conflict_resolution/test_authority_strategy.pysrc/ai_company/communication/conflict_resolution/_helpers.pytests/unit/communication/conflict_resolution/test_service.pytests/unit/communication/conflict_resolution/test_helpers.pysrc/ai_company/communication/__init__.pysrc/ai_company/communication/conflict_resolution/authority_strategy.pysrc/ai_company/communication/conflict_resolution/__init__.pysrc/ai_company/communication/conflict_resolution/protocol.pysrc/ai_company/communication/conflict_resolution/service.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slowas appropriate
Prefer@pytest.mark.parametrizefor testing similar cases
Use vendor-agnostic test names: test-provider, test-small-001, etc. instead of real vendor names
Files:
tests/unit/communication/conflict_resolution/test_hybrid_strategy.pytests/unit/observability/test_events.pytests/unit/communication/conflict_resolution/test_human_strategy.pytests/unit/communication/conflict_resolution/test_models.pytests/unit/communication/conflict_resolution/test_debate_strategy.pytests/unit/communication/conflict_resolution/test_authority_strategy.pytests/unit/communication/conflict_resolution/test_service.pytests/unit/communication/conflict_resolution/test_helpers.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic must importfrom ai_company.observability import get_loggerand createlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Always use variable namelogger(not_logger, notlog) for the logger instance
Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Always use structured logging with kwargs:logger.info(EVENT, key=value)— never use formatted strings likelogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG logging is for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/observability/events/conflict.pysrc/ai_company/communication/conflict_resolution/models.pysrc/ai_company/communication/conflict_resolution/human_strategy.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/communication/__init__.pysrc/ai_company/communication/conflict_resolution/authority_strategy.pysrc/ai_company/communication/conflict_resolution/__init__.pysrc/ai_company/communication/conflict_resolution/protocol.pysrc/ai_company/communication/conflict_resolution/service.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases
Files:
src/ai_company/communication/conflict_resolution/debate_strategy.pysrc/ai_company/communication/conflict_resolution/hybrid_strategy.pysrc/ai_company/observability/events/conflict.pysrc/ai_company/communication/conflict_resolution/models.pysrc/ai_company/communication/conflict_resolution/human_strategy.pysrc/ai_company/communication/conflict_resolution/_helpers.pysrc/ai_company/communication/__init__.pysrc/ai_company/communication/conflict_resolution/authority_strategy.pysrc/ai_company/communication/conflict_resolution/__init__.pysrc/ai_company/communication/conflict_resolution/protocol.pysrc/ai_company/communication/conflict_resolution/service.py
🧠 Learnings (6)
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Fix all valid issues found by review agents — never skip or defer, including pre-existing issues in surrounding code
Applied to files:
tests/unit/communication/conflict_resolution/test_hybrid_strategy.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget); import directly
Applied to files:
tests/unit/observability/test_events.pysrc/ai_company/observability/events/conflict.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields
Applied to files:
src/ai_company/communication/conflict_resolution/models.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Every module with business logic must import `from ai_company.observability import get_logger` and create `logger = get_logger(__name__)`
Applied to files:
src/ai_company/communication/conflict_resolution/_helpers.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
src/ai_company/communication/conflict_resolution/_helpers.py
📚 Learning: 2026-03-07T19:34:16.191Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T19:34:16.191Z
Learning: Applies to src/ai_company/**/*.py : Always use variable name `logger` (not `_logger`, not `log`) for the logger instance
Applied to files:
src/ai_company/communication/conflict_resolution/_helpers.py
🧬 Code graph analysis (10)
tests/unit/communication/conflict_resolution/test_human_strategy.py (4)
src/ai_company/communication/conflict_resolution/human_strategy.py (3)
HumanEscalationResolver(27-102)resolve(38-70)build_dissent_records(72-102)src/ai_company/communication/conflict_resolution/models.py (1)
ConflictResolutionOutcome(29-42)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)tests/unit/communication/conflict_resolution/conftest.py (1)
make_conflict(48-75)
src/ai_company/communication/conflict_resolution/debate_strategy.py (5)
src/ai_company/communication/conflict_resolution/_helpers.py (3)
find_losers(18-53)find_position_or_raise(75-107)pick_highest_seniority(110-125)src/ai_company/communication/conflict_resolution/config.py (1)
DebateConfig(9-28)src/ai_company/communication/conflict_resolution/models.py (4)
Conflict(67-111)ConflictResolution(114-166)ConflictResolutionOutcome(29-42)DissentRecord(169-235)src/ai_company/communication/delegation/hierarchy.py (1)
HierarchyResolver(16-267)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)
src/ai_company/communication/conflict_resolution/hybrid_strategy.py (4)
src/ai_company/communication/conflict_resolution/_helpers.py (3)
find_losers(18-53)find_position(56-72)pick_highest_seniority(110-125)src/ai_company/communication/conflict_resolution/config.py (1)
HybridConfig(31-55)src/ai_company/communication/conflict_resolution/models.py (4)
Conflict(67-111)ConflictResolution(114-166)ConflictResolutionOutcome(29-42)DissentRecord(169-235)src/ai_company/communication/conflict_resolution/protocol.py (5)
ConflictResolver(30-67)JudgeEvaluator(70-91)resolve(38-47)evaluate(77-91)build_dissent_records(49-67)
src/ai_company/communication/conflict_resolution/models.py (2)
src/ai_company/communication/enums.py (2)
ConflictResolutionStrategy(95-108)ConflictType(75-92)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)
src/ai_company/communication/conflict_resolution/human_strategy.py (2)
src/ai_company/communication/conflict_resolution/models.py (4)
Conflict(67-111)ConflictResolution(114-166)ConflictResolutionOutcome(29-42)DissentRecord(169-235)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)
tests/unit/communication/conflict_resolution/test_models.py (4)
src/ai_company/communication/conflict_resolution/models.py (3)
ConflictResolutionOutcome(29-42)DissentRecord(169-235)is_cross_department(97-99)src/ai_company/communication/enums.py (2)
ConflictResolutionStrategy(95-108)ConflictType(75-92)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)tests/unit/communication/conflict_resolution/conftest.py (3)
make_conflict(48-75)make_position(28-45)make_resolution(81-100)
tests/unit/communication/conflict_resolution/test_debate_strategy.py (1)
src/ai_company/communication/conflict_resolution/debate_strategy.py (3)
DebateResolver(48-262)resolve(74-137)build_dissent_records(139-166)
tests/unit/communication/conflict_resolution/test_authority_strategy.py (5)
src/ai_company/communication/conflict_resolution/authority_strategy.py (3)
AuthorityResolver(39-261)resolve(56-97)build_dissent_records(99-125)src/ai_company/communication/conflict_resolution/models.py (1)
ConflictResolutionOutcome(29-42)src/ai_company/communication/delegation/hierarchy.py (1)
HierarchyResolver(16-267)src/ai_company/communication/errors.py (1)
ConflictHierarchyError(111-112)src/ai_company/core/enums.py (1)
SeniorityLevel(6-21)
src/ai_company/communication/conflict_resolution/authority_strategy.py (6)
src/ai_company/communication/conflict_resolution/_helpers.py (1)
find_losers(18-53)src/ai_company/communication/conflict_resolution/models.py (5)
Conflict(67-111)ConflictPosition(45-64)ConflictResolution(114-166)ConflictResolutionOutcome(29-42)DissentRecord(169-235)src/ai_company/communication/enums.py (1)
ConflictResolutionStrategy(95-108)src/ai_company/communication/errors.py (1)
ConflictHierarchyError(111-112)src/ai_company/core/enums.py (1)
compare_seniority(27-40)src/ai_company/observability/_logger.py (1)
get_logger(8-28)
src/ai_company/communication/conflict_resolution/protocol.py (2)
src/ai_company/communication/conflict_resolution/models.py (3)
Conflict(67-111)ConflictResolution(114-166)DissentRecord(169-235)src/ai_company/communication/conflict_resolution/service.py (1)
resolve(139-202)
🔇 Additional comments (33)
src/ai_company/communication/conflict_resolution/_helpers.py (1)
1-125: LGTM!The helper module correctly implements the shared utilities with proper logging setup, structured logging with kwargs, and error paths that log at WARNING level before raising. The past review comment about missing logger has been addressed.
The
pick_highest_seniorityfunction assumes non-emptypositions(line 121), which is safe given theConflictmodel enforces a minimum of 2 positions via validation.tests/unit/communication/conflict_resolution/test_helpers.py (1)
78-147: LGTM!The remaining test classes provide good coverage:
TestFindPosition: Tests both found and not-found casesTestFindPositionOrRaise: Tests successful lookup and error raisingTestPickHighestSeniority: Tests seniority comparisons and first-wins tie-breaker behaviorsrc/ai_company/communication/conflict_resolution/__init__.py (1)
1-43: LGTM!Clean package initialization with a well-documented design decision to keep strategy implementations in their dedicated modules. The
__all__list is alphabetically sorted and exposes a focused public API surface.src/ai_company/observability/events/conflict.py (1)
1-40: LGTM!Comprehensive set of 21 event constants covering all conflict resolution scenarios. The constants are properly typed as
Final[str], follow the consistentconflict.category.actionnaming pattern, and are well-organized by category.tests/unit/observability/test_events.py (1)
24-46: LGTM!The test comprehensively covers all 21 conflict event constants, verifying each maps to its expected string identifier. The "conflict" domain is correctly added to the expected modules set for discovery validation.
Also applies to: 140-140, 229-251
tests/unit/communication/conflict_resolution/test_human_strategy.py (1)
18-55: LGTM!The test class provides good coverage of the
HumanEscalationResolver:
- Outcome verification (ESCALATED_TO_HUMAN)
- No winner fields validation
- Decided-by attribution
- Reasoning content check (stub reference)
- Dissent records for all positions with correct strategy and metadata
The async tests assume
pytest-asynciois configured withmode=autoto auto-mark async test functions.tests/unit/communication/conflict_resolution/test_models.py (1)
1-288: LGTM!Comprehensive model tests covering:
ConflictPosition: Immutability and blank value rejectionConflict: Position constraints, duplicate detection, cross-department computationConflictResolution: Outcome-dependent validation rules for winner fieldsDissentRecord: Cross-field consistency validationConflictResolutionOutcome: Enum completeness and string valuesGood use of
@pytest.mark.parametrizefor the outcome tests.tests/unit/communication/conflict_resolution/test_hybrid_strategy.py (2)
25-37: Add verification thatreview_agentis forwarded to the evaluator.The
FakeReviewEvaluatorreceivesjudge_agent_id(line 35) but doesn't capture it for assertion. This means tests cannot verify thatHybridConfig.review_agentis actually forwarded to the evaluator. A resolver that hardcodes the review agent would still pass all current tests.🧪 Suggested enhancement
class FakeReviewEvaluator: """Fake review evaluator for testing.""" def __init__(self, winner_id: str, reasoning: str = "Review decided") -> None: self._winner_id = winner_id self._reasoning = reasoning + self.last_judge_agent_id: str | None = None async def evaluate( self, conflict: Conflict, judge_agent_id: str, ) -> JudgeDecision: + self.last_judge_agent_id = judge_agent_id return JudgeDecision(self._winner_id, self._reasoning)Then add a test that sets a non-default
review_agentand asserts it reaches the evaluator.
40-178: LGTM!Good coverage of
HybridResolverscenarios:
- Auto-resolve with clear winner yielding
RESOLVED_BY_HYBRID- Ambiguity handling with configurable escalation vs authority fallback
- No-evaluator fallback to authority
- Dissent record generation for both resolved and escalated outcomes
The tests properly verify outcomes, winning agents, and dissent record metadata.
tests/unit/communication/conflict_resolution/test_debate_strategy.py (2)
29-44: LGTM - FakeJudgeEvaluator is well-designed.Clean test double that records calls for assertion while returning configurable results.
262-327: Good coverage of N-party scenarios.The three-party tests properly validate iterative LCM computation and multi-record dissent generation.
src/ai_company/communication/conflict_resolution/protocol.py (3)
1-6: Design rationale documented — detection logic intentionally excluded from protocol.The module docstring explicitly states detection logic lives on the service because it's strategy-agnostic. This is a valid architectural choice that keeps strategies focused on resolution.
18-28: LGTM - JudgeDecision is appropriately simple.NamedTuple is a good choice for this immutable return type.
30-67: Protocol surface is narrower than original requirements but documented.The linked issue requested
detect_conflict,resolve, andlog_dissenton the protocol. This implementation only includesresolveandbuild_dissent_records, with detection/logging handled by the service. The deviation is documented in the module docstring. If this is intentional, no change needed; otherwise, consider addinglog_dissentfor strategies that need custom persistence.src/ai_company/communication/conflict_resolution/hybrid_strategy.py (1)
78-130: LGTM - Resolution flow is well-structured.Clean separation between evaluator path and fallback, with proper error handling and structured logging.
src/ai_company/communication/conflict_resolution/service.py (2)
62-72: LGTM - Proper immutability for resolvers mapping.Using
MappingProxyTypeprevents external mutation of the registry.
74-137: LGTM - Solid conflict creation with validation.Good early validation of position count and unique agent IDs before constructing the model, with appropriate structured logging.
src/ai_company/communication/conflict_resolution/debate_strategy.py (2)
139-166: LGTM - Dissent record construction is correct.Properly creates one record per loser with appropriate metadata including the judge.
168-219: Good N-party LCM computation for shared_manager judge.Iterative LCM folding correctly handles 3+ participants.
src/ai_company/communication/conflict_resolution/authority_strategy.py (6)
1-36: LGTM!Module structure is well-organized with proper imports, observability setup, and clear docstring. The use of
__slots__in the class definition is a good practice for memory efficiency.
56-97: LGTM!The
resolvemethod correctly handles N-party conflicts by including all losers in the reasoning (line 85). The logging follows the guidelines with INFO level for state transitions and structured kwargs.
99-125: LGTM!The method correctly creates one
DissentRecordper overruled agent, properly supporting N-party conflicts. Good use of thefind_losershelper to avoid duplication.
127-159: LGTM!The pairwise iteration approach for N-party conflicts is correct. The seniority comparison followed by hierarchy tiebreaker is well-implemented. The comment on line 152 clarifies the incumbent-favoring behavior on equal depth.
161-216: LGTM!The hierarchy tiebreaker logic is correct. Proper logging levels are used: DEBUG for LCM lookup (internal flow), WARNING before raising
ConflictHierarchyError. The depth comparison correctly favors the incumbent (pos_a) on equal depth.
218-261: LGTM!The edge case handling is correct: when
get_delegation_depthreturnsNone, the method checks if the agent IS the LCM (depth 0) before raising an error for truly unreachable agents. The docstring clearly documents this behavior.src/ai_company/communication/__init__.py (2)
17-42: LGTM!The conflict resolution public API is properly exported. The import structure follows the existing patterns in the file, with the main module exports from
conflict_resolutionand individual strategy modules imported separately.
103-184: LGTM!The
__all__list is correctly updated with all new conflict resolution exports while maintaining alphabetical ordering. All new types are properly exposed for external consumption.src/ai_company/communication/conflict_resolution/models.py (6)
1-27: LGTM!Module structure follows established patterns. Proper imports from Pydantic v2 with all necessary components. The
_MIN_POSITIONSconstant is appropriately defined at module level.
29-42: LGTM!The
ConflictResolutionOutcomeenum correctly defines the four possible resolution outcomes, with clear docstrings following the established pattern.
45-64: LGTM!The
ConflictPositionmodel correctly usesNotBlankStrfor all identifier/text fields,SeniorityLevelenum foragent_level, andAwareDatetimefor the timestamp. The model is properly frozen.
67-111: LGTM!The
Conflictmodel correctly uses@computed_fieldfor the derivedis_cross_departmentproperty. The validator properly enforces both the minimum position count and unique agent IDs. Usingtuple[ConflictPosition, ...]for positions ensures immutability.
114-166: LGTM!The
ConflictResolutionmodel has a robust validator that enforces consistency between theoutcomeand winner fields. The logic correctly handles the ESCALATED_TO_HUMAN case where no winner is determined yet.
169-235: LGTM!The
DissentRecordmodel provides comprehensive validation:
- Verifies the dissenting agent exists in the conflict
- Ensures resolution and conflict IDs match
- Prevents winner from being marked as dissenter (except for human escalation)
The
metadatafield correctly uses nested tuples for immutability in a frozen model.
- Fix misleading DissentRecord docstring (all positions, not just first) - CEO judge selection iterates all positions for N-party conflicts - Rename misleading test_no_losers_raises to test_finds_loser_with_distinct_agents - Resolve merge conflict in README.md (keep both feature bullets)
|
|
||
| def compare_seniority(a: SeniorityLevel, b: SeniorityLevel) -> int: | ||
| """Compare two seniority levels. | ||
|
|
There was a problem hiding this comment.
compare_seniority silently depends on enum declaration order
_SENIORITY_ORDER = tuple(SeniorityLevel) derives the ranking from the enum's declaration order, not from any explicit value. This is an undocumented, implicit invariant: the current order (JUNIOR → MID → SENIOR → LEAD → C_SUITE) happens to be ascending, but nothing enforces it.
If a future contributor inserts a new level (e.g. PRINCIPAL) alphabetically at the bottom of the enum, compare_seniority will silently produce wrong results — C_SUITE would compare as junior to PRINCIPAL. This is particularly risky because StrEnum members with alphabetically ordered string values look "naturally" sorted but can be declared in any order.
Consider either using explicit integer values for ordering or adding a comment + assertion to document the invariant:
_SENIORITY_ORDER: tuple[SeniorityLevel, ...] = (
# IMPORTANT: declaration order must be ascending seniority.
# Adding a new level? Insert it at the correct rank here.
SeniorityLevel.JUNIOR,
SeniorityLevel.MID,
SeniorityLevel.SENIOR,
SeniorityLevel.LEAD,
SeniorityLevel.C_SUITE,
)
assert _SENIORITY_ORDER == tuple(SeniorityLevel), (
"SeniorityLevel enum order does not match _SENIORITY_ORDER; "
"update one or the other"
)Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/core/enums.py
Line: 29
Comment:
**`compare_seniority` silently depends on enum declaration order**
`_SENIORITY_ORDER = tuple(SeniorityLevel)` derives the ranking from the enum's *declaration order*, not from any explicit value. This is an undocumented, implicit invariant: the current order (`JUNIOR → MID → SENIOR → LEAD → C_SUITE`) happens to be ascending, but nothing enforces it.
If a future contributor inserts a new level (e.g. `PRINCIPAL`) alphabetically at the bottom of the enum, `compare_seniority` will silently produce wrong results — `C_SUITE` would compare as *junior* to `PRINCIPAL`. This is particularly risky because `StrEnum` members with alphabetically ordered string values look "naturally" sorted but can be declared in any order.
Consider either using explicit integer values for ordering or adding a comment + assertion to document the invariant:
```python
_SENIORITY_ORDER: tuple[SeniorityLevel, ...] = (
# IMPORTANT: declaration order must be ascending seniority.
# Adding a new level? Insert it at the correct rank here.
SeniorityLevel.JUNIOR,
SeniorityLevel.MID,
SeniorityLevel.SENIOR,
SeniorityLevel.LEAD,
SeniorityLevel.C_SUITE,
)
assert _SENIORITY_ORDER == tuple(SeniorityLevel), (
"SeniorityLevel enum order does not match _SENIORITY_ORDER; "
"update one or the other"
)
```
How can I resolve this? If you propose a fix, please make it concise.| ConflictPosition, | ||
| ConflictResolution, | ||
| DissentRecord, | ||
| ) | ||
| from ai_company.communication.conflict_resolution.protocol import ( # noqa: TC001 | ||
| ConflictResolver, | ||
| ) |
There was a problem hiding this comment.
Private module symbol _MIN_POSITIONS imported across module boundary
_MIN_POSITIONS carries a leading underscore, signalling it is an internal implementation detail of models.py. Importing it in service.py creates fragile coupling between two modules via a private constant.
The Conflict model's _validate_positions validator already enforces the same constraint; the service's early check is purely for producing a richer ConflictResolutionError before Pydantic raises. An inline literal with a clarifying comment achieves the same goal without leaking the private symbol:
_CONFLICT_MIN_POSITIONS = 2 # mirrors Conflict._validate_positions
if len(positions) < _CONFLICT_MIN_POSITIONS:
...Or, expose MIN_POSITIONS (no leading underscore) from models.py as part of the public API.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/communication/conflict_resolution/service.py
Line: 20-26
Comment:
**Private module symbol `_MIN_POSITIONS` imported across module boundary**
`_MIN_POSITIONS` carries a leading underscore, signalling it is an *internal* implementation detail of `models.py`. Importing it in `service.py` creates fragile coupling between two modules via a private constant.
The `Conflict` model's `_validate_positions` validator already enforces the same constraint; the service's early check is purely for producing a richer `ConflictResolutionError` before Pydantic raises. An inline literal with a clarifying comment achieves the same goal without leaking the private symbol:
```python
_CONFLICT_MIN_POSITIONS = 2 # mirrors Conflict._validate_positions
if len(positions) < _CONFLICT_MIN_POSITIONS:
...
```
Or, expose `MIN_POSITIONS` (no leading underscore) from `models.py` as part of the public API.
How can I resolve this? If you propose a fix, please make it concise.| Raises: | ||
| ConflictStrategyError: If agent is not found in positions. | ||
| """ | ||
| pos = find_position(conflict, agent_id) | ||
| if pos is not None: | ||
| return pos | ||
| msg = f"Agent {agent_id!r} not found in conflict positions" | ||
| logger.warning( | ||
| CONFLICT_STRATEGY_ERROR, |
There was a problem hiding this comment.
pick_highest_seniority has no hierarchy tiebreaker, inconsistent with AuthorityResolver
pick_highest_seniority returns the first position when seniority levels are equal — a positional tiebreak. AuthorityResolver._pick_winner() uses hierarchy proximity (LCM depth) for tied seniority, while this helper (used by DebateResolver._authority_fallback and HybridResolver._authority_fallback) does not.
This means for a conflict between two SENIOR agents, the authority fallback path in Debate and Hybrid strategies can produce a different winner than calling AuthorityResolver.resolve() on the exact same conflict. For users auditing why an agent lost, the discrepancy between "won via debate fallback" vs. "would have won via authority" may be surprising.
If the intent is that all authority-style resolutions are consistent, the fallback should use the same tiebreaker. If the simplified positional tiebreak is intentional for these fallback paths, a docstring note explaining the deliberate divergence from AuthorityResolver would help future maintainers.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/communication/conflict_resolution/_helpers.py
Line: 88-96
Comment:
**`pick_highest_seniority` has no hierarchy tiebreaker, inconsistent with `AuthorityResolver`**
`pick_highest_seniority` returns the *first* position when seniority levels are equal — a positional tiebreak. `AuthorityResolver._pick_winner()` uses hierarchy proximity (LCM depth) for tied seniority, while this helper (used by `DebateResolver._authority_fallback` and `HybridResolver._authority_fallback`) does not.
This means for a conflict between two `SENIOR` agents, the authority fallback path in Debate and Hybrid strategies can produce a *different winner* than calling `AuthorityResolver.resolve()` on the exact same conflict. For users auditing why an agent lost, the discrepancy between "won via debate fallback" vs. "would have won via authority" may be surprising.
If the intent is that all authority-style resolutions are consistent, the fallback should use the same tiebreaker. If the simplified positional tiebreak is intentional for these fallback paths, a docstring note explaining the deliberate divergence from `AuthorityResolver` would help future maintainers.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 34 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def get_lowest_common_manager( | ||
| self, | ||
| agent_a: str, | ||
| agent_b: str, | ||
| ) -> str | None: | ||
| """Find the lowest common manager of two agents. | ||
|
|
||
| If one agent is an ancestor of the other, that agent is | ||
| returned as the LCM. | ||
|
|
||
| Args: | ||
| agent_a: First agent name. | ||
| agent_b: Second agent name. | ||
|
|
||
| Returns: | ||
| Name of the lowest common manager, or None if no | ||
| common manager exists. | ||
| """ | ||
| ancestors_a = self.get_ancestors(agent_a) | ||
| ancestors_b_set = set(self.get_ancestors(agent_b)) | ||
| # Check if agent_a is an ancestor of agent_b | ||
| if agent_a in ancestors_b_set: | ||
| return agent_a | ||
| # Check if agent_b is an ancestor of agent_a | ||
| if agent_b in set(ancestors_a): | ||
| return agent_b | ||
| # Walk agent_a's ancestors bottom-up; first hit in agent_b's | ||
| # ancestor set is the LCM | ||
| for ancestor in ancestors_a: | ||
| if ancestor in ancestors_b_set: | ||
| return ancestor | ||
| return None |
There was a problem hiding this comment.
get_lowest_common_manager() returns an incorrect result when agent_a == agent_b: it currently returns the agent’s first ancestor (or another shared ancestor) instead of returning the agent itself. This breaks the definition of “lowest common manager” and can cause the iterative LCM computation in DebateResolver to climb higher than necessary. Add an early if agent_a == agent_b: return agent_a guard (and consider avoiding recomputing sets).
| losers = tuple( | ||
| pos for pos in conflict.positions if pos.agent_id != resolution.winning_agent_id | ||
| ) | ||
| if not losers: | ||
| msg = f"No losing position found for winner {resolution.winning_agent_id!r}" | ||
| logger.warning( | ||
| CONFLICT_STRATEGY_ERROR, | ||
| conflict_id=conflict.id, | ||
| winning_agent_id=resolution.winning_agent_id, | ||
| error=msg, | ||
| ) | ||
| raise ConflictStrategyError( | ||
| msg, | ||
| context={"conflict_id": conflict.id}, | ||
| ) | ||
| return losers |
There was a problem hiding this comment.
find_losers() only errors when no losers exist, but it does not catch the data-integrity case where resolution.winning_agent_id is not present in conflict.positions (it would incorrectly treat all positions as losers). Consider validating that winning_agent_id is one of the participants (and raising ConflictStrategyError if not) before computing losers.
| │ │ ├── config.py # Communication config | ||
| │ │ ├── conflict_resolution/ # Conflict resolution subsystem (§5.6) | ||
| │ │ │ ├── __init__.py # Package exports | ||
| │ │ │ ├── _helpers.py # Shared utility (find_loser) |
There was a problem hiding this comment.
The directory listing comment says _helpers.py is a shared utility for find_loser, but the implementation exports find_losers (plural). Updating this name in the spec will prevent contributors from searching for a non-existent helper.
| │ │ │ ├── _helpers.py # Shared utility (find_loser) | |
| │ │ │ ├── _helpers.py # Shared utility (find_losers) |
| | **Pydantic alias for YAML directives** | Adopted (M2.5) | `Field(alias="_remove")` in `TemplateAgentConfig` — YAML uses `_remove: true`, Python accesses `agent.remove`. Keeps the YAML-facing name (underscore prefix signals internal directive) separate from the Python attribute name. | Underscore-prefixed YAML keys signal merge directives vs regular fields. Pydantic alias bridges the naming convention gap cleanly. | | ||
| | **Communication foundation** | Adopted (M4) | `MessageBus` protocol with `InMemoryMessageBus` backend (asyncio queues, pull-model `receive()` with shutdown signaling via `asyncio.Event`). `MessageDispatcher` routes to concurrent handlers via `asyncio.TaskGroup` with pre-allocated error collection. `AgentMessenger` per-agent facade auto-fills sender/timestamp/ID; deterministic direct-channel naming `@{sorted_a}:{sorted_b}`. `DeliveryEnvelope` for delivery tracking. `NotBlankStr` validation on all protocol boundary identifiers. | Pull-model avoids callback complexity and enables agents to consume at their own pace. Protocol + backend split enables future persistent/distributed bus implementations. Deterministic DM channel names prevent duplicates. See §5. | | ||
| | **Delegation & loop prevention** | Adopted (M4) | `HierarchyResolver` resolves org hierarchy from `Company` at construction (cycle-detected, `MappingProxyType`-frozen). `AuthorityValidator` checks chain-of-command + role permissions. `DelegationGuard` orchestrates five mechanisms (ancestry, depth, dedup, rate limit, circuit breaker) in sequence, short-circuiting on first rejection. `DelegationService` is synchronous (CPU-only); messaging integration deferred. Stateful mechanisms use injectable clock for deterministic testing. Task model extended with `parent_task_id` and `delegation_chain` fields. | Synchronous delegation avoids async complexity for CPU-only validation. Five-mechanism guard provides defence-in-depth against all loop patterns. Injectable clocks enable deterministic testing. See §5.4, §5.5. | | ||
| | **Conflict resolution** | Adopted (M4) | `ConflictResolver` protocol with async `resolve()` + sync `build_dissent_record()` split (resolve may call LLM, dissent record is pure construction). Four strategies: `AuthorityResolver` (seniority comparison iterating all N positions, hierarchy proximity tiebreaker via `get_lowest_common_manager`), `DebateResolver` (LLM judge via `JudgeEvaluator` protocol, authority fallback when absent), `HumanEscalationResolver` (stub, returns `ESCALATED_TO_HUMAN`), `HybridResolver` (LLM review + ambiguity escalation/authority fallback). `ConflictResolutionService` follows `DelegationService` pattern (`__slots__`, keyword-only constructor, `MappingProxyType`-wrapped resolver mapping, audit trail). `DissentRecord` preserves losing agent's reasoning. `Conflict.is_cross_department` is a `@computed_field` derived from positions. `HierarchyResolver` extended with `get_lowest_common_manager()` and `get_delegation_depth()`. | Protocol + strategy pattern enables adding new resolution approaches without modifying existing code. Async resolve accommodates LLM calls; sync dissent record avoids unnecessary async overhead. Shared `find_loser` utility prevents code duplication across strategies. See §5.6. | |
There was a problem hiding this comment.
In the conventions table, the conflict-resolution entry references a build_dissent_record() method and a shared find_loser helper, but the actual protocol defines build_dissent_records() and the helper is find_losers. Please update these identifiers to match the code to avoid mismatches between DESIGN_SPEC and the public API.
| | **Conflict resolution** | Adopted (M4) | `ConflictResolver` protocol with async `resolve()` + sync `build_dissent_record()` split (resolve may call LLM, dissent record is pure construction). Four strategies: `AuthorityResolver` (seniority comparison iterating all N positions, hierarchy proximity tiebreaker via `get_lowest_common_manager`), `DebateResolver` (LLM judge via `JudgeEvaluator` protocol, authority fallback when absent), `HumanEscalationResolver` (stub, returns `ESCALATED_TO_HUMAN`), `HybridResolver` (LLM review + ambiguity escalation/authority fallback). `ConflictResolutionService` follows `DelegationService` pattern (`__slots__`, keyword-only constructor, `MappingProxyType`-wrapped resolver mapping, audit trail). `DissentRecord` preserves losing agent's reasoning. `Conflict.is_cross_department` is a `@computed_field` derived from positions. `HierarchyResolver` extended with `get_lowest_common_manager()` and `get_delegation_depth()`. | Protocol + strategy pattern enables adding new resolution approaches without modifying existing code. Async resolve accommodates LLM calls; sync dissent record avoids unnecessary async overhead. Shared `find_loser` utility prevents code duplication across strategies. See §5.6. | | |
| | **Conflict resolution** | Adopted (M4) | `ConflictResolver` protocol with async `resolve()` + sync `build_dissent_records()` split (resolve may call LLM, dissent record is pure construction). Four strategies: `AuthorityResolver` (seniority comparison iterating all N positions, hierarchy proximity tiebreaker via `get_lowest_common_manager`), `DebateResolver` (LLM judge via `JudgeEvaluator` protocol, authority fallback when absent), `HumanEscalationResolver` (stub, returns `ESCALATED_TO_HUMAN`), `HybridResolver` (LLM review + ambiguity escalation/authority fallback). `ConflictResolutionService` follows `DelegationService` pattern (`__slots__`, keyword-only constructor, `MappingProxyType`-wrapped resolver mapping, audit trail). `DissentRecord` preserves losing agent's reasoning. `Conflict.is_cross_department` is a `@computed_field` derived from positions. `HierarchyResolver` extended with `get_lowest_common_manager()` and `get_delegation_depth()`. | Protocol + strategy pattern enables adding new resolution approaches without modifying existing code. Async resolve accommodates LLM calls; sync dissent record avoids unnecessary async overhead. Shared `find_losers` utility prevents code duplication across strategies. See §5.6. | |
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
agent_id,conflict_type,strategy,since)DissentRecordpreserving the overruled position, strategy used, and metadata_helpers.py):find_loser,find_position,find_position_or_raise,pick_highest_seniority— eliminates duplication across strategiesis_cross_departmentas@computed_fieldderived from positions rather than stored redundantlyMappingProxyTypefor resolver registry, frozen Pydantic models throughoutobservability/events/conflict.py, WARNING before all raises, INFO for state transitionsCloses #122
Test plan
Review coverage