feat: engine intelligence v2 -- trace enrichment, compaction, versioning eval#1139
feat: engine intelligence v2 -- trace enrichment, compaction, versioning eval#1139
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🧰 Additional context used📓 Path-based instructions (1)**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧠 Learnings (7)📓 Common learnings📚 Learning: 2026-03-15T18:28:13.207ZApplied to files:
📚 Learning: 2026-04-02T18:48:02.489ZApplied to files:
📚 Learning: 2026-04-08T11:07:20.359ZApplied to files:
📚 Learning: 2026-03-17T22:08:13.456ZApplied to files:
📚 Learning: 2026-03-20T21:44:04.528ZApplied to files:
📚 Learning: 2026-04-02T08:47:46.313ZApplied to files:
🔇 Additional comments (2)
WalkthroughThis PR introduces per-agent attribution (new 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
There was a problem hiding this comment.
Code Review
This pull request introduces structural credit assignment for multi-agent coordination and enhances context compaction with agent-controlled triggers and reasoning preservation. Key additions include a per-agent contribution scoring system, failure attribution models, and a compact_context tool. The compaction logic now detects "epistemic markers" (e.g., hedging, reconsideration) to maintain reasoning integrity during summarization. Review feedback identifies a logic error in sentence extraction that could lead to empty results for long sentences, redundant length checks in the epistemic detection module, and opportunities to reduce code duplication by refactoring the compaction functions.
| if total_len + len(stripped) + 2 > max_chars: | ||
| break |
There was a problem hiding this comment.
| ws_event_type = ( | ||
| WsEventType.COORDINATION_COMPLETED | ||
| if result.is_success | ||
| if attributed.is_success | ||
| else WsEventType.COORDINATION_FAILED | ||
| ) | ||
| _publish_ws_event( | ||
| request, | ||
| ws_event_type, | ||
| { | ||
| "task_id": task_id, | ||
| "topology": result.topology.value, | ||
| "is_success": result.is_success, | ||
| "total_duration_seconds": result.total_duration_seconds, | ||
| "topology": attributed.result.topology.value, | ||
| "is_success": attributed.is_success, | ||
| "total_duration_seconds": attributed.result.total_duration_seconds, | ||
| }, | ||
| ) | ||
| log_event = ( | ||
| API_COORDINATION_COMPLETED if result.is_success else API_COORDINATION_FAILED | ||
| API_COORDINATION_COMPLETED | ||
| if attributed.is_success | ||
| else API_COORDINATION_FAILED | ||
| ) | ||
| log_fn = logger.info if result.is_success else logger.warning | ||
| log_fn = logger.info if attributed.is_success else logger.warning | ||
| log_fn( | ||
| log_event, | ||
| task_id=task_id, | ||
| topology=result.topology.value, | ||
| is_success=result.is_success, | ||
| total_duration_seconds=result.total_duration_seconds, | ||
| topology=attributed.result.topology.value, | ||
| is_success=attributed.is_success, | ||
| total_duration_seconds=attributed.result.total_duration_seconds, | ||
| ) |
There was a problem hiding this comment.
To improve readability and reduce repetition, you could extract attributed.result and attributed.is_success into local variables at the beginning of this block. This would make the subsequent logic for WebSocket events and logging cleaner and easier to follow.
result = attributed.result
is_success = attributed.is_success
ws_event_type = (
WsEventType.COORDINATION_COMPLETED
if is_success
else WsEventType.COORDINATION_FAILED
)
_publish_ws_event(
request,
ws_event_type,
{
"task_id": task_id,
"topology": result.topology.value,
"is_success": is_success,
"total_duration_seconds": result.total_duration_seconds,
},
)
log_event = (
API_COORDINATION_COMPLETED
if is_success
else API_COORDINATION_FAILED
)
log_fn = logger.info if is_success else logger.warning
log_fn(
log_event,
task_id=task_id,
topology=result.topology.value,
is_success=is_success,
total_duration_seconds=result.total_duration_seconds,
)| if len(joined) > max_chars: | ||
| return joined[:max_chars] + "..." |
There was a problem hiding this comment.
| def force_compaction( | ||
| ctx: AgentContext, | ||
| config: CompactionConfig, | ||
| estimator: PromptTokenEstimator, | ||
| ) -> AgentContext | None: | ||
| """Compact context without checking the fill threshold. | ||
|
|
||
| Used when an agent explicitly requests compaction via the | ||
| ``compact_context`` tool. Skips the threshold check but | ||
| still enforces minimum message count and recent turn | ||
| preservation. | ||
|
|
||
| Args: | ||
| ctx: Current agent context. | ||
| config: Compaction configuration. | ||
| estimator: Token estimator. | ||
|
|
||
| Returns: | ||
| Compacted context, or ``None`` if too few messages. | ||
| """ | ||
| conversation = ctx.conversation | ||
| if len(conversation) < config.min_messages_to_compact: | ||
| logger.debug( | ||
| CONTEXT_BUDGET_COMPACTION_SKIPPED, | ||
| execution_id=ctx.execution_id, | ||
| reason="too_few_messages_for_forced_compaction", | ||
| message_count=len(conversation), | ||
| ) | ||
| return None | ||
|
|
||
| logger.info( | ||
| CONTEXT_BUDGET_COMPACTION_STARTED, | ||
| execution_id=ctx.execution_id, | ||
| fill_percent=ctx.context_fill_percent, | ||
| message_count=len(conversation), | ||
| forced=True, | ||
| ) | ||
|
|
||
| split = _split_conversation(ctx, config) | ||
| if split is None: | ||
| return None | ||
| head, archivable, recent = split | ||
|
|
||
| task_complexity = _extract_task_complexity(ctx) | ||
| compressed, metadata, summary_tokens = _compress( | ||
| ctx, | ||
| head, | ||
| archivable, | ||
| recent, | ||
| estimator, | ||
| preserve_markers=config.preserve_epistemic_markers, | ||
| task_complexity=task_complexity, | ||
| ) | ||
|
|
||
| new_fill = estimator.estimate_conversation_tokens(compressed) | ||
| logger.info( | ||
| CONTEXT_BUDGET_COMPACTION_COMPLETED, | ||
| execution_id=ctx.execution_id, | ||
| original_messages=len(conversation), | ||
| compacted_messages=len(compressed), | ||
| archived_turns=metadata.archived_turns, | ||
| summary_tokens=summary_tokens, | ||
| compactions_total=metadata.compactions_performed, | ||
| forced=True, | ||
| ) | ||
| return ctx.with_compression(metadata, compressed, new_fill) |
There was a problem hiding this comment.
The force_compaction function duplicates a significant amount of logic from _do_compaction. The only substantial difference is that force_compaction bypasses the initial fill threshold check.
To improve maintainability and reduce code duplication, you could refactor this by adding a force: bool = False parameter to _do_compaction and conditionally skipping the threshold check. force_compaction would then become a simple wrapper around _do_compaction(..., force=True).
There was a problem hiding this comment.
Pull request overview
This PR upgrades the engine “intelligence v2” pipeline by enriching execution traces (node typing + coordination attribution) and improving context compaction (agent-initiated compaction directives + epistemic marker preservation), with supporting observability events and tests.
Changes:
- Add per-turn
NodeTypetagging onTurnRecord, with auto-derivation inmake_turn_record(). - Introduce coordination attribution models +
CoordinationResultWithAttribution, wire attribution building into the coordinator, and record contributions inPerformanceTracker. - Add epistemic marker detection + summarizer changes, plus a new
compact_contexttool for agent-initiated compaction; update tool factory and tests.
Reviewed changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/tools/test_factory.py | Updates expected default tool list to include compact_context. |
| tests/unit/tools/test_factory_sandbox_wiring.py | Adjusts expected tool count (+1) for sandbox wiring tests. |
| tests/unit/tools/test_factory_new_categories.py | Updates default tool count expectation (15 tools). |
| tests/unit/tools/context/test_compact_context.py | Adds unit tests for CompactContextTool schema + behavior. |
| tests/unit/tools/context/init.py | Marks the context tools test package. |
| tests/unit/engine/test_coordination_service.py | Updates tests to handle attributed coordination results. |
| tests/unit/engine/coordination/test_attribution.py | Adds unit tests for attribution wrapper/models. |
| tests/unit/engine/coordination/test_attribution_factory.py | Adds unit tests for build_agent_contributions(). |
| tests/unit/engine/compaction/test_summarizer_markers.py | Adds tests for epistemic marker preservation + force_compaction(). |
| tests/unit/engine/compaction/test_models.py | Adds tests for new CompactionConfig fields/validation. |
| tests/unit/engine/compaction/test_epistemic.py | Adds tests for epistemic marker detection helpers. |
| tests/unit/api/controllers/test_coordination.py | Updates controller tests to use attributed coordination results. |
| src/synthorg/tools/factory.py | Registers CompactContextTool in build_default_tools(). |
| src/synthorg/tools/context/compact_context.py | Implements compact_context tool that emits a compaction directive via metadata. |
| src/synthorg/tools/context/init.py | Adds context tool package marker. |
| src/synthorg/observability/events/coordination.py | Adds coordination attribution event constant. |
| src/synthorg/observability/events/context_budget.py | Adds events for agent compaction request + epistemic marker preservation. |
| src/synthorg/hr/performance/tracker.py | Adds storage/logging for per-agent coordination contributions. |
| src/synthorg/engine/loop_protocol.py | Adds NodeType enum and TurnRecord.node_types. |
| src/synthorg/engine/loop_helpers.py | Derives and records node types in make_turn_record(). |
| src/synthorg/engine/coordination/service.py | Returns attributed coordination results and records contributions in tracker. |
| src/synthorg/engine/coordination/attribution.py | Introduces attribution models + contribution factory/scoring logic. |
| src/synthorg/engine/coordination/init.py | Re-exports new attribution types/factory. |
| src/synthorg/engine/compaction/summarizer.py | Adds epistemic marker preservation + force_compaction() and agent-controlled threshold behavior. |
| src/synthorg/engine/compaction/models.py | Extends CompactionConfig with agent-controlled fields + validators. |
| src/synthorg/engine/compaction/epistemic.py | Adds marker detection + extraction utilities. |
| src/synthorg/engine/agent_engine.py | Updates coordination return type to CoordinationResultWithAttribution. |
| src/synthorg/engine/init.py | Re-exports new coordination attribution types. |
| src/synthorg/api/controllers/coordination.py | Adjusts controller flow for attributed results while preserving API response shape. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ): | ||
| marker_text = extract_marker_sentences(cleaned) | ||
| if marker_text: | ||
| snippets.append(marker_text) |
There was a problem hiding this comment.
In the epistemic-marker preservation path, marker_text is appended to the summary without running through sanitize_message. That reintroduces URLs/paths/prompt-injection fragments into the SYSTEM compaction summary (the earlier implementation explicitly sanitized archived assistant content). Consider sanitizing the extracted marker sentences (while keeping the marker words) before adding them to snippets (and/or sanitizing inside extract_marker_sentences).
| snippets.append(marker_text) | |
| sanitized_marker_text = sanitize_message( | |
| marker_text, | |
| max_length=max(len(marker_text), 1), | |
| ) | |
| snippets.append(sanitized_marker_text) |
| CONTEXT_BUDGET_AGENT_COMPACTION_REQUESTED, | ||
| strategy=strategy, | ||
| preserve_markers=preserve_markers, | ||
| reason=reason, |
There was a problem hiding this comment.
CompactContextTool.execute() logs the free-form reason string verbatim. Since reason is LLM/user-authored text, it can contain URLs/paths or other sensitive fragments that won't be redacted by the structured-log key sanitizer. Consider logging a sanitized/redacted form (e.g., sanitize_message(reason, ...)) and/or omitting it from logs while still returning it in tool metadata if needed.
| reason=reason, |
| When ``agent_controlled`` is ``True``, automatic compaction uses | ||
| ``safety_threshold_percent`` instead of ``fill_threshold_percent``, | ||
| allowing agents to manage compaction via the ``compact_context`` | ||
| tool while retaining a safety net. | ||
|
|
||
| Attributes: | ||
| fill_threshold_percent: Context fill percentage that triggers | ||
| compaction (e.g. 80.0 means compact when 80% full). |
There was a problem hiding this comment.
The CompactionConfig docs still describe fill_threshold_percent as the threshold that triggers compaction, but the implementation switches automatic compaction to use safety_threshold_percent when agent_controlled=True. Consider clarifying in the docstring (and/or field descriptions) what each threshold means in agent-controlled mode to avoid config misuse/confusion.
| # Result should be an AgentContext (not None) | ||
| assert result is None or isinstance(result, AgentContext) |
There was a problem hiding this comment.
test_force_compaction_bypasses_threshold ends with assert result is None or isinstance(result, AgentContext), which will always pass and doesn't verify the intended behavior. Consider asserting a concrete outcome (e.g., result is not None, result.compression_metadata is set, or message count decreases) so this test will fail if force_compaction() stops compacting below the threshold.
| # Result should be an AgentContext (not None) | |
| assert result is None or isinstance(result, AgentContext) | |
| # Forced compaction should still produce a compacted context | |
| assert result is not None | |
| assert isinstance(result, AgentContext) |
| description="Average contribution score", | ||
| ) | ||
| @property | ||
| def total_contribution_score(self) -> float: | ||
| """Average of contribution scores, 0.0 when empty.""" | ||
| if not self.agent_contributions: | ||
| return 0.0 | ||
| total = sum(c.contribution_score for c in self.agent_contributions) | ||
| return total / len(self.agent_contributions) |
There was a problem hiding this comment.
total_contribution_score returns the average of contribution scores (as described in the docstring/computed_field), not a total. This naming mismatch can confuse consumers (especially when also logging avg_score). Consider renaming the property to something like avg_contribution_score (or changing the computation to a true total) to match semantics.
| description="Average contribution score", | |
| ) | |
| @property | |
| def total_contribution_score(self) -> float: | |
| """Average of contribution scores, 0.0 when empty.""" | |
| if not self.agent_contributions: | |
| return 0.0 | |
| total = sum(c.contribution_score for c in self.agent_contributions) | |
| return total / len(self.agent_contributions) | |
| description="Total contribution score", | |
| ) | |
| @property | |
| def total_contribution_score(self) -> float: | |
| """Sum of contribution scores, 0.0 when empty.""" | |
| if not self.agent_contributions: | |
| return 0.0 | |
| return sum(c.contribution_score for c in self.agent_contributions) |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1139 +/- ##
==========================================
- Coverage 88.97% 88.96% -0.02%
==========================================
Files 845 848 +3
Lines 49228 49388 +160
Branches 4944 4960 +16
==========================================
+ Hits 43803 43939 +136
- Misses 4501 4520 +19
- Partials 924 929 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/engine/compaction/epistemic.py`:
- Around line 117-125: The loop that collects epistemic sentences over-counts
the "; " separator for the first appended sentence (in the block using
EPISTEMIC_PATTERNS and variables sentences, marker_sentences, total_len,
max_chars), causing premature breaks; fix by accounting for the separator only
when marker_sentences is non-empty (i.e., compute sep_len = 2 if
marker_sentences else 0) and use sep_len both in the pre-check (total_len +
sep_len + len(stripped) > max_chars) and when updating total_len (total_len +=
sep_len + len(stripped)).
In `@src/synthorg/engine/compaction/summarizer.py`:
- Around line 305-312: The returned summary string in the block that checks
preserved_count uses "from {preserved_count} messages" which is incorrect when
preserved_count == 1; update the formatting in the function containing this
block (look for preserved_count and messages in summarizer.py) to choose
"message" vs "messages" based on preserved_count (e.g., conditional
pluralization or a small helper like pluralize_count) so the text reads "from 1
message" when preserved_count == 1 and keeps "messages" otherwise.
In `@src/synthorg/engine/coordination/attribution.py`:
- Around line 234-264: Introduce a lightweight typing Protocol under a
TYPE_CHECKING block that documents the expected shape of outcome_result (e.g.,
class AgentRunResultLike(Protocol) with attributes is_success: bool,
termination_reason: TerminationReason | None, execution_result:
ExecutionResultLike | None and ExecutionResultLike with error_message: str |
None), then change the function signature/type annotation of outcome_result from
object | None to AgentRunResultLike | None and update any local uses accordingly
(you can still use getattr guards if desired); this avoids circular imports
while giving static type safety for outcome_result, AgentRunResultLike,
ExecutionResultLike, and keeps interaction with _TERMINATION_TO_ATTRIBUTION and
AgentContribution unchanged.
- Around line 49-56: The _TERMINATION_TO_ATTRIBUTION mapping is missing
TerminationReason.PARKED, causing PARKED outcomes to fall back to "direct"
attribution; add an entry TerminationReason.PARKED: "coordination_overhead" to
the _TERMINATION_TO_ATTRIBUTION dict so PARKED (an approval-gate/system
suspension) is attributed as coordination_overhead rather than direct, keeping
its behavior consistent with SHUTDOWN, BUDGET_EXHAUSTED, and MAX_TURNS; ensure
you use the FailureAttribution string value "coordination_overhead" and update
the dict where _TERMINATION_TO_ATTRIBUTION is defined.
In `@src/synthorg/engine/coordination/service.py`:
- Around line 217-220: The call to
self._performance_tracker.record_coordination_contributions(contributions) can
raise and currently runs after COORDINATION_COMPLETED is logged, which can
incorrectly mark a finished coordination as failed; update the block in the
coordination completion path (around COORDINATION_COMPLETED logging) to wrap the
record_coordination_contributions call in a try/except that catches all
exceptions, logs them at WARNING or ERROR with context (including contributions
and that this occurred post-completion), and only re-raises MemoryError and
RecursionError; alternatively move the COORDINATION_COMPLETED log to after the
safe/guarded tracker call if tracker failure should be fatal.
- Around line 211-214: The current use of build_agent_contributions with
routing_result and dispatch_result.waves collapses routing into a single
agent_id->subtask_id mapping which allows later subtasks to overwrite earlier
ones when one agent executes multiple subtasks; update build_agent_contributions
(or the code that constructs its input) to use a composite key that includes the
executed outcome or task identifier (e.g., use (agent_id, executed_outcome_id)
or (agent_id, subtask_execution_id)) so each executed subtask is mapped
uniquely, ensure any persistence/returned tuple uses that composite mapping, and
add a unit/regression test that simulates repeated-agent routing to assert that
multiple subtasks from the same agent retain distinct attribution rather than
being overwritten.
In `@src/synthorg/tools/context/compact_context.py`:
- Around line 99-104: The INFO log and metadata currently emit the raw free-form
reason (see the logger.info call using CONTEXT_BUDGET_AGENT_COMPACTION_REQUESTED
and the subsequent metadata forwarding around lines 99–113); replace that with a
bounded/sanitized value by introducing a sanitizer (e.g., sanitize_reason() or
reason_code) that either maps free-text to a fixed set of reason codes or
redacts/truncates the input before logging and before any metadata propagation,
and use that sanitized value in the logger.info call and wherever the original
reason is forwarded (keep existing parameters strategy and preserve_markers
unchanged).
- Around line 21-52: The _COMPACT_CONTEXT_SCHEMA dict is mutable; make it
immutable by building it via copy.deepcopy(...) and then wrapping the result
with types.MappingProxyType so the module-level symbol _COMPACT_CONTEXT_SCHEMA
becomes a read-only mapping; import copy and MappingProxyType where needed and
replace the current literal assignment with a deep-copied value wrapped by
MappingProxyType to enforce immutability.
In `@tests/unit/engine/compaction/test_epistemic.py`:
- Around line 183-189: The assertion in test_strips_whitespace_from_sentences is
unclear; update the assertions after calling extract_marker_sentences(text)
(variable result) to explicitly check that the string does not start or end with
a space and contains no double spaces: replace the current confusing assert
involving count(" ") with three clearer asserts such as ensuring not
result.startswith(" "), not result.endswith(" "), and " " not in result to
verify whitespace is stripped.
In `@tests/unit/engine/compaction/test_summarizer_markers.py`:
- Around line 244-258: The test failure is due to a grammar mismatch ("from 1
messages"); update the summarizer code that builds the status text (the logic
invoked by _build_summary that produces "Epistemic markers preserved from {n}
messages") to choose "message" when the preserved count equals 1 and "messages"
otherwise (i.e., pluralize based on count). Locate the string generation in the
summarizer/marker-preservation code path (the function invoked by
_build_summary) and replace the hardcoded plural with a conditional
pluralization so the output reads "from 1 message" for a single preserved marker
and "from N messages" for other counts.
- Around line 315-343: The test test_force_compaction_bypasses_threshold uses a
weak assertion; instead capture the original message count before calling
force_compaction, call force_compaction(ctx, config, estimator) and assert the
result is an AgentContext (isinstance(result, AgentContext)) and that compaction
actually occurred by asserting len(result.messages) < original_count (or at
least len(result.messages) <= original_count and < when min_messages_to_compact
should trigger removal). Reference: force_compaction, CompactionConfig
(min_messages_to_compact) and the test function name to locate and update the
assertion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6731cd2d-3ed5-4005-9647-a690f4095266
📒 Files selected for processing (29)
src/synthorg/api/controllers/coordination.pysrc/synthorg/engine/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/models.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/coordination/attribution.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/loop_helpers.pysrc/synthorg/engine/loop_protocol.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.pysrc/synthorg/tools/context/__init__.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/tools/factory.pytests/unit/api/controllers/test_coordination.pytests/unit/engine/compaction/test_epistemic.pytests/unit/engine/compaction/test_models.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/coordination/test_attribution_factory.pytests/unit/engine/test_coordination_service.pytests/unit/tools/context/__init__.pytests/unit/tools/context/test_compact_context.pytests/unit/tools/test_factory.pytests/unit/tools/test_factory_new_categories.pytests/unit/tools/test_factory_sandbox_wiring.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Agent
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Build Sandbox
- GitHub Check: Test (Python 3.14)
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax without parentheses (PEP 758 except syntax)—ruff enforces this on Python 3.14
All public functions require type hints; mypy strict mode is enforced
Google-style docstrings are required on public classes and functions; enforced by ruff D rules
Create new objects for immutability, never mutate existing ones; for non-Pydantic internal collections usecopy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement
Line length must be 88 characters (enforced by ruff)
Files:
src/synthorg/tools/context/__init__.pytests/unit/tools/test_factory_sandbox_wiring.pytests/unit/tools/test_factory.pytests/unit/tools/test_factory_new_categories.pytests/unit/api/controllers/test_coordination.pysrc/synthorg/tools/factory.pysrc/synthorg/engine/__init__.pysrc/synthorg/observability/events/context_budget.pytests/unit/engine/compaction/test_models.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/loop_helpers.pysrc/synthorg/observability/events/coordination.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/engine/loop_protocol.pysrc/synthorg/engine/coordination/service.pytests/unit/engine/compaction/test_epistemic.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/compaction/test_summarizer_markers.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/tools/context/compact_context.pytests/unit/tools/context/test_compact_context.pytests/unit/engine/coordination/test_attribution_factory.pysrc/synthorg/engine/compaction/models.pytests/unit/engine/test_coordination_service.pysrc/synthorg/engine/coordination/attribution.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention andcopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; separate mutable-via-copy models (usingmodel_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict); useallow_inf_nan=Falsein allConfigDictdeclarations to rejectNaN/Infin numeric fields at validation time
Use@computed_fieldfor derived values instead of storing + validating redundant fields (e.g.TokenUsage.total_tokens)
UseNotBlankStr(fromcore.types) for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over barecreate_task
Functions must be fewer than 50 lines; files must be fewer than 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Variable name must always belogger(not_logger, notlog)
Use event name constants from domain-specific modules undersynthorg.observability.events(e.g.,API_REQUEST_STARTEDfromevents.api,TOOL_INVOKE_STARTfromevents.tool, etc.); import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT
Always use structured logging:logger.info(EVENT, key=value)—neverlogger.info("msg %s", val)
All error paths must log at WARNING ...
Files:
src/synthorg/tools/context/__init__.pysrc/synthorg/tools/factory.pysrc/synthorg/engine/__init__.pysrc/synthorg/observability/events/context_budget.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/loop_helpers.pysrc/synthorg/observability/events/coordination.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/engine/loop_protocol.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/models.pysrc/synthorg/engine/coordination/attribution.py
!(src/synthorg/observability/{setup,sinks,syslog_handler,http_handler}.py)**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use
import logging/logging.getLogger()/print()in application code (exception:observability/setup.py,observability/sinks.py,observability/syslog_handler.py, andobservability/http_handler.pymay use stdliblogging)
Files:
src/synthorg/tools/context/__init__.pytests/unit/tools/test_factory_sandbox_wiring.pytests/unit/tools/test_factory.pytests/unit/tools/test_factory_new_categories.pytests/unit/api/controllers/test_coordination.pysrc/synthorg/tools/factory.pysrc/synthorg/engine/__init__.pysrc/synthorg/observability/events/context_budget.pytests/unit/engine/compaction/test_models.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/loop_helpers.pysrc/synthorg/observability/events/coordination.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/engine/loop_protocol.pysrc/synthorg/engine/coordination/service.pytests/unit/engine/compaction/test_epistemic.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/compaction/test_summarizer_markers.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/tools/context/compact_context.pytests/unit/tools/context/test_compact_context.pytests/unit/engine/coordination/test_attribution_factory.pysrc/synthorg/engine/compaction/models.pytests/unit/engine/test_coordination_service.pysrc/synthorg/engine/coordination/attribution.py
!(docs/design/operations.md|.claude/**/*|src/synthorg/providers/presets.py)**/*.{py,md}
📄 CodeRabbit inference engine (CLAUDE.md)
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names:
example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Vendor names may only appear in: (1) Operations design page, (2).claude/files, (3) third-party imports, (4)src/synthorg/providers/presets.pywhich is user-facing runtime data. Tests must usetest-provider,test-small-001, etc.
Files:
src/synthorg/tools/context/__init__.pytests/unit/tools/test_factory_sandbox_wiring.pytests/unit/tools/test_factory.pytests/unit/tools/test_factory_new_categories.pytests/unit/api/controllers/test_coordination.pysrc/synthorg/tools/factory.pysrc/synthorg/engine/__init__.pysrc/synthorg/observability/events/context_budget.pytests/unit/engine/compaction/test_models.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/loop_helpers.pysrc/synthorg/observability/events/coordination.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/engine/loop_protocol.pysrc/synthorg/engine/coordination/service.pytests/unit/engine/compaction/test_epistemic.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/compaction/test_summarizer_markers.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/tools/context/compact_context.pytests/unit/tools/context/test_compact_context.pytests/unit/engine/coordination/test_attribution_factory.pysrc/synthorg/engine/compaction/models.pytests/unit/engine/test_coordination_service.pysrc/synthorg/engine/coordination/attribution.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/tools/context/__init__.pysrc/synthorg/tools/factory.pysrc/synthorg/engine/__init__.pysrc/synthorg/observability/events/context_budget.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/loop_helpers.pysrc/synthorg/observability/events/coordination.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/engine/loop_protocol.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/models.pysrc/synthorg/engine/coordination/attribution.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slowfor test classification
30-second timeout per test (global inpyproject.toml); do not add per-filepytest.mark.timeout(30)markers; non-default overrides liketimeout(60)are allowed
Prefer@pytest.mark.parametrizefor testing similar cases
Use Hypothesis for property-based testing with@given+@settingsdecorators; Hypothesis profiles configured intests/conftest.py:ci(deterministic,max_examples=10+derandomize=True),dev(1000 examples),fuzz(10,000 examples, no deadline),extreme(500,000 examples, no deadline). Controlled viaHYPOTHESIS_PROFILEenv var.
NEVER skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally; for timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()to make them deterministic instead of widening timing margins; for tasks that must block indefinitely until cancelled, useasyncio.Event().wait()instead ofasyncio.sleep(large_number)
Files:
tests/unit/tools/test_factory_sandbox_wiring.pytests/unit/tools/test_factory.pytests/unit/tools/test_factory_new_categories.pytests/unit/api/controllers/test_coordination.pytests/unit/engine/compaction/test_models.pytests/unit/engine/compaction/test_epistemic.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/tools/context/test_compact_context.pytests/unit/engine/coordination/test_attribution_factory.pytests/unit/engine/test_coordination_service.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/tools/test_factory_sandbox_wiring.pytests/unit/tools/test_factory.pytests/unit/tools/test_factory_new_categories.pytests/unit/api/controllers/test_coordination.pytests/unit/engine/compaction/test_models.pytests/unit/engine/compaction/test_epistemic.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/tools/context/test_compact_context.pytests/unit/engine/coordination/test_attribution_factory.pytests/unit/engine/test_coordination_service.py
🧠 Learnings (42)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to {pyproject.toml,src/synthorg/__init__.py} : Update version in `pyproject.toml` (`[tool.commitizen].version`) and `src/synthorg/__init__.py` (`__version__`)
Applied to files:
src/synthorg/tools/context/__init__.pysrc/synthorg/engine/__init__.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.
Applied to files:
tests/unit/api/controllers/test_coordination.pysrc/synthorg/engine/__init__.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/test_coordination_service.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
tests/unit/api/controllers/test_coordination.pysrc/synthorg/engine/__init__.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Applied to files:
src/synthorg/engine/__init__.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...
Applied to files:
src/synthorg/engine/__init__.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/coordination/test_attribution_factory.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/communication/**/*.py : Communication package (communication/): message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution; meeting/ subpackage for meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator
Applied to files:
src/synthorg/engine/__init__.pysrc/synthorg/engine/coordination/__init__.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`, etc.); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/observability/events/context_budget.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals
Applied to files:
src/synthorg/observability/events/coordination.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)
Applied to files:
src/synthorg/hr/performance/tracker.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion
Applied to files:
src/synthorg/hr/performance/tracker.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).
Applied to files:
tests/unit/engine/compaction/test_epistemic.pytests/unit/engine/compaction/test_summarizer_markers.py
📚 Learning: 2026-04-02T09:04:54.709Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T09:04:54.709Z
Learning: Applies to tests/**/*.py : Use pytest markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`
Applied to files:
tests/unit/engine/compaction/test_epistemic.pytests/unit/engine/compaction/test_summarizer_markers.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to tests/**/*.py : Use markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow` for test classification
Applied to files:
tests/unit/engine/compaction/test_epistemic.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`); use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-17T11:41:02.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T11:41:02.964Z
Learning: Applies to src/**/*.py : Models: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing + validating redundant fields. Use `NotBlankStr` for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-04-02T11:49:21.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T11:49:21.202Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`) with `allow_inf_nan=False` in all `ConfigDict` declarations
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to src/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time
Applied to files:
src/synthorg/engine/compaction/models.py
🔇 Additional comments (37)
src/synthorg/engine/loop_protocol.py (2)
28-41: LGTM! Clean enum design with clear semantics.The
NodeTypeenum is well-structured with descriptive values and a comprehensive docstring explaining its purpose for trace analysis and credit assignment. UsingStrEnumensures serialization-friendly string values.
119-122: LGTM! Backward-compatible field addition.The
node_typesfield is properly added with:
- Immutable
tuple[NodeType, ...]type- Default
()for legacy deserialization compatibility (as documented in docstring lines 74-76)- Clear description in
Fieldsrc/synthorg/engine/loop_helpers.py (2)
40-47: LGTM! Import addition is correctly placed.The
NodeTypeimport is appropriately grouped with other imports from.loop_protocol.
464-533: LGTM! Well-implemented node type derivation.The
make_turn_recordchanges are clean:
Parameter addition (line 470): The
extra_node_typesparameter with default()maintains backward compatibility.Auto-derivation logic (lines 513-517): Correctly derives base node types from response content:
LLM_CALLalways present (every turn involves an LLM call)TOOL_INVOCATIONadded whenresponse.tool_callsis truthy- Caller-provided types appended via tuple concatenation
Docstring (lines 474-477, 488-489): Clearly documents the auto-derivation behavior and the purpose of
extra_node_types.The implementation aligns well with the PR objective to populate
node_typesduring execution.src/synthorg/tools/context/__init__.py (1)
1-1: Looks good.Clear module-level docstring and no concerns in this segment.
tests/unit/tools/test_factory_sandbox_wiring.py (1)
45-46: Expectation update is correct.Tool-count baseline correctly reflects the new default
compact_contexttool.tests/unit/tools/test_factory.py (1)
20-20: LGTM for factory expectation sync.Expected tool names/count documentation is correctly updated for
compact_context.Also applies to: 46-46
tests/unit/tools/test_factory_new_categories.py (1)
137-140: Count update is correct.Default composition and assertion now match the additional context tool.
src/synthorg/tools/factory.py (1)
194-196: Factory wiring looks correct.
CompactContextToolis properly included in the default tool set.Also applies to: 209-209
src/synthorg/observability/events/context_budget.py (1)
20-26: Event constants addition is clean.Naming and placement are consistent with the observability event-constant pattern.
tests/unit/engine/compaction/test_models.py (1)
45-85: Good coverage for new config semantics.These tests appropriately lock in threshold-ordering behavior and default marker preservation.
src/synthorg/engine/compaction/models.py (2)
35-71: LGTM! Well-structured configuration extension.The new
agent_controlled,safety_threshold_percent, andpreserve_epistemic_markersfields are properly defined with appropriate constraints and defaults. TheConfigDictcorrectly includesfrozen=Trueandallow_inf_nan=Falseas per project conventions.
73-86: LGTM! Correct threshold validation logic.The
@model_validator(mode="after")correctly enforces thatsafety_threshold_percent > fill_threshold_percentonly whenagent_controlled=True. This ensures agents have headroom between their compaction decisions and the safety net threshold.tests/unit/engine/compaction/test_epistemic.py (2)
1-58: LGTM! Comprehensive marker counting tests.Good coverage of edge cases including no markers, single markers from each group, multiple from same group (counted once), mixed markers, and case-insensitive matching.
61-105: LGTM! Thorough threshold testing across complexity levels.Tests properly verify the complexity-adaptive thresholds: COMPLEX/EPIC preserve at ≥1 marker, SIMPLE/MEDIUM require ≥3 markers.
tests/unit/engine/compaction/test_summarizer_markers.py (1)
1-58: LGTM! Good test setup and initial marker preservation tests.The helper function
_msgand initial test cases for standard format output are well-structured.src/synthorg/engine/compaction/epistemic.py (2)
1-41: LGTM! Well-organized epistemic pattern definitions.The module docstring provides good context including the research reference. The precompiled patterns are appropriately grouped by marker type.
55-91: LGTM! Clean utility functions.Both
count_epistemic_markersandshould_preserve_messageare concise and correctly implement the complexity-adaptive thresholds.tests/unit/tools/context/test_compact_context.py (2)
1-52: LGTM! Comprehensive tool attribute tests.Good coverage of tool name, category, description, and parameters schema including the optional
preserve_markersfield with its default value.
54-178: LGTM! Thorough execute and definition tests.Tests properly verify metadata keys, default values, explicit parameter handling, and the
to_definition()method. Theis_error is Falsecheck is a good defensive assertion.src/synthorg/engine/compaction/summarizer.py (4)
10-14: LGTM! Clean epistemic module integration.The imports from the new epistemic module are appropriately scoped to the functions needed.
87-93: LGTM! Correct threshold selection logic.The
effective_thresholdcorrectly selectssafety_threshold_percentfor agent-controlled mode andfill_threshold_percentotherwise, matching the docstring behavior inmake_compaction_callback.
230-239: LGTM! Defensive complexity extraction.The chained
getattrcalls withNonefallbacks correctly handle cases where task execution or complexity may not be present, defaulting toComplexity.COMPLEXfor conservative marker preservation.
315-380: LGTM! Well-structured force_compaction implementation.The function correctly bypasses fill threshold checks while still enforcing
min_messages_to_compactandpreserve_recent_turns. Theforced=Truelog indicator provides good observability for distinguishing forced compactions.src/synthorg/observability/events/coordination.py (1)
19-19: LGTM!The new event constant follows the established naming convention and is properly typed with
Final[str]. It aligns well with the other coordination events in this module.src/synthorg/engine/coordination/attribution.py (3)
59-104: LGTM!The
AgentContributionmodel is well-designed with:
- Proper frozen config with
allow_inf_nan=Falseper guidelines.NotBlankStrfor identifier fields.- Clear field constraints (
ge=0.0, le=1.0for score).- Strong model validator enforcing the score/attribution invariant.
107-145: LGTM!The
CoordinationResultWithAttributionwrapper correctly:
- Preserves immutability of the wrapped
CoordinationResult.- Uses
@computed_fieldfor derived values (is_success,total_contribution_score) per guidelines.- Handles empty contributions gracefully in
total_contribution_score.
148-201: LGTM!The
build_agent_contributions()factory is well-implemented with proper structured logging using the domain event constant.src/synthorg/hr/performance/tracker.py (2)
182-206: LGTM!The
record_coordination_contributions()method is well-implemented:
- Properly guards against empty contributions before calculating
avg_score.- Uses structured logging with the existing
PERF_METRIC_RECORDEDevent.- Synchronous signature is appropriate for in-memory storage.
One minor observation: unlike
get_task_metrics()andget_collaboration_metrics(), there's no corresponding getter for contributions. Consider adding one if consumers need to query stored contributions.
100-100: LGTM!The new
_contributionsstore follows the same pattern as_task_metricsand_collab_metrics.src/synthorg/engine/coordination/__init__.py (1)
8-13: LGTM!The re-exports are properly added with:
- Correct import from the new
attributionmodule.- Alphabetically sorted entries in
__all__.- All four new symbols (
AgentContribution,CoordinationResultWithAttribution,FailureAttribution,build_agent_contributions) exposed at package level.Also applies to: 35-56
src/synthorg/engine/__init__.py (1)
53-70: LGTM!The re-exports appropriately expose only the primary types (
AgentContribution,CoordinationResultWithAttribution) at the engine package level, keeping lower-level symbols likeFailureAttributionandbuild_agent_contributionsscoped to the coordination subpackage. This is good API layering.Also applies to: 230-262
tests/unit/api/controllers/test_coordination.py (1)
51-70: LGTM!The test helper correctly wraps
CoordinationResultinCoordinationResultWithAttribution, matching the updated production API. The wrapper's default emptyagent_contributionstuple is appropriate for these controller-level tests that don't need to verify attribution details.tests/unit/engine/test_coordination_service.py (2)
135-146: LGTM!The test correctly validates the new API shape:
- Uses
attributed.is_successinstead ofresult.is_success.- Accesses the wrapped result via
attributed.result.- Verifies
agent_contributionsis a tuple (line 146).
179-184: LGTM!All test methods are consistently updated to:
- Capture
attributedfromcoordinator.coordinate(ctx).- Extract
result = attributed.resultfor detailed assertions.- Use
attributed.is_successfor success/failure checks.The systematic update pattern is correct and thorough.
Also applies to: 291-299, 336-338, 362-368, 419-426, 510-518, 559-561, 594-598, 625-631, 666-671, 764-766, 797-805, 833-840
src/synthorg/engine/agent_engine.py (2)
454-477: LGTM!The return type update from
CoordinationResulttoCoordinationResultWithAttributionis correct and consistent with:
- The
MultiAgentCoordinator.coordinate()implementation (which now returns the wrapper).- The API controller usage (context snippet 2) that accesses
.resultand.is_success.The test in context snippet 1 (
assert result is expected_result) remains compatible because the mock'sreturn_valueis returned directly without transformation.
120-125: LGTM!The
TYPE_CHECKINGimport correctly switches fromCoordinationResulttoCoordinationResultWithAttribution, and the import path uses the attribution submodule directly which is appropriate for type-only imports.
| contributions = build_agent_contributions( | ||
| routing_result, | ||
| dispatch_result.waves, | ||
| ) |
There was a problem hiding this comment.
Preserve attribution per executed subtask, not per agent.
The factory called here currently collapses routing decisions to a single agent_id -> subtask_id lookup. When one agent owns multiple subtasks in the same run, the later decision overwrites the earlier one and at least one outcome gets attached to the wrong subtask_id. Because this tuple is both returned and persisted, the bad mapping leaks straight into attribution metrics. Use the executed outcome/task as part of the lookup key, and add a regression for repeated-agent routing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/engine/coordination/service.py` around lines 211 - 214, The
current use of build_agent_contributions with routing_result and
dispatch_result.waves collapses routing into a single agent_id->subtask_id
mapping which allows later subtasks to overwrite earlier ones when one agent
executes multiple subtasks; update build_agent_contributions (or the code that
constructs its input) to use a composite key that includes the executed outcome
or task identifier (e.g., use (agent_id, executed_outcome_id) or (agent_id,
subtask_execution_id)) so each executed subtask is mapped uniquely, ensure any
persistence/returned tuple uses that composite mapping, and add a
unit/regression test that simulates repeated-agent routing to assert that
multiple subtasks from the same agent retain distinct attribution rather than
being overwritten.
| _COMPACT_CONTEXT_SCHEMA: dict[str, Any] = { | ||
| "type": "object", | ||
| "properties": { | ||
| "strategy": { | ||
| "type": "string", | ||
| "enum": ["summarize"], | ||
| "description": ( | ||
| "Compaction strategy. Currently only 'summarize' is supported." | ||
| ), | ||
| }, | ||
| "preserve_markers": { | ||
| "type": "boolean", | ||
| "default": True, | ||
| "description": ( | ||
| "Whether to preserve epistemic markers (wait, hmm, " | ||
| "actually, etc.) in the compaction summary." | ||
| ), | ||
| }, | ||
| "reason": { | ||
| "type": "string", | ||
| "minLength": 10, | ||
| "maxLength": 256, | ||
| "description": ( | ||
| "Brief explanation for why compaction is needed " | ||
| "now (e.g., 'context fill at 92 percent, need to " | ||
| "preserve reasoning clarity')." | ||
| ), | ||
| }, | ||
| }, | ||
| "required": ["strategy", "reason"], | ||
| "additionalProperties": False, | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Make the schema constant read-only to enforce immutability.
_COMPACT_CONTEXT_SCHEMA is a mutable module-level dict. Please wrap a deep-copied structure with a read-only proxy at construction.
♻️ Proposed refactor
+from copy import deepcopy
+from types import MappingProxyType
from typing import Any
@@
-_COMPACT_CONTEXT_SCHEMA: dict[str, Any] = {
+_COMPACT_CONTEXT_SCHEMA: dict[str, Any] = MappingProxyType(
+ deepcopy(
+ {
"type": "object",
"properties": {
"strategy": {
@@
},
"required": ["strategy", "reason"],
"additionalProperties": False,
-}
+ }
+ )
+)As per coding guidelines, "Create new objects for immutability, never mutate existing ones; for non-Pydantic internal collections use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/tools/context/compact_context.py` around lines 21 - 52, The
_COMPACT_CONTEXT_SCHEMA dict is mutable; make it immutable by building it via
copy.deepcopy(...) and then wrapping the result with types.MappingProxyType so
the module-level symbol _COMPACT_CONTEXT_SCHEMA becomes a read-only mapping;
import copy and MappingProxyType where needed and replace the current literal
assignment with a deep-copied value wrapped by MappingProxyType to enforce
immutability.
| def test_empty_assistant_content_ignored(self) -> None: | ||
| """Empty assistant messages are skipped.""" | ||
| messages = ( | ||
| _msg(MessageRole.ASSISTANT, ""), | ||
| _msg(MessageRole.ASSISTANT, "Wait, something important."), | ||
| ) | ||
| summary = _build_summary( | ||
| messages, | ||
| execution_id="test-exec", | ||
| preserve_markers=True, | ||
| task_complexity=Complexity.COMPLEX, | ||
| ) | ||
|
|
||
| # First message is empty and skipped, second has marker | ||
| assert "Epistemic markers preserved from 1 messages" in summary |
There was a problem hiding this comment.
Minor: Grammar issue in expected output.
The assertion expects "from 1 messages" which is grammatically incorrect (should be "from 1 message"). This reflects the implementation's behavior, so if intentional, consider updating the implementation to handle singular/plural correctly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/engine/compaction/test_summarizer_markers.py` around lines 244 -
258, The test failure is due to a grammar mismatch ("from 1 messages"); update
the summarizer code that builds the status text (the logic invoked by
_build_summary that produces "Epistemic markers preserved from {n} messages") to
choose "message" when the preserved count equals 1 and "messages" otherwise
(i.e., pluralize based on count). Locate the string generation in the
summarizer/marker-preservation code path (the function invoked by
_build_summary) and replace the hardcoded plural with a conditional
pluralization so the output reads "from 1 message" for a single preserved marker
and "from N messages" for other counts.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/synthorg/engine/compaction/summarizer.py (1)
110-115:⚠️ Potential issue | 🟠 MajorForced compaction currently emits
CONTEXT_BUDGET_COMPACTION_STARTEDtwice.
force_compaction()logs the STARTED event at DEBUG and then_do_compaction()logs the same STARTED event again at INFO. Any metric or alert keyed on the event name will double-count forced runs. Keep the INFO emission as the canonical state-transition log, addforced=forcethere, and drop the extra DEBUG event.🧭 Proposed fix
logger.info( CONTEXT_BUDGET_COMPACTION_STARTED, execution_id=ctx.execution_id, fill_percent=fill_pct, message_count=len(conversation), + forced=force, ) @@ - logger.debug( - CONTEXT_BUDGET_COMPACTION_STARTED, - execution_id=ctx.execution_id, - fill_percent=ctx.context_fill_percent, - message_count=len(ctx.conversation), - forced=True, - ) return _do_compaction(ctx, config, estimator, force=True)As per coding guidelines, "All state transitions must log at INFO. DEBUG for object creation, internal flow, and entry/exit of key functions."
Also applies to: 344-350
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/compaction/summarizer.py` around lines 110 - 115, The STARTED compaction event is being emitted twice for forced runs; remove the DEBUG emission in force_compaction() and rely on the INFO emission inside _do_compaction(), adding forced=force to that INFO call (the CONTEXT_BUDGET_COMPACTION_STARTED log) so INFO is the canonical state-transition log; also scan the same file for the other duplicate STARTED emission patterns (the similar block around the later compaction call) and apply the same change: drop the extra DEBUG STARTED log and annotate the single INFO STARTED log with forced=force.
♻️ Duplicate comments (2)
src/synthorg/tools/context/compact_context.py (1)
23-54: 🛠️ Refactor suggestion | 🟠 MajorFreeze
_COMPACT_CONTEXT_SCHEMAinstead of leaving it mutable.
deepcopy()in__init__protects each instance, but the shared schema source can still be mutated for every futureCompactContextToolconstruction. Build the module-level schema from a deep-copied object and wrap it inMappingProxyType, then copy from that immutable source when passing it toBaseTool.As per coding guidelines, "Create new objects instead of mutating existing ones. For non-Pydantic internal collections (registries,
BaseTool), usecopy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/tools/context/compact_context.py` around lines 23 - 54, The module-level _COMPACT_CONTEXT_SCHEMA should be made immutable: create it from a deep-copied dict and wrap it in MappingProxyType (e.g., _COMPACT_CONTEXT_SCHEMA = MappingProxyType(deepcopy({...}))) so the source cannot be mutated; then in CompactContextTool.__init__ (the constructor that currently deepcopy()s the schema) continue to copy from this immutable source (use copy.deepcopy on the MappingProxyType value) before passing it to BaseTool so each instance gets its own mutable copy while the module-level schema remains read-only.src/synthorg/engine/coordination/attribution.py (1)
178-209:⚠️ Potential issue | 🟠 MajorRound-robin subtask matching still misattributes repeated-agent runs.
Lines 178-203 pair outcomes to subtasks by
agent_idplus ordinal position. If one agent owns multiple subtasks and those outcomes are emitted in a different order thanrouting_result.decisions, the wrongsubtask_idgets attached and that bad mapping is then returned/persisted. Match on the executed task/outcome identifier instead, and add a regression where one agent completes multiple routed subtasks out of order.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/coordination/attribution.py` around lines 178 - 209, The current round-robin uses agent_to_subtasks and agent_subtask_idx which misattributes when an agent emits outcomes out-of-order; instead build a direct lookup from the executed task/outcome identifier to the routed subtask (using routing_result.decisions entries) and use that lookup when iterating waves->outcomes to pick subtask_id (fall back to the existing logic only if no mapping exists). Update the loop that constructs agent_to_subtasks/agent_subtask_idx or add a new mapping (keyed by outcome.task_id or the decision's executed-task id) and then call _score_outcome with the mapped subtask_id for each outcome in the waves processing block.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/design/engine.md`:
- Line 1449: The heading "Per-Agent Attribution (Phase 8)" incorrectly implies
an additional pipeline phase; rename the header to "Per-Agent Attribution
(Post-Pipeline)" or simply "Per-Agent Attribution" and update any
cross-references or anchors that mention "Phase 8" so they no longer imply an
8-phase pipeline; ensure the section title text and any internal links or index
entries referencing "Per-Agent Attribution (Phase 8)" are updated to the new
wording to keep docs consistent with the 7-phase pipeline description.
In `@src/synthorg/engine/compaction/epistemic.py`:
- Around line 12-15: The module is missing the standard logger; import
get_logger from synthorg.observability and create a module logger by adding
"from synthorg.observability import get_logger" and "logger =
get_logger(__name__)" at the top of epistemic.py (near the other imports) so
functions in this compaction logic can use logger consistently with the engine
package conventions.
In `@src/synthorg/engine/coordination/attribution.py`:
- Around line 72-103: The AgentContribution Pydantic model currently only
preserves contribution_score, failure_attribution, and evidence, which drops
fields required by the design (e.g., failure_category, token_usage,
raw_per_agent_outcome); update the AgentContribution class to include these
missing attributes (add fields such as failure_category: FailureCategory | None,
token_usage: TokenUsage | None, and raw_outcome: dict | str | None with
appropriate types, defaults, descriptions, and validation similar to existing
fields) and then update the PerformanceTracker.record_coordination_contributions
signature and any callers to accept and forward the extended AgentContribution
instances so those values are retained and surfaced (ensure model_config and max
length/validation rules follow existing patterns and that
serialization/consumption points use the new fields).
In `@src/synthorg/engine/coordination/service.py`:
- Around line 211-223: The call to build_agent_contributions in Service
(coordination service) can raise and currently sits on the success path, so wrap
the build_agent_contributions(...) call in a try/except that catches Exception,
logs a WARNING or ERROR with context (include routing_result,
dispatch_result.waves or coordination id), and continues so it does not mark the
coordination run as failed; after successfully building contributions, proceed
to call self._performance_tracker.record_coordination_contributions(...) as
before, but ensure the tracker call remains guarded separately so tracker
failures also do not fail an already-completed run.
In `@src/synthorg/tools/context/compact_context.py`:
- Around line 23-53: The schema _COMPACT_CONTEXT_SCHEMA currently lacks a field
for agent-provided summaries; add a new optional property (e.g., "agent_summary"
or "summary") to the schema with type "string" and sensible minLength/maxLength
(and a description stating it can replace or supplement auto compaction), keep
"required" unchanged (do not force it), and ensure additionalProperties remains
False; then wire this field through the public tool boundary by accepting it
where the compact_context tool input is parsed/validated so the agent-provided
summary is passed into the compaction logic (update any validation/parse code
that references _COMPACT_CONTEXT_SCHEMA to include and forward "agent_summary").
In `@tests/integration/tools/test_factory_integration.py`:
- Line 15: Replace the brittle hard-coded-only assertion using
_EXPECTED_TOOL_COUNT by asserting both the expected length and that specific
required tool names exist in the produced tool registry; update the test that
uses _EXPECTED_TOOL_COUNT to (1) keep the length check if desired, and (2) check
that the returned collection (e.g., the list or dict of tools the factory
returns) contains required keys/names including "compact_context" and any other
critical tool names. Ensure you reference and update the variable
_EXPECTED_TOOL_COUNT alongside these presence assertions so the test fails only
when tools are missing or the count truly changes.
In `@tests/unit/engine/compaction/test_epistemic.py`:
- Around line 144-162: The test test_truncates_at_max_chars never triggers the
new truncation branch because multiple short marker sentences fit within
max_chars; update the test to use a single marker sentence longer than max_chars
so extract_marker_sentences(max_chars=...) takes the first-sentence truncation
path, and assert the exact truncated string (first max_chars characters plus
"…"/"..." as implemented) rather than the current len-based check; modify the
input text and the assertions in test_truncates_at_max_chars to validate the
exact truncated output from extract_marker_sentences.
---
Outside diff comments:
In `@src/synthorg/engine/compaction/summarizer.py`:
- Around line 110-115: The STARTED compaction event is being emitted twice for
forced runs; remove the DEBUG emission in force_compaction() and rely on the
INFO emission inside _do_compaction(), adding forced=force to that INFO call
(the CONTEXT_BUDGET_COMPACTION_STARTED log) so INFO is the canonical
state-transition log; also scan the same file for the other duplicate STARTED
emission patterns (the similar block around the later compaction call) and apply
the same change: drop the extra DEBUG STARTED log and annotate the single INFO
STARTED log with forced=force.
---
Duplicate comments:
In `@src/synthorg/engine/coordination/attribution.py`:
- Around line 178-209: The current round-robin uses agent_to_subtasks and
agent_subtask_idx which misattributes when an agent emits outcomes out-of-order;
instead build a direct lookup from the executed task/outcome identifier to the
routed subtask (using routing_result.decisions entries) and use that lookup when
iterating waves->outcomes to pick subtask_id (fall back to the existing logic
only if no mapping exists). Update the loop that constructs
agent_to_subtasks/agent_subtask_idx or add a new mapping (keyed by
outcome.task_id or the decision's executed-task id) and then call _score_outcome
with the mapped subtask_id for each outcome in the waves processing block.
In `@src/synthorg/tools/context/compact_context.py`:
- Around line 23-54: The module-level _COMPACT_CONTEXT_SCHEMA should be made
immutable: create it from a deep-copied dict and wrap it in MappingProxyType
(e.g., _COMPACT_CONTEXT_SCHEMA = MappingProxyType(deepcopy({...}))) so the
source cannot be mutated; then in CompactContextTool.__init__ (the constructor
that currently deepcopy()s the schema) continue to copy from this immutable
source (use copy.deepcopy on the MappingProxyType value) before passing it to
BaseTool so each instance gets its own mutable copy while the module-level
schema remains read-only.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9e159eb0-cdaa-418a-9696-241d4ffee08f
📒 Files selected for processing (17)
CLAUDE.mddocs/design/engine.mdscripts/run_affected_tests.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/models.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/coordination/attribution.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/tools/context/compact_context.pytests/integration/engine/test_coordination_wiring.pytests/integration/tools/test_factory_integration.pytests/unit/engine/compaction/test_epistemic.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/engine/coordination/test_attribution.pyweb/src/api/types.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Dashboard Test
- GitHub Check: Build Web
- GitHub Check: Build Backend
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{py,ts,tsx,go}
📄 CodeRabbit inference engine (CLAUDE.md)
No
from __future__ import annotationsin Python -- Python 3.14+ has PEP 649 native lazy annotations.
Files:
tests/integration/tools/test_factory_integration.pytests/integration/engine/test_coordination_wiring.pyscripts/run_affected_tests.pyweb/src/api/types.tssrc/synthorg/hr/performance/tracker.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/models.pytests/unit/engine/coordination/test_attribution.pysrc/synthorg/api/controllers/coordination.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/engine/compaction/test_epistemic.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use PEP 758 except syntax: use
except A, B:(no parentheses) for multiple exceptions -- ruff enforces this on Python 3.14.All public functions must have type hints and pass mypy strict mode type checking.
Docstrings must use Google style format and are required on all public classes and functions (enforced by ruff D rules).
Create new objects instead of mutating existing ones. For non-Pydantic internal collections (registries,
BaseTool), usecopy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement.Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using
model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.Use Pydantic v2 (
BaseModel,model_validator,computed_field,ConfigDict). Useallow_inf_nan=Falsein allConfigDictdeclarations to rejectNaN/Infin numeric fields at validation time.Use
@computed_fieldfor derived values in Pydantic models instead of storing + validating redundant fields (e.g.TokenUsage.total_tokens).Use
NotBlankStr(fromcore.types) for all identifier/name fields, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.Prefer
asyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.Line length must be 88 characters (enforced by ruff).
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly; never silently swallow exceptions.
Validate input at system boundaries (user input, external APIs, config files).
Every module with business logic MUST have:
from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Never useimport logging/logging.getLogger()/print()in application code.Always use...
Files:
tests/integration/tools/test_factory_integration.pytests/integration/engine/test_coordination_wiring.pyscripts/run_affected_tests.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/models.pytests/unit/engine/coordination/test_attribution.pysrc/synthorg/api/controllers/coordination.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/engine/compaction/test_epistemic.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
All test files must use pytest markers:
@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow.Coverage minimum is 80% (enforced in CI). Do not add per-file
pytest.mark.timeout(30)markers; non-default overrides liketimeout(60)are allowed. Global timeout of 30 seconds per test is configured inpyproject.toml.Use
asyncio_mode = "auto"for async tests -- no manual@pytest.mark.asyncioneeded.Always run tests with
-n 8for parallelism via pytest-xdist locally. Never run tests sequentially. CI uses-n autofor fewer cores on runners.Prefer
@pytest.mark.parametrizefor testing similar cases.Use Hypothesis for property-based testing with
@givenand@settings. Hypothesis profiles configured intests/conftest.py:ci(deterministic, 10 examples),dev(1000 examples),fuzz(10,000 examples, no deadline),extreme(500,000 examples). Controlled viaHYPOTHESIS_PROFILEenv var..hypothesis/is gitignored.When Hypothesis finds a failure, fix the underlying bug and add an explicit
@example(...)decorator to the test so the case is permanently covered in CI. Do NOT just rerun and move on.Never skip, dismiss, or ignore flaky tests -- always fix them fully. For timing-sensitive tests, mock
time.monotonic()andasyncio.sleep()to make them deterministic instead of widening timing margins.For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider), use
asyncio.Event().wait()instead ofasyncio.sleep(large_number)-- it is cancellation-safe and carries no timing assumptions.
Files:
tests/integration/tools/test_factory_integration.pytests/integration/engine/test_coordination_wiring.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/engine/compaction/test_epistemic.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/integration/tools/test_factory_integration.pytests/integration/engine/test_coordination_wiring.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/engine/compaction/test_epistemic.py
**/*
📄 CodeRabbit inference engine (CLAUDE.md)
Never use Bash to write or modify files -- use the Write or Edit tools. Do not use
cat >,cat << EOF,echo >,echo >>,sed -i,python -c "open(...).write(...)", orteefor file creation or modification.
Files:
tests/integration/tools/test_factory_integration.pytests/integration/engine/test_coordination_wiring.pyscripts/run_affected_tests.pyweb/src/api/types.tsCLAUDE.mddocs/design/engine.mdsrc/synthorg/hr/performance/tracker.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/models.pytests/unit/engine/coordination/test_attribution.pysrc/synthorg/api/controllers/coordination.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/engine/compaction/test_epistemic.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
web/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,js,jsx}: Always usecreateLoggerfrom@/lib/logger-- never bareconsole.warn/console.error/console.debugin application code
Logger variable name must always beconst log(e.g.const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate arguments to logger methods (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Files:
web/src/api/types.ts
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground,bg-card,text-accent,text-success,bg-danger, etc.) or CSS variables (var(--so-*)) for colors; NEVER hardcode hex values in.tsx/.tsfiles
Usefont-sansorfont-mono(Geist tokens) for typography; NEVER setfontFamilydirectly in.tsx/.tsfiles
Use density-aware tokens (p-card,gap-section-gap,gap-grid-gap) or standard Tailwind spacing; NEVER hardcode pixel values for layout spacing in components
Use token variables (var(--so-shadow-card-hover),border-border,border-bright) for shadows and borders; NEVER hardcode values in.tsx/.tsfiles
Use@/lib/motionpresets for Framer Motion transition durations; NEVER hardcode transition durations
CSS side-effect imports in TypeScript 6 require type declarations -- add/// <reference types="vite/client" />at the top of files with CSS importsAlways reuse existing components from
web/src/components/ui/before creating new ones. Seeweb/CLAUDE.mdfor the full component inventory.Never hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions -- use design tokens and
@/lib/motionpresets. A PostToolUse hook enforces these rules on every Edit/Write toweb/src/.
Files:
web/src/api/types.ts
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Always read the relevant
docs/design/page before implementing any feature or planning any issue. The design spec is the starting point for architecture, data models, and behavior.
Files:
CLAUDE.mddocs/design/engine.md
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/hr/performance/tracker.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/models.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
🧠 Learnings (58)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: If implementation deviates from the design spec (better approach found, scope evolved, etc.), alert the user and explain why -- user decides whether to proceed or update the spec. Do NOT silently diverge -- every deviation needs explicit user approval.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts. At every phase of planning and implementation, be critical and actively look for ways to improve the design.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Prioritize issues by dependency order, not priority labels -- unblocked dependencies come first.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Commits must follow format: `<type>: <description>` -- types: feat, fix, refactor, docs, test, chore, perf, ci. Enforced by commitizen (commit-msg hook).
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Signed commits are required on `main` via branch protection -- all commits must be GPG/SSH signed.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Branches must follow naming convention: `<type>/<slug>` from main.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Pre-push hooks run: mypy type-check (affected modules) + pytest unit tests (affected modules) + golangci-lint + go vet + go test (CLI) + eslint-web (web). Foundational module changes (core, config, observability) or conftest changes trigger full runs. These hooks are skipped in pre-commit.ci since dedicated CI jobs handle them.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Use `gh issue list` via Bash (not MCP tools) for GitHub issue queries -- MCP `list_issues` has unreliable field data.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Use squash merge for PRs -- PR body becomes the squash commit message on main. Preserve existing `Closes `#NNN`` references in trailers -- never remove unless explicitly asked.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: After finishing an issue implementation, always create a feature branch (`<type>/<slug>`), commit, and push -- do NOT create a PR automatically. Do NOT leave work uncommitted on main.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Never create a PR directly -- `gh pr create` is blocked by hookify. Always use `/pre-pr-review` to create PRs -- it runs automated checks + review agents + fixes before creating the PR. For trivial/docs-only changes: `/pre-pr-review quick` skips agents but still runs checks.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: After the PR exists, use `/aurelio-review-pr` to handle external reviewer feedback. The `/commit-push-pr` command is effectively blocked (it calls `gh pr create` internally).
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Fix everything valid from review agents -- never skip. When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to PR changes), fix them all. No deferring, no "out of scope" skipping.
📚 Learning: 2026-04-02T18:48:02.489Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T18:48:02.489Z
Learning: ALWAYS run pytest with `-n auto` for parallel execution with pytest-xdist; never run tests sequentially
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
scripts/run_affected_tests.pytests/unit/engine/compaction/test_epistemic.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Pre-push hooks check: mypy type-check (affected modules only) + pytest unit tests (affected modules only) + golangci-lint + go vet + go test (CLI, conditional on `cli/**/*.go`) + eslint-web (web dashboard); foundational module changes (core, config, observability) or conftest changes trigger full runs
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Run unit tests with `uv run python -m pytest tests/ -m unit -n auto`; integration tests with `-m integration -n auto`; e2e tests with `-m e2e -n auto`
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: ALWAYS include `-n 8` when running pytest locally; never run tests sequentially; CI uses `-n auto`
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-04-06T16:35:12.934Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: All project conventions, commands, and standards are defined in CLAUDE.md - refer to it for project structure, package layout, code conventions, quick commands, git workflow, testing standards, design specifications, logging, resilience, and security patterns
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.
Applied to files:
CLAUDE.mdsrc/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.
Applied to files:
CLAUDE.mdsrc/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.
Applied to files:
CLAUDE.mdsrc/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Always use structured logging: `logger.info(EVENT, key=value)`—never `logger.info("msg %s", val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-02T12:07:44.443Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T12:07:44.443Z
Learning: Applies to src/synthorg/**/*.py : Always use structured logging: `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit
Applied to files:
CLAUDE.mdsrc/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-31T20:07:03.035Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions
Applied to files:
CLAUDE.mdsrc/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...
Applied to files:
docs/design/engine.mdsrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations
Applied to files:
docs/design/engine.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Applied to files:
docs/design/engine.mdsrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)
Applied to files:
docs/design/engine.md
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue — DESIGN_SPEC.md is a pointer file linking to 7 design pages (Agents, Organization, Communication, Engine, Memory, Operations)
Applied to files:
docs/design/engine.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.
Applied to files:
docs/design/engine.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.
Applied to files:
docs/design/engine.mdsrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
docs/design/engine.mdsrc/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)
Applied to files:
src/synthorg/hr/performance/tracker.pysrc/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion
Applied to files:
src/synthorg/hr/performance/tracker.pysrc/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to **/*.py : Create new objects for immutability, never mutate existing ones; for non-Pydantic internal collections use `copy.deepcopy()` at construction + `MappingProxyType` wrapping for read-only enforcement
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-01T09:09:43.948Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for `dict`/`list` fields
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : For `dict`/`list` fields in frozen Pydantic models, rely on `frozen=True` for field reassignment prevention and `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/models.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Applied to files:
src/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/tools/context/compact_context.pysrc/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`); use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-17T11:41:02.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T11:41:02.964Z
Learning: Applies to src/**/*.py : Models: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing + validating redundant fields. Use `NotBlankStr` for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-04-02T11:49:21.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T11:49:21.202Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`) with `allow_inf_nan=False` in all `ConfigDict` declarations
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state
Applied to files:
src/synthorg/engine/compaction/models.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.
Applied to files:
src/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`
Applied to files:
src/synthorg/api/controllers/coordination.pysrc/synthorg/engine/coordination/service.py
📚 Learning: 2026-04-02T09:04:54.709Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T09:04:54.709Z
Learning: Applies to tests/**/*.py : Use pytest markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`
Applied to files:
tests/unit/engine/compaction/test_epistemic.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).
Applied to files:
tests/unit/engine/compaction/test_epistemic.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to tests/**/*.py : Use markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow` for test classification
Applied to files:
tests/unit/engine/compaction/test_epistemic.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.
Applied to files:
src/synthorg/engine/coordination/service.py
🔇 Additional comments (3)
CLAUDE.md (1)
93-93: Good update to event-name guidance for new observability flows.The expanded event constant examples are aligned with the new coordination attribution and compaction/epistemic instrumentation, and keep the “import constant from domain module” rule explicit.
web/src/api/types.ts (1)
1310-1319: API contract update looks correct.Adding
currency: stringtoCoordinationResultResponsecleanly aligns the frontend type with the backend coordination payload and avoids type drift.docs/design/engine.md (1)
1490-1490: LGTM: ACG cross-reference accurately reflects NodeType addition.The update correctly documents the stronger node typing introduced by the
NodeTypeenum onTurnRecord.node_types, maintaining the bidirectional glossary between ACG formalism and SynthOrg implementation.
| class AgentContribution(BaseModel): | ||
| """Per-agent contribution to a coordinated task execution. | ||
|
|
||
| Attributes: | ||
| agent_id: Identifier of the contributing agent. | ||
| subtask_id: Identifier of the subtask this agent executed. | ||
| contribution_score: Normalized score (0.0-1.0) reflecting | ||
| the agent's contribution quality. | ||
| failure_attribution: Classification of why the agent failed | ||
| (``None`` when the agent succeeded with score 1.0). | ||
| evidence: Truncated error message or evidence pointer | ||
| (``None`` when the agent succeeded). | ||
| """ | ||
|
|
||
| model_config = ConfigDict(frozen=True, allow_inf_nan=False) | ||
|
|
||
| agent_id: NotBlankStr = Field(description="Contributing agent") | ||
| subtask_id: NotBlankStr = Field(description="Subtask executed") | ||
| contribution_score: float = Field( | ||
| ge=0.0, | ||
| le=1.0, | ||
| description="Contribution quality (0.0-1.0)", | ||
| ) | ||
| failure_attribution: FailureAttribution | None = Field( | ||
| default=None, | ||
| description="Why the agent failed (None on success)", | ||
| ) | ||
| evidence: str | None = Field( | ||
| default=None, | ||
| max_length=_MAX_EVIDENCE_LENGTH, | ||
| description="Truncated error or evidence pointer", | ||
| ) |
There was a problem hiding this comment.
AgentContribution drops fields that #1123 says to surface.
This model only preserves score/attribution/evidence. Because PerformanceTracker.record_coordination_contributions() accepts only AgentContribution, failure_category, token usage, and the raw per-agent outcome are discarded before the tracker can surface them. If #1123 intentionally narrowed scope, please update the objective; otherwise this payload needs those fields.
Based on learnings: If implementation deviates from the design spec (better approach found, scope evolved, etc.), alert the user and explain why -- user decides whether to proceed or update the spec.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/engine/coordination/attribution.py` around lines 72 - 103, The
AgentContribution Pydantic model currently only preserves contribution_score,
failure_attribution, and evidence, which drops fields required by the design
(e.g., failure_category, token_usage, raw_per_agent_outcome); update the
AgentContribution class to include these missing attributes (add fields such as
failure_category: FailureCategory | None, token_usage: TokenUsage | None, and
raw_outcome: dict | str | None with appropriate types, defaults, descriptions,
and validation similar to existing fields) and then update the
PerformanceTracker.record_coordination_contributions signature and any callers
to accept and forward the extended AgentContribution instances so those values
are retained and surfaced (ensure model_config and max length/validation rules
follow existing patterns and that serialization/consumption points use the new
fields).
| _COMPACT_CONTEXT_SCHEMA: dict[str, Any] = { | ||
| "type": "object", | ||
| "properties": { | ||
| "strategy": { | ||
| "type": "string", | ||
| "enum": ["summarize"], | ||
| "description": ( | ||
| "Compaction strategy. Currently only 'summarize' is supported." | ||
| ), | ||
| }, | ||
| "preserve_markers": { | ||
| "type": "boolean", | ||
| "default": True, | ||
| "description": ( | ||
| "Whether to preserve epistemic markers (wait, hmm, " | ||
| "actually, etc.) in the compaction summary." | ||
| ), | ||
| }, | ||
| "reason": { | ||
| "type": "string", | ||
| "minLength": 10, | ||
| "maxLength": 256, | ||
| "description": ( | ||
| "Brief explanation for why compaction is needed " | ||
| "now (e.g., 'context fill at 92 percent, need to " | ||
| "preserve reasoning clarity')." | ||
| ), | ||
| }, | ||
| }, | ||
| "required": ["strategy", "reason"], | ||
| "additionalProperties": False, |
There was a problem hiding this comment.
The compact_context schema cannot carry an agent-authored summary.
The linked #1125 objective says agents should be able to provide summary content that replaces or supplements automatic compaction. This schema only accepts strategy, preserve_markers, and reason, so that path is impossible through the public tool boundary. Please either add the missing summary field(s) and wiring, or explicitly update the spec before merge.
Based on learnings, "If implementation deviates from the design spec (better approach found, scope evolved, etc.), alert the user and explain why -- user decides whether to proceed or update the spec."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/tools/context/compact_context.py` around lines 23 - 53, The
schema _COMPACT_CONTEXT_SCHEMA currently lacks a field for agent-provided
summaries; add a new optional property (e.g., "agent_summary" or "summary") to
the schema with type "string" and sensible minLength/maxLength (and a
description stating it can replace or supplement auto compaction), keep
"required" unchanged (do not force it), and ensure additionalProperties remains
False; then wire this field through the public tool boundary by accepting it
where the compact_context tool input is parsed/validated so the agent-provided
summary is passed into the compaction logic (update any validation/parse code
that references _COMPACT_CONTEXT_SCHEMA to include and forward "agent_summary").
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
src/synthorg/tools/context/compact_context.py (1)
26-57:⚠️ Potential issue | 🟠 MajorSchema lacks field for agent-provided summaries.
Per
#1125objectives, agents should be able to provide summaries that "replace or supplement automatic compaction." The current schema only acceptsstrategy,preserve_markers, andreason—there's no path to supply actual summary content through the tool boundary.If this is intentional (e.g., deferred to a follow-up), please document that decision. Otherwise, consider adding an optional
summaryfield.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/tools/context/compact_context.py` around lines 26 - 57, The schema defined in _RAW_SCHEMA does not allow agent-provided summaries; add an optional "summary" property to the "properties" map of _RAW_SCHEMA (type: "string", appropriate minLength/maxLength to match expected summary size, and a clear description stating it can "replace or supplement automatic compaction") and ensure it is not included in "required" so tools can optionally submit summaries; update any validation/consumer code that reads _RAW_SCHEMA to accept and forward the "summary" field where compaction logic (summarize strategy) consumes agent-provided content.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/design/engine.md`:
- Around line 1465-1468: The docs claim record_coordination_contributions() is
guarded by an asyncio.Lock but the method is synchronous and doesn't acquire
self._metrics_lock; change the implementation to actually synchronize by
converting record_coordination_contributions() into an async def and await
self._metrics_lock.acquire()/use async with self._metrics_lock around the write
logic (or alternatively use a separate threading.Lock if you must keep it sync),
and update any callers to await record_coordination_contributions() accordingly
so the lock (self._metrics_lock) is actually used as documented.
In `@src/synthorg/engine/compaction/summarizer.py`:
- Around line 325-345: The docstring for force_compaction is misleading about
the mechanism used to bypass the fill threshold; update the docstring to state
that the function calls _do_compaction with force=True which causes
_do_compaction to skip the fill-threshold comparison entirely (rather than
"temporarily using a zero-percent threshold"), and keep the existing description
of purpose and return values intact so readers understand that force_compaction
delegates to _do_compaction with a force flag to bypass threshold checks.
In `@src/synthorg/engine/coordination/service.py`:
- Around line 97-104: The factory that constructs MultiAgentCoordinator is not
passing the performance_tracker through, leaving
MultiAgentCoordinator._performance_tracker None; update the factory call that
instantiates MultiAgentCoordinator (the line that calls
MultiAgentCoordinator(...)) to forward the performance_tracker argument (either
pass performance_tracker as the corresponding positional parameter or include
performance_tracker=performance_tracker in the constructor call) so
MultiAgentCoordinator.__init__ receives and stores the tracker.
In `@tests/unit/engine/compaction/test_epistemic.py`:
- Around line 171-176: The test test_respects_max_chars_default is too
permissive (assert len(result) <= 210) and should assert the exact truncation
behavior: call extract_marker_sentences with the 306-char single-sentence input
and assert the returned string length equals 200 (since implementation uses
stripped[:max_chars]); update the assertion to assert len(result) == 200 and
adjust the comment to state exact truncation to 200 characters for the
single-sentence case.
---
Duplicate comments:
In `@src/synthorg/tools/context/compact_context.py`:
- Around line 26-57: The schema defined in _RAW_SCHEMA does not allow
agent-provided summaries; add an optional "summary" property to the "properties"
map of _RAW_SCHEMA (type: "string", appropriate minLength/maxLength to match
expected summary size, and a clear description stating it can "replace or
supplement automatic compaction") and ensure it is not included in "required" so
tools can optionally submit summaries; update any validation/consumer code that
reads _RAW_SCHEMA to accept and forward the "summary" field where compaction
logic (summarize strategy) consumes agent-provided content.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 94a08610-8ec9-40a2-92cd-7ab4979482bd
📒 Files selected for processing (9)
docs/design/engine.mdscripts/run_affected_tests.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/coordination/attribution.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/tools/context/compact_context.pytests/integration/tools/test_factory_integration.pytests/unit/engine/compaction/test_epistemic.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use Python 3.14+ and do NOT use
from __future__ import annotations(PEP 649 has native lazy annotations)Use PEP 758 except syntax: use
except A, B:(no parentheses) instead ofexcept (A, B):Provide type hints on all public functions and classes; enforce mypy strict mode
Use Google-style docstrings on all public classes and functions (enforced by ruff D rules)
Create new objects instead of mutating existing ones (immutability). For non-Pydantic internal collections (registries, BaseTool), use
copy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement.Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using
model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.Use Pydantic v2 (
BaseModel,model_validator,computed_field,ConfigDict). Always useallow_inf_nan=Falsein allConfigDictdeclarations. Use@computed_fieldfor derived values instead of storing redundant fields. UseNotBlankStrfromcore.typesfor all identifier/name fields.Prefer
asyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.Enforce 88-character line length (ruff)
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow exceptions
Validate at system boundaries: user input, external APIs, and config files
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:
example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Vendor names may only appear in: (1) Operations design page (docs/design/operations.md), (2).claude/files, (3) third-party import paths, (4) provider presets (`src/synt...
Files:
tests/integration/tools/test_factory_integration.pyscripts/run_affected_tests.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/compaction/epistemic.pytests/unit/engine/compaction/test_epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/coordination/attribution.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use pytest markers:
@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow. Minimum coverage 80% (enforced in CI). Useasyncio_mode = "auto"(no manual@pytest.mark.asyncioneeded). Global 30-second timeout per test (do not add per-file markers except for non-default overrides liketimeout(60)).ALWAYS include
-n 8when running pytest locally for parallelism with pytest-xdist; never run tests sequentiallyPrefer
@pytest.mark.parametrizefor testing similar casesUse Hypothesis for property-based testing with
@given+@settings. Control viaHYPOTHESIS_PROFILEenv var:ci(10 examples, deterministic),dev(1000 examples),fuzz(10,000 examples, no deadline),extreme(500,000 examples, no deadline). When Hypothesis finds a failure, fix the bug and add an explicit@example(...)decorator to permanently cover the case in CI.NEVER skip, dismiss, or ignore flaky tests. Fix them fundamentally. For timing-sensitive tests, mock
time.monotonic()andasyncio.sleep()to make them deterministic. For tasks that must block indefinitely until cancelled, useasyncio.Event().wait()instead ofasyncio.sleep(large_number).
Files:
tests/integration/tools/test_factory_integration.pytests/unit/engine/compaction/test_epistemic.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/integration/tools/test_factory_integration.pytests/unit/engine/compaction/test_epistemic.py
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
ALWAYS read the relevant
docs/design/page before implementing any feature or planning any issue. The design spec is the starting point for architecture, data models, and behavior. If implementation deviates from the spec, alert the user and explain why -- user decides whether to proceed or update the spec. Do NOT silently diverge.
Files:
docs/design/engine.md
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Every module with business logic MUST have logging via
from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Never useimport logging,logging.getLogger(), orprint()in application code.Use event name constants from domain-specific modules under
synthorg.observability.events(e.g.,API_REQUEST_STARTEDfromevents.api,TOOL_INVOKE_STARTfromevents.tool). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANTUse structured logging with kwargs: always
logger.info(EVENT, key=value)-- neverlogger.info("msg %s", val)Log all error paths at WARNING or ERROR with context before raising; log all state transitions at INFO; log object creation, internal flow, and key function entry/exit at DEBUG
All provider calls go through
BaseCompletionProviderwhich applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.Set
RetryConfigandRateLimiterConfigper-provider inProviderConfig. Retryable errors (is_retryable=True):RateLimitError,ProviderTimeoutError,ProviderConnectionError,ProviderInternalError. Non-retryable errors raise immediately.RetryExhaustedErrorsignals all retries failed.For
dict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention andcopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Files:
src/synthorg/tools/context/compact_context.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/coordination/attribution.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/tools/context/compact_context.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/coordination/attribution.py
🧠 Learnings (41)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: At every phase of planning and implementation, be critical -- actively look for ways to improve the design in the spirit of what we're building (robustness, correctness, simplicity, future-proofing where it's free). Surface improvements as suggestions, not silent changes -- user decides.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Prioritize issues by dependency order, not priority labels -- unblocked dependencies come first
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Never use `cd` in Bash commands -- the working directory is already set to the project root. Use absolute paths or run commands directly. Exception: `bash -c "cd <dir> && <cmd>"` is safe for tools without a `-C` flag.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Never use Bash to write or modify files -- use the Write or Edit tools. Do not use `cat >`, `cat << EOF`, `echo >`, `echo >>`, `sed -i`, `python -c`, or `tee` to create/modify files (read-only/inspection uses are fine).
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Use commit format: `<type>: <description>` with types: feat, fix, refactor, docs, test, chore, perf, ci (enforced by commitizen). All commits on main must be GPG/SSH signed. Use branch naming: `<type>/<slug>` from main.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Use squash merge strategy for PRs -- PR body becomes the squash commit message on main. Preserve existing `Closes `#NNN`` references in PR body; never remove unless explicitly asked.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: After finishing an issue implementation: always create a feature branch (`<type>/<slug>`), commit, and push -- do NOT create a PR automatically. Do NOT leave work uncommitted on main.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: NEVER create a PR directly with `gh pr create` (blocked by hookify). ALWAYS use `/pre-pr-review` to create PRs -- it runs automated checks + review agents + fixes before creating the PR. For trivial/docs-only changes, use `/pre-pr-review quick`. After PR exists, use `/aurelio-review-pr` for external feedback. Fix all valid review findings -- never skip.
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: See [docs/reference/claude-reference.md](docs/reference/claude-reference.md) for: Documentation layout, Docker commands, Package Structure, Releasing, CI pipelines, Dependencies
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-04-02T18:48:02.489Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T18:48:02.489Z
Learning: ALWAYS run pytest with `-n auto` for parallel execution with pytest-xdist; never run tests sequentially
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: ALWAYS include `-n 8` when running pytest locally; never run tests sequentially; CI uses `-n auto`
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Pre-push hooks check: mypy type-check (affected modules only) + pytest unit tests (affected modules only) + golangci-lint + go vet + go test (CLI, conditional on `cli/**/*.go`) + eslint-web (web dashboard); foundational module changes (core, config, observability) or conftest changes trigger full runs
Applied to files:
scripts/run_affected_tests.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...
Applied to files:
docs/design/engine.mdsrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.
Applied to files:
docs/design/engine.mdsrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Applied to files:
docs/design/engine.mdsrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/coordination/attribution.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
docs/design/engine.mdsrc/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to **/*.py : Create new objects for immutability, never mutate existing ones; for non-Pydantic internal collections use `copy.deepcopy()` at construction + `MappingProxyType` wrapping for read-only enforcement
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-01T09:09:43.948Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for `dict`/`list` fields
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : For `dict`/`list` fields in frozen Pydantic models, rely on `frozen=True` for field reassignment prevention and `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-04-07T22:12:30.141Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-31T20:07:03.035Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions
Applied to files:
src/synthorg/tools/context/compact_context.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`
Applied to files:
src/synthorg/engine/coordination/service.pysrc/synthorg/engine/compaction/epistemic.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`
Applied to files:
src/synthorg/engine/coordination/service.pysrc/synthorg/engine/compaction/epistemic.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)
Applied to files:
src/synthorg/engine/coordination/service.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.
Applied to files:
src/synthorg/engine/coordination/service.pysrc/synthorg/engine/compaction/epistemic.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.
Applied to files:
src/synthorg/engine/coordination/service.pysrc/synthorg/engine/compaction/epistemic.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
src/synthorg/engine/compaction/epistemic.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).
Applied to files:
src/synthorg/engine/compaction/epistemic.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.
Applied to files:
src/synthorg/engine/compaction/epistemic.py
📚 Learning: 2026-04-02T09:04:54.709Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T09:04:54.709Z
Learning: Applies to tests/**/*.py : Use pytest markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`
Applied to files:
tests/unit/engine/compaction/test_epistemic.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).
Applied to files:
tests/unit/engine/compaction/test_epistemic.py
📚 Learning: 2026-04-07T22:12:30.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to tests/**/*.py : Use markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow` for test classification
Applied to files:
tests/unit/engine/compaction/test_epistemic.py
🔇 Additional comments (21)
scripts/run_affected_tests.py (2)
151-160: Good documentation of xdist behavior.The expanded docstring clearly explains the rationale for both
--dist loadscopeand--max-worker-restart=0, which will help future maintainers understand why these options diverge from the default pytest configuration.
170-172:⚠️ Potential issue | 🟡 MinorConflicting
--distmodes between script and pyproject.toml requires clarification.The script adds
--dist loadscope(line 171), but pyproject.toml specifies--dist=workstealin addopts (line 256). When pytest processes both, the script's flag takes precedence, creating an inconsistency:
- This script runs tests with
loadscope(module-based grouping)- Normal pytest runs (e.g.,
pytest tests/unit/) useworksteal(dynamic work stealing)If
loadscopeis required to resolve the fixture teardown/KeyError issues described in the docstring, update pyproject.toml to useloadscopeglobally for consistency. Alternatively, add an inline comment explaining why this script intentionally overrides the global setting.Also,
-n 8is redundant since pyproject.toml already specifies it in addopts.⛔ Skipped due to learnings
Learnt from: CR Repo: Aureliolo/synthorg PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-04-07T22:12:30.142Z Learning: Pre-push hooks check: mypy type-check (affected modules only) + pytest unit tests (affected modules only) + golangci-lint + go vet + go test (CLI, conditional on `cli/**/*.go`) + eslint-web (web dashboard); foundational module changes (core, config, observability) or conftest changes trigger full runsLearnt from: CR Repo: Aureliolo/synthorg PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-17T22:08:13.456Z Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.Learnt from: CR Repo: Aureliolo/synthorg PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-04-07T22:12:30.142Z Learning: ALWAYS include `-n 8` when running pytest locally; never run tests sequentially; CI uses `-n auto`Learnt from: CR Repo: Aureliolo/synthorg PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-15T18:28:13.207Z Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.Learnt from: CR Repo: Aureliolo/synthorg PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-04-02T18:48:02.489Z Learning: ALWAYS run pytest with `-n auto` for parallel execution with pytest-xdist; never run tests sequentiallyLearnt from: CR Repo: Aureliolo/synthorg PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-20T21:44:04.528Z Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing marginsLearnt from: CR Repo: Aureliolo/synthorg PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-04-07T22:12:30.142Z Learning: Applies to tests/**/*.py : 30-second timeout per test (global in `pyproject.toml`); do not add per-file `pytest.mark.timeout(30)` markers; non-default overrides like `timeout(60)` are allowedLearnt from: CR Repo: Aureliolo/synthorg PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-16T07:22:28.134Z Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing marginssrc/synthorg/engine/coordination/attribution.py (5)
72-103:AgentContributionis missing fields specified in#1123objectives.Per the PR objectives for
#1123: "PopulateAgentContributionwithagent_id,subtask_id,outcome(success/partial/failure),failure_category(viainfer_failure_category()), andtoken_usage."The current model has:
- ✅
agent_id,subtask_id- ❌
outcomeenum (success/partial/failure) — replaced bycontribution_score- ❌
failure_category(theFailureCategoryenum) — replaced byfailure_attribution(a differentLiteraltype)- ❌
token_usageThis deviation means downstream consumers (like
PerformanceTracker) cannot access the originalFailureCategoryenum or token metrics for trend analysis. If this narrowing was intentional, please update the#1123objectives to reflect the actual design.
226-295: LGTM!The
_score_outcomehelper correctly:
- Handles pre-execution failures via
infer_failure_categorywith evidence extraction- Returns perfect score (1.0) for successful outcomes without failure attribution
- Maps termination reasons to failure attributions for non-success completions
- Provides a defensive fallback for unexpected missing result/error states
The
or Noneat line 285 correctly converts empty strings toNone, satisfying the model's optional evidence field.
161-223: LGTM!The
build_agent_contributionsfunction correctly:
- Builds lookups from routing decisions with both task_id and agent_id mappings
- Prefers direct task_id matching (line 199-200) over agent lookup for multi-subtask agent scenarios
- Falls back gracefully to agent-based lookup when task_id isn't in routed subtasks
- Logs attribution metrics at INFO level with structured kwargs
62-69: LGTM!The
_TERMINATION_TO_ATTRIBUTIONmapping correctly omitsTerminationReason.COMPLETEDsince successful outcomes (is_success=True) return early at line 261-266 and never reach the termination-to-attribution lookup. The.get(termination_reason, "direct")fallback at line 274 provides defensive handling.
120-158: LGTM!The
CoordinationResultWithAttributionwrapper correctly:
- Uses frozen Pydantic model with
allow_inf_nan=Falseper coding guidelines- Preserves the original
CoordinationResultimmutably- Provides computed
is_successthat delegates to the wrapped result- Calculates
avg_contribution_scorewith safe empty-tuple handlingdocs/design/engine.md (2)
1449-1450: LGTM!The section header was appropriately updated to remove the "Phase 8" terminology that conflicted with the 7-phase pipeline, now correctly positioned as a post-pipeline step.
1490-1490: LGTM!The ACG vocabulary cross-reference correctly reflects the new
NodeTypeenum andTurnRecord.node_typesfield implementation, upgrading the fidelity from "Partial" to "Strong."src/synthorg/engine/coordination/service.py (2)
212-248: LGTM!The post-pipeline attribution building and tracker recording are properly guarded with try/except blocks that:
- Re-raise
MemoryErrorandRecursionErrorper project convention- Log at WARNING with context before continuing
- Default
contributionsto an empty tuple on error, preserving the successful coordination resultThis addresses the past review concerns about attribution/tracker failures failing an already-completed coordination run.
106-131: LGTM!The method signature and docstring correctly document the return type change to
CoordinationResultWithAttributionand the new attribution phase (Phase 8) in the pipeline description.tests/integration/tools/test_factory_integration.py (1)
105-110: LGTM!The test now combines count verification with specific tool name assertions, which addresses maintainability concerns. The checks for
compact_context,read_file, andwrite_fileensure critical tools are registered.src/synthorg/tools/context/compact_context.py (1)
73-121: LGTM!The tool implementation correctly:
- Deep-copies the schema for the parameters to prevent mutation
- Sanitizes the
reasonbefore logging (addressing privacy concerns)- Returns a directive via metadata for the execution loop to process
src/synthorg/engine/compaction/epistemic.py (2)
1-70: LGTM!The epistemic marker detection implementation is clean:
- Precompiled patterns with
re.IGNORECASEfor case-insensitive matchingcount_epistemic_markerscorrectly counts distinct pattern groups (not raw matches)- Good reference to the arXiv paper justifying the approach
73-138: LGTM!The preservation logic and sentence extraction are well-implemented:
- Complexity-adaptive thresholds match the
#1125spec (COMPLEX/EPIC: 1+, SIMPLE/MEDIUM: 3+)- Separator accounting correctly handles the first sentence (no preceding separator)
- Edge case for first-sentence truncation ensures non-empty output when needed
tests/unit/engine/compaction/test_epistemic.py (2)
13-104: LGTM!Comprehensive test coverage for marker detection:
- Tests all five marker groups individually and in combination
- Verifies deduplication (same group counted once)
- Tests complexity-adaptive thresholds for all
Complexityvariants- Confirms "no markers → never preserve" behavior
144-154: LGTM!This test now properly exercises the first-sentence truncation path by using a single long marker sentence that exceeds
max_chars, and asserts the exact truncated output.src/synthorg/engine/compaction/summarizer.py (4)
90-97: LGTM!The threshold logic correctly implements dual-threshold behavior per
#1125:
- When
agent_controlled=True: usessafety_threshold_percentas the safety net- Otherwise: uses
fill_threshold_percentfor automatic compaction- The
force=Truepath bypasses the check entirely for agent-initiated requests
235-244: LGTM!Defensive extraction with sensible defaults:
- Chains through
getattrto safely navigate context structure- Defaults to
Complexity.COMPLEXwhen complexity is unavailable, which is the conservative choice (preserves more markers at the lower threshold)
280-298: LGTM!The epistemic marker preservation integrates cleanly:
- Checks preservation decision via
should_preserve_messagewith task complexity- Extracts marker sentences (already length-limited to 200 chars by default)
- Sanitizes content without re-truncating via
max_length=max(len(marker_text), 1)- Falls back to standard 100-char snippet when no markers are worth preserving
314-321: LGTM!Singular/plural handling is now correct: "1 message" vs "N messages".
…ing eval #1123: Execution trace enrichment - Add NodeType enum (LLM_CALL, TOOL_INVOCATION, QUALITY_CHECK, BUDGET_CHECK, STAGNATION_CHECK) to loop_protocol.py - Add node_types field to TurnRecord (auto-derived in make_turn_record) - Add AgentContribution + CoordinationResultWithAttribution wrapper in coordination/attribution.py with build_agent_contributions() factory - Update CoordinationService.coordinate() to return attributed result - Add record_coordination_contributions() to PerformanceTracker - Update all callers (agent_engine, API controller, tests) #1125: Agent-controlled compaction + epistemic marker preservation - Add epistemic marker detection (compaction/epistemic.py) with complexity-adaptive thresholds (COMPLEX/EPIC: 1+, SIMPLE/MEDIUM: 3+) - Rewrite _build_summary() to preserve marker-containing sentences - Add CompactContextTool (tools/context/compact_context.py) for agent-initiated compaction via metadata directive - Add CompactionConfig fields: agent_controlled, safety_threshold_percent, preserve_epistemic_markers with dual-threshold behavior - Add force_compaction() that bypasses threshold check - Register compact_context in build_default_tools() #1113: Versioning infrastructure evaluation - Evaluate WorkflowDefinitionVersion migration to generic VersionSnapshot[T] (favorable: content hash dedup, concurrent write safety, schema consistency) - Create follow-up issues: #1131 (migration), #1132 (config versioning), #1133 (role catalog versioning) Closes #1123 Closes #1125 Closes #1113
…odeRabbit Critical: - Document 8 new context_budget + 15 new coordination event constants in CLAUDE.md - Add asyncio.Lock to PerformanceTracker for shared mutable dict access Major: - Fix epistemic marker extraction: first sentence exceeding max_chars no longer returns empty string; fix off-by-2 separator accounting for first sentence - Add TerminationReason.PARKED to _TERMINATION_TO_ATTRIBUTION mapping - Guard tracker write in coordination service with try/except post-completion - Fix agent-to-subtask mapping overwrite for multi-subtask agents (use list) - Sanitize reason text in compact_context tool before logging/propagation - Sanitize marker_text in summarizer before appending to summary - Use deepcopy() at construction for CompactContextTool schema - Replace asyncio.gather with TaskGroup for agent resolution in controller - Add missing currency field to frontend CoordinationResultResponse type - Rename total_contribution_score to avg_contribution_score (was computing avg) - Document attribution subsystem in docs/design/engine.md Medium: - Remove unreachable length check in epistemic.py - Fix "from 1 messages" grammar (singular/plural) - Extract attributed.result into local vars in coordination controller - Expand CompactionConfig docstring for dual-threshold agent-controlled mode - Add Protocol types for outcome_result type safety under TYPE_CHECKING - Update ACG cross-reference: NodeType now provides formal node typing - Narrow pytest.raises(Exception) to ValidationError in frozen model tests Minor: - Clarify whitespace stripping assertions in test_epistemic.py - Strengthen force_compaction test to assert result is not None - Refactor force_compaction to delegate to _do_compaction(force=True)
…esult - Bump expected tool count from 14 to 15 in test_factory_integration.py - Wrap mock coordinate result in CoordinationResultWithAttribution
…rashes With the default --dist load, xdist scatters individual tests across workers randomly, causing each worker to repeatedly create and tear down heavy Litestar TestClient apps for different modules. This leads to resource exhaustion and random worker crashes when running the full 15K+ unit suite. --dist loadscope groups tests by module so each worker handles related tests together, dramatically reducing fixture churn. Verified: 15,627 passed, 0 failed (previously 4-8 random failures per run).
- Rename "Per-Agent Attribution (Phase 8)" heading to "Per-Agent Attribution" in design spec (attribution is post-pipeline) - Add logger to epistemic.py for convention consistency - Remove duplicate STARTED log from force_compaction, add forced= kwarg to canonical INFO emission in _do_compaction - Guard build_agent_contributions with try/except in coordinator so attribution failures don't fail completed runs - Replace round-robin subtask mapping with task_id-based lookup (handles multi-subtask agents and out-of-order outcomes) - Wrap _COMPACT_CONTEXT_SCHEMA in MappingProxyType for read-only enforcement; deepcopy from raw dict at construction - Add tool name presence checks to factory integration test - Add test for first-sentence truncation path in epistemic extraction
- Fix misleading force_compaction docstring (now correctly describes force=True delegation instead of zero-percent threshold) - Correct design spec: record_coordination_contributions() is sync (no await points, dict ops are atomic in single-threaded event loop); record_task_metric() is the method guarded by asyncio.Lock - Wire performance_tracker through build_coordinator factory so attribution data is actually recorded (was defaulting to None) - Tighten test_respects_max_chars_default assertion: single sentence exceeding 200 chars is truncated to exactly 200
The test_evaluator_exception_logs_event test only checked capsys.out but structlog may route to stderr when configured by a preceding test in the same xdist worker. Check combined stdout+stderr output.
2ac69ca to
d8a975c
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/run_affected_tests.py (1)
150-176:⚠️ Potential issue | 🟡 MinorDocument the intentional deviation from project defaults in
pyproject.toml.The script overrides
--distfrom the project defaultworksteal(inpyproject.toml) toloadscope. While the docstring explains the rationale—preventing worker crashes from fixture teardown/setup overhead and working around a loadscope-specific xdist scheduler bug—add a comment to clarify this is an intentional deviation from the project's default distribution strategy. This helps developers understand why directpytestcalls may behave differently than tests run through this script.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/run_affected_tests.py` around lines 150 - 176, Add a brief comment in the _run_pytest function (near the cmd list / where "--dist", "loadscope" is set) stating that using "--dist loadscope" intentionally deviates from the project's pyproject.toml default of "worksteal" and explaining in one line that this choice prevents heavy fixture teardown/setup across workers and works around a known xdist scheduler issue; reference both the cmd variable and pyproject.toml so future readers understand this is deliberate.
♻️ Duplicate comments (2)
src/synthorg/engine/coordination/attribution.py (1)
72-103:⚠️ Potential issue | 🟠 Major
AgentContributionstill can't represent the#1123contract.The model only preserves score/attribution/evidence, and
_score_outcome()only emits0.0or1.0, sopartialoutcomes can never be represented.outcome,failure_category, andtoken_usageare therefore dropped beforePerformanceTracker.record_coordination_contributions()ever sees them.Based on learnings: If implementation deviates from the design spec, alert the user and explain why -- user decides whether to proceed or update the spec.
Also applies to: 226-286
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/coordination/attribution.py` around lines 72 - 103, AgentContribution currently drops outcome, failure_category, and token_usage because its fields only hold contribution_score/failure_attribution/evidence and _score_outcome() returns only 0.0 or 1.0; update AgentContribution to include outcome, failure_category, and token_usage fields (types matching the contract), adjust _score_outcome() to return/emit partial scores (not just binary 0/1) so partial outcomes are preserved, and ensure PerformanceTracker.record_coordination_contributions() accepts and forwards these new fields instead of truncating them; locate and update the AgentContribution class, the _score_outcome() implementation, and record_coordination_contributions() to carry the added fields end-to-end.src/synthorg/tools/context/compact_context.py (1)
26-57:⚠️ Potential issue | 🟠 MajorSchema still lacks agent-provided summary field per PR objectives.
Per
#1125objectives: "Provide an agent-invokablecompact_contexttool allowing agents to supply summaries that replace or supplement automatic compaction." The schema only acceptsstrategy,preserve_markers, andreason—there's no field for agent-authored summary content. This matches a past review comment that was not marked resolved.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/tools/context/compact_context.py` around lines 26 - 57, The JSON schema _RAW_SCHEMA for the compact_context tool is missing a field for agent-provided summary content; add an optional string property (e.g., "agent_summary") with sensible constraints (minLength and maxLength) and include it in properties, update any validation logic that uses _RAW_SCHEMA to accept and prefer "agent_summary" when present (so functions consuming _RAW_SCHEMA such as the compaction handler will use agent_summary to replace or supplement automatic summaries), and ensure additionalProperties remains False so only defined keys ("strategy","preserve_markers","reason","agent_summary") are allowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/engine/loop_helpers.py`:
- Around line 513-518: Derived node_types currently includes
NodeType.TOOL_INVOCATION as soon as response.tool_calls exists, which can
produce false positives when tool calls are later cancelled (e.g.,
clear_last_turn_tool_calls()); fix by delaying or recomputing TOOL_INVOCATION
inclusion based on actual executed/retained tool calls rather than the initial
response.tool_calls: update the logic around the derived/node_types computation
(the derived list, variable node_types and its creation) so it is calculated
after invoker/verification paths, or make clear_last_turn_tool_calls remove or
reset the TOOL_INVOCATION flag from node_types; reference the response object,
derived list, node_types tuple creation, and the clear_last_turn_tool_calls call
to locate and change the code.
In `@src/synthorg/tools/context/compact_context.py`:
- Around line 113-121: The tool emits metadata {"compaction_directive": True,
"strategy": strategy, "preserve_markers": preserve_markers, "reason":
sanitized_reason} via ToolExecutionResult in compact_context.py, but
execute_tool_calls() in src/synthorg/engine/loop_helpers.py currently ignores
metadata; update execute_tool_calls() to inspect each tool_result.metadata for
the "compaction_directive" key and, when present, call the existing
force_compaction(...) helper (or the appropriate compaction entrypoint) with the
provided strategy, preserve_markers, and reason at the turn boundary before
adding the tool result to context; ensure you handle absent metadata safely (use
.get) and only trigger compaction once per directive.
---
Outside diff comments:
In `@scripts/run_affected_tests.py`:
- Around line 150-176: Add a brief comment in the _run_pytest function (near the
cmd list / where "--dist", "loadscope" is set) stating that using "--dist
loadscope" intentionally deviates from the project's pyproject.toml default of
"worksteal" and explaining in one line that this choice prevents heavy fixture
teardown/setup across workers and works around a known xdist scheduler issue;
reference both the cmd variable and pyproject.toml so future readers understand
this is deliberate.
---
Duplicate comments:
In `@src/synthorg/engine/coordination/attribution.py`:
- Around line 72-103: AgentContribution currently drops outcome,
failure_category, and token_usage because its fields only hold
contribution_score/failure_attribution/evidence and _score_outcome() returns
only 0.0 or 1.0; update AgentContribution to include outcome, failure_category,
and token_usage fields (types matching the contract), adjust _score_outcome() to
return/emit partial scores (not just binary 0/1) so partial outcomes are
preserved, and ensure PerformanceTracker.record_coordination_contributions()
accepts and forwards these new fields instead of truncating them; locate and
update the AgentContribution class, the _score_outcome() implementation, and
record_coordination_contributions() to carry the added fields end-to-end.
In `@src/synthorg/tools/context/compact_context.py`:
- Around line 26-57: The JSON schema _RAW_SCHEMA for the compact_context tool is
missing a field for agent-provided summary content; add an optional string
property (e.g., "agent_summary") with sensible constraints (minLength and
maxLength) and include it in properties, update any validation logic that uses
_RAW_SCHEMA to accept and prefer "agent_summary" when present (so functions
consuming _RAW_SCHEMA such as the compaction handler will use agent_summary to
replace or supplement automatic summaries), and ensure additionalProperties
remains False so only defined keys
("strategy","preserve_markers","reason","agent_summary") are allowed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8690bb50-f080-48d3-82b3-2747a483c437
📒 Files selected for processing (37)
CLAUDE.mddocs/design/engine.mdscripts/run_affected_tests.pysrc/synthorg/api/controllers/coordination.pysrc/synthorg/engine/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/compaction/epistemic.pysrc/synthorg/engine/compaction/models.pysrc/synthorg/engine/compaction/summarizer.pysrc/synthorg/engine/coordination/__init__.pysrc/synthorg/engine/coordination/attribution.pysrc/synthorg/engine/coordination/factory.pysrc/synthorg/engine/coordination/service.pysrc/synthorg/engine/loop_helpers.pysrc/synthorg/engine/loop_protocol.pysrc/synthorg/hr/performance/tracker.pysrc/synthorg/observability/events/context_budget.pysrc/synthorg/observability/events/coordination.pysrc/synthorg/tools/context/__init__.pysrc/synthorg/tools/context/compact_context.pysrc/synthorg/tools/factory.pytests/integration/engine/test_coordination_wiring.pytests/integration/tools/test_factory_integration.pytests/unit/api/controllers/test_coordination.pytests/unit/communication/conflict_resolution/test_debate_strategy.pytests/unit/engine/compaction/test_epistemic.pytests/unit/engine/compaction/test_models.pytests/unit/engine/compaction/test_summarizer_markers.pytests/unit/engine/coordination/test_attribution.pytests/unit/engine/coordination/test_attribution_factory.pytests/unit/engine/test_coordination_service.pytests/unit/tools/context/__init__.pytests/unit/tools/context/test_compact_context.pytests/unit/tools/test_factory.pytests/unit/tools/test_factory_new_categories.pytests/unit/tools/test_factory_sandbox_wiring.pyweb/src/api/types.ts
| # Auto-derive base node types from response content. | ||
| derived: list[NodeType] = [NodeType.LLM_CALL] | ||
| if response.tool_calls: | ||
| derived.append(NodeType.TOOL_INVOCATION) | ||
| node_types = tuple(derived) + extra_node_types | ||
|
|
There was a problem hiding this comment.
TOOL_INVOCATION is recorded before execution is guaranteed.
Line 515 through Line 517 marks NodeType.TOOL_INVOCATION from requested tool calls, but on non-execution paths (e.g., the no-invoker error path that calls clear_last_turn_tool_calls()), node_types is not corrected. This can produce false-positive execution traces.
💡 Proposed fix
def clear_last_turn_tool_calls(turns: list[TurnRecord]) -> None:
@@
if turns:
last = turns[-1]
turns[-1] = last.model_copy(
- update={"tool_calls_made": (), "tool_call_fingerprints": ()},
+ update={
+ "tool_calls_made": (),
+ "tool_call_fingerprints": (),
+ "node_types": tuple(
+ nt for nt in last.node_types if nt != NodeType.TOOL_INVOCATION
+ ),
+ },
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/engine/loop_helpers.py` around lines 513 - 518, Derived
node_types currently includes NodeType.TOOL_INVOCATION as soon as
response.tool_calls exists, which can produce false positives when tool calls
are later cancelled (e.g., clear_last_turn_tool_calls()); fix by delaying or
recomputing TOOL_INVOCATION inclusion based on actual executed/retained tool
calls rather than the initial response.tool_calls: update the logic around the
derived/node_types computation (the derived list, variable node_types and its
creation) so it is calculated after invoker/verification paths, or make
clear_last_turn_tool_calls remove or reset the TOOL_INVOCATION flag from
node_types; reference the response object, derived list, node_types tuple
creation, and the clear_last_turn_tool_calls call to locate and change the code.
| return ToolExecutionResult( | ||
| content=("Compaction directive accepted. Will execute at turn boundary."), | ||
| metadata={ | ||
| "compaction_directive": True, | ||
| "strategy": strategy, | ||
| "preserve_markers": preserve_markers, | ||
| "reason": sanitized_reason, | ||
| }, | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for code that reads compaction_directive from tool result metadata
rg -n "compaction_directive" --type=py -C 3Repository: Aureliolo/synthorg
Length of output: 3322
🏁 Script executed:
#!/bin/bash
# Check if force_compaction is called anywhere
rg -n "force_compaction" --type=pyRepository: Aureliolo/synthorg
Length of output: 1121
🏁 Script executed:
#!/bin/bash
# Check the execute_tool_calls function to see if it processes metadata
rg -n "def execute_tool_calls" --type=py -A 50Repository: Aureliolo/synthorg
Length of output: 3817
🏁 Script executed:
#!/bin/bash
# Get the full execute_tool_calls function
sed -n '254,400p' src/synthorg/engine/loop_helpers.pyRepository: Aureliolo/synthorg
Length of output: 4536
Wire compaction_directive metadata into execution loop.
The tool returns metadata={"compaction_directive": True, ...} but execute_tool_calls() in src/synthorg/engine/loop_helpers.py (lines 317–320) adds tool results to context without inspecting metadata or calling force_compaction(). The directive is emitted but never acted upon. Add logic to read the metadata key and trigger compaction at the appropriate point in the turn boundary.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/tools/context/compact_context.py` around lines 113 - 121, The
tool emits metadata {"compaction_directive": True, "strategy": strategy,
"preserve_markers": preserve_markers, "reason": sanitized_reason} via
ToolExecutionResult in compact_context.py, but execute_tool_calls() in
src/synthorg/engine/loop_helpers.py currently ignores metadata; update
execute_tool_calls() to inspect each tool_result.metadata for the
"compaction_directive" key and, when present, call the existing
force_compaction(...) helper (or the appropriate compaction entrypoint) with the
provided strategy, preserve_markers, and reason at the turn boundary before
adding the tool result to context; ensure you handle absent metadata safely (use
.get) and only trigger compaction once per directive.
Summary
Engine intelligence improvements: execution trace enrichment, agent-controlled compaction with epistemic marker preservation, and versioning infrastructure evaluation.
#1123: Execution trace enrichment
NodeTypeenum (LLM_CALL,TOOL_INVOCATION,QUALITY_CHECK,BUDGET_CHECK,STAGNATION_CHECK) toloop_protocol.pynode_typesfield toTurnRecord(auto-derived inmake_turn_record)AgentContribution+CoordinationResultWithAttributionwrapper incoordination/attribution.pywithbuild_agent_contributions()factoryCoordinationService.coordinate()to return attributed resultrecord_coordination_contributions()toPerformanceTrackerand wire into coordination service#1125: Agent-controlled compaction + epistemic marker preservation
compaction/epistemic.py) with complexity-adaptive thresholds (COMPLEX/EPIC: 1+, SIMPLE/MEDIUM: 3+)_build_summary()to preserve marker-containing sentences (ref: arXiv:2603.24472 -- removing markers degrades accuracy by up to 63%)CompactContextTool(tools/context/compact_context.py) for agent-initiated compaction via metadata directiveCompactionConfigfields:agent_controlled,safety_threshold_percent,preserve_epistemic_markerswith dual-threshold behaviorforce_compaction()that bypasses threshold checkcompact_contextinbuild_default_tools()#1113: Versioning infrastructure evaluation
WorkflowDefinitionVersionmigration to genericVersionSnapshot[T](favorable: content hash dedup, concurrent write safety, schema consistency)Test plan
Review coverage
Closes #1123
Closes #1125
Closes #1113