feat: engine intelligence v2 -- trace enrichment, compaction, versioning eval by Aureliolo · Pull Request #1139 · Aureliolo/synthorg

Aureliolo · 2026-04-08T05:09:41Z

Summary

Engine intelligence improvements: execution trace enrichment, agent-controlled compaction with epistemic marker preservation, and versioning infrastructure evaluation.

#1123: Execution trace enrichment

Add NodeType enum (LLM_CALL, TOOL_INVOCATION, QUALITY_CHECK, BUDGET_CHECK, STAGNATION_CHECK) to loop_protocol.py
Add node_types field to TurnRecord (auto-derived in make_turn_record)
Add AgentContribution + CoordinationResultWithAttribution wrapper in coordination/attribution.py with build_agent_contributions() factory
Update CoordinationService.coordinate() to return attributed result
Add record_coordination_contributions() to PerformanceTracker and wire into coordination service
Update all callers (agent engine, API controller, tests)

#1125: Agent-controlled compaction + epistemic marker preservation

Add epistemic marker detection (compaction/epistemic.py) with complexity-adaptive thresholds (COMPLEX/EPIC: 1+, SIMPLE/MEDIUM: 3+)
Rewrite _build_summary() to preserve marker-containing sentences (ref: arXiv:2603.24472 -- removing markers degrades accuracy by up to 63%)
Add CompactContextTool (tools/context/compact_context.py) for agent-initiated compaction via metadata directive
Add CompactionConfig fields: agent_controlled, safety_threshold_percent, preserve_epistemic_markers with dual-threshold behavior
Add force_compaction() that bypasses threshold check
Register compact_context in build_default_tools()

#1113: Versioning infrastructure evaluation

Evaluate WorkflowDefinitionVersion migration to generic VersionSnapshot[T] (favorable: content hash dedup, concurrent write safety, schema consistency)
Create follow-up issues: refactor: migrate WorkflowDefinitionVersion to generic VersionSnapshot[WorkflowDefinition] #1131 (migration), feat: add versioning to EvaluationConfig, BudgetConfig, CompanyStructure #1132 (config versioning), feat: add versioning to Role Catalog definitions #1133 (role catalog versioning)

Test plan

15,619 unit tests pass (0 failures)
New tests: 80+ tests across attribution models, attribution factory, epistemic markers, summarizer markers, compact_context tool, CompactionConfig validators, force_compaction
Pre-reviewed by 5 agents, 4 findings addressed (1 integration gap, 3 test coverage gaps)

Review coverage

code-reviewer: code quality, bugs, conventions
type-design-analyzer: model encapsulation, invariant enforcement
conventions-auditor: CLAUDE.md compliance, logging patterns
test-reviewer: coverage gaps, test quality
issue-verifier: acceptance criteria verification for all 3 issues

Closes #1123
Closes #1125
Closes #1113

coderabbitai · 2026-04-08T05:09:55Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 668b3208-c9ee-4c26-8971-6b248d189fc3

📥 Commits

Reviewing files that changed from the base of the PR and between d8a975c and c81954c.

📒 Files selected for processing (1)

scripts/run_affected_tests.py

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build Backend
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Python version must be 3.14+ (PEP 649 native lazy annotations).

Do not use from __future__ import annotations - Python 3.14 has PEP 649.

Use except A, B: syntax (no parentheses) - ruff enforces this on Python 3.14.

All public functions must have type hints. Use mypy strict mode.

Use Google style docstrings, required on public classes/functions (enforced by ruff D rules).

Code line length must be 88 characters (enforced by ruff).

Files:

scripts/run_affected_tests.py

🧠 Learnings (7)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to design pages.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: If implementation deviates from the design spec, alert the user and explain why - user decides whether to proceed or update the spec. Do NOT silently diverge.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Pure data models, enums, and re-exports do NOT need logging.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Always run pytest with `-n 8` for parallelism when running locally - never run tests sequentially. Use `-n auto` in CI.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Git commit format: `<type>: <description>` - types: feat, fix, refactor, docs, test, chore, perf, ci. Enforced by commitizen.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: All commits must be GPG/SSH signed on `main` branch via branch protection.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Use branch naming format: `<type>/<slug>` from main.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Never use `cd` in Bash commands - the working directory is already set to project root. Use absolute paths or run commands directly. Exception: `bash -c "cd <dir> && <cmd>"` is safe (runs in child process).

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Never use Bash to write or modify files - use the Write or Edit tools. Do not use `cat >`, `cat << EOF`, `echo >`, `echo >>`, `sed -i`, `python -c "open(...).write(...)", or `tee` to create or modify files (read-only inspection uses are fine).

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: After finishing an issue implementation, always create a feature branch (`<type>/<slug>`), commit, and push - do NOT create a PR automatically.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: Do NOT leave work uncommitted on main - branch, commit, push immediately after finishing.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: NEVER create a PR directly with `gh pr create` (blocked by hookify) - ALWAYS use `/pre-pr-review` to create PRs. For trivial/docs-only changes: `/pre-pr-review quick` skips agents but still runs automated checks.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T12:02:13.973Z
Learning: When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and adjacent findings), fix them all. No deferring, no 'out of scope' skipping.

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-02T18:48:02.489Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T18:48:02.489Z
Learning: ALWAYS run pytest with `-n auto` for parallel execution with pytest-xdist; never run tests sequentially

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-08T11:07:20.359Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-08T11:07:20.359Z
Learning: Always include `-n 8` when running pytest locally; never run tests sequentially. Use `pytest-xdist` for parallelism

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-02T08:47:46.313Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Run unit tests with `uv run python -m pytest tests/ -m unit -n auto`; integration tests with `-m integration -n auto`; e2e tests with `-m e2e -n auto`

Applied to files:

scripts/run_affected_tests.py

🔇 Additional comments (2)

scripts/run_affected_tests.py (2)

151-162: Good rationale added for xdist behavior changes.

The updated docstring is clear and gives useful operational context for why this hook intentionally diverges from the default scheduler behavior.

172-174: Explicit xdist overrides are well-placed.

Passing --dist loadscope and --max-worker-restart=0 in the command makes this script’s execution mode deterministic, independent of global pytest defaults.

Walkthrough

This PR introduces per-agent attribution (new AgentContribution, CoordinationResultWithAttribution) and changes coordinators to return the attribution wrapper; coordinator builds contributions and PerformanceTracker can record them. Compaction gains epistemic-marker detection/extraction, new CompactionConfig fields for agent-controlled compaction, a force_compaction API, and marker-preserving summary logic. A new CompactContextTool was added and registered in the default tool set. Execution traces now include a NodeType enum and TurnRecord.node_types. The coordination API response shape adds a currency field and tests were updated accordingly.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: engine intelligence improvements covering trace enrichment, agent-controlled compaction with epistemic markers, and versioning evaluation.
Description check	✅ Passed	The description is directly related to the changeset, covering all three major objectives (`#1123`, `#1125`, `#1113`) with implementation details and test results.
Linked Issues check	✅ Passed	The code changes comprehensively address all three linked issues: `#1123` implements execution trace enrichment with NodeType enum and CoordinationResultWithAttribution, `#1125` adds epistemic marker preservation and agent-controlled compaction, `#1113` evaluates versioning infrastructure with favorable assessment.
Out of Scope Changes check	✅ Passed	All code changes directly support the three stated objectives. Minor changes to pytest configuration and test count updates are supporting infrastructure for the core features.
Docstring Coverage	✅ Passed	Docstring coverage is 90.32% which is sufficient. The required threshold is 40.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-08T05:11:54Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA c81954c.

Ensure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice.

Scanned Files

None

gemini-code-assist

Code Review

This pull request introduces structural credit assignment for multi-agent coordination and enhances context compaction with agent-controlled triggers and reasoning preservation. Key additions include a per-agent contribution scoring system, failure attribution models, and a compact_context tool. The compaction logic now detects "epistemic markers" (e.g., hedging, reconsideration) to maintain reasoning integrity during summarization. Review feedback identifies a logic error in sentence extraction that could lead to empty results for long sentences, redundant length checks in the epistemic detection module, and opportunities to reduce code duplication by refactoring the compaction functions.

gemini-code-assist · 2026-04-08T05:17:05Z

src/synthorg/engine/compaction/epistemic.py

+            if total_len + len(stripped) + 2 > max_chars:
+                break


There's a logic issue here: if the first sentence containing a marker is longer than max_chars, the break on line 123 is executed, and the function will incorrectly return an empty string. The expected behavior should be to return a truncated version of that sentence.

gemini-code-assist · 2026-04-08T05:17:05Z

src/synthorg/api/controllers/coordination.py

        ws_event_type = (
            WsEventType.COORDINATION_COMPLETED
-            if result.is_success
+            if attributed.is_success
            else WsEventType.COORDINATION_FAILED
        )
        _publish_ws_event(
            request,
            ws_event_type,
            {
                "task_id": task_id,
-                "topology": result.topology.value,
-                "is_success": result.is_success,
-                "total_duration_seconds": result.total_duration_seconds,
+                "topology": attributed.result.topology.value,
+                "is_success": attributed.is_success,
+                "total_duration_seconds": attributed.result.total_duration_seconds,
            },
        )
        log_event = (
-            API_COORDINATION_COMPLETED if result.is_success else API_COORDINATION_FAILED
+            API_COORDINATION_COMPLETED
+            if attributed.is_success
+            else API_COORDINATION_FAILED
        )
-        log_fn = logger.info if result.is_success else logger.warning
+        log_fn = logger.info if attributed.is_success else logger.warning
        log_fn(
            log_event,
            task_id=task_id,
-            topology=result.topology.value,
-            is_success=result.is_success,
-            total_duration_seconds=result.total_duration_seconds,
+            topology=attributed.result.topology.value,
+            is_success=attributed.is_success,
+            total_duration_seconds=attributed.result.total_duration_seconds,
        )


To improve readability and reduce repetition, you could extract attributed.result and attributed.is_success into local variables at the beginning of this block. This would make the subsequent logic for WebSocket events and logging cleaner and easier to follow.

result = attributed.result is_success = attributed.is_success ws_event_type = ( WsEventType.COORDINATION_COMPLETED if is_success else WsEventType.COORDINATION_FAILED ) _publish_ws_event( request, ws_event_type, { "task_id": task_id, "topology": result.topology.value, "is_success": is_success, "total_duration_seconds": result.total_duration_seconds, }, ) log_event = ( API_COORDINATION_COMPLETED if is_success else API_COORDINATION_FAILED ) log_fn = logger.info if is_success else logger.warning log_fn( log_event, task_id=task_id, topology=result.topology.value, is_success=is_success, total_duration_seconds=result.total_duration_seconds, )

gemini-code-assist · 2026-04-08T05:17:06Z

src/synthorg/engine/compaction/epistemic.py

+    if len(joined) > max_chars:
+        return joined[:max_chars] + "..."


This length check is unreachable. The loop logic on lines 122-125 ensures that the total length of the collected sentences (including separators) does not exceed max_chars. Therefore, len(joined) will never be greater than max_chars, making this check redundant.

gemini-code-assist · 2026-04-08T05:17:06Z

src/synthorg/engine/compaction/summarizer.py

+def force_compaction(
+    ctx: AgentContext,
+    config: CompactionConfig,
+    estimator: PromptTokenEstimator,
+) -> AgentContext | None:
+    """Compact context without checking the fill threshold.
+
+    Used when an agent explicitly requests compaction via the
+    ``compact_context`` tool.  Skips the threshold check but
+    still enforces minimum message count and recent turn
+    preservation.
+
+    Args:
+        ctx: Current agent context.
+        config: Compaction configuration.
+        estimator: Token estimator.
+
+    Returns:
+        Compacted context, or ``None`` if too few messages.
+    """
+    conversation = ctx.conversation
+    if len(conversation) < config.min_messages_to_compact:
+        logger.debug(
+            CONTEXT_BUDGET_COMPACTION_SKIPPED,
+            execution_id=ctx.execution_id,
+            reason="too_few_messages_for_forced_compaction",
+            message_count=len(conversation),
+        )
+        return None
+
+    logger.info(
+        CONTEXT_BUDGET_COMPACTION_STARTED,
+        execution_id=ctx.execution_id,
+        fill_percent=ctx.context_fill_percent,
+        message_count=len(conversation),
+        forced=True,
+    )
+
+    split = _split_conversation(ctx, config)
+    if split is None:
+        return None
+    head, archivable, recent = split
+
+    task_complexity = _extract_task_complexity(ctx)
+    compressed, metadata, summary_tokens = _compress(
+        ctx,
+        head,
+        archivable,
+        recent,
+        estimator,
+        preserve_markers=config.preserve_epistemic_markers,
+        task_complexity=task_complexity,
+    )
+
+    new_fill = estimator.estimate_conversation_tokens(compressed)
+    logger.info(
+        CONTEXT_BUDGET_COMPACTION_COMPLETED,
+        execution_id=ctx.execution_id,
+        original_messages=len(conversation),
+        compacted_messages=len(compressed),
+        archived_turns=metadata.archived_turns,
+        summary_tokens=summary_tokens,
+        compactions_total=metadata.compactions_performed,
+        forced=True,
+    )
+    return ctx.with_compression(metadata, compressed, new_fill)


The force_compaction function duplicates a significant amount of logic from _do_compaction. The only substantial difference is that force_compaction bypasses the initial fill threshold check.

To improve maintainability and reduce code duplication, you could refactor this by adding a force: bool = False parameter to _do_compaction and conditionally skipping the threshold check. force_compaction would then become a simple wrapper around _do_compaction(..., force=True).

Copilot

Pull request overview

This PR upgrades the engine “intelligence v2” pipeline by enriching execution traces (node typing + coordination attribution) and improving context compaction (agent-initiated compaction directives + epistemic marker preservation), with supporting observability events and tests.

Changes:

Add per-turn NodeType tagging on TurnRecord, with auto-derivation in make_turn_record().
Introduce coordination attribution models + CoordinationResultWithAttribution, wire attribution building into the coordinator, and record contributions in PerformanceTracker.
Add epistemic marker detection + summarizer changes, plus a new compact_context tool for agent-initiated compaction; update tool factory and tests.

Reviewed changes

Copilot reviewed 28 out of 29 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/unit/tools/test_factory.py	Updates expected default tool list to include `compact_context`.
tests/unit/tools/test_factory_sandbox_wiring.py	Adjusts expected tool count (+1) for sandbox wiring tests.
tests/unit/tools/test_factory_new_categories.py	Updates default tool count expectation (15 tools).
tests/unit/tools/context/test_compact_context.py	Adds unit tests for `CompactContextTool` schema + behavior.
tests/unit/tools/context/init.py	Marks the context tools test package.
tests/unit/engine/test_coordination_service.py	Updates tests to handle attributed coordination results.
tests/unit/engine/coordination/test_attribution.py	Adds unit tests for attribution wrapper/models.
tests/unit/engine/coordination/test_attribution_factory.py	Adds unit tests for `build_agent_contributions()`.
tests/unit/engine/compaction/test_summarizer_markers.py	Adds tests for epistemic marker preservation + `force_compaction()`.
tests/unit/engine/compaction/test_models.py	Adds tests for new `CompactionConfig` fields/validation.
tests/unit/engine/compaction/test_epistemic.py	Adds tests for epistemic marker detection helpers.
tests/unit/api/controllers/test_coordination.py	Updates controller tests to use attributed coordination results.
src/synthorg/tools/factory.py	Registers `CompactContextTool` in `build_default_tools()`.
src/synthorg/tools/context/compact_context.py	Implements `compact_context` tool that emits a compaction directive via metadata.
src/synthorg/tools/context/init.py	Adds context tool package marker.
src/synthorg/observability/events/coordination.py	Adds coordination attribution event constant.
src/synthorg/observability/events/context_budget.py	Adds events for agent compaction request + epistemic marker preservation.
src/synthorg/hr/performance/tracker.py	Adds storage/logging for per-agent coordination contributions.
src/synthorg/engine/loop_protocol.py	Adds `NodeType` enum and `TurnRecord.node_types`.
src/synthorg/engine/loop_helpers.py	Derives and records node types in `make_turn_record()`.
src/synthorg/engine/coordination/service.py	Returns attributed coordination results and records contributions in tracker.
src/synthorg/engine/coordination/attribution.py	Introduces attribution models + contribution factory/scoring logic.
src/synthorg/engine/coordination/init.py	Re-exports new attribution types/factory.
src/synthorg/engine/compaction/summarizer.py	Adds epistemic marker preservation + `force_compaction()` and agent-controlled threshold behavior.
src/synthorg/engine/compaction/models.py	Extends `CompactionConfig` with agent-controlled fields + validators.
src/synthorg/engine/compaction/epistemic.py	Adds marker detection + extraction utilities.
src/synthorg/engine/agent_engine.py	Updates coordination return type to `CoordinationResultWithAttribution`.
src/synthorg/engine/init.py	Re-exports new coordination attribution types.
src/synthorg/api/controllers/coordination.py	Adjusts controller flow for attributed results while preserving API response shape.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-08T05:17:12Z

src/synthorg/engine/compaction/summarizer.py

+        ):
+            marker_text = extract_marker_sentences(cleaned)
+            if marker_text:
+                snippets.append(marker_text)


In the epistemic-marker preservation path, marker_text is appended to the summary without running through sanitize_message. That reintroduces URLs/paths/prompt-injection fragments into the SYSTEM compaction summary (the earlier implementation explicitly sanitized archived assistant content). Consider sanitizing the extracted marker sentences (while keeping the marker words) before adding them to snippets (and/or sanitizing inside extract_marker_sentences).

Suggested change

snippets.append(marker_text)

sanitized_marker_text = sanitize_message(

marker_text,

max_length=max(len(marker_text), 1),

)

snippets.append(sanitized_marker_text)

Copilot · 2026-04-08T05:17:12Z

src/synthorg/tools/context/compact_context.py

+            CONTEXT_BUDGET_AGENT_COMPACTION_REQUESTED,
+            strategy=strategy,
+            preserve_markers=preserve_markers,
+            reason=reason,


CompactContextTool.execute() logs the free-form reason string verbatim. Since reason is LLM/user-authored text, it can contain URLs/paths or other sensitive fragments that won't be redacted by the structured-log key sanitizer. Consider logging a sanitized/redacted form (e.g., sanitize_message(reason, ...)) and/or omitting it from logs while still returning it in tool metadata if needed.

Suggested change

reason=reason,

Copilot · 2026-04-08T05:17:13Z

src/synthorg/engine/compaction/models.py

+    When ``agent_controlled`` is ``True``, automatic compaction uses
+    ``safety_threshold_percent`` instead of ``fill_threshold_percent``,
+    allowing agents to manage compaction via the ``compact_context``
+    tool while retaining a safety net.
+
    Attributes:
        fill_threshold_percent: Context fill percentage that triggers
            compaction (e.g. 80.0 means compact when 80% full).


The CompactionConfig docs still describe fill_threshold_percent as the threshold that triggers compaction, but the implementation switches automatic compaction to use safety_threshold_percent when agent_controlled=True. Consider clarifying in the docstring (and/or field descriptions) what each threshold means in agent-controlled mode to avoid config misuse/confusion.

Copilot · 2026-04-08T05:17:13Z

tests/unit/engine/compaction/test_summarizer_markers.py

+        # Result should be an AgentContext (not None)
+        assert result is None or isinstance(result, AgentContext)


test_force_compaction_bypasses_threshold ends with assert result is None or isinstance(result, AgentContext), which will always pass and doesn't verify the intended behavior. Consider asserting a concrete outcome (e.g., result is not None, result.compression_metadata is set, or message count decreases) so this test will fail if force_compaction() stops compacting below the threshold.

Suggested change

# Result should be an AgentContext (not None)

assert result is None or isinstance(result, AgentContext)

# Forced compaction should still produce a compacted context

assert result is not None

assert isinstance(result, AgentContext)

Copilot · 2026-04-08T05:17:13Z

src/synthorg/engine/coordination/attribution.py

+        description="Average contribution score",
+    )
+    @property
+    def total_contribution_score(self) -> float:
+        """Average of contribution scores, 0.0 when empty."""
+        if not self.agent_contributions:
+            return 0.0
+        total = sum(c.contribution_score for c in self.agent_contributions)
+        return total / len(self.agent_contributions)


total_contribution_score returns the average of contribution scores (as described in the docstring/computed_field), not a total. This naming mismatch can confuse consumers (especially when also logging avg_score). Consider renaming the property to something like avg_contribution_score (or changing the computation to a true total) to match semantics.

Suggested change

description="Average contribution score",

)

@property

def total_contribution_score(self) -> float:

"""Average of contribution scores, 0.0 when empty."""

if not self.agent_contributions:

return 0.0

total = sum(c.contribution_score for c in self.agent_contributions)

return total / len(self.agent_contributions)

description="Total contribution score",

)

@property

def total_contribution_score(self) -> float:

"""Sum of contribution scores, 0.0 when empty."""

if not self.agent_contributions:

return 0.0

return sum(c.contribution_score for c in self.agent_contributions)

codecov · 2026-04-08T05:20:46Z

Codecov Report

❌ Patch coverage is 89.06250% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.96%. Comparing base (fe82894) to head (c81954c).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/synthorg/engine/coordination/service.py	35.29%	10 Missing and 1 partial ⚠️
src/synthorg/engine/compaction/summarizer.py	66.66%	5 Missing and 1 partial ⚠️
src/synthorg/engine/coordination/attribution.py	93.75%	1 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1139      +/-   ##
==========================================
- Coverage   88.97%   88.96%   -0.02%     
==========================================
  Files         845      848       +3     
  Lines       49228    49388     +160     
  Branches     4944     4960      +16     
==========================================
+ Hits        43803    43939     +136     
- Misses       4501     4520      +19     
- Partials      924      929       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/engine/compaction/epistemic.py`:
- Around line 117-125: The loop that collects epistemic sentences over-counts
the "; " separator for the first appended sentence (in the block using
EPISTEMIC_PATTERNS and variables sentences, marker_sentences, total_len,
max_chars), causing premature breaks; fix by accounting for the separator only
when marker_sentences is non-empty (i.e., compute sep_len = 2 if
marker_sentences else 0) and use sep_len both in the pre-check (total_len +
sep_len + len(stripped) > max_chars) and when updating total_len (total_len +=
sep_len + len(stripped)).

In `@src/synthorg/engine/compaction/summarizer.py`:
- Around line 305-312: The returned summary string in the block that checks
preserved_count uses "from {preserved_count} messages" which is incorrect when
preserved_count == 1; update the formatting in the function containing this
block (look for preserved_count and messages in summarizer.py) to choose
"message" vs "messages" based on preserved_count (e.g., conditional
pluralization or a small helper like pluralize_count) so the text reads "from 1
message" when preserved_count == 1 and keeps "messages" otherwise.

In `@src/synthorg/engine/coordination/attribution.py`:
- Around line 234-264: Introduce a lightweight typing Protocol under a
TYPE_CHECKING block that documents the expected shape of outcome_result (e.g.,
class AgentRunResultLike(Protocol) with attributes is_success: bool,
termination_reason: TerminationReason | None, execution_result:
ExecutionResultLike | None and ExecutionResultLike with error_message: str |
None), then change the function signature/type annotation of outcome_result from
object | None to AgentRunResultLike | None and update any local uses accordingly
(you can still use getattr guards if desired); this avoids circular imports
while giving static type safety for outcome_result, AgentRunResultLike,
ExecutionResultLike, and keeps interaction with _TERMINATION_TO_ATTRIBUTION and
AgentContribution unchanged.
- Around line 49-56: The _TERMINATION_TO_ATTRIBUTION mapping is missing
TerminationReason.PARKED, causing PARKED outcomes to fall back to "direct"
attribution; add an entry TerminationReason.PARKED: "coordination_overhead" to
the _TERMINATION_TO_ATTRIBUTION dict so PARKED (an approval-gate/system
suspension) is attributed as coordination_overhead rather than direct, keeping
its behavior consistent with SHUTDOWN, BUDGET_EXHAUSTED, and MAX_TURNS; ensure
you use the FailureAttribution string value "coordination_overhead" and update
the dict where _TERMINATION_TO_ATTRIBUTION is defined.

In `@src/synthorg/engine/coordination/service.py`:
- Around line 217-220: The call to
self._performance_tracker.record_coordination_contributions(contributions) can
raise and currently runs after COORDINATION_COMPLETED is logged, which can
incorrectly mark a finished coordination as failed; update the block in the
coordination completion path (around COORDINATION_COMPLETED logging) to wrap the
record_coordination_contributions call in a try/except that catches all
exceptions, logs them at WARNING or ERROR with context (including contributions
and that this occurred post-completion), and only re-raises MemoryError and
RecursionError; alternatively move the COORDINATION_COMPLETED log to after the
safe/guarded tracker call if tracker failure should be fatal.
- Around line 211-214: The current use of build_agent_contributions with
routing_result and dispatch_result.waves collapses routing into a single
agent_id->subtask_id mapping which allows later subtasks to overwrite earlier
ones when one agent executes multiple subtasks; update build_agent_contributions
(or the code that constructs its input) to use a composite key that includes the
executed outcome or task identifier (e.g., use (agent_id, executed_outcome_id)
or (agent_id, subtask_execution_id)) so each executed subtask is mapped
uniquely, ensure any persistence/returned tuple uses that composite mapping, and
add a unit/regression test that simulates repeated-agent routing to assert that
multiple subtasks from the same agent retain distinct attribution rather than
being overwritten.

In `@src/synthorg/tools/context/compact_context.py`:
- Around line 99-104: The INFO log and metadata currently emit the raw free-form
reason (see the logger.info call using CONTEXT_BUDGET_AGENT_COMPACTION_REQUESTED
and the subsequent metadata forwarding around lines 99–113); replace that with a
bounded/sanitized value by introducing a sanitizer (e.g., sanitize_reason() or
reason_code) that either maps free-text to a fixed set of reason codes or
redacts/truncates the input before logging and before any metadata propagation,
and use that sanitized value in the logger.info call and wherever the original
reason is forwarded (keep existing parameters strategy and preserve_markers
unchanged).
- Around line 21-52: The _COMPACT_CONTEXT_SCHEMA dict is mutable; make it
immutable by building it via copy.deepcopy(...) and then wrapping the result
with types.MappingProxyType so the module-level symbol _COMPACT_CONTEXT_SCHEMA
becomes a read-only mapping; import copy and MappingProxyType where needed and
replace the current literal assignment with a deep-copied value wrapped by
MappingProxyType to enforce immutability.

In `@tests/unit/engine/compaction/test_epistemic.py`:
- Around line 183-189: The assertion in test_strips_whitespace_from_sentences is
unclear; update the assertions after calling extract_marker_sentences(text)
(variable result) to explicitly check that the string does not start or end with
a space and contains no double spaces: replace the current confusing assert
involving count(" ") with three clearer asserts such as ensuring not
result.startswith(" "), not result.endswith(" "), and "  " not in result to
verify whitespace is stripped.

In `@tests/unit/engine/compaction/test_summarizer_markers.py`:
- Around line 244-258: The test failure is due to a grammar mismatch ("from 1
messages"); update the summarizer code that builds the status text (the logic
invoked by _build_summary that produces "Epistemic markers preserved from {n}
messages") to choose "message" when the preserved count equals 1 and "messages"
otherwise (i.e., pluralize based on count). Locate the string generation in the
summarizer/marker-preservation code path (the function invoked by
_build_summary) and replace the hardcoded plural with a conditional
pluralization so the output reads "from 1 message" for a single preserved marker
and "from N messages" for other counts.
- Around line 315-343: The test test_force_compaction_bypasses_threshold uses a
weak assertion; instead capture the original message count before calling
force_compaction, call force_compaction(ctx, config, estimator) and assert the
result is an AgentContext (isinstance(result, AgentContext)) and that compaction
actually occurred by asserting len(result.messages) < original_count (or at
least len(result.messages) <= original_count and < when min_messages_to_compact
should trigger removal). Reference: force_compaction, CompactionConfig
(min_messages_to_compact) and the test function name to locate and update the
assertion.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6731cd2d-3ed5-4005-9647-a690f4095266

📥 Commits

Reviewing files that changed from the base of the PR and between bd0929e and 2123b09.

📒 Files selected for processing (29)

src/synthorg/api/controllers/coordination.py
src/synthorg/engine/__init__.py
src/synthorg/engine/agent_engine.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/models.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/coordination/attribution.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/loop_helpers.py
src/synthorg/engine/loop_protocol.py
src/synthorg/hr/performance/tracker.py
src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py
src/synthorg/tools/context/__init__.py
src/synthorg/tools/context/compact_context.py
src/synthorg/tools/factory.py
tests/unit/api/controllers/test_coordination.py
tests/unit/engine/compaction/test_epistemic.py
tests/unit/engine/compaction/test_models.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/coordination/test_attribution_factory.py
tests/unit/engine/test_coordination_service.py
tests/unit/tools/context/__init__.py
tests/unit/tools/context/test_compact_context.py
tests/unit/tools/test_factory.py
tests/unit/tools/test_factory_new_categories.py
tests/unit/tools/test_factory_sandbox_wiring.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: Agent
GitHub Check: Build Backend
GitHub Check: Build Web
GitHub Check: Build Sandbox
GitHub Check: Test (Python 3.14)
GitHub Check: Dependency Review
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (6)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations
Use except A, B: syntax without parentheses (PEP 758 except syntax)—ruff enforces this on Python 3.14
All public functions require type hints; mypy strict mode is enforced
Google-style docstrings are required on public classes and functions; enforced by ruff D rules
Create new objects for immutability, never mutate existing ones; for non-Pydantic internal collections use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement
Line length must be 88 characters (enforced by ruff)

Files:

src/synthorg/tools/context/__init__.py
tests/unit/tools/test_factory_sandbox_wiring.py
tests/unit/tools/test_factory.py
tests/unit/tools/test_factory_new_categories.py
tests/unit/api/controllers/test_coordination.py
src/synthorg/tools/factory.py
src/synthorg/engine/__init__.py
src/synthorg/observability/events/context_budget.py
tests/unit/engine/compaction/test_models.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/loop_helpers.py
src/synthorg/observability/events/coordination.py
src/synthorg/hr/performance/tracker.py
src/synthorg/engine/loop_protocol.py
src/synthorg/engine/coordination/service.py
tests/unit/engine/compaction/test_epistemic.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/agent_engine.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/compaction/test_summarizer_markers.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/tools/context/compact_context.py
tests/unit/tools/context/test_compact_context.py
tests/unit/engine/coordination/test_attribution_factory.py
src/synthorg/engine/compaction/models.py
tests/unit/engine/test_coordination_service.py
src/synthorg/engine/coordination/attribution.py

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Functions must be fewer than 50 lines; files must be fewer than 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Variable name must always be logger (not _logger, not log)
Use event name constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool, etc.); import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Always use structured logging: logger.info(EVENT, key=value)—never logger.info("msg %s", val)
All error paths must log at WARNING ...

Files:

src/synthorg/tools/context/__init__.py
src/synthorg/tools/factory.py
src/synthorg/engine/__init__.py
src/synthorg/observability/events/context_budget.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/loop_helpers.py
src/synthorg/observability/events/coordination.py
src/synthorg/hr/performance/tracker.py
src/synthorg/engine/loop_protocol.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/agent_engine.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/models.py
src/synthorg/engine/coordination/attribution.py

!(src/synthorg/observability/{setup,sinks,syslog_handler,http_handler}.py)**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging)

Files:

src/synthorg/tools/context/__init__.py
tests/unit/tools/test_factory_sandbox_wiring.py
tests/unit/tools/test_factory.py
tests/unit/tools/test_factory_new_categories.py
tests/unit/api/controllers/test_coordination.py
src/synthorg/tools/factory.py
src/synthorg/engine/__init__.py
src/synthorg/observability/events/context_budget.py
tests/unit/engine/compaction/test_models.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/loop_helpers.py
src/synthorg/observability/events/coordination.py
src/synthorg/hr/performance/tracker.py
src/synthorg/engine/loop_protocol.py
src/synthorg/engine/coordination/service.py
tests/unit/engine/compaction/test_epistemic.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/agent_engine.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/compaction/test_summarizer_markers.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/tools/context/compact_context.py
tests/unit/tools/context/test_compact_context.py
tests/unit/engine/coordination/test_attribution_factory.py
src/synthorg/engine/compaction/models.py
tests/unit/engine/test_coordination_service.py
src/synthorg/engine/coordination/attribution.py

!(docs/design/operations.md|.claude/**/*|src/synthorg/providers/presets.py)**/*.{py,md}

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Vendor names may only appear in: (1) Operations design page, (2) .claude/ files, (3) third-party imports, (4) src/synthorg/providers/presets.py which is user-facing runtime data. Tests must use test-provider, test-small-001, etc.

Files:

src/synthorg/tools/context/__init__.py
tests/unit/tools/test_factory_sandbox_wiring.py
tests/unit/tools/test_factory.py
tests/unit/tools/test_factory_new_categories.py
tests/unit/api/controllers/test_coordination.py
src/synthorg/tools/factory.py
src/synthorg/engine/__init__.py
src/synthorg/observability/events/context_budget.py
tests/unit/engine/compaction/test_models.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/loop_helpers.py
src/synthorg/observability/events/coordination.py
src/synthorg/hr/performance/tracker.py
src/synthorg/engine/loop_protocol.py
src/synthorg/engine/coordination/service.py
tests/unit/engine/compaction/test_epistemic.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/agent_engine.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/compaction/test_summarizer_markers.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/tools/context/compact_context.py
tests/unit/tools/context/test_compact_context.py
tests/unit/engine/coordination/test_attribution_factory.py
src/synthorg/engine/compaction/models.py
tests/unit/engine/test_coordination_service.py
src/synthorg/engine/coordination/attribution.py

src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/tools/context/__init__.py
src/synthorg/tools/factory.py
src/synthorg/engine/__init__.py
src/synthorg/observability/events/context_budget.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/loop_helpers.py
src/synthorg/observability/events/coordination.py
src/synthorg/hr/performance/tracker.py
src/synthorg/engine/loop_protocol.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/agent_engine.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/models.py
src/synthorg/engine/coordination/attribution.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow for test classification
30-second timeout per test (global in pyproject.toml); do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed
Prefer @pytest.mark.parametrize for testing similar cases
Use Hypothesis for property-based testing with @given + @settings decorators; Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Controlled via HYPOTHESIS_PROFILE env var.
NEVER skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally; for timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins; for tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number)

Files:

tests/unit/tools/test_factory_sandbox_wiring.py
tests/unit/tools/test_factory.py
tests/unit/tools/test_factory_new_categories.py
tests/unit/api/controllers/test_coordination.py
tests/unit/engine/compaction/test_models.py
tests/unit/engine/compaction/test_epistemic.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/tools/context/test_compact_context.py
tests/unit/engine/coordination/test_attribution_factory.py
tests/unit/engine/test_coordination_service.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/unit/tools/test_factory_sandbox_wiring.py
tests/unit/tools/test_factory.py
tests/unit/tools/test_factory_new_categories.py
tests/unit/api/controllers/test_coordination.py
tests/unit/engine/compaction/test_models.py
tests/unit/engine/compaction/test_epistemic.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/tools/context/test_compact_context.py
tests/unit/engine/coordination/test_attribution_factory.py
tests/unit/engine/test_coordination_service.py

🧠 Learnings (42)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

📚 Learning: 2026-04-02T07:18:02.381Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to {pyproject.toml,src/synthorg/__init__.py} : Update version in `pyproject.toml` (`[tool.commitizen].version`) and `src/synthorg/__init__.py` (`__version__`)

Applied to files:

src/synthorg/tools/context/__init__.py
src/synthorg/engine/__init__.py

📚 Learning: 2026-03-17T06:30:14.180Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.

Applied to files:

tests/unit/api/controllers/test_coordination.py
src/synthorg/engine/__init__.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/agent_engine.py
tests/unit/engine/test_coordination_service.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

tests/unit/api/controllers/test_coordination.py
src/synthorg/engine/__init__.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/agent_engine.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

src/synthorg/engine/__init__.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/agent_engine.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

src/synthorg/engine/__init__.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/agent_engine.py
tests/unit/engine/coordination/test_attribution_factory.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/communication/**/*.py : Communication package (communication/): message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution; meeting/ subpackage for meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator

Applied to files:

src/synthorg/engine/__init__.py
src/synthorg/engine/coordination/__init__.py

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`, etc.); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-03-14T15:43:05.601Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-03-14T16:18:57.267Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-03-20T11:18:48.128Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-04-02T07:18:02.381Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-03-15T18:38:44.202Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py

📚 Learning: 2026-03-18T21:23:23.586Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

src/synthorg/observability/events/context_budget.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

src/synthorg/observability/events/coordination.py

📚 Learning: 2026-03-19T07:13:44.964Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

src/synthorg/hr/performance/tracker.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/agent_engine.py

📚 Learning: 2026-03-16T06:24:56.341Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

src/synthorg/hr/performance/tracker.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/agent_engine.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.

Applied to files:

src/synthorg/hr/performance/tracker.py

📚 Learning: 2026-03-15T18:38:44.202Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-19T11:33:01.580Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-20T11:18:48.128Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).

Applied to files:

tests/unit/engine/compaction/test_epistemic.py
tests/unit/engine/compaction/test_summarizer_markers.py

📚 Learning: 2026-04-02T09:04:54.709Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T09:04:54.709Z
Learning: Applies to tests/**/*.py : Use pytest markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`

Applied to files:

tests/unit/engine/compaction/test_epistemic.py
tests/unit/engine/compaction/test_summarizer_markers.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to tests/**/*.py : Use markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow` for test classification

Applied to files:

tests/unit/engine/compaction/test_epistemic.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`

Applied to files:

src/synthorg/engine/agent_engine.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

src/synthorg/engine/agent_engine.py

📚 Learning: 2026-03-17T06:30:14.180Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

src/synthorg/engine/agent_engine.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

src/synthorg/engine/agent_engine.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`); use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-17T11:41:02.964Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T11:41:02.964Z
Learning: Applies to src/**/*.py : Models: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing + validating redundant fields. Use `NotBlankStr` for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-15T18:42:17.990Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves; never mix static config fields with mutable runtime fields in one model

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-04-02T11:49:21.202Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T11:49:21.202Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`) with `allow_inf_nan=False` in all `ConfigDict` declarations

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-04-01T15:36:39.993Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to src/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time

Applied to files:

src/synthorg/engine/compaction/models.py

🔇 Additional comments (37)

src/synthorg/engine/loop_protocol.py (2)

28-41: LGTM! Clean enum design with clear semantics.

The NodeType enum is well-structured with descriptive values and a comprehensive docstring explaining its purpose for trace analysis and credit assignment. Using StrEnum ensures serialization-friendly string values.

119-122: LGTM! Backward-compatible field addition.

The node_types field is properly added with:

Immutable tuple[NodeType, ...] type

Default () for legacy deserialization compatibility (as documented in docstring lines 74-76)

Clear description in Field

src/synthorg/engine/loop_helpers.py (2)

40-47: LGTM! Import addition is correctly placed.

The NodeType import is appropriately grouped with other imports from .loop_protocol.

464-533: LGTM! Well-implemented node type derivation.

The make_turn_record changes are clean:

Parameter addition (line 470): The extra_node_types parameter with default () maintains backward compatibility.

Auto-derivation logic (lines 513-517): Correctly derives base node types from response content:

LLM_CALL always present (every turn involves an LLM call)

TOOL_INVOCATION added when response.tool_calls is truthy

Caller-provided types appended via tuple concatenation

Docstring (lines 474-477, 488-489): Clearly documents the auto-derivation behavior and the purpose of extra_node_types.

The implementation aligns well with the PR objective to populate node_types during execution.

src/synthorg/tools/context/__init__.py (1)

1-1: Looks good.

Clear module-level docstring and no concerns in this segment.

tests/unit/tools/test_factory_sandbox_wiring.py (1)

45-46: Expectation update is correct.

Tool-count baseline correctly reflects the new default compact_context tool.

tests/unit/tools/test_factory.py (1)

20-20: LGTM for factory expectation sync.

Expected tool names/count documentation is correctly updated for compact_context.

Also applies to: 46-46

tests/unit/tools/test_factory_new_categories.py (1)

137-140: Count update is correct.

Default composition and assertion now match the additional context tool.

src/synthorg/tools/factory.py (1)

194-196: Factory wiring looks correct.

CompactContextTool is properly included in the default tool set.

Also applies to: 209-209

src/synthorg/observability/events/context_budget.py (1)

20-26: Event constants addition is clean.

Naming and placement are consistent with the observability event-constant pattern.

tests/unit/engine/compaction/test_models.py (1)

45-85: Good coverage for new config semantics.

These tests appropriately lock in threshold-ordering behavior and default marker preservation.

src/synthorg/engine/compaction/models.py (2)

35-71: LGTM! Well-structured configuration extension.

The new agent_controlled, safety_threshold_percent, and preserve_epistemic_markers fields are properly defined with appropriate constraints and defaults. The ConfigDict correctly includes frozen=True and allow_inf_nan=False as per project conventions.

73-86: LGTM! Correct threshold validation logic.

The @model_validator(mode="after") correctly enforces that safety_threshold_percent > fill_threshold_percent only when agent_controlled=True. This ensures agents have headroom between their compaction decisions and the safety net threshold.

tests/unit/engine/compaction/test_epistemic.py (2)

1-58: LGTM! Comprehensive marker counting tests.

Good coverage of edge cases including no markers, single markers from each group, multiple from same group (counted once), mixed markers, and case-insensitive matching.

61-105: LGTM! Thorough threshold testing across complexity levels.

Tests properly verify the complexity-adaptive thresholds: COMPLEX/EPIC preserve at ≥1 marker, SIMPLE/MEDIUM require ≥3 markers.

tests/unit/engine/compaction/test_summarizer_markers.py (1)

1-58: LGTM! Good test setup and initial marker preservation tests.

The helper function _msg and initial test cases for standard format output are well-structured.

src/synthorg/engine/compaction/epistemic.py (2)

1-41: LGTM! Well-organized epistemic pattern definitions.

The module docstring provides good context including the research reference. The precompiled patterns are appropriately grouped by marker type.

55-91: LGTM! Clean utility functions.

Both count_epistemic_markers and should_preserve_message are concise and correctly implement the complexity-adaptive thresholds.

tests/unit/tools/context/test_compact_context.py (2)

1-52: LGTM! Comprehensive tool attribute tests.

Good coverage of tool name, category, description, and parameters schema including the optional preserve_markers field with its default value.

54-178: LGTM! Thorough execute and definition tests.

Tests properly verify metadata keys, default values, explicit parameter handling, and the to_definition() method. The is_error is False check is a good defensive assertion.

src/synthorg/engine/compaction/summarizer.py (4)

10-14: LGTM! Clean epistemic module integration.

The imports from the new epistemic module are appropriately scoped to the functions needed.

87-93: LGTM! Correct threshold selection logic.

The effective_threshold correctly selects safety_threshold_percent for agent-controlled mode and fill_threshold_percent otherwise, matching the docstring behavior in make_compaction_callback.

230-239: LGTM! Defensive complexity extraction.

The chained getattr calls with None fallbacks correctly handle cases where task execution or complexity may not be present, defaulting to Complexity.COMPLEX for conservative marker preservation.

315-380: LGTM! Well-structured force_compaction implementation.

The function correctly bypasses fill threshold checks while still enforcing min_messages_to_compact and preserve_recent_turns. The forced=True log indicator provides good observability for distinguishing forced compactions.

src/synthorg/observability/events/coordination.py (1)

19-19: LGTM!

The new event constant follows the established naming convention and is properly typed with Final[str]. It aligns well with the other coordination events in this module.

src/synthorg/engine/coordination/attribution.py (3)

59-104: LGTM!

The AgentContribution model is well-designed with:

Proper frozen config with allow_inf_nan=False per guidelines.

NotBlankStr for identifier fields.

Clear field constraints (ge=0.0, le=1.0 for score).

Strong model validator enforcing the score/attribution invariant.

107-145: LGTM!

The CoordinationResultWithAttribution wrapper correctly:

Preserves immutability of the wrapped CoordinationResult.

Uses @computed_field for derived values (is_success, total_contribution_score) per guidelines.

Handles empty contributions gracefully in total_contribution_score.

148-201: LGTM!

The build_agent_contributions() factory is well-implemented with proper structured logging using the domain event constant.

src/synthorg/hr/performance/tracker.py (2)

182-206: LGTM!

The record_coordination_contributions() method is well-implemented:

Properly guards against empty contributions before calculating avg_score.

Uses structured logging with the existing PERF_METRIC_RECORDED event.

Synchronous signature is appropriate for in-memory storage.

One minor observation: unlike get_task_metrics() and get_collaboration_metrics(), there's no corresponding getter for contributions. Consider adding one if consumers need to query stored contributions.

100-100: LGTM!

The new _contributions store follows the same pattern as _task_metrics and _collab_metrics.

src/synthorg/engine/coordination/__init__.py (1)

8-13: LGTM!

The re-exports are properly added with:

Correct import from the new attribution module.

Alphabetically sorted entries in __all__.

All four new symbols (AgentContribution, CoordinationResultWithAttribution, FailureAttribution, build_agent_contributions) exposed at package level.

Also applies to: 35-56

src/synthorg/engine/__init__.py (1)

53-70: LGTM!

The re-exports appropriately expose only the primary types (AgentContribution, CoordinationResultWithAttribution) at the engine package level, keeping lower-level symbols like FailureAttribution and build_agent_contributions scoped to the coordination subpackage. This is good API layering.

Also applies to: 230-262

tests/unit/api/controllers/test_coordination.py (1)

51-70: LGTM!

The test helper correctly wraps CoordinationResult in CoordinationResultWithAttribution, matching the updated production API. The wrapper's default empty agent_contributions tuple is appropriate for these controller-level tests that don't need to verify attribution details.

tests/unit/engine/test_coordination_service.py (2)

135-146: LGTM!

The test correctly validates the new API shape:

Uses attributed.is_success instead of result.is_success.

Accesses the wrapped result via attributed.result.

Verifies agent_contributions is a tuple (line 146).

179-184: LGTM!

All test methods are consistently updated to:

Capture attributed from coordinator.coordinate(ctx).

Extract result = attributed.result for detailed assertions.

Use attributed.is_success for success/failure checks.

The systematic update pattern is correct and thorough.

Also applies to: 291-299, 336-338, 362-368, 419-426, 510-518, 559-561, 594-598, 625-631, 666-671, 764-766, 797-805, 833-840

src/synthorg/engine/agent_engine.py (2)

454-477: LGTM!

The return type update from CoordinationResult to CoordinationResultWithAttribution is correct and consistent with:

The MultiAgentCoordinator.coordinate() implementation (which now returns the wrapper).

The API controller usage (context snippet 2) that accesses .result and .is_success.

The test in context snippet 1 (assert result is expected_result) remains compatible because the mock's return_value is returned directly without transformation.

120-125: LGTM!

The TYPE_CHECKING import correctly switches from CoordinationResult to CoordinationResultWithAttribution, and the import path uses the attribution submodule directly which is appropriate for type-only imports.

src/synthorg/engine/compaction/epistemic.py

src/synthorg/engine/compaction/summarizer.py

src/synthorg/engine/coordination/attribution.py

coderabbitai · 2026-04-08T05:22:59Z

src/synthorg/engine/coordination/service.py

+        contributions = build_agent_contributions(
+            routing_result,
+            dispatch_result.waves,
+        )


⚠️ Potential issue | 🟠 Major

Preserve attribution per executed subtask, not per agent.

The factory called here currently collapses routing decisions to a single agent_id -> subtask_id lookup. When one agent owns multiple subtasks in the same run, the later decision overwrites the earlier one and at least one outcome gets attached to the wrong subtask_id. Because this tuple is both returned and persisted, the bad mapping leaks straight into attribution metrics. Use the executed outcome/task as part of the lookup key, and add a regression for repeated-agent routing.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/coordination/service.py` around lines 211 - 214, The current use of build_agent_contributions with routing_result and dispatch_result.waves collapses routing into a single agent_id->subtask_id mapping which allows later subtasks to overwrite earlier ones when one agent executes multiple subtasks; update build_agent_contributions (or the code that constructs its input) to use a composite key that includes the executed outcome or task identifier (e.g., use (agent_id, executed_outcome_id) or (agent_id, subtask_execution_id)) so each executed subtask is mapped uniquely, ensure any persistence/returned tuple uses that composite mapping, and add a unit/regression test that simulates repeated-agent routing to assert that multiple subtasks from the same agent retain distinct attribution rather than being overwritten.

coderabbitai · 2026-04-08T05:22:59Z

src/synthorg/tools/context/compact_context.py

+_COMPACT_CONTEXT_SCHEMA: dict[str, Any] = {
+    "type": "object",
+    "properties": {
+        "strategy": {
+            "type": "string",
+            "enum": ["summarize"],
+            "description": (
+                "Compaction strategy. Currently only 'summarize' is supported."
+            ),
+        },
+        "preserve_markers": {
+            "type": "boolean",
+            "default": True,
+            "description": (
+                "Whether to preserve epistemic markers (wait, hmm, "
+                "actually, etc.) in the compaction summary."
+            ),
+        },
+        "reason": {
+            "type": "string",
+            "minLength": 10,
+            "maxLength": 256,
+            "description": (
+                "Brief explanation for why compaction is needed "
+                "now (e.g., 'context fill at 92 percent, need to "
+                "preserve reasoning clarity')."
+            ),
+        },
+    },
+    "required": ["strategy", "reason"],
+    "additionalProperties": False,
+}


🛠️ Refactor suggestion | 🟠 Major

Make the schema constant read-only to enforce immutability.

_COMPACT_CONTEXT_SCHEMA is a mutable module-level dict. Please wrap a deep-copied structure with a read-only proxy at construction.

♻️ Proposed refactor

+from copy import deepcopy +from types import MappingProxyType from typing import Any @@ -_COMPACT_CONTEXT_SCHEMA: dict[str, Any] = { +_COMPACT_CONTEXT_SCHEMA: dict[str, Any] = MappingProxyType( + deepcopy( + { "type": "object", "properties": { "strategy": { @@ }, "required": ["strategy", "reason"], "additionalProperties": False, -} + } + ) +)

As per coding guidelines, "Create new objects for immutability, never mutate existing ones; for non-Pydantic internal collections use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/synthorg/tools/context/compact_context.py` around lines 21 - 52, The _COMPACT_CONTEXT_SCHEMA dict is mutable; make it immutable by building it via copy.deepcopy(...) and then wrapping the result with types.MappingProxyType so the module-level symbol _COMPACT_CONTEXT_SCHEMA becomes a read-only mapping; import copy and MappingProxyType where needed and replace the current literal assignment with a deep-copied value wrapped by MappingProxyType to enforce immutability.

src/synthorg/tools/context/compact_context.py

tests/unit/engine/compaction/test_epistemic.py

coderabbitai · 2026-04-08T05:22:59Z

tests/unit/engine/compaction/test_summarizer_markers.py

+    def test_empty_assistant_content_ignored(self) -> None:
+        """Empty assistant messages are skipped."""
+        messages = (
+            _msg(MessageRole.ASSISTANT, ""),
+            _msg(MessageRole.ASSISTANT, "Wait, something important."),
+        )
+        summary = _build_summary(
+            messages,
+            execution_id="test-exec",
+            preserve_markers=True,
+            task_complexity=Complexity.COMPLEX,
+        )
+
+        # First message is empty and skipped, second has marker
+        assert "Epistemic markers preserved from 1 messages" in summary


⚠️ Potential issue | 🟡 Minor

Minor: Grammar issue in expected output.

The assertion expects "from 1 messages" which is grammatically incorrect (should be "from 1 message"). This reflects the implementation's behavior, so if intentional, consider updating the implementation to handle singular/plural correctly.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/engine/compaction/test_summarizer_markers.py` around lines 244 - 258, The test failure is due to a grammar mismatch ("from 1 messages"); update the summarizer code that builds the status text (the logic invoked by _build_summary that produces "Epistemic markers preserved from {n} messages") to choose "message" when the preserved count equals 1 and "messages" otherwise (i.e., pluralize based on count). Locate the string generation in the summarizer/marker-preservation code path (the function invoked by _build_summary) and replace the hardcoded plural with a conditional pluralization so the output reads "from 1 message" for a single preserved marker and "from N messages" for other counts.

tests/unit/engine/compaction/test_summarizer_markers.py

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/synthorg/engine/compaction/summarizer.py (1)
110-115: ⚠️ Potential issue | 🟠 Major

Forced compaction currently emits CONTEXT_BUDGET_COMPACTION_STARTED twice.

force_compaction() logs the STARTED event at DEBUG and then _do_compaction() logs the same STARTED event again at INFO. Any metric or alert keyed on the event name will double-count forced runs. Keep the INFO emission as the canonical state-transition log, add forced=force there, and drop the extra DEBUG event.
🧭 Proposed fix
     logger.info(
         CONTEXT_BUDGET_COMPACTION_STARTED,
         execution_id=ctx.execution_id,
         fill_percent=fill_pct,
         message_count=len(conversation),
+        forced=force,
     )
@@
-    logger.debug(
-        CONTEXT_BUDGET_COMPACTION_STARTED,
-        execution_id=ctx.execution_id,
-        fill_percent=ctx.context_fill_percent,
-        message_count=len(ctx.conversation),
-        forced=True,
-    )
     return _do_compaction(ctx, config, estimator, force=True)
As per coding guidelines, "All state transitions must log at INFO. DEBUG for object creation, internal flow, and entry/exit of key functions."

Also applies to: 344-350
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/engine/compaction/summarizer.py` around lines 110 - 115, The
STARTED compaction event is being emitted twice for forced runs; remove the
DEBUG emission in force_compaction() and rely on the INFO emission inside
_do_compaction(), adding forced=force to that INFO call (the
CONTEXT_BUDGET_COMPACTION_STARTED log) so INFO is the canonical state-transition
log; also scan the same file for the other duplicate STARTED emission patterns
(the similar block around the later compaction call) and apply the same change:
drop the extra DEBUG STARTED log and annotate the single INFO STARTED log with
forced=force.

♻️ Duplicate comments (2)

src/synthorg/tools/context/compact_context.py (1)
23-54: 🛠️ Refactor suggestion | 🟠 Major

Freeze _COMPACT_CONTEXT_SCHEMA instead of leaving it mutable.

deepcopy() in __init__ protects each instance, but the shared schema source can still be mutated for every future CompactContextTool construction. Build the module-level schema from a deep-copied object and wrap it in MappingProxyType, then copy from that immutable source when passing it to BaseTool.

As per coding guidelines, "Create new objects instead of mutating existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/context/compact_context.py` around lines 23 - 54, The
module-level _COMPACT_CONTEXT_SCHEMA should be made immutable: create it from a
deep-copied dict and wrap it in MappingProxyType (e.g., _COMPACT_CONTEXT_SCHEMA
= MappingProxyType(deepcopy({...}))) so the source cannot be mutated; then in
CompactContextTool.__init__ (the constructor that currently deepcopy()s the
schema) continue to copy from this immutable source (use copy.deepcopy on the
MappingProxyType value) before passing it to BaseTool so each instance gets its
own mutable copy while the module-level schema remains read-only.
src/synthorg/engine/coordination/attribution.py (1)
178-209: ⚠️ Potential issue | 🟠 Major

Round-robin subtask matching still misattributes repeated-agent runs.

Lines 178-203 pair outcomes to subtasks by agent_id plus ordinal position. If one agent owns multiple subtasks and those outcomes are emitted in a different order than routing_result.decisions, the wrong subtask_id gets attached and that bad mapping is then returned/persisted. Match on the executed task/outcome identifier instead, and add a regression where one agent completes multiple routed subtasks out of order.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/engine/coordination/attribution.py` around lines 178 - 209, The
current round-robin uses agent_to_subtasks and agent_subtask_idx which
misattributes when an agent emits outcomes out-of-order; instead build a direct
lookup from the executed task/outcome identifier to the routed subtask (using
routing_result.decisions entries) and use that lookup when iterating
waves->outcomes to pick subtask_id (fall back to the existing logic only if no
mapping exists). Update the loop that constructs
agent_to_subtasks/agent_subtask_idx or add a new mapping (keyed by
outcome.task_id or the decision's executed-task id) and then call _score_outcome
with the mapped subtask_id for each outcome in the waves processing block.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/design/engine.md`:
- Line 1449: The heading "Per-Agent Attribution (Phase 8)" incorrectly implies
an additional pipeline phase; rename the header to "Per-Agent Attribution
(Post-Pipeline)" or simply "Per-Agent Attribution" and update any
cross-references or anchors that mention "Phase 8" so they no longer imply an
8-phase pipeline; ensure the section title text and any internal links or index
entries referencing "Per-Agent Attribution (Phase 8)" are updated to the new
wording to keep docs consistent with the 7-phase pipeline description.

In `@src/synthorg/engine/compaction/epistemic.py`:
- Around line 12-15: The module is missing the standard logger; import
get_logger from synthorg.observability and create a module logger by adding
"from synthorg.observability import get_logger" and "logger =
get_logger(__name__)" at the top of epistemic.py (near the other imports) so
functions in this compaction logic can use logger consistently with the engine
package conventions.

In `@src/synthorg/engine/coordination/attribution.py`:
- Around line 72-103: The AgentContribution Pydantic model currently only
preserves contribution_score, failure_attribution, and evidence, which drops
fields required by the design (e.g., failure_category, token_usage,
raw_per_agent_outcome); update the AgentContribution class to include these
missing attributes (add fields such as failure_category: FailureCategory | None,
token_usage: TokenUsage | None, and raw_outcome: dict | str | None with
appropriate types, defaults, descriptions, and validation similar to existing
fields) and then update the PerformanceTracker.record_coordination_contributions
signature and any callers to accept and forward the extended AgentContribution
instances so those values are retained and surfaced (ensure model_config and max
length/validation rules follow existing patterns and that
serialization/consumption points use the new fields).

In `@src/synthorg/engine/coordination/service.py`:
- Around line 211-223: The call to build_agent_contributions in Service
(coordination service) can raise and currently sits on the success path, so wrap
the build_agent_contributions(...) call in a try/except that catches Exception,
logs a WARNING or ERROR with context (include routing_result,
dispatch_result.waves or coordination id), and continues so it does not mark the
coordination run as failed; after successfully building contributions, proceed
to call self._performance_tracker.record_coordination_contributions(...) as
before, but ensure the tracker call remains guarded separately so tracker
failures also do not fail an already-completed run.

In `@src/synthorg/tools/context/compact_context.py`:
- Around line 23-53: The schema _COMPACT_CONTEXT_SCHEMA currently lacks a field
for agent-provided summaries; add a new optional property (e.g., "agent_summary"
or "summary") to the schema with type "string" and sensible minLength/maxLength
(and a description stating it can replace or supplement auto compaction), keep
"required" unchanged (do not force it), and ensure additionalProperties remains
False; then wire this field through the public tool boundary by accepting it
where the compact_context tool input is parsed/validated so the agent-provided
summary is passed into the compaction logic (update any validation/parse code
that references _COMPACT_CONTEXT_SCHEMA to include and forward "agent_summary").

In `@tests/integration/tools/test_factory_integration.py`:
- Line 15: Replace the brittle hard-coded-only assertion using
_EXPECTED_TOOL_COUNT by asserting both the expected length and that specific
required tool names exist in the produced tool registry; update the test that
uses _EXPECTED_TOOL_COUNT to (1) keep the length check if desired, and (2) check
that the returned collection (e.g., the list or dict of tools the factory
returns) contains required keys/names including "compact_context" and any other
critical tool names. Ensure you reference and update the variable
_EXPECTED_TOOL_COUNT alongside these presence assertions so the test fails only
when tools are missing or the count truly changes.

In `@tests/unit/engine/compaction/test_epistemic.py`:
- Around line 144-162: The test test_truncates_at_max_chars never triggers the
new truncation branch because multiple short marker sentences fit within
max_chars; update the test to use a single marker sentence longer than max_chars
so extract_marker_sentences(max_chars=...) takes the first-sentence truncation
path, and assert the exact truncated string (first max_chars characters plus
"…"/"..." as implemented) rather than the current len-based check; modify the
input text and the assertions in test_truncates_at_max_chars to validate the
exact truncated output from extract_marker_sentences.

---

Outside diff comments:
In `@src/synthorg/engine/compaction/summarizer.py`:
- Around line 110-115: The STARTED compaction event is being emitted twice for
forced runs; remove the DEBUG emission in force_compaction() and rely on the
INFO emission inside _do_compaction(), adding forced=force to that INFO call
(the CONTEXT_BUDGET_COMPACTION_STARTED log) so INFO is the canonical
state-transition log; also scan the same file for the other duplicate STARTED
emission patterns (the similar block around the later compaction call) and apply
the same change: drop the extra DEBUG STARTED log and annotate the single INFO
STARTED log with forced=force.

---

Duplicate comments:
In `@src/synthorg/engine/coordination/attribution.py`:
- Around line 178-209: The current round-robin uses agent_to_subtasks and
agent_subtask_idx which misattributes when an agent emits outcomes out-of-order;
instead build a direct lookup from the executed task/outcome identifier to the
routed subtask (using routing_result.decisions entries) and use that lookup when
iterating waves->outcomes to pick subtask_id (fall back to the existing logic
only if no mapping exists). Update the loop that constructs
agent_to_subtasks/agent_subtask_idx or add a new mapping (keyed by
outcome.task_id or the decision's executed-task id) and then call _score_outcome
with the mapped subtask_id for each outcome in the waves processing block.

In `@src/synthorg/tools/context/compact_context.py`:
- Around line 23-54: The module-level _COMPACT_CONTEXT_SCHEMA should be made
immutable: create it from a deep-copied dict and wrap it in MappingProxyType
(e.g., _COMPACT_CONTEXT_SCHEMA = MappingProxyType(deepcopy({...}))) so the
source cannot be mutated; then in CompactContextTool.__init__ (the constructor
that currently deepcopy()s the schema) continue to copy from this immutable
source (use copy.deepcopy on the MappingProxyType value) before passing it to
BaseTool so each instance gets its own mutable copy while the module-level
schema remains read-only.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9e159eb0-cdaa-418a-9696-241d4ffee08f

📥 Commits

Reviewing files that changed from the base of the PR and between 2123b09 and dfdf697.

📒 Files selected for processing (17)

CLAUDE.md
docs/design/engine.md
scripts/run_affected_tests.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/models.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/coordination/attribution.py
src/synthorg/engine/coordination/service.py
src/synthorg/hr/performance/tracker.py
src/synthorg/tools/context/compact_context.py
tests/integration/engine/test_coordination_wiring.py
tests/integration/tools/test_factory_integration.py
tests/unit/engine/compaction/test_epistemic.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/engine/coordination/test_attribution.py
web/src/api/types.ts

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Test (Python 3.14)
GitHub Check: Dashboard Test
GitHub Check: Build Web
GitHub Check: Build Backend
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (8)

**/*.{py,ts,tsx,go}

📄 CodeRabbit inference engine (CLAUDE.md)

No from __future__ import annotations in Python -- Python 3.14+ has PEP 649 native lazy annotations.

Files:

tests/integration/tools/test_factory_integration.py
tests/integration/engine/test_coordination_wiring.py
scripts/run_affected_tests.py
web/src/api/types.ts
src/synthorg/hr/performance/tracker.py
src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/models.py
tests/unit/engine/coordination/test_attribution.py
src/synthorg/api/controllers/coordination.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/engine/compaction/test_epistemic.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use PEP 758 except syntax: use except A, B: (no parentheses) for multiple exceptions -- ruff enforces this on Python 3.14.

All public functions must have type hints and pass mypy strict mode type checking.

Docstrings must use Google style format and are required on all public classes and functions (enforced by ruff D rules).

Create new objects instead of mutating existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.

Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.

Use @computed_field for derived values in Pydantic models instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens).

Use NotBlankStr (from core.types) for all identifier/name fields, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.

Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.

Line length must be 88 characters (enforced by ruff).

Functions must be less than 50 lines; files must be less than 800 lines.

Handle errors explicitly; never silently swallow exceptions.

Validate input at system boundaries (user input, external APIs, config files).

Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__). Never use import logging / logging.getLogger() / print() in application code.

Always use...

Files:

tests/integration/tools/test_factory_integration.py
tests/integration/engine/test_coordination_wiring.py
scripts/run_affected_tests.py
src/synthorg/hr/performance/tracker.py
src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/models.py
tests/unit/engine/coordination/test_attribution.py
src/synthorg/api/controllers/coordination.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/engine/compaction/test_epistemic.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

All test files must use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow.

Coverage minimum is 80% (enforced in CI). Do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed. Global timeout of 30 seconds per test is configured in pyproject.toml.

Use asyncio_mode = "auto" for async tests -- no manual @pytest.mark.asyncio needed.

Always run tests with -n 8 for parallelism via pytest-xdist locally. Never run tests sequentially. CI uses -n auto for fewer cores on runners.

Prefer @pytest.mark.parametrize for testing similar cases.

Use Hypothesis for property-based testing with @given and @settings. Hypothesis profiles configured in tests/conftest.py: ci (deterministic, 10 examples), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples). Controlled via HYPOTHESIS_PROFILE env var. .hypothesis/ is gitignored.

When Hypothesis finds a failure, fix the underlying bug and add an explicit @example(...) decorator to the test so the case is permanently covered in CI. Do NOT just rerun and move on.

Never skip, dismiss, or ignore flaky tests -- always fix them fully. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.

For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider), use asyncio.Event().wait() instead of asyncio.sleep(large_number) -- it is cancellation-safe and carries no timing assumptions.

Files:

tests/integration/tools/test_factory_integration.py
tests/integration/engine/test_coordination_wiring.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/engine/compaction/test_epistemic.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/integration/tools/test_factory_integration.py
tests/integration/engine/test_coordination_wiring.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/engine/compaction/test_epistemic.py

**/*

📄 CodeRabbit inference engine (CLAUDE.md)

Never use Bash to write or modify files -- use the Write or Edit tools. Do not use cat >, cat << EOF, echo >, echo >>, sed -i, python -c "open(...).write(...)", or tee for file creation or modification.

Files:

tests/integration/tools/test_factory_integration.py
tests/integration/engine/test_coordination_wiring.py
scripts/run_affected_tests.py
web/src/api/types.ts
CLAUDE.md
docs/design/engine.md
src/synthorg/hr/performance/tracker.py
src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/models.py
tests/unit/engine/coordination/test_attribution.py
src/synthorg/api/controllers/coordination.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/engine/compaction/test_epistemic.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

web/src/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx,js,jsx}: Always use createLogger from @/lib/logger -- never bare console.warn/console.error/console.debug in application code
Logger variable name must always be const log (e.g. const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate arguments to logger methods (not interpolated into the message string) so they go through sanitizeArg
Attacker-controlled fields inside structured objects must be wrapped in sanitizeForLog() before embedding in log calls

Files:

web/src/api/types.ts

web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground, bg-card, text-accent, text-success, bg-danger, etc.) or CSS variables (var(--so-*)) for colors; NEVER hardcode hex values in .tsx/.ts files
Use font-sans or font-mono (Geist tokens) for typography; NEVER set fontFamily directly in .tsx/.ts files
Use density-aware tokens (p-card, gap-section-gap, gap-grid-gap) or standard Tailwind spacing; NEVER hardcode pixel values for layout spacing in components
Use token variables (var(--so-shadow-card-hover), border-border, border-bright) for shadows and borders; NEVER hardcode values in .tsx/.ts files
Use @/lib/motion presets for Framer Motion transition durations; NEVER hardcode transition durations
CSS side-effect imports in TypeScript 6 require type declarations -- add /// <reference types="vite/client" /> at the top of files with CSS imports

Always reuse existing components from web/src/components/ui/ before creating new ones. See web/CLAUDE.md for the full component inventory.

Never hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions -- use design tokens and @/lib/motion presets. A PostToolUse hook enforces these rules on every Edit/Write to web/src/.

Files:

web/src/api/types.ts

**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Always read the relevant docs/design/ page before implementing any feature or planning any issue. The design spec is the starting point for architecture, data models, and behavior.

Files:

CLAUDE.md
docs/design/engine.md

src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/hr/performance/tracker.py
src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/models.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

🧠 Learnings (58)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: If implementation deviates from the design spec (better approach found, scope evolved, etc.), alert the user and explain why -- user decides whether to proceed or update the spec. Do NOT silently diverge -- every deviation needs explicit user approval.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts. At every phase of planning and implementation, be critical and actively look for ways to improve the design.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Prioritize issues by dependency order, not priority labels -- unblocked dependencies come first.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Commits must follow format: `<type>: <description>` -- types: feat, fix, refactor, docs, test, chore, perf, ci. Enforced by commitizen (commit-msg hook).

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Signed commits are required on `main` via branch protection -- all commits must be GPG/SSH signed.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Branches must follow naming convention: `<type>/<slug>` from main.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Pre-push hooks run: mypy type-check (affected modules) + pytest unit tests (affected modules) + golangci-lint + go vet + go test (CLI) + eslint-web (web). Foundational module changes (core, config, observability) or conftest changes trigger full runs. These hooks are skipped in pre-commit.ci since dedicated CI jobs handle them.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Use `gh issue list` via Bash (not MCP tools) for GitHub issue queries -- MCP `list_issues` has unreliable field data.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Use squash merge for PRs -- PR body becomes the squash commit message on main. Preserve existing `Closes `#NNN`` references in trailers -- never remove unless explicitly asked.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: After finishing an issue implementation, always create a feature branch (`<type>/<slug>`), commit, and push -- do NOT create a PR automatically. Do NOT leave work uncommitted on main.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Never create a PR directly -- `gh pr create` is blocked by hookify. Always use `/pre-pr-review` to create PRs -- it runs automated checks + review agents + fixes before creating the PR. For trivial/docs-only changes: `/pre-pr-review quick` skips agents but still runs checks.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: After the PR exists, use `/aurelio-review-pr` to handle external reviewer feedback. The `/commit-push-pr` command is effectively blocked (it calls `gh pr create` internally).

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T06:40:06.503Z
Learning: Fix everything valid from review agents -- never skip. When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to PR changes), fix them all. No deferring, no "out of scope" skipping.

📚 Learning: 2026-04-02T18:48:02.489Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T18:48:02.489Z
Learning: ALWAYS run pytest with `-n auto` for parallel execution with pytest-xdist; never run tests sequentially

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.

Applied to files:

scripts/run_affected_tests.py
tests/unit/engine/compaction/test_epistemic.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Pre-push hooks check: mypy type-check (affected modules only) + pytest unit tests (affected modules only) + golangci-lint + go vet + go test (CLI, conditional on `cli/**/*.go`) + eslint-web (web dashboard); foundational module changes (core, config, observability) or conftest changes trigger full runs

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-02T08:47:46.313Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Run unit tests with `uv run python -m pytest tests/ -m unit -n auto`; integration tests with `-m integration -n auto`; e2e tests with `-m e2e -n auto`

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: ALWAYS include `-n 8` when running pytest locally; never run tests sequentially; CI uses `-n auto`

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-06T16:35:12.934Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: All project conventions, commands, and standards are defined in CLAUDE.md - refer to it for project structure, package layout, code conventions, quick commands, git workflow, testing standards, design specifications, logging, resilience, and security patterns

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-17T06:43:14.114Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-17T06:43:14.114Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

CLAUDE.md
src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

CLAUDE.md
src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.

Applied to files:

CLAUDE.md
src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Always use structured logging: `logger.info(EVENT, key=value)`—never `logger.info("msg %s", val)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-19T11:33:01.580Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings

Applied to files:

CLAUDE.md

📚 Learning: 2026-04-02T12:07:44.443Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T12:07:44.443Z
Learning: Applies to src/synthorg/**/*.py : Always use structured logging: `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-16T07:22:28.134Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit

Applied to files:

CLAUDE.md
src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-31T20:07:03.035Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions

Applied to files:

CLAUDE.md
src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

docs/design/engine.md
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations

Applied to files:

docs/design/engine.md

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

docs/design/engine.md
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-16T06:24:56.341Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

docs/design/engine.md

📚 Learning: 2026-03-19T07:13:44.964Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue — DESIGN_SPEC.md is a pointer file linking to 7 design pages (Agents, Organization, Communication, Engine, Memory, Operations)

Applied to files:

docs/design/engine.md

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

docs/design/engine.md

📚 Learning: 2026-03-17T06:30:14.180Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.

Applied to files:

docs/design/engine.md
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

docs/design/engine.md
src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-19T07:13:44.964Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

src/synthorg/hr/performance/tracker.py
src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-16T06:24:56.341Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

src/synthorg/hr/performance/tracker.py
src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.

Applied to files:

src/synthorg/hr/performance/tracker.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to **/*.py : Create new objects for immutability, never mutate existing ones; for non-Pydantic internal collections use `copy.deepcopy()` at construction + `MappingProxyType` wrapping for read-only enforcement

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-01T09:09:43.948Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for `dict`/`list` fields

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : For `dict`/`list` fields in frozen Pydantic models, rely on `frozen=True` for field reassignment prevention and `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/models.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves; never mix static config fields with mutable runtime fields in one model

Applied to files:

src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

src/synthorg/tools/context/compact_context.py
src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-14T16:18:57.267Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-15T16:55:07.730Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`); use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-17T11:41:02.964Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T11:41:02.964Z
Learning: Applies to src/**/*.py : Models: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing + validating redundant fields. Use `NotBlankStr` for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-15T18:42:17.990Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-04-02T11:49:21.202Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T11:49:21.202Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`) with `allow_inf_nan=False` in all `ConfigDict` declarations

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-15T18:38:44.202Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

src/synthorg/engine/compaction/models.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`

Applied to files:

src/synthorg/api/controllers/coordination.py
src/synthorg/engine/coordination/service.py

📚 Learning: 2026-04-02T09:04:54.709Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T09:04:54.709Z
Learning: Applies to tests/**/*.py : Use pytest markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`

Applied to files:

tests/unit/engine/compaction/test_epistemic.py

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).

Applied to files:

tests/unit/engine/compaction/test_epistemic.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to tests/**/*.py : Use markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow` for test classification

Applied to files:

tests/unit/engine/compaction/test_epistemic.py

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-15T18:38:44.202Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-19T11:33:01.580Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-20T11:18:48.128Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

src/synthorg/engine/coordination/service.py

🔇 Additional comments (3)

CLAUDE.md (1)

93-93: Good update to event-name guidance for new observability flows.

The expanded event constant examples are aligned with the new coordination attribution and compaction/epistemic instrumentation, and keep the “import constant from domain module” rule explicit.

web/src/api/types.ts (1)

1310-1319: API contract update looks correct.

Adding currency: string to CoordinationResultResponse cleanly aligns the frontend type with the backend coordination payload and avoids type drift.

docs/design/engine.md (1)

1490-1490: LGTM: ACG cross-reference accurately reflects NodeType addition.

The update correctly documents the stronger node typing introduced by the NodeType enum on TurnRecord.node_types, maintaining the bidirectional glossary between ACG formalism and SynthOrg implementation.

docs/design/engine.md

src/synthorg/engine/compaction/epistemic.py

coderabbitai · 2026-04-08T06:55:31Z

src/synthorg/engine/coordination/attribution.py

+class AgentContribution(BaseModel):
+    """Per-agent contribution to a coordinated task execution.
+
+    Attributes:
+        agent_id: Identifier of the contributing agent.
+        subtask_id: Identifier of the subtask this agent executed.
+        contribution_score: Normalized score (0.0-1.0) reflecting
+            the agent's contribution quality.
+        failure_attribution: Classification of why the agent failed
+            (``None`` when the agent succeeded with score 1.0).
+        evidence: Truncated error message or evidence pointer
+            (``None`` when the agent succeeded).
+    """
+
+    model_config = ConfigDict(frozen=True, allow_inf_nan=False)
+
+    agent_id: NotBlankStr = Field(description="Contributing agent")
+    subtask_id: NotBlankStr = Field(description="Subtask executed")
+    contribution_score: float = Field(
+        ge=0.0,
+        le=1.0,
+        description="Contribution quality (0.0-1.0)",
+    )
+    failure_attribution: FailureAttribution | None = Field(
+        default=None,
+        description="Why the agent failed (None on success)",
+    )
+    evidence: str | None = Field(
+        default=None,
+        max_length=_MAX_EVIDENCE_LENGTH,
+        description="Truncated error or evidence pointer",
+    )


⚠️ Potential issue | 🟠 Major

AgentContribution drops fields that #1123 says to surface.

This model only preserves score/attribution/evidence. Because PerformanceTracker.record_coordination_contributions() accepts only AgentContribution, failure_category, token usage, and the raw per-agent outcome are discarded before the tracker can surface them. If #1123 intentionally narrowed scope, please update the objective; otherwise this payload needs those fields.

Based on learnings: If implementation deviates from the design spec (better approach found, scope evolved, etc.), alert the user and explain why -- user decides whether to proceed or update the spec.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/coordination/attribution.py` around lines 72 - 103, The AgentContribution Pydantic model currently only preserves contribution_score, failure_attribution, and evidence, which drops fields required by the design (e.g., failure_category, token_usage, raw_per_agent_outcome); update the AgentContribution class to include these missing attributes (add fields such as failure_category: FailureCategory | None, token_usage: TokenUsage | None, and raw_outcome: dict | str | None with appropriate types, defaults, descriptions, and validation similar to existing fields) and then update the PerformanceTracker.record_coordination_contributions signature and any callers to accept and forward the extended AgentContribution instances so those values are retained and surfaced (ensure model_config and max length/validation rules follow existing patterns and that serialization/consumption points use the new fields).

src/synthorg/engine/coordination/service.py

coderabbitai · 2026-04-08T06:55:31Z

src/synthorg/tools/context/compact_context.py

+_COMPACT_CONTEXT_SCHEMA: dict[str, Any] = {
+    "type": "object",
+    "properties": {
+        "strategy": {
+            "type": "string",
+            "enum": ["summarize"],
+            "description": (
+                "Compaction strategy. Currently only 'summarize' is supported."
+            ),
+        },
+        "preserve_markers": {
+            "type": "boolean",
+            "default": True,
+            "description": (
+                "Whether to preserve epistemic markers (wait, hmm, "
+                "actually, etc.) in the compaction summary."
+            ),
+        },
+        "reason": {
+            "type": "string",
+            "minLength": 10,
+            "maxLength": 256,
+            "description": (
+                "Brief explanation for why compaction is needed "
+                "now (e.g., 'context fill at 92 percent, need to "
+                "preserve reasoning clarity')."
+            ),
+        },
+    },
+    "required": ["strategy", "reason"],
+    "additionalProperties": False,


⚠️ Potential issue | 🟠 Major

The compact_context schema cannot carry an agent-authored summary.

The linked #1125 objective says agents should be able to provide summary content that replaces or supplements automatic compaction. This schema only accepts strategy, preserve_markers, and reason, so that path is impossible through the public tool boundary. Please either add the missing summary field(s) and wiring, or explicitly update the spec before merge.

Based on learnings, "If implementation deviates from the design spec (better approach found, scope evolved, etc.), alert the user and explain why -- user decides whether to proceed or update the spec."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/synthorg/tools/context/compact_context.py` around lines 23 - 53, The schema _COMPACT_CONTEXT_SCHEMA currently lacks a field for agent-provided summaries; add a new optional property (e.g., "agent_summary" or "summary") to the schema with type "string" and sensible minLength/maxLength (and a description stating it can replace or supplement auto compaction), keep "required" unchanged (do not force it), and ensure additionalProperties remains False; then wire this field through the public tool boundary by accepting it where the compact_context tool input is parsed/validated so the agent-provided summary is passed into the compaction logic (update any validation/parse code that references _COMPACT_CONTEXT_SCHEMA to include and forward "agent_summary").

tests/integration/tools/test_factory_integration.py

tests/unit/engine/compaction/test_epistemic.py

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

src/synthorg/tools/context/compact_context.py (1)
26-57: ⚠️ Potential issue | 🟠 Major

Schema lacks field for agent-provided summaries.

Per #1125 objectives, agents should be able to provide summaries that "replace or supplement automatic compaction." The current schema only accepts strategy, preserve_markers, and reason—there's no path to supply actual summary content through the tool boundary.

If this is intentional (e.g., deferred to a follow-up), please document that decision. Otherwise, consider adding an optional summary field.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/context/compact_context.py` around lines 26 - 57, The
schema defined in _RAW_SCHEMA does not allow agent-provided summaries; add an
optional "summary" property to the "properties" map of _RAW_SCHEMA (type:
"string", appropriate minLength/maxLength to match expected summary size, and a
clear description stating it can "replace or supplement automatic compaction")
and ensure it is not included in "required" so tools can optionally submit
summaries; update any validation/consumer code that reads _RAW_SCHEMA to accept
and forward the "summary" field where compaction logic (summarize strategy)
consumes agent-provided content.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/design/engine.md`:
- Around line 1465-1468: The docs claim record_coordination_contributions() is
guarded by an asyncio.Lock but the method is synchronous and doesn't acquire
self._metrics_lock; change the implementation to actually synchronize by
converting record_coordination_contributions() into an async def and await
self._metrics_lock.acquire()/use async with self._metrics_lock around the write
logic (or alternatively use a separate threading.Lock if you must keep it sync),
and update any callers to await record_coordination_contributions() accordingly
so the lock (self._metrics_lock) is actually used as documented.

In `@src/synthorg/engine/compaction/summarizer.py`:
- Around line 325-345: The docstring for force_compaction is misleading about
the mechanism used to bypass the fill threshold; update the docstring to state
that the function calls _do_compaction with force=True which causes
_do_compaction to skip the fill-threshold comparison entirely (rather than
"temporarily using a zero-percent threshold"), and keep the existing description
of purpose and return values intact so readers understand that force_compaction
delegates to _do_compaction with a force flag to bypass threshold checks.

In `@src/synthorg/engine/coordination/service.py`:
- Around line 97-104: The factory that constructs MultiAgentCoordinator is not
passing the performance_tracker through, leaving
MultiAgentCoordinator._performance_tracker None; update the factory call that
instantiates MultiAgentCoordinator (the line that calls
MultiAgentCoordinator(...)) to forward the performance_tracker argument (either
pass performance_tracker as the corresponding positional parameter or include
performance_tracker=performance_tracker in the constructor call) so
MultiAgentCoordinator.__init__ receives and stores the tracker.

In `@tests/unit/engine/compaction/test_epistemic.py`:
- Around line 171-176: The test test_respects_max_chars_default is too
permissive (assert len(result) <= 210) and should assert the exact truncation
behavior: call extract_marker_sentences with the 306-char single-sentence input
and assert the returned string length equals 200 (since implementation uses
stripped[:max_chars]); update the assertion to assert len(result) == 200 and
adjust the comment to state exact truncation to 200 characters for the
single-sentence case.

---

Duplicate comments:
In `@src/synthorg/tools/context/compact_context.py`:
- Around line 26-57: The schema defined in _RAW_SCHEMA does not allow
agent-provided summaries; add an optional "summary" property to the "properties"
map of _RAW_SCHEMA (type: "string", appropriate minLength/maxLength to match
expected summary size, and a clear description stating it can "replace or
supplement automatic compaction") and ensure it is not included in "required" so
tools can optionally submit summaries; update any validation/consumer code that
reads _RAW_SCHEMA to accept and forward the "summary" field where compaction
logic (summarize strategy) consumes agent-provided content.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 94a08610-8ec9-40a2-92cd-7ab4979482bd

📥 Commits

Reviewing files that changed from the base of the PR and between dfdf697 and 2ac69ca.

📒 Files selected for processing (9)

docs/design/engine.md
scripts/run_affected_tests.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/coordination/attribution.py
src/synthorg/engine/coordination/service.py
src/synthorg/tools/context/compact_context.py
tests/integration/tools/test_factory_integration.py
tests/unit/engine/compaction/test_epistemic.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Build Backend
GitHub Check: Build Web
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (5)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use Python 3.14+ and do NOT use from __future__ import annotations (PEP 649 has native lazy annotations)

Use PEP 758 except syntax: use except A, B: (no parentheses) instead of except (A, B):

Provide type hints on all public functions and classes; enforce mypy strict mode

Use Google-style docstrings on all public classes and functions (enforced by ruff D rules)

Create new objects instead of mutating existing ones (immutability). For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.

Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Always use allow_inf_nan=False in all ConfigDict declarations. Use @computed_field for derived values instead of storing redundant fields. Use NotBlankStr from core.types for all identifier/name fields.

Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.

Enforce 88-character line length (ruff)

Keep functions under 50 lines and files under 800 lines

Handle errors explicitly; never silently swallow exceptions

Validate at system boundaries: user input, external APIs, and config files

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Vendor names may only appear in: (1) Operations design page (docs/design/operations.md), (2) .claude/ files, (3) third-party import paths, (4) provider presets (`src/synt...

Files:

tests/integration/tools/test_factory_integration.py
scripts/run_affected_tests.py
src/synthorg/tools/context/compact_context.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/compaction/epistemic.py
tests/unit/engine/compaction/test_epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/coordination/attribution.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow. Minimum coverage 80% (enforced in CI). Use asyncio_mode = "auto" (no manual @pytest.mark.asyncio needed). Global 30-second timeout per test (do not add per-file markers except for non-default overrides like timeout(60)).

ALWAYS include -n 8 when running pytest locally for parallelism with pytest-xdist; never run tests sequentially

Prefer @pytest.mark.parametrize for testing similar cases

Use Hypothesis for property-based testing with @given + @settings. Control via HYPOTHESIS_PROFILE env var: ci (10 examples, deterministic), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). When Hypothesis finds a failure, fix the bug and add an explicit @example(...) decorator to permanently cover the case in CI.

NEVER skip, dismiss, or ignore flaky tests. Fix them fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

tests/integration/tools/test_factory_integration.py
tests/unit/engine/compaction/test_epistemic.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

tests/integration/tools/test_factory_integration.py
tests/unit/engine/compaction/test_epistemic.py

**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

ALWAYS read the relevant docs/design/ page before implementing any feature or planning any issue. The design spec is the starting point for architecture, data models, and behavior. If implementation deviates from the spec, alert the user and explain why -- user decides whether to proceed or update the spec. Do NOT silently diverge.

Files:

docs/design/engine.md

src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every module with business logic MUST have logging via from synthorg.observability import get_logger then logger = get_logger(__name__). Never use import logging, logging.getLogger(), or print() in application code.

Use event name constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT

Use structured logging with kwargs: always logger.info(EVENT, key=value) -- never logger.info("msg %s", val)

Log all error paths at WARNING or ERROR with context before raising; log all state transitions at INFO; log object creation, internal flow, and key function entry/exit at DEBUG

All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.

Set RetryConfig and RateLimiterConfig per-provider in ProviderConfig. Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately. RetryExhaustedError signals all retries failed.

For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).

Files:

src/synthorg/tools/context/compact_context.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/coordination/attribution.py

src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

src/synthorg/tools/context/compact_context.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/coordination/attribution.py

🧠 Learnings (41)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Every implementation plan must be presented to the user for accept/deny before coding starts

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: At every phase of planning and implementation, be critical -- actively look for ways to improve the design in the spirit of what we're building (robustness, correctness, simplicity, future-proofing where it's free). Surface improvements as suggestions, not silent changes -- user decides.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Prioritize issues by dependency order, not priority labels -- unblocked dependencies come first

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Never use `cd` in Bash commands -- the working directory is already set to the project root. Use absolute paths or run commands directly. Exception: `bash -c "cd <dir> && <cmd>"` is safe for tools without a `-C` flag.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Never use Bash to write or modify files -- use the Write or Edit tools. Do not use `cat >`, `cat << EOF`, `echo >`, `echo >>`, `sed -i`, `python -c`, or `tee` to create/modify files (read-only/inspection uses are fine).

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Use commit format: `<type>: <description>` with types: feat, fix, refactor, docs, test, chore, perf, ci (enforced by commitizen). All commits on main must be GPG/SSH signed. Use branch naming: `<type>/<slug>` from main.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: Use squash merge strategy for PRs -- PR body becomes the squash commit message on main. Preserve existing `Closes `#NNN`` references in PR body; never remove unless explicitly asked.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: After finishing an issue implementation: always create a feature branch (`<type>/<slug>`), commit, and push -- do NOT create a PR automatically. Do NOT leave work uncommitted on main.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: NEVER create a PR directly with `gh pr create` (blocked by hookify). ALWAYS use `/pre-pr-review` to create PRs -- it runs automated checks + review agents + fixes before creating the PR. For trivial/docs-only changes, use `/pre-pr-review quick`. After PR exists, use `/aurelio-review-pr` for external feedback. Fix all valid review findings -- never skip.

Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-08T08:15:26.656Z
Learning: See [docs/reference/claude-reference.md](docs/reference/claude-reference.md) for: Documentation layout, Docker commands, Package Structure, Releasing, CI pipelines, Dependencies

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-02T18:48:02.489Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T18:48:02.489Z
Learning: ALWAYS run pytest with `-n auto` for parallel execution with pytest-xdist; never run tests sequentially

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: ALWAYS include `-n 8` when running pytest locally; never run tests sequentially; CI uses `-n auto`

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Pre-push hooks check: mypy type-check (affected modules only) + pytest unit tests (affected modules only) + golangci-lint + go vet + go test (CLI, conditional on `cli/**/*.go`) + eslint-web (web dashboard); foundational module changes (core, config, observability) or conftest changes trigger full runs

Applied to files:

scripts/run_affected_tests.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

docs/design/engine.md
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-17T06:30:14.180Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.

Applied to files:

docs/design/engine.md
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

docs/design/engine.md
src/synthorg/engine/coordination/service.py
src/synthorg/engine/coordination/attribution.py

📚 Learning: 2026-03-19T07:12:14.508Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

docs/design/engine.md
src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to **/*.py : Create new objects for immutability, never mutate existing ones; for non-Pydantic internal collections use `copy.deepcopy()` at construction + `MappingProxyType` wrapping for read-only enforcement

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-01T09:09:43.948Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for `dict`/`list` fields

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : For `dict`/`list` fields in frozen Pydantic models, rely on `frozen=True` for field reassignment prevention and `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-04-07T22:12:30.141Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.141Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves; never mix static config fields with mutable runtime fields in one model

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-20T21:44:04.528Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-16T07:22:28.134Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-17T06:43:14.114Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-31T20:07:03.035Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions

Applied to files:

src/synthorg/tools/context/compact_context.py

📚 Learning: 2026-03-17T22:08:13.456Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-14T16:18:57.267Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-16T06:24:56.341Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-15T18:38:44.202Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

src/synthorg/engine/coordination/service.py
src/synthorg/engine/compaction/epistemic.py

📚 Learning: 2026-03-19T11:33:01.580Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

src/synthorg/engine/coordination/service.py
src/synthorg/engine/compaction/epistemic.py

📚 Learning: 2026-03-19T07:13:44.964Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

src/synthorg/engine/coordination/service.py

📚 Learning: 2026-03-15T19:14:27.144Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

src/synthorg/engine/coordination/service.py
src/synthorg/engine/compaction/epistemic.py

📚 Learning: 2026-03-20T11:18:48.128Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

src/synthorg/engine/coordination/service.py
src/synthorg/engine/compaction/epistemic.py

📚 Learning: 2026-03-16T06:24:56.341Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

src/synthorg/engine/compaction/epistemic.py

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

src/synthorg/engine/compaction/epistemic.py

📚 Learning: 2026-03-17T06:43:14.114Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

src/synthorg/engine/compaction/epistemic.py

📚 Learning: 2026-04-02T09:04:54.709Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T09:04:54.709Z
Learning: Applies to tests/**/*.py : Use pytest markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`

Applied to files:

tests/unit/engine/compaction/test_epistemic.py

📚 Learning: 2026-03-15T18:28:13.207Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).

Applied to files:

tests/unit/engine/compaction/test_epistemic.py

📚 Learning: 2026-04-07T22:12:30.142Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to tests/**/*.py : Use markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow` for test classification

Applied to files:

tests/unit/engine/compaction/test_epistemic.py

🔇 Additional comments (21)

scripts/run_affected_tests.py (2)
151-160: Good documentation of xdist behavior.

The expanded docstring clearly explains the rationale for both --dist loadscope and --max-worker-restart=0, which will help future maintainers understand why these options diverge from the default pytest configuration.

170-172: ⚠️ Potential issue | 🟡 Minor

Conflicting --dist modes between script and pyproject.toml requires clarification.

The script adds --dist loadscope (line 171), but pyproject.toml specifies --dist=worksteal in addopts (line 256). When pytest processes both, the script's flag takes precedence, creating an inconsistency:

This script runs tests with loadscope (module-based grouping)

Normal pytest runs (e.g., pytest tests/unit/) use worksteal (dynamic work stealing)

If loadscope is required to resolve the fixture teardown/KeyError issues described in the docstring, update pyproject.toml to use loadscope globally for consistency. Alternatively, add an inline comment explaining why this script intentionally overrides the global setting.

Also, -n 8 is redundant since pyproject.toml already specifies it in addopts.
⛔ Skipped due to learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Pre-push hooks check: mypy type-check (affected modules only) + pytest unit tests (affected modules only) + golangci-lint + go vet + go test (CLI, conditional on `cli/**/*.go`) + eslint-web (web dashboard); foundational module changes (core, config, observability) or conftest changes trigger full runs
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: ALWAYS include `-n 8` when running pytest locally; never run tests sequentially; CI uses `-n auto`
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T18:48:02.489Z
Learning: ALWAYS run pytest with `-n auto` for parallel execution with pytest-xdist; never run tests sequentially
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T22:12:30.142Z
Learning: Applies to tests/**/*.py : 30-second timeout per test (global in `pyproject.toml`); do not add per-file `pytest.mark.timeout(30)` markers; non-default overrides like `timeout(60)` are allowed
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
src/synthorg/engine/coordination/attribution.py (5)

72-103: AgentContribution is missing fields specified in #1123 objectives.

Per the PR objectives for #1123: "Populate AgentContribution with agent_id, subtask_id, outcome (success/partial/failure), failure_category (via infer_failure_category()), and token_usage."

The current model has:

✅ agent_id, subtask_id

❌ outcome enum (success/partial/failure) — replaced by contribution_score

❌ failure_category (the FailureCategory enum) — replaced by failure_attribution (a different Literal type)

❌ token_usage

This deviation means downstream consumers (like PerformanceTracker) cannot access the original FailureCategory enum or token metrics for trend analysis. If this narrowing was intentional, please update the #1123 objectives to reflect the actual design.

226-295: LGTM!

The _score_outcome helper correctly:

Handles pre-execution failures via infer_failure_category with evidence extraction

Returns perfect score (1.0) for successful outcomes without failure attribution

Maps termination reasons to failure attributions for non-success completions

Provides a defensive fallback for unexpected missing result/error states

The or None at line 285 correctly converts empty strings to None, satisfying the model's optional evidence field.

161-223: LGTM!

The build_agent_contributions function correctly:

Builds lookups from routing decisions with both task_id and agent_id mappings

Prefers direct task_id matching (line 199-200) over agent lookup for multi-subtask agent scenarios

Falls back gracefully to agent-based lookup when task_id isn't in routed subtasks

Logs attribution metrics at INFO level with structured kwargs

62-69: LGTM!

The _TERMINATION_TO_ATTRIBUTION mapping correctly omits TerminationReason.COMPLETED since successful outcomes (is_success=True) return early at line 261-266 and never reach the termination-to-attribution lookup. The .get(termination_reason, "direct") fallback at line 274 provides defensive handling.

120-158: LGTM!

The CoordinationResultWithAttribution wrapper correctly:

Uses frozen Pydantic model with allow_inf_nan=False per coding guidelines

Preserves the original CoordinationResult immutably

Provides computed is_success that delegates to the wrapped result

Calculates avg_contribution_score with safe empty-tuple handling

docs/design/engine.md (2)

1449-1450: LGTM!

The section header was appropriately updated to remove the "Phase 8" terminology that conflicted with the 7-phase pipeline, now correctly positioned as a post-pipeline step.

1490-1490: LGTM!

The ACG vocabulary cross-reference correctly reflects the new NodeType enum and TurnRecord.node_types field implementation, upgrading the fidelity from "Partial" to "Strong."

src/synthorg/engine/coordination/service.py (2)

212-248: LGTM!

The post-pipeline attribution building and tracker recording are properly guarded with try/except blocks that:

Re-raise MemoryError and RecursionError per project convention

Log at WARNING with context before continuing

Default contributions to an empty tuple on error, preserving the successful coordination result

This addresses the past review concerns about attribution/tracker failures failing an already-completed coordination run.

106-131: LGTM!

The method signature and docstring correctly document the return type change to CoordinationResultWithAttribution and the new attribution phase (Phase 8) in the pipeline description.

tests/integration/tools/test_factory_integration.py (1)

105-110: LGTM!

The test now combines count verification with specific tool name assertions, which addresses maintainability concerns. The checks for compact_context, read_file, and write_file ensure critical tools are registered.

src/synthorg/tools/context/compact_context.py (1)

73-121: LGTM!

The tool implementation correctly:

Deep-copies the schema for the parameters to prevent mutation

Sanitizes the reason before logging (addressing privacy concerns)

Returns a directive via metadata for the execution loop to process

src/synthorg/engine/compaction/epistemic.py (2)

1-70: LGTM!

The epistemic marker detection implementation is clean:

Precompiled patterns with re.IGNORECASE for case-insensitive matching

count_epistemic_markers correctly counts distinct pattern groups (not raw matches)

Good reference to the arXiv paper justifying the approach

73-138: LGTM!

The preservation logic and sentence extraction are well-implemented:

Complexity-adaptive thresholds match the #1125 spec (COMPLEX/EPIC: 1+, SIMPLE/MEDIUM: 3+)

Separator accounting correctly handles the first sentence (no preceding separator)

Edge case for first-sentence truncation ensures non-empty output when needed

tests/unit/engine/compaction/test_epistemic.py (2)

13-104: LGTM!

Comprehensive test coverage for marker detection:

Tests all five marker groups individually and in combination

Verifies deduplication (same group counted once)

Tests complexity-adaptive thresholds for all Complexity variants

Confirms "no markers → never preserve" behavior

144-154: LGTM!

This test now properly exercises the first-sentence truncation path by using a single long marker sentence that exceeds max_chars, and asserts the exact truncated output.

src/synthorg/engine/compaction/summarizer.py (4)

90-97: LGTM!

The threshold logic correctly implements dual-threshold behavior per #1125:

When agent_controlled=True: uses safety_threshold_percent as the safety net

Otherwise: uses fill_threshold_percent for automatic compaction

The force=True path bypasses the check entirely for agent-initiated requests

235-244: LGTM!

Defensive extraction with sensible defaults:

Chains through getattr to safely navigate context structure

Defaults to Complexity.COMPLEX when complexity is unavailable, which is the conservative choice (preserves more markers at the lower threshold)

280-298: LGTM!

The epistemic marker preservation integrates cleanly:

Checks preservation decision via should_preserve_message with task complexity

Extracts marker sentences (already length-limited to 200 chars by default)

Sanitizes content without re-truncating via max_length=max(len(marker_text), 1)

Falls back to standard 100-char snippet when no markers are worth preserving

314-321: LGTM!

Singular/plural handling is now correct: "1 message" vs "N messages".

docs/design/engine.md

src/synthorg/engine/compaction/summarizer.py

src/synthorg/engine/coordination/service.py

tests/unit/engine/compaction/test_epistemic.py

…ing eval #1123: Execution trace enrichment - Add NodeType enum (LLM_CALL, TOOL_INVOCATION, QUALITY_CHECK, BUDGET_CHECK, STAGNATION_CHECK) to loop_protocol.py - Add node_types field to TurnRecord (auto-derived in make_turn_record) - Add AgentContribution + CoordinationResultWithAttribution wrapper in coordination/attribution.py with build_agent_contributions() factory - Update CoordinationService.coordinate() to return attributed result - Add record_coordination_contributions() to PerformanceTracker - Update all callers (agent_engine, API controller, tests) #1125: Agent-controlled compaction + epistemic marker preservation - Add epistemic marker detection (compaction/epistemic.py) with complexity-adaptive thresholds (COMPLEX/EPIC: 1+, SIMPLE/MEDIUM: 3+) - Rewrite _build_summary() to preserve marker-containing sentences - Add CompactContextTool (tools/context/compact_context.py) for agent-initiated compaction via metadata directive - Add CompactionConfig fields: agent_controlled, safety_threshold_percent, preserve_epistemic_markers with dual-threshold behavior - Add force_compaction() that bypasses threshold check - Register compact_context in build_default_tools() #1113: Versioning infrastructure evaluation - Evaluate WorkflowDefinitionVersion migration to generic VersionSnapshot[T] (favorable: content hash dedup, concurrent write safety, schema consistency) - Create follow-up issues: #1131 (migration), #1132 (config versioning), #1133 (role catalog versioning) Closes #1123 Closes #1125 Closes #1113

… agents

…odeRabbit Critical: - Document 8 new context_budget + 15 new coordination event constants in CLAUDE.md - Add asyncio.Lock to PerformanceTracker for shared mutable dict access Major: - Fix epistemic marker extraction: first sentence exceeding max_chars no longer returns empty string; fix off-by-2 separator accounting for first sentence - Add TerminationReason.PARKED to _TERMINATION_TO_ATTRIBUTION mapping - Guard tracker write in coordination service with try/except post-completion - Fix agent-to-subtask mapping overwrite for multi-subtask agents (use list) - Sanitize reason text in compact_context tool before logging/propagation - Sanitize marker_text in summarizer before appending to summary - Use deepcopy() at construction for CompactContextTool schema - Replace asyncio.gather with TaskGroup for agent resolution in controller - Add missing currency field to frontend CoordinationResultResponse type - Rename total_contribution_score to avg_contribution_score (was computing avg) - Document attribution subsystem in docs/design/engine.md Medium: - Remove unreachable length check in epistemic.py - Fix "from 1 messages" grammar (singular/plural) - Extract attributed.result into local vars in coordination controller - Expand CompactionConfig docstring for dual-threshold agent-controlled mode - Add Protocol types for outcome_result type safety under TYPE_CHECKING - Update ACG cross-reference: NodeType now provides formal node typing - Narrow pytest.raises(Exception) to ValidationError in frozen model tests Minor: - Clarify whitespace stripping assertions in test_epistemic.py - Strengthen force_compaction test to assert result is not None - Refactor force_compaction to delegate to _do_compaction(force=True)

…esult - Bump expected tool count from 14 to 15 in test_factory_integration.py - Wrap mock coordinate result in CoordinationResultWithAttribution

…rashes With the default --dist load, xdist scatters individual tests across workers randomly, causing each worker to repeatedly create and tear down heavy Litestar TestClient apps for different modules. This leads to resource exhaustion and random worker crashes when running the full 15K+ unit suite. --dist loadscope groups tests by module so each worker handles related tests together, dramatically reducing fixture churn. Verified: 15,627 passed, 0 failed (previously 4-8 random failures per run).

- Rename "Per-Agent Attribution (Phase 8)" heading to "Per-Agent Attribution" in design spec (attribution is post-pipeline) - Add logger to epistemic.py for convention consistency - Remove duplicate STARTED log from force_compaction, add forced= kwarg to canonical INFO emission in _do_compaction - Guard build_agent_contributions with try/except in coordinator so attribution failures don't fail completed runs - Replace round-robin subtask mapping with task_id-based lookup (handles multi-subtask agents and out-of-order outcomes) - Wrap _COMPACT_CONTEXT_SCHEMA in MappingProxyType for read-only enforcement; deepcopy from raw dict at construction - Add tool name presence checks to factory integration test - Add test for first-sentence truncation path in epistemic extraction

- Fix misleading force_compaction docstring (now correctly describes force=True delegation instead of zero-percent threshold) - Correct design spec: record_coordination_contributions() is sync (no await points, dict ops are atomic in single-threaded event loop); record_task_metric() is the method guarded by asyncio.Lock - Wire performance_tracker through build_coordinator factory so attribution data is actually recorded (was defaulting to None) - Tighten test_respects_max_chars_default assertion: single sentence exceeding 200 chars is truncated to exactly 200

The test_evaluator_exception_logs_event test only checked capsys.out but structlog may route to stderr when configured by a preceding test in the same xdist worker. Check combined stdout+stderr output.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scripts/run_affected_tests.py (1)
150-176: ⚠️ Potential issue | 🟡 Minor

Document the intentional deviation from project defaults in pyproject.toml.

The script overrides --dist from the project default worksteal (in pyproject.toml) to loadscope. While the docstring explains the rationale—preventing worker crashes from fixture teardown/setup overhead and working around a loadscope-specific xdist scheduler bug—add a comment to clarify this is an intentional deviation from the project's default distribution strategy. This helps developers understand why direct pytest calls may behave differently than tests run through this script.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/run_affected_tests.py` around lines 150 - 176, Add a brief comment in
the _run_pytest function (near the cmd list / where "--dist", "loadscope" is
set) stating that using "--dist loadscope" intentionally deviates from the
project's pyproject.toml default of "worksteal" and explaining in one line that
this choice prevents heavy fixture teardown/setup across workers and works
around a known xdist scheduler issue; reference both the cmd variable and
pyproject.toml so future readers understand this is deliberate.

♻️ Duplicate comments (2)

src/synthorg/engine/coordination/attribution.py (1)
72-103: ⚠️ Potential issue | 🟠 Major

AgentContribution still can't represent the #1123 contract.

The model only preserves score/attribution/evidence, and _score_outcome() only emits 0.0 or 1.0, so partial outcomes can never be represented. outcome, failure_category, and token_usage are therefore dropped before PerformanceTracker.record_coordination_contributions() ever sees them.

Based on learnings: If implementation deviates from the design spec, alert the user and explain why -- user decides whether to proceed or update the spec.

Also applies to: 226-286
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/engine/coordination/attribution.py` around lines 72 - 103,
AgentContribution currently drops outcome, failure_category, and token_usage
because its fields only hold contribution_score/failure_attribution/evidence and
_score_outcome() returns only 0.0 or 1.0; update AgentContribution to include
outcome, failure_category, and token_usage fields (types matching the contract),
adjust _score_outcome() to return/emit partial scores (not just binary 0/1) so
partial outcomes are preserved, and ensure
PerformanceTracker.record_coordination_contributions() accepts and forwards
these new fields instead of truncating them; locate and update the
AgentContribution class, the _score_outcome() implementation, and
record_coordination_contributions() to carry the added fields end-to-end.
src/synthorg/tools/context/compact_context.py (1)
26-57: ⚠️ Potential issue | 🟠 Major

Schema still lacks agent-provided summary field per PR objectives.

Per #1125 objectives: "Provide an agent-invokable compact_context tool allowing agents to supply summaries that replace or supplement automatic compaction." The schema only accepts strategy, preserve_markers, and reason—there's no field for agent-authored summary content. This matches a past review comment that was not marked resolved.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/context/compact_context.py` around lines 26 - 57, The JSON
schema _RAW_SCHEMA for the compact_context tool is missing a field for
agent-provided summary content; add an optional string property (e.g.,
"agent_summary") with sensible constraints (minLength and maxLength) and include
it in properties, update any validation logic that uses _RAW_SCHEMA to accept
and prefer "agent_summary" when present (so functions consuming _RAW_SCHEMA such
as the compaction handler will use agent_summary to replace or supplement
automatic summaries), and ensure additionalProperties remains False so only
defined keys ("strategy","preserve_markers","reason","agent_summary") are
allowed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/engine/loop_helpers.py`:
- Around line 513-518: Derived node_types currently includes
NodeType.TOOL_INVOCATION as soon as response.tool_calls exists, which can
produce false positives when tool calls are later cancelled (e.g.,
clear_last_turn_tool_calls()); fix by delaying or recomputing TOOL_INVOCATION
inclusion based on actual executed/retained tool calls rather than the initial
response.tool_calls: update the logic around the derived/node_types computation
(the derived list, variable node_types and its creation) so it is calculated
after invoker/verification paths, or make clear_last_turn_tool_calls remove or
reset the TOOL_INVOCATION flag from node_types; reference the response object,
derived list, node_types tuple creation, and the clear_last_turn_tool_calls call
to locate and change the code.

In `@src/synthorg/tools/context/compact_context.py`:
- Around line 113-121: The tool emits metadata {"compaction_directive": True,
"strategy": strategy, "preserve_markers": preserve_markers, "reason":
sanitized_reason} via ToolExecutionResult in compact_context.py, but
execute_tool_calls() in src/synthorg/engine/loop_helpers.py currently ignores
metadata; update execute_tool_calls() to inspect each tool_result.metadata for
the "compaction_directive" key and, when present, call the existing
force_compaction(...) helper (or the appropriate compaction entrypoint) with the
provided strategy, preserve_markers, and reason at the turn boundary before
adding the tool result to context; ensure you handle absent metadata safely (use
.get) and only trigger compaction once per directive.

---

Outside diff comments:
In `@scripts/run_affected_tests.py`:
- Around line 150-176: Add a brief comment in the _run_pytest function (near the
cmd list / where "--dist", "loadscope" is set) stating that using "--dist
loadscope" intentionally deviates from the project's pyproject.toml default of
"worksteal" and explaining in one line that this choice prevents heavy fixture
teardown/setup across workers and works around a known xdist scheduler issue;
reference both the cmd variable and pyproject.toml so future readers understand
this is deliberate.

---

Duplicate comments:
In `@src/synthorg/engine/coordination/attribution.py`:
- Around line 72-103: AgentContribution currently drops outcome,
failure_category, and token_usage because its fields only hold
contribution_score/failure_attribution/evidence and _score_outcome() returns
only 0.0 or 1.0; update AgentContribution to include outcome, failure_category,
and token_usage fields (types matching the contract), adjust _score_outcome() to
return/emit partial scores (not just binary 0/1) so partial outcomes are
preserved, and ensure PerformanceTracker.record_coordination_contributions()
accepts and forwards these new fields instead of truncating them; locate and
update the AgentContribution class, the _score_outcome() implementation, and
record_coordination_contributions() to carry the added fields end-to-end.

In `@src/synthorg/tools/context/compact_context.py`:
- Around line 26-57: The JSON schema _RAW_SCHEMA for the compact_context tool is
missing a field for agent-provided summary content; add an optional string
property (e.g., "agent_summary") with sensible constraints (minLength and
maxLength) and include it in properties, update any validation logic that uses
_RAW_SCHEMA to accept and prefer "agent_summary" when present (so functions
consuming _RAW_SCHEMA such as the compaction handler will use agent_summary to
replace or supplement automatic summaries), and ensure additionalProperties
remains False so only defined keys
("strategy","preserve_markers","reason","agent_summary") are allowed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8690bb50-f080-48d3-82b3-2747a483c437

📥 Commits

Reviewing files that changed from the base of the PR and between 2ac69ca and d8a975c.

📒 Files selected for processing (37)

CLAUDE.md
docs/design/engine.md
scripts/run_affected_tests.py
src/synthorg/api/controllers/coordination.py
src/synthorg/engine/__init__.py
src/synthorg/engine/agent_engine.py
src/synthorg/engine/compaction/epistemic.py
src/synthorg/engine/compaction/models.py
src/synthorg/engine/compaction/summarizer.py
src/synthorg/engine/coordination/__init__.py
src/synthorg/engine/coordination/attribution.py
src/synthorg/engine/coordination/factory.py
src/synthorg/engine/coordination/service.py
src/synthorg/engine/loop_helpers.py
src/synthorg/engine/loop_protocol.py
src/synthorg/hr/performance/tracker.py
src/synthorg/observability/events/context_budget.py
src/synthorg/observability/events/coordination.py
src/synthorg/tools/context/__init__.py
src/synthorg/tools/context/compact_context.py
src/synthorg/tools/factory.py
tests/integration/engine/test_coordination_wiring.py
tests/integration/tools/test_factory_integration.py
tests/unit/api/controllers/test_coordination.py
tests/unit/communication/conflict_resolution/test_debate_strategy.py
tests/unit/engine/compaction/test_epistemic.py
tests/unit/engine/compaction/test_models.py
tests/unit/engine/compaction/test_summarizer_markers.py
tests/unit/engine/coordination/test_attribution.py
tests/unit/engine/coordination/test_attribution_factory.py
tests/unit/engine/test_coordination_service.py
tests/unit/tools/context/__init__.py
tests/unit/tools/context/test_compact_context.py
tests/unit/tools/test_factory.py
tests/unit/tools/test_factory_new_categories.py
tests/unit/tools/test_factory_sandbox_wiring.py
web/src/api/types.ts

coderabbitai · 2026-04-08T11:53:51Z

src/synthorg/engine/loop_helpers.py

+    # Auto-derive base node types from response content.
+    derived: list[NodeType] = [NodeType.LLM_CALL]
+    if response.tool_calls:
+        derived.append(NodeType.TOOL_INVOCATION)
+    node_types = tuple(derived) + extra_node_types
+


⚠️ Potential issue | 🟠 Major

TOOL_INVOCATION is recorded before execution is guaranteed.

Line 515 through Line 517 marks NodeType.TOOL_INVOCATION from requested tool calls, but on non-execution paths (e.g., the no-invoker error path that calls clear_last_turn_tool_calls()), node_types is not corrected. This can produce false-positive execution traces.

💡 Proposed fix

def clear_last_turn_tool_calls(turns: list[TurnRecord]) -> None: @@ if turns: last = turns[-1] turns[-1] = last.model_copy( - update={"tool_calls_made": (), "tool_call_fingerprints": ()}, + update={ + "tool_calls_made": (), + "tool_call_fingerprints": (), + "node_types": tuple( + nt for nt in last.node_types if nt != NodeType.TOOL_INVOCATION + ), + }, )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/synthorg/engine/loop_helpers.py` around lines 513 - 518, Derived node_types currently includes NodeType.TOOL_INVOCATION as soon as response.tool_calls exists, which can produce false positives when tool calls are later cancelled (e.g., clear_last_turn_tool_calls()); fix by delaying or recomputing TOOL_INVOCATION inclusion based on actual executed/retained tool calls rather than the initial response.tool_calls: update the logic around the derived/node_types computation (the derived list, variable node_types and its creation) so it is calculated after invoker/verification paths, or make clear_last_turn_tool_calls remove or reset the TOOL_INVOCATION flag from node_types; reference the response object, derived list, node_types tuple creation, and the clear_last_turn_tool_calls call to locate and change the code.

coderabbitai · 2026-04-08T11:53:51Z

src/synthorg/tools/context/compact_context.py

+        return ToolExecutionResult(
+            content=("Compaction directive accepted. Will execute at turn boundary."),
+            metadata={
+                "compaction_directive": True,
+                "strategy": strategy,
+                "preserve_markers": preserve_markers,
+                "reason": sanitized_reason,
+            },
+        )


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for code that reads compaction_directive from tool result metadata rg -n "compaction_directive" --type=py -C 3

Repository: Aureliolo/synthorg

Length of output: 3322

🏁 Script executed:

#!/bin/bash # Check if force_compaction is called anywhere rg -n "force_compaction" --type=py

Repository: Aureliolo/synthorg

Length of output: 1121

🏁 Script executed:

#!/bin/bash # Check the execute_tool_calls function to see if it processes metadata rg -n "def execute_tool_calls" --type=py -A 50

Repository: Aureliolo/synthorg

Length of output: 3817

🏁 Script executed:

#!/bin/bash # Get the full execute_tool_calls function sed -n '254,400p' src/synthorg/engine/loop_helpers.py

Repository: Aureliolo/synthorg

Length of output: 4536

Wire compaction_directive metadata into execution loop.

The tool returns metadata={"compaction_directive": True, ...} but execute_tool_calls() in src/synthorg/engine/loop_helpers.py (lines 317–320) adds tool results to context without inspecting metadata or calling force_compaction(). The directive is emitted but never acted upon. Add logic to read the metadata key and trigger compaction at the appropriate point in the turn boundary.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/synthorg/tools/context/compact_context.py` around lines 113 - 121, The tool emits metadata {"compaction_directive": True, "strategy": strategy, "preserve_markers": preserve_markers, "reason": sanitized_reason} via ToolExecutionResult in compact_context.py, but execute_tool_calls() in src/synthorg/engine/loop_helpers.py currently ignores metadata; update execute_tool_calls() to inspect each tool_result.metadata for the "compaction_directive" key and, when present, call the existing force_compaction(...) helper (or the appropriate compaction entrypoint) with the provided strategy, preserve_markers, and reason at the turn boundary before adding the tool result to context; ensure you handle absent metadata safely (use .get) and only trigger compaction once per directive.

Copilot AI review requested due to automatic review settings April 8, 2026 05:09

Aureliolo had a problem deploying to ci April 8, 2026 05:09 — with GitHub Actions Failure

Copilot started reviewing on behalf of Aureliolo April 8, 2026 05:10 View session

Aureliolo temporarily deployed to cloudflare-preview April 8, 2026 05:10 — with GitHub Actions Inactive

gemini-code-assist bot reviewed Apr 8, 2026

View reviewed changes

Copilot AI reviewed Apr 8, 2026

View reviewed changes

coderabbitai bot requested changes Apr 8, 2026

View reviewed changes

Aureliolo temporarily deployed to ci April 8, 2026 06:39 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview April 8, 2026 06:40 — with GitHub Actions Inactive

coderabbitai bot requested changes Apr 8, 2026

View reviewed changes

Aureliolo temporarily deployed to ci April 8, 2026 08:14 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview April 8, 2026 08:15 — with GitHub Actions Inactive

coderabbitai bot requested changes Apr 8, 2026

View reviewed changes

docs/design/engine.md Outdated Show resolved Hide resolved

src/synthorg/engine/compaction/summarizer.py Show resolved Hide resolved

src/synthorg/engine/coordination/service.py Show resolved Hide resolved

tests/unit/engine/compaction/test_epistemic.py Outdated Show resolved Hide resolved

Aureliolo added 9 commits April 8, 2026 13:06

fix: add None guard for parameters_schema in compact_context test

518abfc

fix: wire tracker integration + add missing test coverage from review…

63b3057

… agents

fix: update integration tests for compact_context tool + attributed r…

b42bbb4

…esult - Bump expected tool count from 14 to 15 in test_factory_integration.py - Wrap mock coordinate result in CoordinationResultWithAttribution

fix: check both stdout and stderr in debate evaluator log test

d8a975c

The test_evaluator_exception_logs_event test only checked capsys.out but structlog may route to stderr when configured by a preceding test in the same xdist worker. Check combined stdout+stderr output.

Aureliolo force-pushed the feat/engine-intelligence-v2 branch from 2ac69ca to d8a975c Compare April 8, 2026 11:40

Aureliolo temporarily deployed to ci April 8, 2026 11:40 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview April 8, 2026 11:41 — with GitHub Actions Inactive

coderabbitai bot requested changes Apr 8, 2026

View reviewed changes

fix: clarify loadscope deviation from pyproject.toml worksteal default

c81954c

Aureliolo temporarily deployed to ci April 8, 2026 12:01 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview April 8, 2026 12:02 — with GitHub Actions Inactive

Aureliolo merged commit ed57dfa into main Apr 8, 2026
33 of 34 checks passed

Aureliolo deleted the feat/engine-intelligence-v2 branch April 8, 2026 12:30

Aureliolo temporarily deployed to cloudflare-preview April 8, 2026 12:30 — with GitHub Actions Inactive

Aureliolo mentioned this pull request Apr 8, 2026

chore(main): release 0.6.5 #1149

Open

		if len(joined) > max_chars:
		return joined[:max_chars] + "..."

-                snippets.append(marker_text)
+                sanitized_marker_text = sanitize_message(
+                    marker_text,
+                    max_length=max(len(marker_text), 1),
+                )
+                snippets.append(sanitized_marker_text)

		# Result should be an AgentContext (not None)
		assert result is None or isinstance(result, AgentContext)

-        # Result should be an AgentContext (not None)
-        assert result is None or isinstance(result, AgentContext)
+        # Forced compaction should still produce a compacted context
+        assert result is not None
+        assert isinstance(result, AgentContext)

Conversation

Aureliolo commented Apr 8, 2026

Summary

#1123: Execution trace enrichment

#1125: Agent-controlled compaction + epistemic marker preservation

#1113: Versioning infrastructure evaluation

Test plan

Review coverage

Uh oh!

coderabbitai bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

github-actions bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Snapshot Warnings

Scanned Files

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot commented Apr 8, 2026 •

edited

Loading

github-actions bot commented Apr 8, 2026 •

edited

Loading

codecov bot commented Apr 8, 2026 •

edited

Loading