Skip to content

feat: implement quality scoring Layers 2+3 -- LLM judge and human override#1057

Merged
Aureliolo merged 12 commits intomainfrom
feat/quality-scoring-l2-l3
Apr 4, 2026
Merged

feat: implement quality scoring Layers 2+3 -- LLM judge and human override#1057
Aureliolo merged 12 commits intomainfrom
feat/quality-scoring-l2-l3

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

Implements quality scoring Layers 2 and 3 as specified in the Agents design page (SS8.3, D2). Layer 1 (CI signals) was already implemented.

Layer 2: LLM Judge (LlmJudgeQualityStrategy)

  • Small-model LLM judge evaluating task output against acceptance criteria
  • Structured JSON scoring with rationale
  • Prompt injection defense via delimiters
  • Cost tracking via CostTracker
  • Graceful degradation on failure (zero-confidence fallback)

Layer 3: Human Override (QualityOverrideStore + API)

  • REST API endpoints: GET/POST/DELETE /agents/{agent_id}/quality/override
  • RBAC guards (CEO/Manager for write, read access for GET)
  • Expiration support (1-365 days or indefinite)
  • Highest priority in composite -- short-circuits other layers

Composite Strategy (CompositeQualityStrategy)

  • Configurable CI/LLM weights (default 0.4/0.6, must sum to 1.0)
  • Parallel CI + LLM scoring via asyncio.TaskGroup
  • Human override short-circuit at highest priority
  • RetryExhaustedError re-raised to preserve engine fallback chains

Dashboard UI (QualityScoreOverride)

  • Override form with score slider, reason, expiration slider
  • Active override display with applied-by, dates, reason
  • Clear confirmation dialog
  • Error state handling (distinguishes 404 from 503)

Configuration (PerformanceConfig)

  • quality_judge_model, quality_judge_provider, quality_ci_weight, quality_llm_weight
  • Weight-sum validation, provider-requires-model validator, windows min_length

Other Changes

  • Extracted _BaseOverride from QualityOverride/CollaborationOverride (DRY)
  • Added PERF_JUDGE_COST_RECORDING_FAILED event constant
  • Added aclose() to PerformanceTracker for background task cleanup
  • Override store capacity bound (max_overrides, default 10,000)
  • Defensive pop(key, None) for override eviction
  • Fixed _build_prompt brace escaping bug (garbled criteria text)
  • Fixed wrong log event on composite failure path
  • Updated CLAUDE.md, docs/design/agents.md, README.md

Test Plan

  • 290 unit tests covering all new modules (13,534 total, all passing)
  • 2,342 web dashboard tests (all passing)
  • mypy strict: clean
  • ruff lint + format: clean
  • ESLint (zero warnings): clean

Review Coverage

Pre-reviewed by 17 specialized agents, 45 findings addressed:

  • code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter
  • comment-analyzer, type-design-analyzer, logging-audit, resilience-audit
  • conventions-enforcer, security-reviewer, frontend-reviewer, design-token-audit
  • api-contract-drift, async-concurrency-reviewer, test-quality-reviewer
  • docs-consistency, issue-resolution-verifier

Closes #230

…rride

Layer 2: LlmJudgeQualityStrategy using a small model from a different
provider family than the agent being scored. Evaluates task output
against acceptance criteria descriptions. Cost tracked via CostTracker
with LLMCallCategory.SYSTEM. Graceful degradation on LLM failure.

Layer 3: QualityController at /agents/{agent_id}/quality with
POST/GET/DELETE /override endpoints. QualityOverrideStore (in-memory,
at most one active override per agent with expiration support).
Guarded by require_ceo_or_manager for writes.

CompositeQualityStrategy combines CI signal (40%), LLM judge (60%),
and human override (highest priority, confidence=1.0). Configurable
weights via PerformanceConfig. Falls back to CI-only when LLM is
unavailable.

Helper _build_performance_tracker() wires the composite strategy
for callers that want the full layer stack.

Closes #230
- Fix wrong exception type in _build_performance_tracker (KeyError ->
  DriverNotRegisteredError) -- was dead catch block
- Extract _resolve_llm_judge_strategy to bring functions under 50 lines
- Extract _build_result and _try_record_cost from LlmJudgeQualityStrategy.score
- Wrap _record_cost in try/except so cost failure does not discard valid score
- Add weight validation to CompositeQualityStrategy.__init__ (reject negatives)
- Add Field(description=) to QualityOverrideResponse for OpenAPI docs
- Update CLAUDE.md Package Structure (hr/, api/ descriptions)
- Update docs/design/agents.md D2 to reflect all 3 layers implemented
- Add config weight validation tests (parametrized valid/invalid combos)
- Tighten confidence assertions (exact 0.8/0.5 instead of > 0.0)
- Add API request body validation tests (invalid payloads -> 400)
- Add brace escaping test for prompt injection defense
- Add cost recording resilience test (failure does not discard score)
- Extract named confidence constants

Pre-reviewed by 6 agents, 14 findings addressed
QualityScoreOverride component on agent detail page:
- Shows active override with StatPill metadata display
- Form with SliderField (0-10) + InputField (reason) for applying overrides
- ConfirmDialog for clearing overrides
- Storybook stories with MSW mock handlers

API client: quality.ts endpoint module following collaboration.ts pattern.

Pre-reviewed by 6 agents, 14 findings addressed
Pre-reviewed by 17 agents, 45 findings addressed:

Source fixes:
- Fix _build_prompt garbling criteria with curly braces (llm_judge)
- Fix wrong log event PERF_COMPOSITE_SCORED on failure path
- Add RetryExhaustedError re-raise in llm_judge, composite, tracker
- Add weight sum + finiteness validation in CompositeQualityStrategy
- Extract _BaseOverride from QualityOverride/CollaborationOverride
- Add ClassVar annotation on _WEIGHT_TOLERANCE
- Add quality_judge_provider requires model validator
- Add windows min_length=1 constraint
- Add max_overrides capacity bound to QualityOverrideStore
- Replace del with pop(key, None) for defensive eviction
- Add aclose() for background task cleanup in tracker
- Add PERF_JUDGE_COST_RECORDING_FAILED event constant
- Add ValueError/OverflowError to _parse_llm_response except
- Use TaskGroup for parallel CI + LLM scoring
- Extract confidence_discount as named parameter
- Keep _model as NotBlankStr (no str downcast)

Docs/docstrings:
- Add events.performance to CLAUDE.md logging examples
- Add UnauthorizedError to set_override Raises docstring
- Add 4 new quality fields to PerformanceConfig Attributes
- Update set_override and list_overrides docstrings
- Fix weight validation docstring to match constraints

Tests:
- Fix test_braces_in_criteria_escaped (was validating the bug)
- Fix test_expired_override_falls_through weak assertion
- Fix test_no_override_store bare assertion
- Rename test_breakdown_contains_rationale to match actual assertion

Frontend:
- Fix fetchOverride error handling (distinguish 404 from 503)
- Use Zustand selector pattern for toast store
- Add aria-label on Clear button
- Add expiration slider field to override form
- Fix clearQualityOverride type annotation (204 no body)
Copilot AI review requested due to automatic review settings April 4, 2026 00:56
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e6f01cd5-0768-4d5d-9517-764db94a0a00

📥 Commits

Reviewing files that changed from the base of the PR and between 3353343 and 7face85.

📒 Files selected for processing (2)
  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)

Files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff)

Files:

  • src/synthorg/hr/performance/composite_quality_strategy.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Models: use @computed_field for derived values instead of storing + validating redundant fields (e.g., TokenUsage.total_tokens)
Models: use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction, bootstrap, and error reporting code)
Variable name: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object crea...

Files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async: asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml — do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. (vendor-agnostic naming)
Property-based testing: Python uses Hypothesis (@given + @settings). Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline).
Flaky tests: NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) — it is cancellation-safe and carries no timing assumptions.

Files:

  • tests/unit/hr/performance/test_composite_quality_strategy.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/hr/performance/test_composite_quality_strategy.py
🧠 Learnings (23)
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to **/*.py : PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-01T09:39:21.584Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:39:21.584Z
Learning: Applies to **/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T18:52:05.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T18:52:05.142Z
Learning: Applies to **/*.py : Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/**/*.py : Async concurrency: prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-01T17:49:14.133Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T17:49:14.133Z
Learning: Applies to src/synthorg/{providers,engine}/**/*.py : Retryable errors are `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`; non-retryable errors raise immediately; `RetryExhaustedError` signals all retries failed

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
📚 Learning: 2026-03-16T19:13:34.746Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:34.746Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry. RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/**/*.py : Errors: handle explicitly, never silently swallow

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Non-retryable errors raise immediately without retry

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Never implement retry logic in provider subclasses or calling code — it is handled automatically by `BaseCompletionProvider` with `RetryConfig` and `RateLimiterConfig` per-provider

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry. `RetryExhaustedError` signals that all retries failed — the engine layer catches this to trigger fallback chains. Rate limiter respects `RateLimitError.retry_after` from providers — automatically pauses future requests.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
🔇 Additional comments (12)
src/synthorg/hr/performance/composite_quality_strategy.py (6)

1-32: LGTM!

Module docstring, imports, logger setup, and TYPE_CHECKING guards are well-structured. The observability events are imported from the domain-specific module as required.


34-109: LGTM!

Constructor validation is thorough: finite checks, non-negative checks, sum-to-1.0 with tolerance, and discount range validation. The docstring is comprehensive and follows Google style.


116-204: LGTM!

The score() method correctly:

  • Short-circuits on human override (Layer 3 priority)
  • Skips zero-weight layers to avoid unnecessary calls
  • Uses asyncio.TaskGroup for parallel CI+LLM execution
  • Unwraps MemoryError, RecursionError, and RetryExhaustedError from ExceptionGroup
  • Returns consistent strategy_name="composite" in LLM-only success path

206-233: LGTM!

Override check correctly short-circuits with confidence=1.0 and logs the state transition at INFO level.


235-278: LGTM!

The _try_llm() method correctly:

  • Re-raises MemoryError, RecursionError (using PEP 758 comma syntax)
  • Re-raises RetryExhaustedError for engine fallback chain
  • Logs non-critical failures at WARNING with exc_info=True
  • Treats zero confidence as graceful failure

280-331: LGTM!

The _combine() method correctly implements weighted score combination with proper clamping, confidence discounting, breakdown construction, and debug logging.

tests/unit/hr/performance/test_composite_quality_strategy.py (6)

1-51: LGTM!

Imports and helper functions are well-structured. _make_strategy() and _make_failing_strategy() provide clean test fixtures for mocking QualityScoringStrategy.


53-138: LGTM!

Good coverage of override behavior: active override short-circuit, expired override fallthrough, and no override store configured.


140-265: LGTM!

Comprehensive weighted combination tests including edge cases (1.0, 0.0) and (0.0, 1.0). Strategy name assertions are properly included.


267-343: LGTM!

Breakdown and confidence tests properly verify that layer scores appear in breakdown and that both-layer confidence exceeds CI-only confidence.


345-401: LGTM!

Critical regression tests for exception propagation: RetryExhaustedError, MemoryError, and RecursionError all propagate unwrapped from the TaskGroup. This locks down the system-error propagation convention.


403-462: LGTM!

Constructor validation tests thoroughly cover edge cases: negative weights, NaN values, sum validation, and discount range validation.


Walkthrough

Adds layered quality scoring: CompositeQualityStrategy (CI + optional LLM + human override), LlmJudgeQualityStrategy, QualityOverride model, and an in-memory QualityOverrideStore. Exposes human override CRUD at GET/POST/DELETE /agents/{agent_id}/quality/override via QualityController. PerformanceConfig gains judge/provider and CI/LLM weight fields. create_app auto-wires a PerformanceTracker built with the composite strategy and override store; tracker gained quality_override_store and aclose(). Shutdown now closes the tracker. New observability events, frontend endpoints, UI component, Storybook stories, and unit tests were added; docs updated.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title accurately summarizes the main change: implementing quality scoring Layers 2 and 3 (LLM judge and human override), which is the core objective of this changeset.
Description check ✅ Passed The pull request description is comprehensive and directly related to the changeset, providing detailed context for Layer 2, Layer 3, composite strategy, dashboard UI, configuration, and test coverage.
Linked Issues check ✅ Passed The changeset fully implements all objectives from issue #230: LLM judge integration, human override API and UI, composite strategy with configurable weights, cost tracking, RBAC guards, and graceful degradation. All coding requirements are satisfied.
Out of Scope Changes check ✅ Passed All changes are in-scope. The PR implements Layers 2+3 as specified, includes necessary infrastructure updates (PerformanceConfig, PerformanceTracker.aclose), documentation updates (CLAUDE.md, design/agents.md), and comprehensive test coverage. No unrelated or tangential changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 89.29% which is sufficient. The required threshold is 40.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 7face85.
Ensure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a three-layered composite quality scoring system for agents, combining CI signals, an LLM-based judge, and human overrides. The implementation includes a new API controller, an in-memory override store, and corresponding frontend components. Feedback highlights several critical syntax errors where multiple exceptions are caught using invalid Python 3 syntax, which must be corrected to avoid runtime failures. Other recommendations include replacing magic numbers with constants, logging LLM judge rationales for auditability, and improving the override store's capacity management by pruning expired entries.

Comment on lines +199 to +200
except MemoryError, RecursionError:
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The syntax except E1, E2: is invalid in Python 3 and will raise a SyntaxError. Multiple exceptions must be grouped in a tuple.

Suggested change
except MemoryError, RecursionError:
raise
except (MemoryError, RecursionError):
raise

Comment on lines +135 to +136
except MemoryError, RecursionError:
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The syntax except E1, E2: is invalid in Python 3 and will raise a SyntaxError. Multiple exceptions must be grouped in a tuple.

        except (MemoryError, RecursionError):
            raise

Comment on lines +202 to +203
except MemoryError, RecursionError:
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The syntax except E1, E2: is invalid in Python 3 and will raise a SyntaxError. Multiple exceptions must be grouped in a tuple.

        except (MemoryError, RecursionError):
            raise

Comment on lines 505 to 506
except MemoryError, RecursionError:
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The syntax except E1, E2: is invalid in Python 3 and will raise a SyntaxError. Multiple exceptions must be grouped in a tuple.

        except (MemoryError, RecursionError):
            raise

Comment on lines 524 to 525
except MemoryError, RecursionError:
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The syntax except E1, E2: is invalid in Python 3 and will raise a SyntaxError. Multiple exceptions must be grouped in a tuple.

        except (MemoryError, RecursionError):
            raise

else:
# CI-only fallback.
combined_score = round(ci_result.score, 4)
confidence = round(ci_result.confidence * 0.7, 4)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The confidence multiplier 0.7 is a magic number. Consider defining it as a named constant or making it configurable via PerformanceConfig to improve maintainability and clarity, similar to how confidence_discount is handled for the weighted case.

Comment on lines +159 to +183
def _build_result(
self,
agent_id: NotBlankStr,
task_id: NotBlankStr,
clamped_score: float,
cost_usd: float,
acceptance_criteria: tuple[AcceptanceCriterion, ...],
) -> QualityScoreResult:
"""Build and log the quality score result."""
result = QualityScoreResult(
score=round(clamped_score, 4),
strategy_name=NotBlankStr(self.name),
breakdown=(("llm_score", round(clamped_score, 4)),),
confidence=_CONFIDENCE_WITH_CRITERIA
if acceptance_criteria
else _CONFIDENCE_WITHOUT_CRITERIA,
)
logger.info(
PERF_LLM_JUDGE_COMPLETED,
agent_id=agent_id,
task_id=task_id,
score=result.score,
cost_usd=cost_usd,
)
return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The rationale provided by the LLM judge is currently parsed but discarded. Since the PR description highlights rationale as a key feature of Layer 2, it should be included in the completion log for auditability and transparency.

    def _build_result(
        self,
        agent_id: NotBlankStr,
        task_id: NotBlankStr,
        clamped_score: float,
        cost_usd: float,
        rationale: str,
        acceptance_criteria: tuple[AcceptanceCriterion, ...],
    ) -> QualityScoreResult:
        """Build and log the quality score result."""
        result = QualityScoreResult(
            score=round(clamped_score, 4),
            strategy_name=NotBlankStr(self.name),
            breakdown=(("llm_score", round(clamped_score, 4)),),
            confidence=_CONFIDENCE_WITH_CRITERIA
            if acceptance_criteria
            else _CONFIDENCE_WITHOUT_CRITERIA,
        )
        logger.info(
            PERF_LLM_JUDGE_COMPLETED,
            agent_id=agent_id,
            task_id=task_id,
            score=result.score,
            cost_usd=cost_usd,
            rationale=rationale,
        )
        return result

Comment on lines +58 to +65
agent_key = str(override.agent_id)
if (
agent_key not in self._overrides
and len(self._overrides) >= self._max_overrides
):
msg = f"Override store capacity reached ({self._max_overrides})"
raise ValueError(msg)
self._overrides[agent_key] = override
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The capacity check prevents new overrides from being set when the limit is reached, but it doesn't account for expired entries that are still in the internal dictionary. This could lead to a situation where the store is 'full' of expired data. Consider triggering a pruning of expired entries when the capacity is reached before raising an error.

Suggested change
agent_key = str(override.agent_id)
if (
agent_key not in self._overrides
and len(self._overrides) >= self._max_overrides
):
msg = f"Override store capacity reached ({self._max_overrides})"
raise ValueError(msg)
self._overrides[agent_key] = override
agent_key = str(override.agent_id)
if (
agent_key not in self._overrides
and len(self._overrides) >= self._max_overrides
):
# Attempt to prune expired entries before failing
now = datetime.now(UTC)
self._overrides = {
k: v for k, v in self._overrides.items()
if v.expires_at is None or v.expires_at > now
}
if len(self._overrides) >= self._max_overrides:
msg = f"Override store capacity reached ({self._max_overrides})"
raise ValueError(msg)
self._overrides[agent_key] = override

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 4, 2026

Codecov Report

❌ Patch coverage is 91.82390% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.72%. Comparing base (67650c5) to head (7face85).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/api/app.py 51.51% 15 Missing and 1 partial ⚠️
src/synthorg/api/controllers/quality.py 91.25% 6 Missing and 1 partial ⚠️
src/synthorg/hr/performance/tracker.py 80.00% 1 Missing and 1 partial ⚠️
...thorg/hr/performance/llm_judge_quality_strategy.py 98.21% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1057      +/-   ##
==========================================
+ Coverage   90.70%   90.72%   +0.02%     
==========================================
  Files         703      707       +4     
  Lines       39939    40254     +315     
  Branches     3982     4014      +32     
==========================================
+ Hits        36225    36522     +297     
- Misses       3018     3033      +15     
- Partials      696      699       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements “quality scoring” Layers 2+3 in the performance subsystem: an LLM-judge strategy and a human override layer, combined via a composite strategy; adds REST + dashboard UI to view/apply/clear per-agent quality overrides.

Changes:

  • Added backend quality override store + REST controller (/agents/{agent_id}/quality/override) and related observability events/config.
  • Added LLM judge strategy and composite quality strategy (override short-circuit, CI+LLM weighted combination).
  • Added dashboard UI component + API client endpoints (and Storybook stories) for managing overrides.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
web/src/pages/agents/QualityScoreOverride.tsx New UI for viewing/setting/clearing a quality override with form validation and error handling.
web/src/pages/agents/QualityScoreOverride.stories.tsx Storybook stories for override/no-override/expiration states using MSW.
web/src/pages/AgentDetailPage.tsx Wires the new override section into the agent detail page.
web/src/api/endpoints/quality.ts Adds API client helpers for GET/POST/DELETE quality override endpoints.
tests/unit/hr/performance/test_quality_override_store.py Unit tests for override store behavior + QualityOverride model validation.
tests/unit/hr/performance/test_performance_config.py Tests for CI/LLM weight-sum validation in PerformanceConfig.
tests/unit/hr/performance/test_llm_judge_quality_strategy.py Unit tests for LLM judge scoring, prompt construction, resilience, and cost tracking behavior.
tests/unit/hr/performance/test_composite_quality_strategy.py Unit tests for override short-circuit, weighted combination, breakdown, and confidence behavior.
tests/unit/hr/performance/conftest.py Adds make_quality_override() factory for tests.
tests/unit/api/controllers/test_quality.py Controller tests for override CRUD, RBAC, validation, and 503 when store missing.
src/synthorg/observability/events/performance.py Adds new performance event constants for quality overrides + LLM judge + composite scoring.
src/synthorg/hr/performance/tracker.py Adds quality_override_store wiring + aclose() for background task cleanup + preserves RetryExhaustedError.
src/synthorg/hr/performance/quality_override_store.py New in-memory quality override store with expiration + capacity bound.
src/synthorg/hr/performance/models.py Extracts shared _BaseOverride and adds QualityOverride.
src/synthorg/hr/performance/llm_judge_quality_strategy.py New Layer 2 LLM judge strategy with JSON parsing, injection delimiters, and optional cost recording.
src/synthorg/hr/performance/config.py Adds judge model/provider + CI/LLM weights and validators; adds windows min_length.
src/synthorg/hr/performance/composite_quality_strategy.py New composite strategy: override short-circuit + parallel CI/LLM scoring + weighted combine.
src/synthorg/api/controllers/quality.py New REST controller for quality overrides with RBAC and expiration support.
src/synthorg/api/controllers/init.py Exposes QualityController via controller registry.
src/synthorg/api/app.py Adds helper functions to resolve/wire LLM judge + composite performance tracker (currently not used).
docs/design/agents.md Updates design doc to reflect implementation details and config fields/endpoints.
CLAUDE.md Updates repo inventory + logging guidance mention for new performance events.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +124 to +142
async with asyncio.TaskGroup() as tg:
ci_task = tg.create_task(
self._ci_strategy.score(
agent_id=agent_id,
task_id=task_id,
task_result=task_result,
acceptance_criteria=acceptance_criteria,
),
)
llm_task = tg.create_task(
self._try_llm(
agent_id=agent_id,
task_id=task_id,
task_result=task_result,
acceptance_criteria=acceptance_criteria,
),
)
ci_result = ci_task.result()
llm_result = llm_task.result()
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asyncio.TaskGroup() wraps exceptions into an ExceptionGroup, which will prevent a RetryExhaustedError raised by ci_strategy.score() (or any other task) from propagating as a bare RetryExhaustedError. This breaks the stated requirement that RetryExhaustedError be re-raised to preserve upstream fallback chains. Consider adding an except* RetryExhaustedError (and re-raising) around the TaskGroup block, or avoiding TaskGroup for the part that must preserve exact exception types.

Suggested change
async with asyncio.TaskGroup() as tg:
ci_task = tg.create_task(
self._ci_strategy.score(
agent_id=agent_id,
task_id=task_id,
task_result=task_result,
acceptance_criteria=acceptance_criteria,
),
)
llm_task = tg.create_task(
self._try_llm(
agent_id=agent_id,
task_id=task_id,
task_result=task_result,
acceptance_criteria=acceptance_criteria,
),
)
ci_result = ci_task.result()
llm_result = llm_task.result()
ci_result, llm_result = await asyncio.gather(
self._ci_strategy.score(
agent_id=agent_id,
task_id=task_id,
task_result=task_result,
acceptance_criteria=acceptance_criteria,
),
self._try_llm(
agent_id=agent_id,
task_id=task_id,
task_result=task_result,
acceptance_criteria=acceptance_criteria,
),
)

Copilot uses AI. Check for mistakes.
Comment on lines +558 to +612
def _build_performance_tracker(
*,
cost_tracker: CostTracker | None = None,
provider_registry: ProviderRegistry | None = None,
perf_config: PerformanceConfig | None = None,
) -> PerformanceTracker:
"""Build a PerformanceTracker with composite quality strategy.

Always wires a ``QualityOverrideStore`` (human overrides are free).
Delegates LLM judge resolution to :func:`_resolve_llm_judge_strategy`.

Args:
cost_tracker: Optional cost tracker for judge cost recording.
provider_registry: Provider registry for LLM judge calls.
perf_config: Performance configuration (default config if None).

Returns:
Configured performance tracker.
"""
from synthorg.hr.performance.ci_quality_strategy import ( # noqa: PLC0415
CISignalQualityStrategy,
)
from synthorg.hr.performance.composite_quality_strategy import ( # noqa: PLC0415
CompositeQualityStrategy,
)
from synthorg.hr.performance.quality_override_store import ( # noqa: PLC0415
QualityOverrideStore,
)

cfg = perf_config or PerformanceConfig()
quality_override_store = QualityOverrideStore()

llm_strategy = (
_resolve_llm_judge_strategy(
cfg,
provider_registry=provider_registry,
cost_tracker=cost_tracker,
)
if provider_registry is not None
else None
)

composite = CompositeQualityStrategy(
ci_strategy=CISignalQualityStrategy(),
llm_strategy=llm_strategy,
override_store=quality_override_store,
ci_weight=cfg.quality_ci_weight,
llm_weight=cfg.quality_llm_weight,
)

return PerformanceTracker(
quality_strategy=composite,
config=cfg,
quality_override_store=quality_override_store,
)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_performance_tracker() / _resolve_llm_judge_strategy() are introduced but never used by create_app(). As a result, running the server via run_server() (which calls create_app(config=config) with no injected tracker) will still leave AppState.performance_tracker unset and the new quality override endpoints will return 503. Consider auto-wiring performance_tracker inside create_app() when it is None (using cost_tracker + provider_registry + a default PerformanceConfig).

Copilot uses AI. Check for mistakes.
Comment on lines +143 to +154
async def aclose(self) -> None:
"""Cancel and await all pending background tasks.

Should be called during application shutdown to prevent
``RuntimeError: Task was destroyed but it is pending!``
warnings.
"""
tasks = list(self._background_tasks)
for t in tasks:
t.cancel()
await asyncio.gather(*tasks, return_exceptions=True)
self._background_tasks.clear()
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PerformanceTracker.aclose() is added to cancel/await background tasks, but there is no shutdown hook calling it (e.g., from the Litestar on_shutdown lifecycle). Without wiring this into app shutdown, the background sampling tasks tracked in _background_tasks can still trigger "Task was destroyed but it is pending!" warnings. Consider integrating await performance_tracker.aclose() into the API shutdown lifecycle (preferably using the same _try_stop/best-effort pattern used for other services).

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +65
if (
agent_key not in self._overrides
and len(self._overrides) >= self._max_overrides
):
msg = f"Override store capacity reached ({self._max_overrides})"
raise ValueError(msg)
self._overrides[agent_key] = override
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the store reaches max_overrides, set_override() raises a plain ValueError. In the API controller this will currently map to the generic 500 handler, which is a rough failure mode for an expected operational condition (capacity reached). Consider raising a domain-specific exception and mapping it to a 503/429/409 (or catching it in QualityController.set_override() and returning an appropriate ApiError) so callers get a stable, intentional status code and message.

Copilot uses AI. Check for mistakes.
Comment on lines +361 to +371
record = CostRecord(
agent_id=agent_id,
task_id=task_id,
provider=NotBlankStr("quality-judge"),
model=NotBlankStr(self._model),
input_tokens=usage[0],
output_tokens=usage[1],
cost_usd=cost_usd,
timestamp=datetime.now(UTC),
call_category=LLMCallCategory.SYSTEM,
)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cost records use provider to mean the actual LLM provider name (see CostRecord.provider docs), but _record_cost() hard-codes provider="quality-judge". This will misattribute spend and make provider-level budget reporting inaccurate. Consider passing the selected provider name into LlmJudgeQualityStrategy (from _resolve_llm_judge_strategy() where the provider is chosen) and using that value for CostRecord.provider.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/api/controllers/quality.py`:
- Around line 216-225: Add structured INFO logs for successful override state
changes: after calling store.set_override(override) add a processLogger.info (or
logger.info) entry that includes key fields from QualityOverride (agent_id,
applied_by, applied_at, expires_at, score, reason) and an explicit message like
"override set"; similarly after store.clear_override(...) (the clear path around
where clear_override is invoked) add an INFO log including agent_id, applied_by,
applied_at, and any expires/previous state context and message "override
cleared". Ensure you use the same logger used elsewhere in this module and emit
structured key/value pairs (not freeform text) so audit tooling can parse them.

In `@src/synthorg/hr/performance/composite_quality_strategy.py`:
- Around line 123-145: The current implementation always schedules both
self._ci_strategy.score and self._try_llm even when their weights are zero;
change it to skip scheduling any layer whose weight is 0.0 (e.g., check
self.ci_weight and self.llm_weight) so you only create tasks for contributing
layers, await only those tasks, and then call self._combine with the actual
results (use None or a sensible default for any skipped layer so _combine
continues to work). Ensure you still run tasks in parallel when both weights > 0
and preserve exception propagation/handling semantics from the original
TaskGroup approach.

In `@src/synthorg/hr/performance/llm_judge_quality_strategy.py`:
- Around line 272-278: The logger.warning call using PERF_LLM_JUDGE_FAILED must
not include the raw model output; remove the raw_content argument from the
logger.warning invocation in llm_judge_quality_strategy (the
PERF_LLM_JUDGE_FAILED log) so the warning only contains structured fields like
agent_id, task_id and reason="parse_error". Also search the surrounding function
for any other logging that references raw_content and remove or redact those
usages to ensure no model output is written to logs.
- Around line 223-239: The criterion descriptions are interpolated verbatim into
the raw-data block (built in the acceptance_criteria loop that populates
criteria_lines and then criteria_list before calling _SYSTEM_PROMPT.format), so
a malicious c.description containing the delimiter sequences (e.g., "---END
CRITERIA---" or "---BEGIN CRITERIA---") can break out; fix by
sanitizing/escaping those delimiter substrings in c.description before appending
(e.g., replace or escape any occurrences of the BEGIN/END delimiter strings and
other control delimiters) so the f"- {status} {c.description}" line uses the
escaped description, then continue to build criteria_list and pass it to
_SYSTEM_PROMPT.format.

In `@src/synthorg/hr/performance/quality_override_store.py`:
- Around line 49-65: set_override currently enforces capacity using
len(self._overrides) which counts expired entries; before raising, sweep expired
entries out of self._overrides (or compute active_count by checking each stored
override's active status) so capacity is enforced only against active overrides.
Specifically, in set_override (and before the len check) iterate self._overrides
items and remove entries that are expired (or call
get_active_override/QualityOverride.is_active for each agent to determine
activity), then re-evaluate len(self._overrides) (or active_count) against
self._max_overrides and only then raise ValueError or insert the new override.

In `@src/synthorg/hr/performance/tracker.py`:
- Around line 505-508: The code in _maybe_sample re-raises RetryExhaustedError
(and similarly at the other catch site), causing unobserved task failures for
the fire-and-forget sampler; instead, catch RetryExhaustedError in the sampler
task, log the exception via the sampler/engine logger with context, and return
(or otherwise swallow) so the task fails gracefully without propagating a
RetryExhaustedError up the event loop; keep the special-case re-raise for
MemoryError and RecursionError in the same except clause but replace the `except
RetryExhaustedError: raise` branches with logging+return behavior in the
_maybe_sample sampler and the corresponding block at the second location (lines
around 524-527).

In `@tests/unit/hr/performance/test_composite_quality_strategy.py`:
- Around line 180-223: Add a regression test that ensures
CompositeQualityStrategy re-raises RetryExhaustedError from the llm_strategy
rather than falling back to CI: create a test (e.g.,
test_llm_retry_exhausted_passthrough) that constructs ci = _make_strategy(...),
llm = AsyncMock() with llm.name set and llm.score.side_effect =
RetryExhaustedError(...), then call composite =
CompositeQualityStrategy(ci_strategy=ci, llm_strategy=llm, ci_weight=0.4,
llm_weight=0.6) and await composite.score(...) asserting that the call raises
RetryExhaustedError; reference CompositeQualityStrategy, llm.score, and
RetryExhaustedError to locate the relevant code paths.

In `@tests/unit/hr/performance/test_llm_judge_quality_strategy.py`:
- Around line 322-323: The test currently uses
cost_tracker.record.assert_called_once() which doesn't verify an AsyncMock was
awaited; change this to cost_tracker.record.assert_awaited_once() to ensure the
coroutine was awaited, and replace extracting call_args with
cost_tracker.record.await_args (or await_args[0]) to obtain the recorded cost
into cost_record; update any assertions that reference
cost_tracker.record.call_args accordingly so they use the await_args result.

In `@web/src/pages/agents/QualityScoreOverride.tsx`:
- Around line 110-121: The Apply/Clear write controls in
QualityScoreOverride.tsx are rendered unconditionally (buttons using Button,
Trash2, onClick handlers like setClearDialogOpen and the Apply handler) even for
users who lack manager/CEO permissions; update the component to check the
current user's permission (e.g., a canManageOverrides or isManagerOrCEO boolean
from auth/context or props) and only render the interactive write controls (the
Clear Button that calls setClearDialogOpen, the Apply button and any inputs that
call setOverride or submit handlers) when that permission is true, otherwise
render the active-override display in read-only mode; apply the same gating to
the other write/control blocks currently at the Apply/override UI (the blocks
around the Apply button and clear-confirmation handlers).
- Around line 42-63: The fetchOverride async logic can overwrite state when
agentId changes; modify fetchOverride (and the useEffect) to guard against stale
responses by resetting state before each fetch (call setLoading(true);
setOverride(null); setFetchError(null)) and ignore or cancel responses from
previous requests—e.g., create a per-call cancel token or an AbortController and
pass it to getQualityOverride (or check a locally captured requestId/agentId
before calling setOverride/setLoading), and ensure the finally block only
updates loading when the response is not stale; update useEffect to clean
up/cancel the in-flight request when agentId changes or the component unmounts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8285e1d7-bc0b-401d-adff-dcd65a66598c

📥 Commits

Reviewing files that changed from the base of the PR and between 67650c5 and 02b6c13.

📒 Files selected for processing (22)
  • CLAUDE.md
  • docs/design/agents.md
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/models.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/observability/events/performance.py
  • tests/unit/api/controllers/test_quality.py
  • tests/unit/hr/performance/conftest.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
  • tests/unit/hr/performance/test_llm_judge_quality_strategy.py
  • tests/unit/hr/performance/test_performance_config.py
  • tests/unit/hr/performance/test_quality_override_store.py
  • web/src/api/endpoints/quality.ts
  • web/src/pages/AgentDetailPage.tsx
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Agent
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)

Files:

  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/hr/performance/config.py
  • tests/unit/hr/performance/test_performance_config.py
  • tests/unit/hr/performance/conftest.py
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/models.py
  • tests/unit/hr/performance/test_llm_judge_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
  • tests/unit/api/controllers/test_quality.py
  • src/synthorg/observability/events/performance.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_quality_override_store.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff)

Files:

  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/models.py
  • src/synthorg/observability/events/performance.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/models.py
  • src/synthorg/observability/events/performance.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Models: use @computed_field for derived values instead of storing + validating redundant fields (e.g., TokenUsage.total_tokens)
Models: use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction, bootstrap, and error reporting code)
Variable name: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object crea...

Files:

  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/models.py
  • src/synthorg/observability/events/performance.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async: asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml — do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. (vendor-agnostic naming)
Property-based testing: Python uses Hypothesis (@given + @settings). Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline).
Flaky tests: NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) — it is cancellation-safe and carries no timing assumptions.

Files:

  • tests/unit/hr/performance/test_performance_config.py
  • tests/unit/hr/performance/conftest.py
  • tests/unit/hr/performance/test_llm_judge_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
  • tests/unit/api/controllers/test_quality.py
  • tests/unit/hr/performance/test_quality_override_store.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/hr/performance/test_performance_config.py
  • tests/unit/hr/performance/conftest.py
  • tests/unit/hr/performance/test_llm_judge_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
  • tests/unit/api/controllers/test_quality.py
  • tests/unit/hr/performance/test_quality_override_store.py
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground, bg-card, text-accent, etc.) or CSS variables (var(--so-*)) for colors. NEVER hardcode hex values or rgba() in .tsx/.ts files
Use font-sans or font-mono for typography (maps to Geist tokens). NEVER set fontFamily directly in styles
Use density-aware spacing tokens (p-card, gap-section-gap, gap-grid-gap) or standard Tailwind spacing. NEVER hardcode pixel values for layout spacing
Use token variables (var(--so-shadow-card-hover), border-border, border-bright) for shadows and borders
Do NOT use rgba() with hardcoded values -- use design token variables
Do NOT hardcode Framer Motion transition durations -- use @/lib/motion presets
Use /// <reference types="vite/client" /> to enable side-effect imports for CSS in Storybook 10 with TypeScript 6
Ensure all TypeScript code passes type checking via npm --prefix web run type-check
Bundle size must be auditable via npm --prefix web run analyze (generates stats.html treemap)
Achieve Lighthouse performance audit target of 90+ via npm --prefix web run lighthouse

web/src/**/*.{ts,tsx}: ALWAYS reuse existing components from web/src/components/ui/ before creating new ones
Web Dashboard Design System (MANDATORY): See web/CLAUDE.md for the full component inventory, design token rules, and post-training references (TS6, Storybook 10).

Files:

  • web/src/pages/AgentDetailPage.tsx
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
  • web/src/api/endpoints/quality.ts
web/src/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Run ESLint with zero warnings enforced via npm --prefix web run lint

Line length: 88 characters (ruff) for JavaScript/TypeScript files

Files:

  • web/src/pages/AgentDetailPage.tsx
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
  • web/src/api/endpoints/quality.ts
web/src/pages/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/pages/**/*.{ts,tsx}: Organize page-scoped sub-components in pages/<page-name>/ subdirectories rather than inline in page files
Use lazy-loaded page components in web/src/pages/ with React Router and route guards for auth/setup
Use @AnimatedPresence wrapper for page transitions in React Router (keyed by route)

Files:

  • web/src/pages/AgentDetailPage.tsx
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
web/src/**/*.{ts,tsx,css}

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions — use design tokens and @/lib/motion presets

Files:

  • web/src/pages/AgentDetailPage.tsx
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
  • web/src/api/endpoints/quality.ts
web/src/**/*.stories.tsx

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.stories.tsx: Use parameters.a11y.test: 'error' | 'todo' | 'off' for a11y testing in Storybook 10 to enforce WCAG compliance (replaces old .element and .manual)
Import from storybook/test (not @storybook/test) and storybook/actions (not @storybook/addon-actions) in Storybook 10

Files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
🧠 Learnings (68)
📚 Learning: 2026-03-31T14:28:28.895Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:28:28.895Z
Learning: Web dashboard: see `web/CLAUDE.md` for commands, design system, and component inventory

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to web/src/**/*.{ts,tsx} : Web Dashboard Design System (MANDATORY): See `web/CLAUDE.md` for the full component inventory, design token rules, and post-training references (TS6, Storybook 10).

Applied to files:

  • CLAUDE.md
  • web/src/pages/agents/QualityScoreOverride.tsx
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/api/**/*.py : API package (api/): Litestar REST + WebSocket with controllers, guards, channels, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint, provider management endpoint (CRUD + test + presets), backup endpoint, RFC 9457 structured errors, AppState hot-reload slots, service auto-wiring (Phase 1 at construction, Phase 2 on startup), lifecycle helpers

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/app.py
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/synthorg/api/**/*.py : Litestar API must include setup wizard, auth/, auto-wiring, and lifecycle management

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/controllers/__init__.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/api/**/*.py : REST API: Litestar framework, controllers with guards, channels for WebSocket, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint. RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail, ProblemDetail, CATEGORY_TITLES, category_title, category_type_uri, content negotiation).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/models.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/models.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/api/**/*.py : Use Litestar for REST + WebSocket API. Controllers, guards, channels, JWT + API key + WS ticket auth, RFC 9457 structured errors.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/app.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.

Applied to files:

  • src/synthorg/api/controllers/__init__.py
  • src/synthorg/api/app.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • src/synthorg/hr/performance/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/hr/performance/config.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`

Applied to files:

  • src/synthorg/hr/performance/config.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/hr/performance/config.py
  • src/synthorg/hr/performance/models.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/hr/performance/config.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`.

Applied to files:

  • src/synthorg/hr/performance/config.py
  • src/synthorg/api/app.py
📚 Learning: 2026-03-27T12:44:29.466Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T12:44:29.466Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from `web/src/components/ui/` (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem) before creating new ones

Applied to files:

  • web/src/pages/AgentDetailPage.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from web/src/components/ui/ (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast/ToastContainer, Skeleton variants, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem, Drawer, form fields, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor) before creating new components

Applied to files:

  • web/src/pages/AgentDetailPage.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.{tsx,ts} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup, Drawer, InputField, SelectField, SliderField, ToggleField, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor, TagInput, MetadataGrid, ProjectStatusBadge, ContentTypeBadge)

Applied to files:

  • web/src/pages/AgentDetailPage.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.{ts,tsx} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones; refer to design system inventory (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, etc.)

Applied to files:

  • web/src/pages/AgentDetailPage.tsx
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • docs/design/agents.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Persistence backend: pluggable PersistenceBackend protocol in `src/synthorg/persistence/`, SQLite initial, SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`

Applied to files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-04-01T17:49:14.133Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T17:49:14.133Z
Learning: Applies to src/synthorg/{providers,engine}/**/*.py : Retryable errors are `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`; non-retryable errors raise immediately; `RetryExhaustedError` signals all retries failed

Applied to files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors: `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately.

Applied to files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code — it's handled by the base class.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: import from `storybook/test` (not `storybook/test`), `storybook/actions` (not `storybook/addon-actions`)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-03T18:36:51.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-03T18:36:51.730Z
Learning: Applies to web/src/**/*.stories.tsx : Import from `storybook/test` (not `storybook/test`) and `storybook/actions` (not `storybook/addon-actions`) in Storybook 10

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-03T18:36:51.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-03T18:36:51.730Z
Learning: Applies to web/src/components/ui/**/*.stories.tsx : Create a `.stories.tsx` file alongside each new component with all states (default, hover, loading, error, empty)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.stories.{ts,tsx} : Storybook 10: Use storybook/test (not storybook/test) and storybook/actions (not storybook/addon-actions) import paths

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : Create new shared components in `web/src/components/ui/` with `.stories.tsx` Storybook file covering all states (default, hover, loading, error, empty)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: Import from `storybook/test` instead of `storybook/test`

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: Use `parameters.a11y.test: 'error' | 'todo' | 'off'` for a11y testing configuration (replaces old `.element` and `.manual`)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-03T18:36:51.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-03T18:36:51.730Z
Learning: Applies to web/src/**/*.stories.tsx : Use `parameters.a11y.test: 'error' | 'todo' | 'off'` for a11y testing in Storybook 10 to enforce WCAG compliance (replaces old `.element` and `.manual`)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.stories.{ts,tsx} : Storybook 10: Use parameters.a11y.test: 'error' | 'todo' | 'off' for a11y testing (replaces old .element and .manual); set globally in preview.tsx to enforce WCAG compliance on all stories

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.stories.{ts,tsx} : Storybook 10: Use parameters.backgrounds.options (object keyed by name) + initialGlobals.backgrounds.value for background options (replaces old default + values array)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-03T18:36:51.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-03T18:36:51.730Z
Learning: Applies to web/**/storybook/**/*.{ts,tsx} : Use type-safe Storybook config with `defineMain` from `storybook/react-vite/node` and `definePreview` from `storybook/react-vite`, including explicit `framework` field

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-16T19:13:34.746Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:34.746Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry. RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry. `RetryExhaustedError` signals that all retries failed — the engine layer catches this to trigger fallback chains. Rate limiter respects `RateLimitError.retry_after` from providers — automatically pauses future requests.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Non-retryable errors raise immediately without retry

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/providers/**/*.py : Rate limiter respects RateLimitError.retry_after from providers — automatically pauses future requests.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/providers/**/*.py : Rate limiter respects `RateLimitError.retry_after` from providers — automatically pauses future requests

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/**/*.py : Models: use `computed_field` for derived values instead of storing + validating redundant fields (e.g., `TokenUsage.total_tokens`)

Applied to files:

  • src/synthorg/hr/performance/models.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/observability/events/performance.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/observability/events/performance.py
🔇 Additional comments (11)
src/synthorg/hr/performance/config.py (1)

79-132: LGTM! Well-structured quality configuration fields.

The new quality judge and weight configuration follows project conventions correctly:

  • NotBlankStr | None for optional identifier fields
  • Proper ge/le constraints on weights
  • Tolerance-based float comparison in _validate_quality_weights_sum avoids floating-point precision issues
  • Dependency validator ensures quality_judge_provider requires quality_judge_model
CLAUDE.md (1)

92-100: Documentation accurately reflects the new quality scoring implementation.

The package structure updates correctly document:

  • API controller for quality overrides
  • HR package quality scoring layers (CI + LLM + human override)
  • New observability event constant
src/synthorg/api/controllers/__init__.py (1)

30-30: LGTM! Controller registration follows established patterns.

QualityController is correctly:

  • Imported from the new module
  • Added to ALL_CONTROLLERS tuple (passed as class reference to Litestar)
  • Exported in __all__

Also applies to: 74-74, 102-102

web/src/pages/AgentDetailPage.tsx (1)

84-86: LGTM! Clean integration of the quality override component.

The new section correctly:

  • Uses ErrorBoundary level="section" consistent with other page sections
  • Conditionally renders only when agent.id is available
  • Follows the existing page layout pattern
tests/unit/hr/performance/test_performance_config.py (1)

1-61: LGTM! Comprehensive test coverage for weight validation.

The tests properly cover:

  • Valid weight combinations that sum to 1.0 (including edge cases 0.0/1.0 and 1.0/0.0)
  • Invalid combinations that don't sum to 1.0
  • Default values verification

Good use of @pytest.mark.parametrize for similar test cases.

tests/unit/hr/performance/conftest.py (1)

117-135: LGTM! Helper follows established file patterns.

The make_quality_override helper mirrors the existing make_collaboration_override structure exactly, maintaining consistency within the test fixtures. The explicit NotBlankStr wrapping is technically redundant (Pydantic coerces strings), but matches the file's established convention.

docs/design/agents.md (1)

336-342: LGTM! Design documentation accurately reflects the implementation.

The updated D2 section correctly documents:

  • CompositeQualityStrategy orchestrating all three layers
  • Human override short-circuiting with highest priority
  • API endpoint path matching QualityController
  • Config fields matching PerformanceConfig
src/synthorg/api/app.py (2)

499-555: LGTM! Well-structured LLM judge resolution with graceful degradation.

The function correctly:

  • Returns None when quality_judge_model is not configured
  • Handles DriverNotRegisteredError gracefully with a warning
  • Falls back to the first available provider when no explicit provider is configured
  • Logs at appropriate levels (warning for failures, info for success)

558-612: LGTM! Clean composition of quality scoring strategies.

The _build_performance_tracker function correctly:

  • Always wires QualityOverrideStore (human overrides have no cost)
  • Conditionally includes LLM judge only when provider_registry is available
  • Passes through config weights to CompositeQualityStrategy
  • Exposes the override store via PerformanceTracker for API access
web/src/pages/agents/QualityScoreOverride.stories.tsx (1)

16-21: Add the required a11y.test setting to this story meta.

This Storybook 10 story file must declare an accessibility policy per WCAG compliance requirements.

♿ Suggested change
 const meta = {
   title: 'Agents/QualityScoreOverride',
   component: QualityScoreOverride,
+  parameters: {
+    a11y: { test: 'error' },
+  },
   decorators: [(Story) => <div className="max-w-lg p-6"><Story /></div>],
   args: { agentId: 'agent-001' },
 } satisfies Meta<typeof QualityScoreOverride>
src/synthorg/hr/performance/llm_judge_quality_strategy.py (1)

149-150: Reject NaN/Infinity scores before clamping.

json.loads() accepts non-finite numbers, and the current max(0.0, min(10.0, llm_score)) clamp can turn malformed output into 10.0 instead of triggering the zero-confidence fallback. Guard this with math.isfinite(llm_score) and fail parsing when the value is not finite.

Also applies to lines 263–280.

Comment on lines +216 to +225
override = QualityOverride(
agent_id=NotBlankStr(agent_id),
score=data.score,
reason=data.reason,
applied_by=NotBlankStr(str(auth_user.user_id)),
applied_at=now,
expires_at=expires_at,
)
store.set_override(override)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Log successful override mutations at INFO.

store.set_override() and store.clear_override() both change override state, but only failure paths emit events today. Please add structured INFO logs for successful set/clear operations so these audit-relevant changes are observable. As per coding guidelines, "All state transitions must log at INFO."

Also applies to: 257-260

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/api/controllers/quality.py` around lines 216 - 225, Add
structured INFO logs for successful override state changes: after calling
store.set_override(override) add a processLogger.info (or logger.info) entry
that includes key fields from QualityOverride (agent_id, applied_by, applied_at,
expires_at, score, reason) and an explicit message like "override set";
similarly after store.clear_override(...) (the clear path around where
clear_override is invoked) add an INFO log including agent_id, applied_by,
applied_at, and any expires/previous state context and message "override
cleared". Ensure you use the same logger used elsewhere in this module and emit
structured key/value pairs (not freeform text) so audit tooling can parse them.

…nd CodeRabbit

Backend:
- Remove RetryExhaustedError re-raise in _maybe_sample background task (silent failure)
- Wire _build_performance_tracker into create_app (was dead code)
- Gracefully degrade RetryExhaustedError in composite _try_llm (was ExceptionGroup crash)
- Add confidence_discount and ci_only_confidence_discount validation
- Extract 0.7 magic number to ci_only_confidence_discount parameter
- Skip zero-weight layers in composite strategy
- Sanitize delimiter strings in criteria descriptions (prompt injection)
- Add math.isfinite check for NaN/Inf LLM scores
- Rename _SYSTEM_PROMPT to _JUDGE_PROMPT (naming accuracy)
- Use actual provider name in CostRecord (was hardcoded quality-judge)
- Remove raw_content from parse error warning log
- Sweep expired entries before capacity check in override store
- Add max_overrides lower-bound validation
- Catch ValueError from store capacity in API controller (was raw 500)
- Wire aclose() into app shutdown lifecycle
- Move _WEIGHT_TOLERANCE ClassVar before validators
- Add PAIR_PROGRAMMER exclusion comment on quality override guards
- Remove unused RetryExhaustedError imports

Docs:
- Restore prompt profiles to engine/ description in CLAUDE.md
- Restore PROMPT_PROFILE_SELECTED event example in logging section
- Fix docstrings: _try_llm, score(), module, set_override Raises section
- Fix misleading 'silently evicted' wording
- Add constraint notes to PerformanceConfig docstring

Frontend:
- Add stale-response guard on agentId change (fetchOverride)
- Gate write controls to CEO/Manager roles only
- Reset score state after successful override submission
- Add <ApiResponse<null>> generic on apiClient.delete
- Add Error story and a11y.test parameter to Storybook

Tests (14 new):
- RetryExhaustedError graceful degradation in composite
- CompositeQualityStrategy weight and discount validation (6 cases)
- max_overrides capacity limit (4 cases: new agent, replace, sweep, zero)
- PerformanceConfig provider-requires-model validation (3 cases)
- Use assert_awaited_once for AsyncMock cost tracking
- Tighter expires_at assertion (7-day window check)
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/api/lifecycle.py (1)

353-387: ⚠️ Potential issue | 🟠 Major

Pass performance_tracker to _safe_shutdown() calls.

The function accepts performance_tracker as a parameter, but both call sites in app.py (lines 387–396 and 428–437) omit it. Since performance_tracker is available in scope and initialized at line 765–769, it should be passed to both _safe_shutdown() calls so that PerformanceTracker.aclose() executes during shutdown.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/api/lifecycle.py` around lines 353 - 387, The two calls to
_safe_shutdown(...) in app.py currently omit the performance_tracker argument so
PerformanceTracker.aclose() never runs; update both call sites (the
_safe_shutdown invocation around lines 387–396 and the one around 428–437) to
pass the performance_tracker variable as the corresponding parameter so that
performance_tracker is forwarded into lifecycle._safe_shutdown (which will call
performance_tracker.aclose()); ensure the argument order matches the
_safe_shutdown signature used in src/synthorg/api/lifecycle.py.
♻️ Duplicate comments (2)
src/synthorg/hr/performance/composite_quality_strategy.py (1)

169-184: ⚠️ Potential issue | 🟠 Major

A zero-weight CI layer still runs in the LLM-only branch.

Line 176 still calls self._ci_strategy.score(...) when self._ci_weight == 0.0, so the disabled CI layer still adds latency and can affect the final breakdown/confidence. If _try_llm() returns None, Line 184 also falls back to a CI-only score even though CI was explicitly configured off. Please skip CI entirely in this branch and let _combine() represent the real LLM-only / no-result outcome.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/hr/performance/composite_quality_strategy.py` around lines 169 -
184, The LLM-only branch still calls self._ci_strategy.score(...) even when
self._ci_weight == 0.0; update the branch around _try_llm(...) so that if
self._ci_weight is zero you do not call self._ci_strategy.score and you pass a
CI-missing/disabled sentinel into self._combine(...) (or None) so _combine(_)
reflects an LLM-only or no-result outcome; also ensure that when _try_llm()
returns None you do NOT fall back to a CI-only score when _ci_weight == 0.0 —
skip invoking _ci_strategy.score entirely in that configuration.
src/synthorg/api/controllers/quality.py (1)

229-249: ⚠️ Potential issue | 🟠 Major

Successful override mutations are still missing INFO audit logs.

Line 230 and Line 273 both change override state, but the success paths return without any corresponding INFO event. That leaves the audit trail with failures only. Please emit structured INFO logs on both set and clear; the DELETE path will likely need the caller and current override details to make that record useful. As per coding guidelines, "All state transitions must log at INFO."

Also applies to: 273-282

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/api/controllers/quality.py` around lines 229 - 249, The success
paths for override state changes lack INFO audit logs — after calling
store.set_override(override) (used to return the ApiResponse with
QualityOverrideResponse) and in the DELETE/clear flow (where override is removed
around the clear_override/clear path), emit structured logger.info entries
recording the state transition; include fields like path="quality/override" or
path="quality/override/clear", action="override_set" or "override_cleared",
agent_id=override.agent_id (or agent_id variable), score=override.score,
reason=override.reason, applied_by=override.applied_by,
applied_at=override.applied_at and expires_at=override.expires_at (for set) and
include the caller or requester id for the delete path so the audit has both who
performed the action and the prior override details; place the log immediately
after the successful store.set_override(override) call and immediately before
returning in the clear/delete flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/hr/performance/composite_quality_strategy.py`:
- Around line 233-250: The generic except block around calling
self._llm_strategy.score is currently swallowing RetryExhaustedError; add an
explicit except RetryExhaustedError: raise before the broad except Exception so
that the RetryExhaustedError sentinel from the judge client propagates instead
of being logged as PERF_LLM_JUDGE_FAILED and returning None; keep the existing
except MemoryError, RecursionError: raise intact and leave the logger.warning
PERF_LLM_JUDGE_FAILED path for other exceptions.

---

Outside diff comments:
In `@src/synthorg/api/lifecycle.py`:
- Around line 353-387: The two calls to _safe_shutdown(...) in app.py currently
omit the performance_tracker argument so PerformanceTracker.aclose() never runs;
update both call sites (the _safe_shutdown invocation around lines 387–396 and
the one around 428–437) to pass the performance_tracker variable as the
corresponding parameter so that performance_tracker is forwarded into
lifecycle._safe_shutdown (which will call performance_tracker.aclose()); ensure
the argument order matches the _safe_shutdown signature used in
src/synthorg/api/lifecycle.py.

---

Duplicate comments:
In `@src/synthorg/api/controllers/quality.py`:
- Around line 229-249: The success paths for override state changes lack INFO
audit logs — after calling store.set_override(override) (used to return the
ApiResponse with QualityOverrideResponse) and in the DELETE/clear flow (where
override is removed around the clear_override/clear path), emit structured
logger.info entries recording the state transition; include fields like
path="quality/override" or path="quality/override/clear", action="override_set"
or "override_cleared", agent_id=override.agent_id (or agent_id variable),
score=override.score, reason=override.reason, applied_by=override.applied_by,
applied_at=override.applied_at and expires_at=override.expires_at (for set) and
include the caller or requester id for the delete path so the audit has both who
performed the action and the prior override details; place the log immediately
after the successful store.set_override(override) call and immediately before
returning in the clear/delete flow.

In `@src/synthorg/hr/performance/composite_quality_strategy.py`:
- Around line 169-184: The LLM-only branch still calls
self._ci_strategy.score(...) even when self._ci_weight == 0.0; update the branch
around _try_llm(...) so that if self._ci_weight is zero you do not call
self._ci_strategy.score and you pass a CI-missing/disabled sentinel into
self._combine(...) (or None) so _combine(_) reflects an LLM-only or no-result
outcome; also ensure that when _try_llm() returns None you do NOT fall back to a
CI-only score when _ci_weight == 0.0 — skip invoking _ci_strategy.score entirely
in that configuration.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: edcf08e4-1aab-4128-8eec-3b0ac564a8e9

📥 Commits

Reviewing files that changed from the base of the PR and between 02b6c13 and 24946d7.

📒 Files selected for processing (18)
  • CLAUDE.md
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/api/lifecycle.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/tracker.py
  • tests/unit/api/controllers/test_analytics.py
  • tests/unit/api/controllers/test_quality.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
  • tests/unit/hr/performance/test_llm_judge_quality_strategy.py
  • tests/unit/hr/performance/test_performance_config.py
  • tests/unit/hr/performance/test_quality_override_store.py
  • web/src/api/endpoints/quality.ts
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)

Files:

  • src/synthorg/api/lifecycle.py
  • tests/unit/api/controllers/test_analytics.py
  • tests/unit/hr/performance/test_performance_config.py
  • src/synthorg/hr/performance/tracker.py
  • tests/unit/hr/performance/test_llm_judge_quality_strategy.py
  • src/synthorg/api/app.py
  • tests/unit/api/controllers/test_quality.py
  • tests/unit/hr/performance/test_quality_override_store.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff)

Files:

  • src/synthorg/api/lifecycle.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/lifecycle.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Models: use @computed_field for derived values instead of storing + validating redundant fields (e.g., TokenUsage.total_tokens)
Models: use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction, bootstrap, and error reporting code)
Variable name: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object crea...

Files:

  • src/synthorg/api/lifecycle.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/config.py
  • src/synthorg/hr/performance/quality_override_store.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async: asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml — do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. (vendor-agnostic naming)
Property-based testing: Python uses Hypothesis (@given + @settings). Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline).
Flaky tests: NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) — it is cancellation-safe and carries no timing assumptions.

Files:

  • tests/unit/api/controllers/test_analytics.py
  • tests/unit/hr/performance/test_performance_config.py
  • tests/unit/hr/performance/test_llm_judge_quality_strategy.py
  • tests/unit/api/controllers/test_quality.py
  • tests/unit/hr/performance/test_quality_override_store.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/api/controllers/test_analytics.py
  • tests/unit/hr/performance/test_performance_config.py
  • tests/unit/hr/performance/test_llm_judge_quality_strategy.py
  • tests/unit/api/controllers/test_quality.py
  • tests/unit/hr/performance/test_quality_override_store.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground, bg-card, text-accent, etc.) or CSS variables (var(--so-*)) for colors. NEVER hardcode hex values or rgba() in .tsx/.ts files
Use font-sans or font-mono for typography (maps to Geist tokens). NEVER set fontFamily directly in styles
Use density-aware spacing tokens (p-card, gap-section-gap, gap-grid-gap) or standard Tailwind spacing. NEVER hardcode pixel values for layout spacing
Use token variables (var(--so-shadow-card-hover), border-border, border-bright) for shadows and borders
Do NOT use rgba() with hardcoded values -- use design token variables
Do NOT hardcode Framer Motion transition durations -- use @/lib/motion presets
Use /// <reference types="vite/client" /> to enable side-effect imports for CSS in Storybook 10 with TypeScript 6
Ensure all TypeScript code passes type checking via npm --prefix web run type-check
Bundle size must be auditable via npm --prefix web run analyze (generates stats.html treemap)
Achieve Lighthouse performance audit target of 90+ via npm --prefix web run lighthouse

web/src/**/*.{ts,tsx}: ALWAYS reuse existing components from web/src/components/ui/ before creating new ones
Web Dashboard Design System (MANDATORY): See web/CLAUDE.md for the full component inventory, design token rules, and post-training references (TS6, Storybook 10).

Files:

  • web/src/api/endpoints/quality.ts
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
web/src/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Run ESLint with zero warnings enforced via npm --prefix web run lint

Line length: 88 characters (ruff) for JavaScript/TypeScript files

Files:

  • web/src/api/endpoints/quality.ts
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
web/src/**/*.{ts,tsx,css}

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions — use design tokens and @/lib/motion presets

Files:

  • web/src/api/endpoints/quality.ts
  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
web/src/**/*.stories.tsx

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.stories.tsx: Use parameters.a11y.test: 'error' | 'todo' | 'off' for a11y testing in Storybook 10 to enforce WCAG compliance (replaces old .element and .manual)
Import from storybook/test (not @storybook/test) and storybook/actions (not @storybook/addon-actions) in Storybook 10

Files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
web/src/pages/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/pages/**/*.{ts,tsx}: Organize page-scoped sub-components in pages/<page-name>/ subdirectories rather than inline in page files
Use lazy-loaded page components in web/src/pages/ with React Router and route guards for auth/setup
Use @AnimatedPresence wrapper for page transitions in React Router (keyed by route)

Files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
  • web/src/pages/agents/QualityScoreOverride.tsx
🧠 Learnings (79)
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/communication/**/*.py : Communication package (communication/): message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution; meeting/ subpackage for meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator

Applied to files:

  • src/synthorg/api/lifecycle.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • src/synthorg/api/lifecycle.py
  • src/synthorg/api/app.py
  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

  • src/synthorg/api/lifecycle.py
  • src/synthorg/api/app.py
  • CLAUDE.md
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

  • src/synthorg/api/lifecycle.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/api/app.py
  • CLAUDE.md
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

  • src/synthorg/api/lifecycle.py
  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/api/app.py
  • CLAUDE.md
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to **/*.py : PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-01T17:49:14.133Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T17:49:14.133Z
Learning: Applies to src/synthorg/{providers,engine}/**/*.py : Retryable errors are `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`; non-retryable errors raise immediately; `RetryExhaustedError` signals all retries failed

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/api/app.py
📚 Learning: 2026-03-16T19:13:34.746Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:34.746Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry. RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry.

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`

Applied to files:

  • src/synthorg/hr/performance/tracker.py
  • src/synthorg/api/app.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Non-retryable errors raise immediately without retry

Applied to files:

  • src/synthorg/hr/performance/tracker.py
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: import from `storybook/test` (not `storybook/test`), `storybook/actions` (not `storybook/addon-actions`)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-03T18:36:51.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-03T18:36:51.730Z
Learning: Applies to web/src/**/*.stories.tsx : Import from `storybook/test` (not `storybook/test`) and `storybook/actions` (not `storybook/addon-actions`) in Storybook 10

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-03T18:36:51.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-03T18:36:51.730Z
Learning: Applies to web/src/components/ui/**/*.stories.tsx : Create a `.stories.tsx` file alongside each new component with all states (default, hover, loading, error, empty)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.stories.{ts,tsx} : Storybook 10: Use storybook/test (not storybook/test) and storybook/actions (not storybook/addon-actions) import paths

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : Create new shared components in `web/src/components/ui/` with `.stories.tsx` Storybook file covering all states (default, hover, loading, error, empty)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: Use `parameters.a11y.test: 'error' | 'todo' | 'off'` for a11y testing configuration (replaces old `.element` and `.manual`)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: Import from `storybook/test` instead of `storybook/test`

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-03T18:36:51.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-03T18:36:51.730Z
Learning: Applies to web/src/**/*.stories.tsx : Use `parameters.a11y.test: 'error' | 'todo' | 'off'` for a11y testing in Storybook 10 to enforce WCAG compliance (replaces old `.element` and `.manual`)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.stories.{ts,tsx} : Storybook 10: Use parameters.a11y.test: 'error' | 'todo' | 'off' for a11y testing (replaces old .element and .manual); set globally in preview.tsx to enforce WCAG compliance on all stories

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.stories.{ts,tsx} : Storybook 10: Use parameters.backgrounds.options (object keyed by name) + initialGlobals.backgrounds.value for background options (replaces old default + values array)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-04-03T18:36:51.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-03T18:36:51.730Z
Learning: Applies to web/**/storybook/**/*.{ts,tsx} : Use type-safe Storybook config with `defineMain` from `storybook/react-vite/node` and `definePreview` from `storybook/react-vite`, including explicit `framework` field

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.stories.tsx
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • src/synthorg/api/app.py
  • CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • src/synthorg/api/app.py
  • CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Persistence backend: pluggable PersistenceBackend protocol in `src/synthorg/persistence/`, SQLite initial, SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors: `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code — it's handled by the base class.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from web/src/components/ui/ (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast/ToastContainer, Skeleton variants, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem, Drawer, form fields, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor) before creating new components

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.tsx
📚 Learning: 2026-03-27T12:44:29.466Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T12:44:29.466Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from `web/src/components/ui/` (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem) before creating new ones

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.{tsx,ts} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup, Drawer, InputField, SelectField, SliderField, ToggleField, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor, TagInput, MetadataGrid, ProjectStatusBadge, ContentTypeBadge)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.tsx
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to web/src/**/*.{ts,tsx} : Web Dashboard Design System (MANDATORY): See `web/CLAUDE.md` for the full component inventory, design token rules, and post-training references (TS6, Storybook 10).

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.tsx
  • CLAUDE.md
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.{ts,tsx} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones; refer to design system inventory (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, etc.)

Applied to files:

  • web/src/pages/agents/QualityScoreOverride.tsx
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/hr/performance/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • src/synthorg/hr/performance/config.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.

Applied to files:

  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit

Applied to files:

  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-31T20:07:03.035Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions

Applied to files:

  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising

Applied to files:

  • src/synthorg/hr/performance/llm_judge_quality_strategy.py
📚 Learning: 2026-03-31T14:28:28.895Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:28:28.895Z
Learning: Web dashboard: see `web/CLAUDE.md` for commands, design system, and component inventory

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T20:07:03.035Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to {docs/design/operations.md,src/synthorg/providers/presets.py,.claude/**/*.{md,yml,yaml}} : Vendor names may appear only in: (1) Operations design page, (2) `.claude/` skill/agent files, (3) third-party import paths, (4) provider presets (`src/synthorg/providers/presets.py`), (5) tests using `test-provider`, `test-small-001`, etc.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/api/**/*.py : API package (api/): Litestar REST + WebSocket with controllers, guards, channels, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint, provider management endpoint (CRUD + test + presets), backup endpoint, RFC 9457 structured errors, AppState hot-reload slots, service auto-wiring (Phase 1 at construction, Phase 2 on startup), lifecycle helpers

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/synthorg/api/**/*.py : Litestar API must include setup wizard, auth/, auto-wiring, and lifecycle management

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/api/**/*.py : REST API: Litestar framework, controllers with guards, channels for WebSocket, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint. RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail, ProblemDetail, CATEGORY_TITLES, category_title, category_type_uri, content negotiation).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/api/**/*.py : Use Litestar for REST + WebSocket API. Controllers, guards, channels, JWT + API key + WS ticket auth, RFC 9457 structured errors.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO

Applied to files:

  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO level

Applied to files:

  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level

Applied to files:

  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Use structured logging: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`

Applied to files:

  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-04-02T12:07:44.443Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T12:07:44.443Z
Learning: Applies to src/synthorg/**/*.py : Always use structured logging: `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`

Applied to files:

  • src/synthorg/api/controllers/quality.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-01T09:39:21.584Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:39:21.584Z
Learning: Applies to **/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T18:52:05.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T18:52:05.142Z
Learning: Applies to **/*.py : Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/**/*.py : Async concurrency: prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/synthorg/hr/performance/composite_quality_strategy.py (2)

245-255: ⚠️ Potential issue | 🟠 Major

Don't swallow RetryExhaustedError here.

Line 247's broad fallback still converts retry exhaustion into None, so the upstream provider fallback chain never sees the sentinel. Add an explicit except RetryExhaustedError: raise before except Exception:.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/hr/performance/composite_quality_strategy.py` around lines 245 -
255, The broad except in the try/except block in composite_quality_strategy.py
currently catches RetryExhaustedError and returns None, hiding the sentinel; add
an explicit "except RetryExhaustedError: raise" (referencing the
RetryExhaustedError symbol) immediately before the generic "except Exception:"
that logs PERF_LLM_JUDGE_FAILED via logger.warning (keeping the existing
agent_id/task_id/exc_info fields) so retry exhaustion bubbles up to the upstream
fallback chain.

169-185: ⚠️ Potential issue | 🟠 Major

Keep CI disabled in the LLM-only branch.

Lines 179-185 still call self._ci_strategy.score() when ci_weight == 0.0. That lets a zero-weight layer change the final result and contradicts the "LLM-only mode" contract. Return a zero-confidence result or surface the LLM failure instead of falling back to CI here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/hr/performance/composite_quality_strategy.py` around lines 169 -
185, The LLM-only branch currently falls back to self._ci_strategy.score() even
when ci_weight == 0.0; change the _try_llm path so that if llm_result is None
you DO NOT call self._ci_strategy.score(), and instead return a deterministic
LLM-only fallback (e.g. a zero-confidence/safe failure result or re-surface the
LLM failure) from the method; update the logic around llm_result in the function
containing the call to _try_llm to early-return the
zero-confidence/result-on-failure rather than invoking _ci_strategy.score().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/api/app.py`:
- Around line 767-771: The call that builds the performance tracker currently
omits the app's PerformanceConfig so _build_performance_tracker falls back to a
default; update the call to pass the app's PerformanceConfig (e.g., perf_config
or root_config.performance_config) into _build_performance_tracker by adding the
perf_config argument alongside cost_tracker and provider_registry so configured
judge model/provider and custom CI/LLM weights are respected when constructing
performance_tracker.

---

Duplicate comments:
In `@src/synthorg/hr/performance/composite_quality_strategy.py`:
- Around line 245-255: The broad except in the try/except block in
composite_quality_strategy.py currently catches RetryExhaustedError and returns
None, hiding the sentinel; add an explicit "except RetryExhaustedError: raise"
(referencing the RetryExhaustedError symbol) immediately before the generic
"except Exception:" that logs PERF_LLM_JUDGE_FAILED via logger.warning (keeping
the existing agent_id/task_id/exc_info fields) so retry exhaustion bubbles up to
the upstream fallback chain.
- Around line 169-185: The LLM-only branch currently falls back to
self._ci_strategy.score() even when ci_weight == 0.0; change the _try_llm path
so that if llm_result is None you DO NOT call self._ci_strategy.score(), and
instead return a deterministic LLM-only fallback (e.g. a zero-confidence/safe
failure result or re-surface the LLM failure) from the method; update the logic
around llm_result in the function containing the call to _try_llm to
early-return the zero-confidence/result-on-failure rather than invoking
_ci_strategy.score().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 415ed905-cd85-4a54-81a6-fbf70d106513

📥 Commits

Reviewing files that changed from the base of the PR and between 24946d7 and e348fb2.

📒 Files selected for processing (2)
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)

Files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff)

Files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Models: use @computed_field for derived values instead of storing + validating redundant fields (e.g., TokenUsage.total_tokens)
Models: use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction, bootstrap, and error reporting code)
Variable name: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object crea...

Files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
🧠 Learnings (36)
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Persistence backend: pluggable PersistenceBackend protocol in `src/synthorg/persistence/`, SQLite initial, SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`

Applied to files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-04-01T17:49:14.133Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T17:49:14.133Z
Learning: Applies to src/synthorg/{providers,engine}/**/*.py : Retryable errors are `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`; non-retryable errors raise immediately; `RetryExhaustedError` signals all retries failed

Applied to files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors: `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code — it's handled by the base class.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to **/*.py : PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-01T09:39:21.584Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:39:21.584Z
Learning: Applies to **/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T18:52:05.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T18:52:05.142Z
Learning: Applies to **/*.py : Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/**/*.py : Async concurrency: prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T19:13:34.746Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:34.746Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry. RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/**/*.py : Errors: handle explicitly, never silently swallow

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Non-retryable errors raise immediately without retry

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Never implement retry logic in provider subclasses or calling code — it is handled automatically by `BaseCompletionProvider` with `RetryConfig` and `RateLimiterConfig` per-provider

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
🔇 Additional comments (2)
src/synthorg/hr/performance/composite_quality_strategy.py (1)

141-158: Good use of TaskGroup for the dual-layer branch.

This keeps CI and judge scoring in structured concurrency and matches the repo’s async pattern for new fan-out work. As per coding guidelines: "Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task."

src/synthorg/api/app.py (1)

387-397: Good shutdown coverage for PerformanceTracker.

Passing the tracker into both _safe_shutdown() paths ensures the new resource is closed during startup rollback and normal shutdown.

Also applies to: 429-439

Comment on lines +767 to +771
if performance_tracker is None:
performance_tracker = _build_performance_tracker(
cost_tracker=cost_tracker,
provider_registry=provider_registry,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Pass the app’s PerformanceConfig into _build_performance_tracker().

Because _build_performance_tracker() falls back to PerformanceConfig() at Lines 589-590, omitting perf_config here drops any non-default quality-scoring config from RootConfig on the main app path. Configured judge model/provider and custom CI/LLM weights are ignored unless a tracker is injected manually.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/api/app.py` around lines 767 - 771, The call that builds the
performance tracker currently omits the app's PerformanceConfig so
_build_performance_tracker falls back to a default; update the call to pass the
app's PerformanceConfig (e.g., perf_config or root_config.performance_config)
into _build_performance_tracker by adding the perf_config argument alongside
cost_tracker and provider_registry so configured judge model/provider and custom
CI/LLM weights are respected when constructing performance_tracker.

…nly fallback, TODO(#1061)

- Re-add RetryExhaustedError re-raise in _try_llm so provider
  outages surface to the engine fallback chain
- Wrap TaskGroup with except* to unwrap ExceptionGroup and
  re-raise bare RetryExhaustedError (PEP 654)
- LLM-only mode (ci_weight=0.0) returns zero-confidence fallback
  instead of silently using disabled CI layer
- Update test: RetryExhaustedError now raises instead of degrading
- Add TODO(#1061) for wiring PerformanceConfig from RootConfig
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/hr/performance/composite_quality_strategy.py`:
- Around line 175-192: The composite strategy returns inconsistent strategy_name
values in LLM-only mode: when _try_llm succeeds it returns the raw LLM result
with strategy_name "llm_judge", but on LLM failure it returns a fallback with
strategy_name self.name ("composite"). Update the LLM-success path in the async
method (where _try_llm is called) to normalize the returned QualityScoreResult’s
strategy_name to NotBlankStr(self.name) (or construct a new QualityScoreResult
using llm_result fields but strategy_name=NotBlankStr(self.name)) so both
success and fallback use the composite strategy name consistently.

In `@tests/unit/hr/performance/test_composite_quality_strategy.py`:
- Around line 227-263: Update the test_weight_combinations test to also assert
the returned strategy name so the behavior is documented and regressions are
caught: after calling composite.score in test_weight_combinations, add
assertions on result.strategy_name (referencing the CompositeQualityStrategy
instance named composite and the test function test_weight_combinations) — for
the (1.0, 0.0, ...) case assert the strategy_name is "ci_signal" (or "composite"
if that is the intended behavior) and for the (0.0, 1.0, ...) case assert it is
"llm_judge" (or adjust to "composite" if production code should return
composite), ensuring the expected strings match the intended semantics so the
parametrized matrix documents and verifies strategy_name as well as score.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7c9134ff-d70d-4ee7-b412-048429ce0eb3

📥 Commits

Reviewing files that changed from the base of the PR and between e348fb2 and 3ce80c2.

📒 Files selected for processing (3)
  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)

Files:

  • src/synthorg/api/app.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff)

Files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Models: use @computed_field for derived values instead of storing + validating redundant fields (e.g., TokenUsage.total_tokens)
Models: use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction, bootstrap, and error reporting code)
Variable name: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object crea...

Files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async: asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml — do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. (vendor-agnostic naming)
Property-based testing: Python uses Hypothesis (@given + @settings). Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline).
Flaky tests: NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) — it is cancellation-safe and carries no timing assumptions.

Files:

  • tests/unit/hr/performance/test_composite_quality_strategy.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/hr/performance/test_composite_quality_strategy.py
🧠 Learnings (37)
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/engine/coordination/**/*.py : Task coordination uses multi-agent pipeline with 4 dispatchers (SAS/centralized/decentralized/context-dependent), wave execution, and workspace lifecycle integration.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Persistence backend: pluggable PersistenceBackend protocol in `src/synthorg/persistence/`, SQLite initial, SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`

Applied to files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-04-01T17:49:14.133Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T17:49:14.133Z
Learning: Applies to src/synthorg/{providers,engine}/**/*.py : Retryable errors are `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`; non-retryable errors raise immediately; `RetryExhaustedError` signals all retries failed

Applied to files:

  • src/synthorg/api/app.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors: `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code — it's handled by the base class.

Applied to files:

  • src/synthorg/api/app.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to **/*.py : PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-01T09:39:21.584Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:39:21.584Z
Learning: Applies to **/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T18:52:05.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T18:52:05.142Z
Learning: Applies to **/*.py : Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/**/*.py : Async concurrency: prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T19:13:34.746Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:34.746Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry. RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/**/*.py : Errors: handle explicitly, never silently swallow

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Non-retryable errors raise immediately without retry

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Never implement retry logic in provider subclasses or calling code — it is handled automatically by `BaseCompletionProvider` with `RetryConfig` and `RateLimiterConfig` per-provider

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry. `RetryExhaustedError` signals that all retries failed — the engine layer catches this to trigger fallback chains. Rate limiter respects `RateLimitError.retry_after` from providers — automatically pauses future requests.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
🔇 Additional comments (14)
src/synthorg/api/app.py (4)

765-773: TODO acknowledges missing perf_config passthrough.

The _build_performance_tracker() call omits perf_config, causing it to fall back to PerformanceConfig() defaults (line 589). The TODO at lines 767-768 correctly tracks this as issue #1061. Once RootConfig gains a performance field, this should be wired through.


501-557: Robust LLM judge resolution with graceful degradation.

_resolve_llm_judge_strategy properly handles three failure modes: no judge model configured, named provider not registered, and no providers available. Each case logs at WARNING and returns None to disable the LLM layer gracefully. The fallback to available[0] when no explicit provider is specified is reasonable for single-provider deployments.


560-614: Well-structured performance tracker factory.

The factory always wires QualityOverrideStore (human overrides are inexpensive) and conditionally resolves the LLM judge based on config and provider availability. The composite strategy receives weights from PerformanceConfig, and the returned tracker is properly configured with all three components.


396-396: Shutdown correctly closes performance tracker.

Both error-path shutdown (line 396) and normal shutdown (line 438) pass app_state._performance_tracker to _safe_shutdown(), which calls aclose() to cancel pending background tasks per the lifecycle contract in src/synthorg/api/lifecycle.py:382-387.

Also applies to: 438-438

src/synthorg/hr/performance/composite_quality_strategy.py (6)

1-31: Module setup looks good.

Imports, logger setup, and module docstring are well-structured. The use of TYPE_CHECKING for type-only imports and event constants from the observability module follows project conventions.


34-109: Thorough constructor validation.

The __init__ validates weights (finite, non-negative, sum to 1.0) and discount parameters (finite, in [0.0, 1.0]) with clear error messages. The use of math.isfinite() correctly rejects NaN and Inf values.


140-195: Good implementation of parallel layer execution with skip logic.

The branching correctly skips zero-weight layers: both weights > 0 runs in parallel via TaskGroup, ci_weight > 0 only runs CI, and the else branch handles LLM-only mode. The except* RetryExhaustedError properly unwraps from ExceptionGroup per PEP 654 for engine fallback chain compatibility.


226-269: Correct error handling with RetryExhaustedError passthrough.

The separate except RetryExhaustedError: raise before the broad except Exception: ensures retry-exhaustion signals propagate to the engine's provider fallback chain while other failures degrade gracefully to CI-only scoring.


251-254: PEP 758 exception syntax is correct.

The except MemoryError, RecursionError: and except RetryExhaustedError: clauses use valid PEP 758 comma-separated syntax for Python 3.14 (no parentheses required). The AI summary note claiming this is invalid is incorrect.


271-322: Weighted combination logic is sound.

The _combine method correctly applies weighted averaging with clamping to [0.0, 10.0], uses min(ci, llm) confidence with discount for two-layer results, and applies a separate discount for CI-only fallback. Rounding to 4 decimal places is reasonable for floating-point score representation.

tests/unit/hr/performance/test_composite_quality_strategy.py (4)

1-50: Clean test setup with well-designed mock helpers.

The _make_strategy and _make_failing_strategy helpers create reusable mock QualityScoringStrategy instances with configurable scores and confidence levels. Using AsyncMock is appropriate for async protocol methods.


65-138: Comprehensive override behavior tests.

The tests cover the three key override scenarios: active override short-circuits scoring, expired override falls through to CI, and no override store skips the check entirely. Assertions verify both the returned result properties and that CI strategy was/wasn't called as expected.


344-370: RetryExhaustedError propagation test addresses prior feedback.

This test verifies that RetryExhaustedError from the LLM strategy bubbles up unwrapped rather than being caught and triggering CI-only fallback. This ensures the engine's provider fallback chain receives the signal correctly.


373-432: Thorough constructor validation tests.

The tests cover all validation paths: negative weights, NaN weights, weights not summing to 1.0, and invalid confidence discount parameters. Using pytest.raises with match patterns ensures error messages are verified.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 4, 2026 07:29 — with GitHub Actions Inactive
Repository owner deleted a comment from coderabbitai bot Apr 4, 2026
@Aureliolo
Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/hr/performance/composite_quality_strategy.py`:
- Around line 143-164: The TaskGroup branch currently only unwraps
RetryExhaustedError from the ExceptionGroup, leaving system errors wrapped;
update the exception handling around asyncio.TaskGroup so that ExceptionGroup
cases for MemoryError and RecursionError are also caught (similar to the
existing except* RetryExhaustedError block) and re-raise the bare exception
(eg.exceptions[0]) from the group—i.e., add except* MemoryError and except*
RecursionError handlers that unwrap and raise the contained exception so
_try_llm, TaskGroup, and the system-error propagation behavior match the
sequential code paths.

In `@tests/unit/hr/performance/test_composite_quality_strategy.py`:
- Around line 345-372: Update the test suite to also assert that built-in system
errors propagate unchanged: add assertions in TestRetryExhaustedPropagation
(e.g., in test_llm_retry_exhausted_raises or as new parameterized tests) that
when llm.score raises MemoryError or RecursionError the
CompositeQualityStrategy.score call (on CompositeQualityStrategy with
ci_strategy and llm_strategy setup) re-raises that same exception (use
pytest.raises(MemoryError) and pytest.raises(RecursionError)); reference the
AsyncMock llm.score side_effect and the CompositeQualityStrategy.score
invocation to implement these passthrough checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2dea6a9d-aeea-45e6-bb2a-45543ab24246

📥 Commits

Reviewing files that changed from the base of the PR and between 3ce80c2 and 3353343.

📒 Files selected for processing (2)
  • src/synthorg/hr/performance/composite_quality_strategy.py
  • tests/unit/hr/performance/test_composite_quality_strategy.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)

Files:

  • tests/unit/hr/performance/test_composite_quality_strategy.py
  • src/synthorg/hr/performance/composite_quality_strategy.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async: asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml — do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. (vendor-agnostic naming)
Property-based testing: Python uses Hypothesis (@given + @settings). Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline).
Flaky tests: NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) — it is cancellation-safe and carries no timing assumptions.

Files:

  • tests/unit/hr/performance/test_composite_quality_strategy.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/hr/performance/test_composite_quality_strategy.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff)

Files:

  • src/synthorg/hr/performance/composite_quality_strategy.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Models: use @computed_field for derived values instead of storing + validating redundant fields (e.g., TokenUsage.total_tokens)
Models: use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr) for handler construction, bootstrap, and error reporting code)
Variable name: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object crea...

Files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
🧠 Learnings (24)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes), fix them all. No deferring, no "out of scope" skipping.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:49:53.264Z
Learning: Fix everything valid — never skip when review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes). No deferring, no 'out of scope' skipping.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T05:46:17.064Z
Learning: Fix all valid issues found by review agents (including pre-existing issues in surrounding code, suggestions, and adjacent findings) — never skip or defer
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to **/*.py : PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-01T09:39:21.584Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:39:21.584Z
Learning: Applies to **/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over bare `create_task`

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T18:52:05.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T18:52:05.142Z
Learning: Applies to **/*.py : Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/**/*.py : Async concurrency: prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-01T17:49:14.133Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T17:49:14.133Z
Learning: Applies to src/synthorg/{providers,engine}/**/*.py : Retryable errors are `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`; non-retryable errors raise immediately; `RetryExhaustedError` signals all retries failed

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-16T19:13:34.746Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:34.746Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry. RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/**/*.py : Errors: handle explicitly, never silently swallow

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Non-retryable errors raise immediately without retry

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Never implement retry logic in provider subclasses or calling code — it is handled automatically by `BaseCompletionProvider` with `RetryConfig` and `RateLimiterConfig` per-provider

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-04-03T21:18:15.006Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T21:18:15.006Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry. `RetryExhaustedError` signals that all retries failed — the engine layer catches this to trigger fallback chains. Rate limiter respects `RateLimitError.retry_after` from providers — automatically pauses future requests.

Applied to files:

  • src/synthorg/hr/performance/composite_quality_strategy.py

@Aureliolo
Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Aureliolo Aureliolo merged commit 4a8adfe into main Apr 4, 2026
34 checks passed
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 4, 2026 08:15 — with GitHub Actions Inactive
@Aureliolo Aureliolo deleted the feat/quality-scoring-l2-l3 branch April 4, 2026 08:15
Aureliolo added a commit that referenced this pull request Apr 4, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.6.1](v0.6.0...v0.6.1)
(2026-04-04)


### Features

* capability-aware prompt profiles for model tier adaptation
([#1047](#1047))
([67650c5](67650c5)),
closes [#805](#805)
* implement procedural memory auto-generation from agent failures
([#1048](#1048))
([55f5206](55f5206)),
closes [#420](#420)
* implement quality scoring Layers 2+3 -- LLM judge and human override
([#1057](#1057))
([4a8adfe](4a8adfe)),
closes [#230](#230)
* token-based personality trimming via
PromptProfile.max_personality_tokens
([#1059](#1059))
([75afd52](75afd52)),
closes [#1045](#1045)
* workflow execution lifecycle + editor improvements
([#1058](#1058))
([7b54262](7b54262)),
closes [#1029](#1029)
[#1042](#1042)


### Refactoring

* **web:** address complexity and logging issues in dashboard
([#1056](#1056))
([ada997b](ada997b)),
closes [#1055](#1055)


### Documentation

* comprehensive documentation refresh
([#1050](#1050))
([c7a4259](c7a4259))


### Tests

* fix Hypothesis fuzzing infra and speed up slow unit tests
([#1044](#1044))
([1111602](1111602))


### Maintenance

* add text=auto catch-all to .gitattributes
([#1051](#1051))
([fc65d72](fc65d72))
* bump defu from 6.1.4 to 6.1.6 in /site
([#1062](#1062))
([f0cc439](f0cc439))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: implement quality scoring Layers 2+3 -- LLM judge and human override

2 participants