chore: backend hardening -- eviction/size-caps and model validation#911
chore: backend hardening -- eviction/size-caps and model validation#911
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
🧰 Additional context used📓 Path-based instructions (2)**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
tests/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🔇 Additional comments (6)
WalkthroughThis PR adds bounded-memory behavior and telemetry across in-memory stores. CostTracker and ProviderHealthTracker gain TTL-based pruning with a new 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
There was a problem hiding this comment.
Code Review
This pull request introduces memory management and record eviction strategies across several tracking services, including CostTracker, DelegationRecordStore, ProviderHealthTracker, and ToolInvocationTracker. It implements time-based pruning for budget and health records and FIFO eviction using deques for delegation and tool invocation records. Additionally, Pydantic models for tasks were updated to disallow infinity and NaN values. Review feedback suggests reducing code duplication by extracting common pruning logic into helper methods and improving testability by allowing reference times to be passed into snapshot methods.
| if len(self._records) > self._auto_prune_threshold: | ||
| cutoff = datetime.now(UTC) - timedelta( | ||
| hours=_COST_WINDOW_HOURS, | ||
| ) | ||
| before = len(self._records) | ||
| self._records = [r for r in self._records if r.timestamp >= cutoff] | ||
| pruned = before - len(self._records) | ||
| if pruned: | ||
| logger.info( | ||
| BUDGET_RECORDS_AUTO_PRUNED, | ||
| pruned=pruned, | ||
| remaining=len(self._records), | ||
| ) |
There was a problem hiding this comment.
The pruning logic here is nearly identical to the logic in the prune_expired method. This duplication can make future maintenance more difficult and error-prone.
To improve this, you could extract the common logic into a private, non-locking helper method. Both prune_expired and _snapshot could then call this helper from within their async with self._lock: blocks, keeping the code DRY (Don't Repeat Yourself).
For example:
def _prune_records_unlocked(self, cutoff: datetime) -> int:
"""Prunes records older than the cutoff. Assumes lock is held."""
before = len(self._records)
self._records = [r for r in self._records if r.timestamp >= cutoff]
return before - len(self._records)
src/synthorg/providers/health.py
Outdated
| async def _snapshot(self) -> tuple[ProviderHealthRecord, ...]: | ||
| """Return an immutable snapshot of all current records.""" | ||
| """Return an immutable snapshot of all current records. | ||
|
|
||
| When the record count exceeds the auto-prune threshold, | ||
| expired records are removed before the snapshot is taken. | ||
| """ | ||
| async with self._lock: | ||
| if len(self._records) > self._auto_prune_threshold: | ||
| cutoff = datetime.now(UTC) - timedelta( | ||
| hours=_HEALTH_WINDOW_HOURS, | ||
| ) | ||
| before = len(self._records) | ||
| self._records = [r for r in self._records if r.timestamp >= cutoff] | ||
| pruned = before - len(self._records) | ||
| if pruned: | ||
| logger.info( | ||
| PROVIDER_HEALTH_AUTO_PRUNED, | ||
| pruned=pruned, | ||
| remaining=len(self._records), | ||
| ) | ||
| return tuple(self._records) |
There was a problem hiding this comment.
This method can be improved for maintainability and testability:
- Code Duplication: The pruning logic is nearly identical to that in
prune_expired. This could be extracted into a private helper to avoid repetition. - Testability: The method uses
datetime.now(UTC)directly, making it hard to test deterministically. Since callers likeget_summaryalready accept anowparameter, it should be passed down to_snapshot.
Here is a suggestion that addresses the testability issue. After applying it, you could further refactor to address the code duplication.
async def _snapshot(self, *, now: datetime | None = None) -> tuple[ProviderHealthRecord, ...]:
"""Return an immutable snapshot of all current records.
When the record count exceeds the auto-prune threshold,
expired records are removed before the snapshot is taken.
"""
async with self._lock:
if len(self._records) > self._auto_prune_threshold:
ref = now or datetime.now(UTC)
cutoff = ref - timedelta(
hours=_HEALTH_WINDOW_HOURS,
)
before = len(self._records)
self._records = [r for r in self._records if r.timestamp >= cutoff]
pruned = before - len(self._records)
if pruned:
logger.info(
PROVIDER_HEALTH_AUTO_PRUNED,
pruned=pruned,
remaining=len(self._records),
)
return tuple(self._records)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #911 +/- ##
==========================================
+ Coverage 92.15% 92.17% +0.02%
==========================================
Files 596 596
Lines 31438 31508 +70
Branches 3043 3055 +12
==========================================
+ Hits 28971 29043 +72
+ Misses 1945 1943 -2
Partials 522 522 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/budget/tracker.py`:
- Around line 93-99: Validate the auto_prune_threshold passed into the
constructor (auto_prune_threshold) so it cannot be zero or negative: in the
initializer where self._auto_prune_threshold is set, check that
auto_prune_threshold is an integer >= 1 and raise a ValueError (or TypeError if
not int) with a clear message; this prevents _snapshot() from performing an
unconditional or overly frequent prune scan and ensures the
_auto_prune_threshold invariant is enforced at the system boundary.
In `@src/synthorg/providers/health.py`:
- Around line 255-274: The auto-prune in _snapshot currently uses
datetime.now(UTC) which can evict records that callers asking for a different
reference time expect to keep; change _snapshot(self) -> tuple[...] to accept an
optional now: datetime parameter (defaulting to datetime.now(UTC)) and compute
cutoff = now - timedelta(hours=_HEALTH_WINDOW_HOURS) so pruning honors the
caller-supplied reference; update callers (e.g., get_summary and
get_all_summaries) to pass their `now` through to _snapshot, and ensure
_records, _auto_prune_threshold and the logging (PROVIDER_HEALTH_AUTO_PRUNED)
remain unchanged otherwise.
In `@src/synthorg/tools/invocation_tracker.py`:
- Around line 63-67: The current record() logic emits
logger.warning(TOOL_INVOCATION_EVICTED, ...) every time the deque self._records
is full, producing continuous warnings; modify record() to avoid repeated
warnings by either lowering severity or emitting only once: add a boolean flag
(e.g. self._eviction_warned) initialized False, set it True and call
logger.warning(TOOL_INVOCATION_EVICTED, max_records=...) only the first time
self._records reaches maxlen, and for subsequent evictions either log at
info/debug or skip logging entirely; ensure the flag is referenced and updated
inside the same scope where self._records and TOOL_INVOCATION_EVICTED are used
so behavior is consistent.
In `@tests/unit/communication/delegation/test_record_store.py`:
- Around line 222-228: Replace the two separate tests
test_max_records_zero_rejected and test_max_records_negative_rejected with a
single parametrized test that uses `@pytest.mark.parametrize` over the invalid
values (e.g., 0 and -1) and asserts that constructing
DelegationRecordStore(max_records=value) raises ValueError with the same match
"max_records must be >= 1"; keep the test name descriptive (e.g.,
test_max_records_invalid_rejected) and reuse the existing exception assertion
and message match.
In `@tests/unit/providers/test_health.py`:
- Around line 462-475: The test test_snapshot_no_prune_below_threshold currently
checks only calls_last_24h but not that expired records remain in memory; modify
the test to also assert the tracker's internal storage still contains both
entries after recording when auto_prune_threshold=10 — e.g. after two await
tracker.record(...) calls and before/after await tracker.get_summary(...), check
ProviderHealthTracker's internal list (the attribute holding records used by
record/get_summary, e.g., _records or whatever internal container is used) has
length 2 and that one of those entries has a timestamp older than 24h and the
other within 24h, while preserving the existing assertion that
summary.calls_last_24h == 1. Ensure you reference the same symbols:
test_snapshot_no_prune_below_threshold, ProviderHealthTracker, record,
get_summary, and calls_last_24h.
In `@tests/unit/tools/test_invocation_tracker.py`:
- Around line 140-146: Merge the two nearly identical tests
test_max_records_zero_rejected and test_max_records_negative_rejected into a
single parametrized test using pytest.mark.parametrize; call
ToolInvocationTracker(max_records=...) for each invalid value (e.g., 0 and -1)
and assert ValueError with match "max_records must be >= 1". Replace both
functions with one test function (e.g., test_max_records_invalid_rejected)
decorated with `@pytest.mark.parametrize`("invalid_max", [0, -1]) and keep the
with pytest.raises(...) block inside to exercise the ToolInvocationTracker
constructor.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d51c6aab-8733-4fd9-b8e3-adf8f61c2a3c
📒 Files selected for processing (14)
src/synthorg/budget/tracker.pysrc/synthorg/communication/delegation/record_store.pysrc/synthorg/core/task.pysrc/synthorg/observability/events/budget.pysrc/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.pysrc/synthorg/providers/health.pysrc/synthorg/tools/invocation_tracker.pytests/unit/budget/test_tracker.pytests/unit/communication/delegation/test_record_store.pytests/unit/core/test_task.pytests/unit/providers/test_health.pytests/unit/tools/test_invocation_tracker.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Build Sandbox
- GitHub Check: Build Web
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Use PEP 758 except syntax:except A, B:(no parentheses) on Python 3.14. Do NOT usefrom __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
All public functions and classes must have type hints. Use mypy strict mode. Google-style docstrings are required on all public classes and functions (enforced by ruff D rules).
Every module with business logic must include:from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Always use constants from domain-specific modules undersynthorg.observability.events(e.g.,API_REQUEST_STARTEDfromevents.api). Use structured kwargs:logger.info(EVENT, key=value)— neverlogger.info("msg %s", val). Never useimport loggingorprint()in application code.
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values instead of storing redundant fields. UseNotBlankStrfromcore.typesfor all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction and wrap withMappingProxyTypefor read-only enforcement. Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (viamodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Keep functions under 50 lines and files under 800 lines. Handle errors exp...
Files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.pysrc/synthorg/core/task.pysrc/synthorg/providers/health.pysrc/synthorg/tools/invocation_tracker.pysrc/synthorg/communication/delegation/record_store.pysrc/synthorg/budget/tracker.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use markers@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow. Maintain 80% minimum code coverage (enforced in CI). Always run pytest with-n autofor parallel execution. Use@pytest.mark.parametrizefor testing similar cases. Useasyncio_mode = "auto"(no manual@pytest.mark.asyncioneeded).
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in tests or project code—use generic names liketest-provider,test-small-001,example-provider,example-large-001,large/medium/small. Vendor names may only appear in: (1) Operations design page, (2).claude/files, (3) third-party import paths, (4) provider presets.
Never skip, dismiss, or ignore flaky tests—always fix them fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()instead of widening margins. For tasks that must block indefinitely until cancelled, useasyncio.Event().wait()instead ofasyncio.sleep(large_number).
Files:
tests/unit/core/test_task.pytests/unit/communication/delegation/test_record_store.pytests/unit/tools/test_invocation_tracker.pytests/unit/budget/test_tracker.pytests/unit/providers/test_health.py
🧠 Learnings (24)
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/provider.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
src/synthorg/observability/events/provider.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/budget/tracker.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/budget/tracker.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/budget/tracker.py
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing redundant fields. Use `NotBlankStr` from `core.types` for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
src/synthorg/core/task.py
🔇 Additional comments (6)
tests/unit/core/test_task.py (1)
371-378: Nice targeted regression coverage.This hits
inf,-inf, andnanwithout duplicating setup.tests/unit/communication/delegation/test_record_store.py (1)
194-221: Good eviction coverage for FIFO semantics.These tests clearly validate both overflow eviction order and non-eviction under capacity.
tests/unit/providers/test_health.py (1)
441-460: Threshold-triggered auto-prune test is solid.This case validates the intended prune path and expected 24h-call count after pruning.
tests/unit/tools/test_invocation_tracker.py (1)
116-139: Eviction behavior tests are well targeted.The assertions verify FIFO retention order and below-threshold non-eviction correctly.
tests/unit/budget/test_tracker.py (1)
464-527: Prune and auto-prune test coverage is comprehensive.These additions cover removal count semantics, empty/no-expiry behavior, and threshold-triggered snapshot pruning well.
src/synthorg/communication/delegation/record_store.py (1)
48-77: Bounded deque migration and eviction logging look correct.The constructor guard,
deque(maxlen=...)storage, and pre-append eviction warning are consistent with FIFO cap behavior.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CLAUDE.md`:
- Line 321: Several ConfigDict declarations across the codebase are missing the
new convention allow_inf_nan=False; update every ConfigDict in the listed
modules (the remaining declarations in providers/models.py, tools/base.py,
settings/models.py, templates/schema.py) to include allow_inf_nan=False (Task
and AcceptanceCriterion are already updated), and if you don't want to change
them now create an issue to track updating these remaining ConfigDict
occurrences so they’re handled in a follow-up PR.
In `@src/synthorg/budget/tracker.py`:
- Around line 455-461: The fast-path in _prune_before assumes self._records[0]
is the oldest but record() simply appends, so the early-return can be incorrect
when inserts are out-of-order; fix by either (A) removing the fast-path and
always filtering self._records in _prune_before, or (B) ensure chronological
order by updating record() to insert/merge by timestamp (or sort self._records)
so the leading element is guaranteed oldest; reference functions/attributes:
_prune_before, record, and self._records and choose the approach that fits
expected insertion patterns.
In `@src/synthorg/observability/events/tool.py`:
- Line 64: The constant TOOL_INVOCATION_TIME_RANGE_INVALID uses an inconsistent
event name; update its value from "tool.invocation.time_range_invalid" to the
dotted convention "tool.invocation.time_range.invalid" so it matches other
events (domain.subject.qualifier) and avoids splitting observability
queries/alerts; ensure any downstream uses or tests referring to
TOOL_INVOCATION_TIME_RANGE_INVALID still work after the string change.
In `@src/synthorg/providers/health.py`:
- Around line 303-308: The fast-path in _prune_before assumes self._records is
in chronological order by checking self._records[0].timestamp; change this to a
defensive check that does not rely on ordering (e.g. compute the minimum
timestamp or use all(...) to determine if every record.timestamp >= cutoff)
before returning 0, then proceed with the existing list comprehension prune;
reference: _prune_before and self._records (compare behavior to
CostTracker._prune_before).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 833278b4-c906-4c43-ace9-c2034ca40a10
📒 Files selected for processing (14)
CLAUDE.mdsrc/synthorg/budget/tracker.pysrc/synthorg/communication/delegation/record_store.pysrc/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.pysrc/synthorg/providers/health.pysrc/synthorg/tools/invocation_tracker.pytests/unit/api/controllers/test_setup.pytests/unit/budget/test_tracker.pytests/unit/communication/delegation/test_record_store.pytests/unit/core/test_task.pytests/unit/providers/test_health.pytests/unit/tools/test_invocation_tracker.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Build Sandbox
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Documentation must follow Markdown conventions: use consistent heading levels, link to relevant sections, include code examples where appropriate, and keep content up-to-date with implementation changes.
Files:
CLAUDE.md
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Use PEP 758 except syntax:except A, B:(no parentheses) on Python 3.14. Do NOT usefrom __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
All public functions and classes must have type hints. Use mypy strict mode. Google-style docstrings are required on all public classes and functions (enforced by ruff D rules).
Every module with business logic must include:from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Always use constants from domain-specific modules undersynthorg.observability.events(e.g.,API_REQUEST_STARTEDfromevents.api). Use structured kwargs:logger.info(EVENT, key=value)— neverlogger.info("msg %s", val). Never useimport loggingorprint()in application code.
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values instead of storing redundant fields. UseNotBlankStrfromcore.typesfor all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction and wrap withMappingProxyTypefor read-only enforcement. Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (viamodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Keep functions under 50 lines and files under 800 lines. Handle errors exp...
Files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.pysrc/synthorg/communication/delegation/record_store.pysrc/synthorg/providers/health.pysrc/synthorg/tools/invocation_tracker.pysrc/synthorg/budget/tracker.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use markers@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow. Maintain 80% minimum code coverage (enforced in CI). Always run pytest with-n autofor parallel execution. Use@pytest.mark.parametrizefor testing similar cases. Useasyncio_mode = "auto"(no manual@pytest.mark.asyncioneeded).
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in tests or project code—use generic names liketest-provider,test-small-001,example-provider,example-large-001,large/medium/small. Vendor names may only appear in: (1) Operations design page, (2).claude/files, (3) third-party import paths, (4) provider presets.
Never skip, dismiss, or ignore flaky tests—always fix them fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()instead of widening margins. For tasks that must block indefinitely until cancelled, useasyncio.Event().wait()instead ofasyncio.sleep(large_number).
Files:
tests/unit/tools/test_invocation_tracker.pytests/unit/api/controllers/test_setup.pytests/unit/core/test_task.pytests/unit/communication/delegation/test_record_store.pytests/unit/providers/test_health.pytests/unit/budget/test_tracker.py
🧠 Learnings (37)
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T11:41:02.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T11:41:02.964Z
Learning: Applies to src/**/*.py : Models: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing + validating redundant fields. Use `NotBlankStr` for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. Use `computed_field` for derived values instead of storing redundant fields. Use `NotBlankStr` for all identifier/name fields.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing redundant fields. Use `NotBlankStr` from `core.types` for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to **/*.py : Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use computed_field for derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T10:40:25.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T10:40:25.144Z
Learning: Applies to **/*.py : Docstrings: Google style, required on public classes and functions (enforced by ruff D rules).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement. For `dict`/`list` fields in frozen Pydantic models, use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : All public functions and classes must have type hints. Use mypy strict mode. Google-style docstrings are required on all public classes and functions (enforced by ruff D rules).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to **/*.py : Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T23:05:29.577Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T23:05:29.577Z
Learning: Applies to **/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to **/*.py : Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/provider.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/providers/health.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Applied to files:
src/synthorg/observability/events/provider.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases.
Applied to files:
tests/unit/tools/test_invocation_tracker.pytests/unit/communication/delegation/test_record_store.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Parametrize: Prefer pytest.mark.parametrize for testing similar cases.
Applied to files:
tests/unit/tools/test_invocation_tracker.pytests/unit/communication/delegation/test_record_store.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/unit/api/controllers/test_setup.pytests/unit/budget/test_tracker.py
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to tests/**/*.py : Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in tests or project code—use generic names like `test-provider`, `test-small-001`, `example-provider`, `example-large-001`, `large`/`medium`/`small`. Vendor names may only appear in: (1) Operations design page, (2) `.claude/` files, (3) third-party import paths, (4) provider presets.
Applied to files:
tests/unit/api/controllers/test_setup.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.
Applied to files:
src/synthorg/tools/invocation_tracker.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit
Applied to files:
src/synthorg/tools/invocation_tracker.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.
Applied to files:
src/synthorg/tools/invocation_tracker.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.
Applied to files:
src/synthorg/budget/tracker.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).
Applied to files:
src/synthorg/budget/tracker.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Applied to files:
src/synthorg/budget/tracker.py
🔇 Additional comments (22)
tests/unit/api/controllers/test_setup.py (1)
1076-1107: Deterministic randomize-name test is a solid hardening change.Using
monkeypatchto forcegenerate_auto_nameand asserting exact values in both POST and follow-up GET makes this test stable and meaningfully verifies persistence behavior.Based on learnings: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock
time.monotonic()andasyncio.sleep()to make them deterministic instead of widening timing margins.src/synthorg/observability/events/delegation.py (1)
31-32: Good event-surface extension for delegation tracking.These constants cleanly support the new eviction and invalid-range logging paths without introducing string literals downstream.
src/synthorg/communication/delegation/record_store.py (2)
52-82: Bounded FIFO storage and one-time eviction warning are implemented well.
max_recordsvalidation plusdeque(maxlen=...)keeps memory bounded while preserving insertion-order semantics.
189-195: Nice error-path observability on invalid time ranges.Logging
DELEGATION_TIME_RANGE_INVALIDimmediately before raising makes failures diagnosable without changing API behavior.tests/unit/communication/delegation/test_record_store.py (2)
198-244: Eviction boundary coverage is strong.The suite checks below/equal/over-capacity and
max_records=1, which directly validates FIFO retention behavior.
222-225: Good consolidation of invalid constructor cases.Using one parametrized test for
0and-1keeps the failure contract focused and reduces duplication.tests/unit/tools/test_invocation_tracker.py (2)
116-162: Eviction behavior tests are comprehensive and targeted.The cases validate FIFO ordering and edge behavior (
below,exact,over, andmax_records=1) for the new bounded storage.
140-143: Parametrized invalid-input coverage looks good.This keeps constructor rejection checks concise while preserving explicit value coverage.
src/synthorg/tools/invocation_tracker.py (2)
45-72: Tracker hardening implementation is solid.
max_recordsvalidation, bounded deque storage, and warning-on-first-eviction provide a clean cap without log spam.
124-136: Time-range validation helper is clean and observable.Centralizing validation with a structured warning before raising improves consistency and debuggability.
src/synthorg/budget/tracker.py (3)
97-116: LGTM!Constructor validation at lines 104-106 properly enforces
auto_prune_threshold >= 1at the system boundary, addressing the previous review feedback. The error message is clear and actionable.
142-164: LGTM!The
prune_expired()method is well-designed:
- Accepts optional
nowfor testability- Uses shared
_prune_before()helper- Logs with structured event constant
- Returns count for caller visibility
428-453: LGTM!The
_snapshot()method correctly:
- Accepts
nowparameter for consistent reference time- Only auto-prunes when threshold is exceeded (not at threshold)
- Logs auto-prune events separately from manual prune
- Returns immutable tuple snapshot
tests/unit/budget/test_tracker.py (2)
464-502: LGTM!Comprehensive test coverage for
prune_expired():
- Standard pruning case
- Empty tracker edge case
- No-expiry case
- Exact cutoff boundary verification (record at cutoff retained)
The tests use explicit timestamps avoiding timing-based flakiness.
508-572: LGTM!The
TestCostTrackerAutoEvictionclass thoroughly validates auto-pruning behavior:
- Threshold exceeded (12 records, threshold 10)
- Below threshold (no prune)
- Exactly at threshold (no prune - correct
>semantics)- All-expired edge case
- Invalid constructor values via parametrize
tests/unit/core/test_task.py (1)
371-378: LGTM!Good parametrized test covering all three non-finite float cases (
inf,-inf,nan) with clear test IDs. This validates theallow_inf_nan=FalseConfigDict setting on theTaskmodel.src/synthorg/observability/events/provider.py (1)
82-85: LGTM!New event constants follow the established naming pattern and provide proper observability for both manual and automatic pruning paths.
tests/unit/providers/test_health.py (2)
433-441: LGTM!The exact-cutoff boundary test correctly verifies that a record timestamped exactly at the 24-hour cutoff is retained (not pruned), confirming the
>=semantics.
447-527: LGTM!The
TestAutoEvictionclass provides comprehensive coverage that mirrors theCostTrackertests:
- Auto-prune when threshold exceeded
- No prune below threshold (with verification via
prune_expired)- No prune at exact threshold
- All-expired edge case
- Invalid constructor values
The assertion at lines 489-491 addresses the previous review feedback about strengthening the "no prune below threshold" test.
src/synthorg/providers/health.py (3)
57-57: LGTM!
ProviderHealthRecordcorrectly includesallow_inf_nan=Falsein itsConfigDict, aligning with the updated convention and ensuringresponse_time_msrejects non-finite values.
169-181: LGTM!Good additions:
__slots__improves memory efficiency for tracker instances- Constructor validation enforces
auto_prune_threshold >= 1- Pattern matches
CostTrackerfor consistency
276-301: LGTM!The
_snapshot()method correctly:
- Accepts
nowparameter (addresses previous review feedback)- Uses consistent reference time for cutoff calculation
- Auto-prunes only when
> threshold(not>=)- Logs with distinct event constant for auto-prune
| - **Immutability**: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction + `MappingProxyType` wrapping for read-only enforcement. For `dict`/`list` fields in frozen Pydantic models, rely on `frozen=True` for field reassignment prevention and `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence). | ||
| - **Config vs runtime state**: frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves (e.g. agent execution state, task progress). Never mix static config fields with mutable runtime fields in one model. | ||
| - **Models**: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Adopted conventions: use `@computed_field` for derived values instead of storing + validating redundant fields (e.g. `TokenUsage.total_tokens`); use `NotBlankStr` (from `core.types`) for all identifier/name fields -- including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants -- instead of manual whitespace validators. | ||
| - **Models**: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Adopted conventions: use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time; use `@computed_field` for derived values instead of storing + validating redundant fields (e.g. `TokenUsage.total_tokens`); use `NotBlankStr` (from `core.types`) for all identifier/name fields -- including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants -- instead of manual whitespace validators. |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
New allow_inf_nan=False convention is not applied consistently across codebase.
The convention update is well-documented, but the relevant code snippets indicate 18+ existing ConfigDict declarations that don't include allow_inf_nan=False:
src/synthorg/providers/models.py(9 declarations)src/synthorg/tools/base.py(1 declaration)src/synthorg/settings/models.py(3 declarations)src/synthorg/templates/schema.py(6 declarations)
This PR only updated Task and AcceptanceCriterion. Consider tracking the remaining models for a follow-up PR to ensure the codebase fully aligns with this convention.
Do you want me to open an issue to track updating the remaining ConfigDict declarations to include allow_inf_nan=False?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CLAUDE.md` at line 321, Several ConfigDict declarations across the codebase
are missing the new convention allow_inf_nan=False; update every ConfigDict in
the listed modules (the remaining declarations in providers/models.py,
tools/base.py, settings/models.py, templates/schema.py) to include
allow_inf_nan=False (Task and AcceptanceCriterion are already updated), and if
you don't want to change them now create an issue to track updating these
remaining ConfigDict occurrences so they’re handled in a follow-up PR.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/synthorg/providers/health.py (1)
237-242:⚠️ Potential issue | 🟠 MajorClamp historical summaries to the caller’s reference time.
Lines 241 and 268 only apply the lower bound (
timestamp >= cutoff). If a caller asks fornow=<historical datetime>, records newer than that reference time are still included, so the returned summary is not actually “as of”now. Add the missing upper bound in both filters.Proposed fix
recent = [ r for r in snapshot - if r.provider_name == provider_name and r.timestamp >= cutoff + if r.provider_name == provider_name + and cutoff <= r.timestamp <= ref ] @@ by_provider: dict[str, list[ProviderHealthRecord]] = defaultdict(list) for r in snapshot: - if r.timestamp >= cutoff: + if cutoff <= r.timestamp <= ref: by_provider[r.provider_name].append(r)Also applies to: 265-269
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/providers/health.py` around lines 237 - 242, The filtering for historical "as of" snapshots only applies the lower bound (r.timestamp >= cutoff) and must also clamp to the reference time; update both filters that build recent and recent_by_type after calling self._snapshot(now=ref) to include an upper bound using the reference variable (e.g., require cutoff <= r.timestamp <= ref or r.timestamp <= ref), ensuring you use the same ref/now passed to _snapshot and apply the <= ref check in the list comprehensions that reference provider_name and when computing max timestamps per type.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/budget/tracker.py`:
- Around line 61-62: The 7-day retention constant _COST_WINDOW_HOURS is
hard-coded and pruning logic tied to _AUTO_PRUNE_THRESHOLD discards older
records while build_summary() accepts arbitrary start/end ranges, causing silent
undercounting for month-to-date or longer queries; either make the retention
window configurable (expose _COST_WINDOW_HOURS as a configurable parameter and
use it wherever pruning and retention checks occur) or ensure retention aligns
with the longest supported reporting period, and if you intentionally keep 7-day
retention then modify build_summary() to validate the requested start/end
against the retention window and raise/reject queries that exceed it (update all
uses of _COST_WINDOW_HOURS and the pruning code that depends on it and ensure
budget_total_monthly / budget_used_percent calculations only run when data for
the full requested range is present).
---
Outside diff comments:
In `@src/synthorg/providers/health.py`:
- Around line 237-242: The filtering for historical "as of" snapshots only
applies the lower bound (r.timestamp >= cutoff) and must also clamp to the
reference time; update both filters that build recent and recent_by_type after
calling self._snapshot(now=ref) to include an upper bound using the reference
variable (e.g., require cutoff <= r.timestamp <= ref or r.timestamp <= ref),
ensuring you use the same ref/now passed to _snapshot and apply the <= ref check
in the list comprehensions that reference provider_name and when computing max
timestamps per type.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f17b93ef-0b1a-4cce-a440-475174e1affe
📒 Files selected for processing (3)
src/synthorg/budget/tracker.pysrc/synthorg/observability/events/tool.pysrc/synthorg/providers/health.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Test (Python 3.14)
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (1)
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Use PEP 758 except syntax:except A, B:(no parentheses) on Python 3.14. Do NOT usefrom __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
All public functions and classes must have type hints. Use mypy strict mode. Google-style docstrings are required on all public classes and functions (enforced by ruff D rules).
Every module with business logic must include:from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Always use constants from domain-specific modules undersynthorg.observability.events(e.g.,API_REQUEST_STARTEDfromevents.api). Use structured kwargs:logger.info(EVENT, key=value)— neverlogger.info("msg %s", val). Never useimport loggingorprint()in application code.
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values instead of storing redundant fields. UseNotBlankStrfromcore.typesfor all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction and wrap withMappingProxyTypefor read-only enforcement. Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (viamodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Keep functions under 50 lines and files under 800 lines. Handle errors exp...
Files:
src/synthorg/observability/events/tool.pysrc/synthorg/budget/tracker.pysrc/synthorg/providers/health.py
🧠 Learnings (14)
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.
Applied to files:
src/synthorg/budget/tracker.pysrc/synthorg/providers/health.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Applied to files:
src/synthorg/budget/tracker.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).
Applied to files:
src/synthorg/budget/tracker.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
src/synthorg/providers/health.py
🔇 Additional comments (1)
src/synthorg/observability/events/tool.py (1)
63-64: Looks good — event constants are consistent and correctly named.Both new constants follow the existing
tool.*naming scheme, andTOOL_INVOCATION_TIME_RANGE_INVALIDnow matches the dotted convention (time_range.invalid).As per coding guidelines, event names should be defined as constants in domain-specific modules under
synthorg.observability.events.
| _COST_WINDOW_HOURS = 168 # 7 days | ||
| _AUTO_PRUNE_THRESHOLD = 100_000 |
There was a problem hiding this comment.
Seven-day eviction will undercount longer-range budget summaries.
Lines 155 and 445 permanently discard records older than 168 hours, but build_summary() still accepts arbitrary start/end ranges and computes budget_total_monthly / budget_used_percent from whatever survives. Once a deployment crosses the prune threshold, month-to-date and any longer-range summaries will silently undercount. Make the retention window configurable (or align it to the longest supported reporting period); if 7 days is intentional, reject older query ranges instead of returning partial data.
Also applies to: 75-81, 154-157, 443-446
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/budget/tracker.py` around lines 61 - 62, The 7-day retention
constant _COST_WINDOW_HOURS is hard-coded and pruning logic tied to
_AUTO_PRUNE_THRESHOLD discards older records while build_summary() accepts
arbitrary start/end ranges, causing silent undercounting for month-to-date or
longer queries; either make the retention window configurable (expose
_COST_WINDOW_HOURS as a configurable parameter and use it wherever pruning and
retention checks occur) or ensure retention aligns with the longest supported
reporting period, and if you intentionally keep 7-day retention then modify
build_summary() to validate the requested start/end against the retention window
and raise/reject queries that exceed it (update all uses of _COST_WINDOW_HOURS
and the pruning code that depends on it and ensure budget_total_monthly /
budget_used_percent calculations only run when data for the full requested range
is present).
…#844) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cordStore (#837) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pre-reviewed by 11 agents, 7 findings addressed: - Add logging to ProviderHealthTracker.prune_expired (parity with CostTracker) - Validate auto_prune_threshold >= 1 in both trackers - Log WARNING before raising ValueError in time-range validation - Add boundary tests (threshold exact, max_records=1) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…r crash The lambda mock was missing the `seed` kwarg that setup_agents.py passes, causing a TypeError that crashed the xdist worker silently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… Gemini - Fix stale "append-only" docstrings across 4 source files (now documents TTL-based eviction and size-cap eviction respectively) - Add missing Args/Raises to CostTracker and ProviderHealthTracker docstrings - Extract _prune_before helper to DRY duplicated pruning logic - Thread `now` parameter to _snapshot() for deterministic auto-prune cutoffs - Add O(1) fast-path in _prune_before (skip scan when oldest record is fresh) - Fix eviction warning log spam: warn once per tracker lifetime, not per append - Add __slots__ to CostTracker, DelegationRecordStore, ToolInvocationTracker - Extract _validate_time_range helper in ToolInvocationTracker (consistent with CostTracker and DelegationRecordStore) - Update concurrency note in DelegationRecordStore to reference deque semantics - Document prune window durations in prune_expired docstrings (168h, 24h) - Document allow_inf_nan=False convention in CLAUDE.md Models section - Parametrize duplicate zero/negative validation tests across 4 test files - Add match= to inf/nan rejection test for precision - Add boundary tests: exact cutoff retention, all-records-expired, exact-max - Strengthen auto-prune test to verify identity of surviving records Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ProviderHealthTracker retains __slots__ (pre-existing, its tests use real instances). CostTracker, DelegationRecordStore, and ToolInvocationTracker cannot use __slots__ because downstream tests mock them with MagicMock which requires dynamic attribute assignment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove unsafe fast-path in _prune_before (CostTracker + ProviderHealthTracker) that assumed chronological insertion order; records appended via record() may have out-of-order timestamps - Fix event name "tool.invocation.time_range_invalid" to use dotted convention "tool.invocation.time_range.invalid" matching budget and delegation event patterns - Created #913 to track adding allow_inf_nan=False to remaining ~150 ConfigDict declarations across the codebase Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CostTracker.build_summary() now logs a warning when the requested date range extends beyond the 168-hour retention window, alerting operators that auto-pruned data may cause undercounting - ProviderHealthTracker.get_summary() and get_all_summaries() now clamp records to the reference time (cutoff <= timestamp <= ref), preventing records newer than the reference from leaking into historical queries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b33f4fb to
e4649ce
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/api/controllers/test_setup.py (1)
479-485: 🧹 Nitpick | 🔵 TrivialRun the registry assertion on the
TestClientapp loop.Creating a fresh event loop here disconnects the assertion from the loop that owns
app_state.agent_registryand any loop-bound primitives it uses. That makes these tests brittle even though they pass today. Prefer an async test or the LitestarTestClientportal/helper that executes the coroutine on the application's loop.Also applies to: 633-639
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/api/controllers/test_setup.py` around lines 479 - 485, The test currently creates a fresh asyncio loop and runs app_state.agent_registry.agent_count() on it, which disconnects that call from the application's event loop and its loop-bound primitives; instead, execute the coroutine on the TestClient/app loop by using the Litestar TestClient helper (e.g., TestClient.run_sync or the TestClient portal) to run app_state.agent_registry.agent_count() so the call runs in the application's loop and you can remove asyncio.new_event_loop() / loop.run_until_complete() / loop.close() surrounding the agent_count call.
♻️ Duplicate comments (1)
src/synthorg/budget/tracker.py (1)
62-63:⚠️ Potential issue | 🟠 MajorThe 7-day retention window still returns silently partial summaries.
This only logs when
startpredates the retention cutoff, butbuild_summary()still returns normal totals andbudget_used_percentafterward. Once_snapshot()has pruned, any month-to-date or other long-range report becomes an undercount that callers cannot detect. Either reject ranges earlier than the retained window, or make the retention period configurable/aligned with the longest supported reporting period.Also applies to: 299-309
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/budget/tracker.py` around lines 62 - 63, The retention window constant _COST_WINDOW_HOURS (168) causes silent undercounts because _snapshot() prunes old data but build_summary() and budget_used_percent still return totals for ranges that start before the retention cutoff; change behavior so callers can detect this: either make the retention period configurable and used as the authoritative max-range, or validate range inputs (e.g., in build_summary() and any month-to-date/long-range report methods) and raise/return an explicit error when the requested start is earlier than now() - timedelta(hours=_COST_WINDOW_HOURS). Ensure _snapshot() and the validation use the same _COST_WINDOW_HOURS (or the new config) so they stay aligned and callers no longer receive silent partial summaries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/budget/tracker.py`:
- Around line 143-165: The prune_expired() routine is only invoked on read paths
so expired entries accumulate on write-heavy nodes; fix by adding opportunistic
or periodic pruning: update the record() method in the same budget tracker to
opportunistically call prune_expired() (or inline _prune_before(cutoff)) under
the existing self._lock whenever appending a record and the record count or age
exceeds a small threshold, and/or add a background prune loop that calls
prune_expired() on an interval (implement a _start_prune_loop coroutine and
start it from your service lifecycle bootstrap where services are started).
Ensure you use the existing methods (_prune_before, prune_expired) and locks to
avoid races and avoid blocking the write path by scheduling async background
calls where appropriate.
In `@src/synthorg/communication/delegation/record_store.py`:
- Around line 38-43: record_sync lacks protection against concurrent calls from
other OS threads and _snapshot iterates the deque without that protection; add a
dedicated threading lock (e.g. self._thread_lock = threading.Lock()) and use it
to guard deque mutations and snapshot iteration: acquire self._thread_lock in
record_sync before appending the DelegationRecord and in _snapshot while
copying/iterating the deque, keep the existing asyncio._lock for async-reader
coordination, and import threading; this provides thread-safety without changing
the async lock semantics.
In `@src/synthorg/tools/invocation_tracker.py`:
- Around line 50-52: Before raising the ValueError for invalid max_records in
invocation_tracker.py, emit a warning that includes the rejected value and
context; e.g., call the module/class logger (use the same logger used elsewhere
in this file) to logger.warning("Rejected max_records", extra={"max_records":
max_records, "reason": "must be >= 1"}) or similar structured message
immediately before raise ValueError(msg) so the invalid value and rationale are
recorded.
In `@tests/unit/providers/test_health.py`:
- Around line 453-472: The test currently only verifies get_summary's 24h
filtering but not that expired entries were actually evicted from the tracker's
storage; update the test for ProviderHealthTracker by adding an explicit call to
prune_expired(now=self._NOW) after summary = await tracker.get_summary(...) and
assert the prune_expired return (or the change in tracker._records length)
equals the expected number evicted (0 for >threshold auto-prune case and the
expected count at the exact-threshold boundary); apply the same additional
assertion pattern to the other test block around the 493-520 region to prove
pruning is a side effect, referencing ProviderHealthTracker, get_summary,
prune_expired, and the internal _records for locating the code.
---
Outside diff comments:
In `@tests/unit/api/controllers/test_setup.py`:
- Around line 479-485: The test currently creates a fresh asyncio loop and runs
app_state.agent_registry.agent_count() on it, which disconnects that call from
the application's event loop and its loop-bound primitives; instead, execute the
coroutine on the TestClient/app loop by using the Litestar TestClient helper
(e.g., TestClient.run_sync or the TestClient portal) to run
app_state.agent_registry.agent_count() so the call runs in the application's
loop and you can remove asyncio.new_event_loop() / loop.run_until_complete() /
loop.close() surrounding the agent_count call.
---
Duplicate comments:
In `@src/synthorg/budget/tracker.py`:
- Around line 62-63: The retention window constant _COST_WINDOW_HOURS (168)
causes silent undercounts because _snapshot() prunes old data but
build_summary() and budget_used_percent still return totals for ranges that
start before the retention cutoff; change behavior so callers can detect this:
either make the retention period configurable and used as the authoritative
max-range, or validate range inputs (e.g., in build_summary() and any
month-to-date/long-range report methods) and raise/return an explicit error when
the requested start is earlier than now() - timedelta(hours=_COST_WINDOW_HOURS).
Ensure _snapshot() and the validation use the same _COST_WINDOW_HOURS (or the
new config) so they stay aligned and callers no longer receive silent partial
summaries.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c340253f-1d44-4b9c-961c-818005f0aefa
📒 Files selected for processing (16)
CLAUDE.mdsrc/synthorg/budget/tracker.pysrc/synthorg/communication/delegation/record_store.pysrc/synthorg/core/task.pysrc/synthorg/observability/events/budget.pysrc/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/tool.pysrc/synthorg/providers/health.pysrc/synthorg/tools/invocation_tracker.pytests/unit/api/controllers/test_setup.pytests/unit/budget/test_tracker.pytests/unit/communication/delegation/test_record_store.pytests/unit/core/test_task.pytests/unit/providers/test_health.pytests/unit/tools/test_invocation_tracker.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Build Sandbox
- GitHub Check: Test (Python 3.14)
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Documentation must follow Markdown conventions: use consistent heading levels, link to relevant sections, include code examples where appropriate, and keep content up-to-date with implementation changes.
Files:
CLAUDE.md
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Use PEP 758 except syntax:except A, B:(no parentheses) on Python 3.14. Do NOT usefrom __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
All public functions and classes must have type hints. Use mypy strict mode. Google-style docstrings are required on all public classes and functions (enforced by ruff D rules).
Every module with business logic must include:from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Always use constants from domain-specific modules undersynthorg.observability.events(e.g.,API_REQUEST_STARTEDfromevents.api). Use structured kwargs:logger.info(EVENT, key=value)— neverlogger.info("msg %s", val). Never useimport loggingorprint()in application code.
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values instead of storing redundant fields. UseNotBlankStrfromcore.typesfor all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction and wrap withMappingProxyTypefor read-only enforcement. Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (viamodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Keep functions under 50 lines and files under 800 lines. Handle errors exp...
Files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/core/task.pysrc/synthorg/observability/events/budget.pysrc/synthorg/tools/invocation_tracker.pysrc/synthorg/communication/delegation/record_store.pysrc/synthorg/providers/health.pysrc/synthorg/budget/tracker.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use markers@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow. Maintain 80% minimum code coverage (enforced in CI). Always run pytest with-n autofor parallel execution. Use@pytest.mark.parametrizefor testing similar cases. Useasyncio_mode = "auto"(no manual@pytest.mark.asyncioneeded).
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in tests or project code—use generic names liketest-provider,test-small-001,example-provider,example-large-001,large/medium/small. Vendor names may only appear in: (1) Operations design page, (2).claude/files, (3) third-party import paths, (4) provider presets.
Never skip, dismiss, or ignore flaky tests—always fix them fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()instead of widening margins. For tasks that must block indefinitely until cancelled, useasyncio.Event().wait()instead ofasyncio.sleep(large_number).
Files:
tests/unit/core/test_task.pytests/unit/tools/test_invocation_tracker.pytests/unit/communication/delegation/test_record_store.pytests/unit/providers/test_health.pytests/unit/budget/test_tracker.pytests/unit/api/controllers/test_setup.py
🧠 Learnings (44)
📚 Learning: 2026-03-17T11:41:02.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T11:41:02.964Z
Learning: Applies to src/**/*.py : Models: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing + validating redundant fields. Use `NotBlankStr` for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
CLAUDE.mdsrc/synthorg/core/task.py
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. Use `computed_field` for derived values instead of storing redundant fields. Use `NotBlankStr` for all identifier/name fields.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to **/*.py : Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use computed_field for derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing redundant fields. Use `NotBlankStr` from `core.types` for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
CLAUDE.mdsrc/synthorg/core/task.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
CLAUDE.mdsrc/synthorg/core/task.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
CLAUDE.mdsrc/synthorg/core/task.py
📚 Learning: 2026-03-16T10:40:25.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T10:40:25.144Z
Learning: Applies to **/*.py : Docstrings: Google style, required on public classes and functions (enforced by ruff D rules).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement. For `dict`/`list` fields in frozen Pydantic models, use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : All public functions and classes must have type hints. Use mypy strict mode. Google-style docstrings are required on all public classes and functions (enforced by ruff D rules).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to **/*.py : Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
CLAUDE.mdsrc/synthorg/core/task.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T23:05:29.577Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T23:05:29.577Z
Learning: Applies to **/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to **/*.py : Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/delegation.pysrc/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals
Applied to files:
src/synthorg/observability/events/tool.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/tool.pysrc/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings
Applied to files:
src/synthorg/observability/events/tool.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls
Applied to files:
src/synthorg/observability/events/tool.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/providers/health.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Applied to files:
src/synthorg/observability/events/provider.pysrc/synthorg/observability/events/budget.py
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
src/synthorg/core/task.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases.
Applied to files:
tests/unit/tools/test_invocation_tracker.pytests/unit/communication/delegation/test_record_store.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Parametrize: Prefer pytest.mark.parametrize for testing similar cases.
Applied to files:
tests/unit/tools/test_invocation_tracker.pytests/unit/communication/delegation/test_record_store.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/budget/tracker.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/providers/health.pysrc/synthorg/budget/tracker.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/budget/tracker.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.
Applied to files:
src/synthorg/tools/invocation_tracker.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit
Applied to files:
src/synthorg/tools/invocation_tracker.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/unit/api/controllers/test_setup.py
📚 Learning: 2026-03-28T21:29:12.755Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T21:29:12.755Z
Learning: Applies to tests/**/*.py : Use markers `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Maintain 80% minimum code coverage (enforced in CI). Always run pytest with `-n auto` for parallel execution. Use `pytest.mark.parametrize` for testing similar cases. Use `asyncio_mode = "auto"` (no manual `pytest.mark.asyncio` needed).
Applied to files:
tests/unit/api/controllers/test_setup.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
tests/unit/api/controllers/test_setup.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).
Applied to files:
tests/unit/api/controllers/test_setup.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/api/**/*.py : API package (api/): Litestar REST + WebSocket with controllers, guards, channels, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint, provider management endpoint (CRUD + test + presets), backup endpoint, RFC 9457 structured errors, AppState hot-reload slots, service auto-wiring (Phase 1 at construction, Phase 2 on startup), lifecycle helpers
Applied to files:
tests/unit/api/controllers/test_setup.py
🔇 Additional comments (9)
CLAUDE.md (1)
321-321: LGTM!The updated convention clearly documents the requirement for
allow_inf_nan=Falsein allConfigDictdeclarations. Per the commit messages, issue#913has been opened to track applying this convention to the remainingConfigDictdeclarations across the codebase.tests/unit/core/test_task.py (1)
371-378: LGTM!Well-structured parametrized test for the new
allow_inf_nan=Falsevalidation. The test correctly covers all three non-finite float edge cases (+inf,-inf,NaN) and the error message match aligns with Pydantic's validation output.src/synthorg/observability/events/provider.py (1)
82-85: LGTM!Event constants are correctly defined with proper dotted naming convention (
provider.health.pruned,provider.health.auto_pruned) matching the existing patterns in this module.Minor observation: The section comment uses
# -- ... --style while other sections in this file use the box-drawing characters (# ── ... ──). Consider aligning for visual consistency, though this is purely cosmetic.src/synthorg/observability/events/delegation.py (1)
31-32: LGTM!Event constants are correctly defined with proper dotted naming convention, consistent with the existing patterns in this module. The context snippet confirms these constants are properly imported and used by
record_store.py.src/synthorg/observability/events/tool.py (1)
63-64: LGTM!Event constants are correctly defined. The
TOOL_INVOCATION_TIME_RANGE_INVALIDconstant now uses the proper dotted naming convention (tool.invocation.time_range.invalid), consistent with other similar events likeDELEGATION_TIME_RANGE_INVALIDandBUDGET_TIME_RANGE_INVALID.src/synthorg/core/task.py (2)
34-34: LGTM!Adding
allow_inf_nan=FalsetoAcceptanceCriterionis defensive and aligns with the new codebase-wide convention, even though this model currently has no float fields.
79-79: LGTM!The
allow_inf_nan=Falseaddition correctly prevents non-finite values in thebudget_limitfield, which is the primary motivation for this change. This is properly covered by the new parametrized test intest_task.py.tests/unit/communication/delegation/test_record_store.py (1)
189-244: LGTM!Comprehensive test coverage for the new FIFO eviction behavior:
- Eviction when exceeding
max_records- No eviction below and at exact capacity
- Constructor validation for invalid
max_recordsvalues (properly parametrized per guidelines)- Edge case for
max_records=1The tests correctly verify the
deque(maxlen=N)behavior where the oldest records are automatically evicted on append when at capacity.src/synthorg/observability/events/budget.py (1)
37-41: LGTM!Event constants are correctly defined with proper dotted naming convention. The context snippet confirms these constants are properly imported and used by
tracker.pyfor the new TTL-based pruning and retention warning behaviors.
| async def prune_expired(self, *, now: datetime | None = None) -> int: | ||
| """Remove records older than the 168-hour (7-day) cost window. | ||
|
|
||
| Call periodically from long-running services to bound | ||
| memory growth. | ||
|
|
||
| Args: | ||
| now: Reference time. Defaults to current UTC time. | ||
|
|
||
| Returns: | ||
| Number of records removed. | ||
| """ | ||
| ref = now or datetime.now(UTC) | ||
| cutoff = ref - timedelta(hours=_COST_WINDOW_HOURS) | ||
| async with self._lock: | ||
| pruned = self._prune_before(cutoff) | ||
| if pruned: | ||
| logger.info( | ||
| BUDGET_RECORDS_PRUNED, | ||
| pruned=pruned, | ||
| remaining=len(self._records), | ||
| ) | ||
| return pruned |
There was a problem hiding this comment.
Expired records are only pruned on read paths.
record() never checks the TTL, _snapshot() only runs when somebody queries, and the current wiring in src/synthorg/api/auto_wire.py:156-167, src/synthorg/api/lifecycle.py:157-200, and src/synthorg/api/app.py:192-210 does not schedule prune_expired(). A write-heavy/read-light node will therefore keep accumulating expired entries indefinitely, which misses the memory-bounding goal of this change. Add a periodic prune loop or opportunistic pruning from record() once the threshold is crossed.
Also applies to: 439-464
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/budget/tracker.py` around lines 143 - 165, The prune_expired()
routine is only invoked on read paths so expired entries accumulate on
write-heavy nodes; fix by adding opportunistic or periodic pruning: update the
record() method in the same budget tracker to opportunistically call
prune_expired() (or inline _prune_before(cutoff)) under the existing self._lock
whenever appending a record and the record count or age exceeds a small
threshold, and/or add a background prune loop that calls prune_expired() on an
interval (implement a _start_prune_loop coroutine and start it from your service
lifecycle bootstrap where services are started). Ensure you use the existing
methods (_prune_before, prune_expired) and locks to avoid races and avoid
blocking the write path by scheduling async background calls where appropriate.
| Concurrency note: ``record_sync`` does not acquire ``_lock``. | ||
| The lock serialises concurrent async readers only. Append-only | ||
| semantics and cooperative asyncio scheduling make single-call | ||
| sync writes safe (``list.append`` cannot be interrupted). | ||
| The lock serialises concurrent async readers only. Cooperative | ||
| asyncio scheduling and deque's internal maxlen enforcement make | ||
| single-call sync writes safe (``deque.append`` cannot be | ||
| interrupted). The eviction warning is best-effort under this | ||
| model. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/synthorg/communication/delegation/record_store.py | head -100Repository: Aureliolo/synthorg
Length of output: 4014
🏁 Script executed:
cat -n src/synthorg/communication/delegation/record_store.py | tail -80Repository: Aureliolo/synthorg
Length of output: 3192
🏁 Script executed:
rg "record_sync" --type py -B 2 -A 2Repository: Aureliolo/synthorg
Length of output: 12305
🏁 Script executed:
sed -n '1,50p' src/synthorg/communication/delegation/service.py | head -60Repository: Aureliolo/synthorg
Length of output: 1643
🏁 Script executed:
sed -n '40,150p' src/synthorg/communication/delegation/service.pyRepository: Aureliolo/synthorg
Length of output: 3669
🏁 Script executed:
sed -n '130,200p' src/synthorg/communication/delegation/service.pyRepository: Aureliolo/synthorg
Length of output: 2485
🏁 Script executed:
rg "\.delegate\(" --type py -B 3 -A 1 | head -60Repository: Aureliolo/synthorg
Length of output: 4975
🏁 Script executed:
rg "DelegationService\(" --type py -B 2 -A 5 | head -80Repository: Aureliolo/synthorg
Length of output: 2717
🏁 Script executed:
rg "service\.delegate|\.delegate\(" --type py src/ | head -30Repository: Aureliolo/synthorg
Length of output: 44
🏁 Script executed:
rg "delegate\(" --type py src/ -A 2 -B 2Repository: Aureliolo/synthorg
Length of output: 430
🏁 Script executed:
rg "\.delegate\(" src/ --type py -B 5 -A 3Repository: Aureliolo/synthorg
Length of output: 44
🏁 Script executed:
fd -e py src/ | xargs grep -l "DelegationService" | head -5Repository: Aureliolo/synthorg
Length of output: 383
🏁 Script executed:
fd -e py . src/ | xargs grep -l "DelegationService" 2>/dev/null | head -5Repository: Aureliolo/synthorg
Length of output: 257
🏁 Script executed:
cat -n src/synthorg/communication/conflict_resolution/service.py | head -150Repository: Aureliolo/synthorg
Length of output: 5684
🏁 Script executed:
rg "to_thread|ThreadPoolExecutor|threading\." src/synthorg/communication/delegation/ --type pyRepository: Aureliolo/synthorg
Length of output: 44
🏁 Script executed:
rg "async def.*delegate" src/ --type py -A 10 | head -50Repository: Aureliolo/synthorg
Length of output: 1837
Clarify the design constraint on record_sync() or add enforcement mechanism.
record_sync() indeed lacks synchronization with _snapshot(): an asyncio.Lock only coordinates coroutines on the event loop, not separate OS threads. If record_sync() is called from a different thread while _snapshot() iterates the deque, concurrent mutation can cause RuntimeError or a torn snapshot.
The current docstring states the method is "safe to call from sync code under asyncio cooperative scheduling," which implicitly constrains usage to the event loop thread. However, this constraint is undocumented in the signature and unenforced at runtime. Either:
- Add a thread-safe lock shared by both
record_sync()and_snapshot(), or - Explicitly document the event-loop-thread-only requirement and (optionally) add a runtime check to prevent off-thread calls.
Context: lines 38–43 and 66–88
Concurrency note: ``record_sync`` does not acquire ``_lock``.
The lock serialises concurrent async readers only. Cooperative
asyncio scheduling and deque's internal maxlen enforcement make
single-call sync writes safe (``deque.append`` cannot be
interrupted). The eviction warning is best-effort under this
model.
def record_sync(self, delegation: DelegationRecord) -> None:
"""Append a delegation record (sync, for cooperative scheduling).
Safe to call from sync code under asyncio cooperative scheduling
since a plain deque append cannot be interrupted by another
coroutine.
...
"""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/communication/delegation/record_store.py` around lines 38 - 43,
record_sync lacks protection against concurrent calls from other OS threads and
_snapshot iterates the deque without that protection; add a dedicated threading
lock (e.g. self._thread_lock = threading.Lock()) and use it to guard deque
mutations and snapshot iteration: acquire self._thread_lock in record_sync
before appending the DelegationRecord and in _snapshot while copying/iterating
the deque, keep the existing asyncio._lock for async-reader coordination, and
import threading; this provides thread-safety without changing the async lock
semantics.
| if max_records < 1: | ||
| msg = f"max_records must be >= 1, got {max_records}" | ||
| raise ValueError(msg) |
There was a problem hiding this comment.
Log rejected max_records values before raising.
This new constructor error path is silent today, so bad retention config can fail without any structured warning in logs. Please emit a warning with the rejected value before raising ValueError.
Based on learnings, "All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/tools/invocation_tracker.py` around lines 50 - 52, Before
raising the ValueError for invalid max_records in invocation_tracker.py, emit a
warning that includes the rejected value and context; e.g., call the
module/class logger (use the same logger used elsewhere in this file) to
logger.warning("Rejected max_records", extra={"max_records": max_records,
"reason": "must be >= 1"}) or similar structured message immediately before
raise ValueError(msg) so the invalid value and rationale are recorded.
Add prune_expired() assertions after get_summary() calls to prove auto-prune actually evicted expired records (removed==0) and that no-prune paths preserved expired records (removed==5). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tests/unit/providers/test_health.py (1)
517-527:⚠️ Potential issue | 🟡 MinorAssert auto-prune side effect in the all-expired case.
At Line 525-526,
calls_last_24h == 0can still pass via filtering alone. Add a post-summaryprune_expiredassertion to prove_snapshot()actually evicted records whenlen(_records) > auto_prune_threshold.♻️ Proposed test hardening
summary = await tracker.get_summary("test-provider", now=self._NOW) assert summary.calls_last_24h == 0 + removed = await tracker.prune_expired(now=self._NOW) + assert removed == 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/providers/test_health.py` around lines 517 - 527, The test test_snapshot_all_records_expired must assert the auto-prune side-effect: after creating 6 expired records on a ProviderHealthTracker(auto_prune_threshold=5) and calling get_summary, verify that pruning actually occurred by asserting the tracker’s internal storage was reduced—either call the tracker.prune_expired() (or the public pruning API) and assert tracker._records is empty (or len(tracker._records) <= auto_prune_threshold) to prove _snapshot() evicted records when len(_records) > auto_prune_threshold.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/unit/providers/test_health.py`:
- Around line 517-527: The test test_snapshot_all_records_expired must assert
the auto-prune side-effect: after creating 6 expired records on a
ProviderHealthTracker(auto_prune_threshold=5) and calling get_summary, verify
that pruning actually occurred by asserting the tracker’s internal storage was
reduced—either call the tracker.prune_expired() (or the public pruning API) and
assert tracker._records is empty (or len(tracker._records) <=
auto_prune_threshold) to prove _snapshot() evicted records when len(_records) >
auto_prune_threshold.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e1588bba-7f03-4f98-9683-796ed0bb5814
📒 Files selected for processing (1)
tests/unit/providers/test_health.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web
- GitHub Check: Build Sandbox
- GitHub Check: Build Backend
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (1)
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use markers@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow. Maintain 80% minimum code coverage (enforced in CI). Always run pytest with-n autofor parallel execution. Use@pytest.mark.parametrizefor testing similar cases. Useasyncio_mode = "auto"(no manual@pytest.mark.asyncioneeded).
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in tests or project code—use generic names liketest-provider,test-small-001,example-provider,example-large-001,large/medium/small. Vendor names may only appear in: (1) Operations design page, (2).claude/files, (3) third-party import paths, (4) provider presets.
Never skip, dismiss, or ignore flaky tests—always fix them fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()instead of widening margins. For tasks that must block indefinitely until cancelled, useasyncio.Event().wait()instead ofasyncio.sleep(large_number).
Files:
tests/unit/providers/test_health.py
🔇 Additional comments (1)
tests/unit/providers/test_health.py (1)
453-515: Good coverage improvements for threshold/boundary behavior.The new tests around threshold exceeded vs. below/exact threshold and invalid
auto_prune_thresholdvalues are solid and materially improve regression protection.Also applies to: 528-534
Add prune_expired() assertion to test_snapshot_all_records_expired to prove auto-prune evicted all expired records (removed==0), consistent with the other two auto-eviction tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🤖 I have created a release *beep* *boop* --- #MAJOR CHANGES; We got a somewhat working webui :) ## [0.5.0](v0.4.9...v0.5.0) (2026-03-30) ### Features * add analytics trends and budget forecast API endpoints ([#798](#798)) ([16b61f5](16b61f5)) * add department policies to default templates ([#852](#852)) ([7a41548](7a41548)) * add remaining activity event types (task_started, tool_used, delegation, cost_incurred) ([#832](#832)) ([4252fac](4252fac)) * agent performance, activity, and history API endpoints ([#811](#811)) ([9b75c1d](9b75c1d)) * Agent Profiles and Detail pages (biography, career, performance) ([#874](#874)) ([62d7880](62d7880)) * app shell, Storybook, and CI/CD pipeline ([#819](#819)) ([d4dde90](d4dde90)) * Approvals page with risk grouping, urgency indicators, batch actions ([#889](#889)) ([4e9673d](4e9673d)) * Budget Panel page (P&L dashboard, breakdown charts, forecast) ([#890](#890)) ([b63b0f1](b63b0f1)) * build infrastructure layer (API client, auth, WebSocket) ([#815](#815)) ([9f01d3e](9f01d3e)) * CLI global options infrastructure, UI modes, exit codes, env vars ([#891](#891)) ([fef4fc5](fef4fc5)) * CodeMirror editor and theme preferences toggle ([#905](#905), [#807](#807)) ([#909](#909)) ([41fbedc](41fbedc)) * Company page (department/agent management) ([#888](#888)) ([cfb88b0](cfb88b0)) * comprehensive hint coverage across all CLI commands ([#900](#900)) ([937974e](937974e)) * config system extensions, per-command flags for init/start/stop/status/logs ([#895](#895)) ([32f83fe](32f83fe)) * configurable currency system replacing hardcoded USD ([#854](#854)) ([b372551](b372551)) * Dashboard page (metric cards, activity feed, budget burn) ([#861](#861)) ([7d519d5](7d519d5)) * department health, provider status, and activity feed endpoints ([#818](#818)) ([6d5f196](6d5f196)) * design tokens and core UI components ([#833](#833)) ([ed887f2](ed887f2)) * extend approval, meeting, and budget API responses ([#834](#834)) ([31472bf](31472bf)) * frontend polish -- real-time UX, accessibility, responsive, performance ([#790](#790), [#792](#792), [#791](#791), [#793](#793)) ([#917](#917)) ([f04a537](f04a537)) * implement human roles and access control levels ([#856](#856)) ([d6d8a06](d6d8a06)) * implement semantic conflict detection in workspace merge ([#860](#860)) ([d97283b](d97283b)) * interaction components and animation patterns ([#853](#853)) ([82d4b01](82d4b01)) * Login page + first-run bootstrap + Company page ([#789](#789), [#888](#888)) ([#896](#896)) ([8758e8d](8758e8d)) * Meetings page with timeline viz, token bars, contribution formatting ([#788](#788)) ([#904](#904)) ([b207f46](b207f46)) * Messages page with threading, channel badges, sender indicators ([#787](#787)) ([#903](#903)) ([28293ad](28293ad)) * Org Chart force-directed view and drag-drop reassignment ([#872](#872), [#873](#873)) ([#912](#912)) ([a68a938](a68a938)) * Org Chart page (living nodes, status, CRUD, department health) ([#870](#870)) ([0acbdae](0acbdae)) * per-command flags for remaining commands, auto-behavior wiring, help/discoverability ([#897](#897)) ([3f7afa2](3f7afa2)) * Providers page with backend rework -- health, CRUD, subscription auth ([#893](#893)) ([9f8dd98](9f8dd98)) * scaffold React + Vite + TypeScript + Tailwind project ([#799](#799)) ([bd151aa](bd151aa)) * Settings page with search, dependency indicators, grouped rendering ([#784](#784)) ([#902](#902)) ([a7b9870](a7b9870)) * Setup Wizard rebuild with template comparison, cost estimator, theme customization ([#879](#879)) ([ae8b50b](ae8b50b)) * setup wizard UX -- template filters, card metadata, provider form reuse ([#910](#910)) ([7f04676](7f04676)) * setup wizard UX overhaul -- mode choice, step reorder, provider fixes ([#907](#907)) ([ee964c4](ee964c4)) * structured ModelRequirement in template agent configs ([#795](#795)) ([7433548](7433548)) * Task Board page (rich Kanban, filtering, dependency viz) ([#871](#871)) ([04a19b0](04a19b0)) ### Bug Fixes * align frontend types with backend and debounce WS refetches ([#916](#916)) ([134c11b](134c11b)) * auto-cleanup targets newly pulled images instead of old ones ([#884](#884)) ([50e6591](50e6591)) * correct wipe backup-skip flow and harden error handling ([#808](#808)) ([c05860f](c05860f)) * improve provider setup in wizard, subscription auth, dashboard bugs ([#914](#914)) ([87bf8e6](87bf8e6)) * improve update channel detection and add config get command ([#814](#814)) ([6b137f0](6b137f0)) * resolve all ESLint warnings, add zero-warnings enforcement ([#899](#899)) ([079b46a](079b46a)) * subscription auth uses api_key, base URL optional for cloud providers ([#915](#915)) ([f0098dd](f0098dd)) ### Refactoring * semantic analyzer cleanup -- shared filtering, concurrency, extraction ([#908](#908)) ([81372bf](81372bf)) ### Documentation * brand identity and UX design system from [#765](#765) exploration ([#804](#804)) ([389a9f4](389a9f4)) * page structure and information architecture for v0.5.0 dashboard ([#809](#809)) ([f8d6d4a](f8d6d4a)) * write UX design guidelines with WCAG-verified color system ([#816](#816)) ([4a4594e](4a4594e)) ### Tests * add unit tests for agent hooks and page components ([#875](#875)) ([#901](#901)) ([1d81546](1d81546)) ### CI/CD * bump actions/deploy-pages from 4.0.5 to 5.0.0 in the major group ([#831](#831)) ([01c19de](01c19de)) * bump astral-sh/setup-uv from 7.6.0 to 8.0.0 in /.github/actions/setup-python-uv in the all group ([#920](#920)) ([5f6ba54](5f6ba54)) * bump codecov/codecov-action from 5.5.3 to 6.0.0 in the major group ([#868](#868)) ([f22a181](f22a181)) * bump github/codeql-action from 4.34.1 to 4.35.0 in the all group ([#883](#883)) ([87a4890](87a4890)) * bump sigstore/cosign-installer from 4.1.0 to 4.1.1 in the minor-and-patch group ([#830](#830)) ([7a69050](7a69050)) * bump the all group with 3 updates ([#923](#923)) ([ff27c8e](ff27c8e)) * bump wrangler from 4.76.0 to 4.77.0 in /.github in the minor-and-patch group ([#822](#822)) ([07d43eb](07d43eb)) * bump wrangler from 4.77.0 to 4.78.0 in /.github in the all group ([#882](#882)) ([f84118d](f84118d)) ### Maintenance * add design system enforcement hook and component inventory ([#846](#846)) ([15abc43](15abc43)) * add dev-only auth bypass for frontend testing ([#885](#885)) ([6cdcd8a](6cdcd8a)) * add pre-push rebase check hook ([#855](#855)) ([b637a04](b637a04)) * backend hardening -- eviction/size-caps and model validation ([#911](#911)) ([81253d9](81253d9)) * bump axios from 1.13.6 to 1.14.0 in /web in the all group across 1 directory ([#922](#922)) ([b1b0232](b1b0232)) * bump brace-expansion from 5.0.4 to 5.0.5 in /web ([#862](#862)) ([ba4a565](ba4a565)) * bump eslint-plugin-react-refresh from 0.4.26 to 0.5.2 in /web ([#801](#801)) ([7574bb5](7574bb5)) * bump faker from 40.11.0 to 40.11.1 in the minor-and-patch group ([#803](#803)) ([14d322e](14d322e)) * bump https://github.com/astral-sh/ruff-pre-commit from v0.15.7 to 0.15.8 ([#864](#864)) ([f52901e](f52901e)) * bump nginxinc/nginx-unprivileged from `6582a34` to `f99cc61` in /docker/web in the all group ([#919](#919)) ([df85e4f](df85e4f)) * bump nginxinc/nginx-unprivileged from `ccbac1a` to `6582a34` in /docker/web ([#800](#800)) ([f4e9450](f4e9450)) * bump node from `44bcbf4` to `71be405` in /docker/sandbox ([#827](#827)) ([91bec67](91bec67)) * bump node from `5209bca` to `cf38e1f` in /docker/web ([#863](#863)) ([66d6043](66d6043)) * bump picomatch in /site ([#842](#842)) ([5f20bcc](5f20bcc)) * bump recharts 2->3 and @types/node 22->25 in /web ([#802](#802)) ([a908800](a908800)) * Bump requests from 2.32.5 to 2.33.0 ([#843](#843)) ([41daf69](41daf69)) * bump smol-toml from 1.6.0 to 1.6.1 in /site ([#826](#826)) ([3e5dbe4](3e5dbe4)) * bump the all group with 3 updates ([#921](#921)) ([7bace0b](7bace0b)) * bump the minor-and-patch group across 1 directory with 2 updates ([#829](#829)) ([93e611f](93e611f)) * bump the minor-and-patch group across 1 directory with 3 updates ([#841](#841)) ([7010c8e](7010c8e)) * bump the minor-and-patch group across 1 directory with 3 updates ([#869](#869)) ([548cee5](548cee5)) * bump the minor-and-patch group in /site with 2 updates ([#865](#865)) ([9558101](9558101)) * bump the minor-and-patch group with 2 updates ([#867](#867)) ([4830706](4830706)) * consolidate Dependabot groups to 1 PR per ecosystem ([06d2556](06d2556)) * consolidate Dependabot groups to 1 PR per ecosystem ([#881](#881)) ([06d2556](06d2556)) * improve worktree skill with full dep sync and status enhancements ([#906](#906)) ([772c625](772c625)) * remove Vue remnants and document framework decision ([#851](#851)) ([bf2adf6](bf2adf6)) * update web dependencies and fix brace-expansion CVE ([#880](#880)) ([a7a0ed6](a7a0ed6)) * upgrade to Storybook 10 and TypeScript 6 ([#845](#845)) ([52d95f2](52d95f2)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…ns (#943) ## Summary - Add `allow_inf_nan=False` to all 260 remaining `ConfigDict` declarations across 92 files in `src/synthorg/` - All 421 `ConfigDict` declarations in the codebase now consistently reject `NaN`/`Inf` in numeric fields at validation time - Defensive application to models without numeric fields guards against future float field additions ## Details PR #911 established the `allow_inf_nan=False` convention and applied it to `Task` and `AcceptanceCriterion`. This PR completes the rollout to all remaining models across: api/, budget/, communication/, config/, core/, engine/, hr/, memory/ (already done), observability/, persistence/, providers/, security/, settings/, templates/, and tools/. Four patterns were updated: - `ConfigDict(frozen=True)` (240 declarations) - `ConfigDict(frozen=True, extra="forbid")` (18 declarations) - `ConfigDict(frozen=True, arbitrary_types_allowed=True)` (1 declaration) - `ConfigDict(frozen=True, populate_by_name=True)` (1 declaration) ## Test plan - [x] `uv run ruff check src/ tests/` -- all checks passed - [x] `uv run ruff format src/ tests/` -- all files already formatted - [x] `uv run mypy src/ tests/` -- no issues in 1,250 files - [x] `uv run python -m pytest tests/ -n auto` -- 11,259 passed, 9 skipped - [x] Pre-commit and pre-push hooks passed Closes #913 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
allow_inf_nan=FalsetoTaskandAcceptanceCriterionConfigDicts, with parametrized inf/NaN rejection testsProviderHealthTracker._snapshot()andCostTracker(newprune_expired()+ auto-prune in_snapshot()when records exceed configurable threshold, default 100k)ToolInvocationTrackerandDelegationRecordStorefrom unboundedlisttodeque(maxlen=max_records)with FIFO eviction (default 10k), matching theAuditLogpatternCloses #844
Closes #817
Closes #837
Test plan
uv run python -m pytest tests/ -n auto-- 10978 passeduv run ruff check src/ tests/-- all cleanuv run mypy src/ tests/-- no issues in 1233 files🤖 Generated with Claude Code