refactor: memory pipeline improvements (#1075, #997)#1091
Conversation
D16: SynthesisOutcome StrEnum replaces bare bool in _synthesize return. D17: _select_entries returns tuple instead of list for immutability. D18: MemoryToolName StrEnum narrows DistillationRequest field type. D20: Defensive asyncio.CancelledError re-raise before except Exception. D21: Final-round sufficiency check in _reformulation_loop. D22: Remove misleading MEMORY_RETRIEVAL_START log from constructor. Closes items D16-D22 of #1075.
Replace loose __init__ kwargs and module-level constants in LLMConsolidationStrategy with a frozen LLMConsolidationConfig model. Aligns with the convention used by DualModeConfig, RetentionConfig. All 9 tuning knobs (group_threshold, temperature, max_summary_tokens, include_distillation_context, max_trajectory_context_entries, max_trajectory_chars_per_entry, max_entry_input_chars, max_total_user_content_chars, fallback_truncate_length) are now validated by Pydantic with range constraints. Reworks _make_strategy test helper to accept **config_overrides, satisfying C12 (fixture consolidation). Closes items A2 and C12 of #1075.
Add candidate_pool_multiplier to MemoryRetrievalConfig (default 3, range 1-10). When diversity_penalty_enabled is True, the pipeline over-fetches max_memories * candidate_pool_multiplier entries from the backend, runs filter + MMR, then truncates to max_memories. This allows MMR to promote diverse candidates that would otherwise fall below the top-K cutoff. Extract _compute_pool_limit() helper to keep _execute_pipeline under the 50-line limit (B8). Closes items A1 and B8 of #1075.
Add FusionStrategy | None discriminator field to ScoredMemory so consumers can distinguish LINEAR vs RRF producers without checking for dead-field sentinels (recency_score == 0.0). Set to LINEAR in _score_entry and RRF in _build_rrf_scored_memories. Default None preserves backward compatibility. Closes item A4 of #1075.
Add EMBEDDING category to LLMCallCategory for distinguishing embedding API calls from LLM completion calls. Add EmbeddingCostConfig to Mem0BackendConfig with per-model pricing and a chars-per-token heuristic for token estimation. Instrument Mem0MemoryBackend.store() and retrieve() to record CostRecord entries with call_category=EMBEDDING after each successful operation. Cost tracking is opt-in (enabled=False by default) and best-effort (failures log but never propagate). Extend CategoryBreakdown with embedding_cost/tokens/count fields. Include embedding tokens in orchestration ratio denominator. Add BUDGET_EMBEDDING_COST_RECORDED observability event. Closes #997 (backend portion).
B6: Extract _try_capture_distillation and _try_procedural_memory from agent_engine.py (1953 lines) into engine/post_execution/memory_hooks.py. Engine delegates to the extracted functions. B7: Move TestHybridSearchPipeline from test_retriever.py (809 lines) to test_retriever_hybrid.py, reducing the main file under 800 lines. Closes items B6 and B7 of #1075.
C10: Add spec= to AsyncMock() calls across memory test files for interface-change detection. C11: Replace call_args[0][N] indexing with .args/.kwargs access. C13: Promote _merge_results to merge_results (public) in tool_retriever.py and _bigram_jaccard to bigram_jaccard in ranking.py. C15: Add structlog.testing.capture_logs() assertions on distillation error paths verifying DISTILLATION_CAPTURE_FAILED event is logged. Closes items C10, C11, C13, and C15 of #1075.
Add 'embedding' to CostRecord.call_category union type. Add embedding bucket to computeCategoryBreakdown in budget utils. Add Embedding row to CategoryBreakdown component with bg-accent-dim color from the design system. Closes #997 (dashboard portion).
- Fix MemoryError/RecursionError/CancelledError propagation in _record_embedding_cost and try_procedural_memory - Add BUDGET_EMBEDDING_COST_FAILED event constant for failure path - Remove dead _estimate_embedding_tokens method - Add spec=MemoryBackend to test_retriever_hybrid.py mocks - Add DEBUG skip logs to try_procedural_memory - Wrap try_capture_distillation validation in try/except - Add cross-field validator to LLMConsolidationConfig - Reject negative model_pricing in EmbeddingCostConfig - Log when embedding model not in pricing map - Fix ScoredMemory description (no backward compat language) - Use three-valued sufficiency in reformulation final check - Fix stale _bigram_jaccard comment reference Pre-reviewed by 5 agents, 13 findings addressed.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
🧰 Additional context used📓 Path-based instructions (5)docs/**/*.md📄 CodeRabbit inference engine (CLAUDE.md)
Files:
**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
tests/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
⚙️ CodeRabbit configuration file
Files:
src/synthorg/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
src/**/*.py⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (37)📓 Common learnings📚 Learning: 2026-03-17T22:08:13.456ZApplied to files:
📚 Learning: 2026-03-17T06:30:14.180ZApplied to files:
📚 Learning: 2026-03-17T22:08:13.456ZApplied to files:
📚 Learning: 2026-03-17T11:41:02.964ZApplied to files:
📚 Learning: 2026-04-06T06:43:24.031ZApplied to files:
📚 Learning: 2026-03-20T21:44:04.528ZApplied to files:
📚 Learning: 2026-03-31T20:07:03.035ZApplied to files:
📚 Learning: 2026-03-26T15:18:16.848ZApplied to files:
📚 Learning: 2026-03-15T19:14:27.144ZApplied to files:
📚 Learning: 2026-03-19T07:12:14.508ZApplied to files:
📚 Learning: 2026-03-16T06:24:56.341ZApplied to files:
📚 Learning: 2026-03-20T21:44:04.528ZApplied to files:
📚 Learning: 2026-03-17T22:08:13.456ZApplied to files:
📚 Learning: 2026-03-15T18:28:13.207ZApplied to files:
📚 Learning: 2026-04-02T07:18:02.381ZApplied to files:
📚 Learning: 2026-04-06T06:43:24.031ZApplied to files:
📚 Learning: 2026-03-20T08:28:32.845ZApplied to files:
📚 Learning: 2026-03-31T21:07:37.470ZApplied to files:
📚 Learning: 2026-03-20T21:44:04.528ZApplied to files:
📚 Learning: 2026-03-16T07:22:28.134ZApplied to files:
📚 Learning: 2026-03-14T15:43:05.601ZApplied to files:
📚 Learning: 2026-03-16T07:22:28.134ZApplied to files:
📚 Learning: 2026-03-14T16:18:57.267ZApplied to files:
📚 Learning: 2026-03-15T16:55:07.730ZApplied to files:
📚 Learning: 2026-04-06T06:43:24.031ZApplied to files:
📚 Learning: 2026-03-14T16:18:57.267ZApplied to files:
📚 Learning: 2026-03-15T18:38:44.202ZApplied to files:
📚 Learning: 2026-03-17T22:08:13.456ZApplied to files:
📚 Learning: 2026-03-16T07:22:28.134ZApplied to files:
📚 Learning: 2026-03-17T06:43:14.114ZApplied to files:
📚 Learning: 2026-03-31T20:07:03.035ZApplied to files:
📚 Learning: 2026-03-19T07:13:44.964ZApplied to files:
📚 Learning: 2026-03-17T06:30:14.180ZApplied to files:
📚 Learning: 2026-03-19T07:12:14.508ZApplied to files:
📚 Learning: 2026-03-15T18:28:13.207ZApplied to files:
📚 Learning: 2026-03-19T11:33:01.580ZApplied to files:
🔇 Additional comments (12)
WalkthroughThis PR adds an EMBEDDING call category and end-to-end embedding cost tracking: new enum member, Mem0 backend wiring to estimate and persist embedding costs, and a frozen EmbeddingCostConfig. Budget analytics were extended with embedding fields and totals. It extracts post-execution memory hooks into synthorg.engine.post_execution, introduces LLMConsolidationConfig and SynthesisOutcome, makes consolidation use immutable tuples, adds candidate-pool overfetching (candidate_pool_multiplier) in retrieval, promotes/renames several helpers to public, adds observability events, and updates tests and frontend budget UI to surface embedding. |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
There was a problem hiding this comment.
Code Review
This pull request introduces embedding cost tracking, refactors post-execution hooks into a dedicated module, and enhances memory consolidation and retrieval configurations. Specifically, it adds an EMBEDDING call category to the budget system, implements cost estimation for Mem0 embeddings, and encapsulates LLM consolidation parameters into a structured configuration model. However, several critical syntax errors were identified in exception handling blocks across the new memory_hooks.py and the Mem0 adapter, where multiple exceptions are being caught without the required tuple syntax for Python 3, which will cause runtime failures.
| task_id=_nb.validate_python(task_id), | ||
| backend=memory_backend, | ||
| ) | ||
| except MemoryError, RecursionError: |
There was a problem hiding this comment.
| memory_backend=memory_backend, | ||
| config=procedural_memory_config, | ||
| ) | ||
| except MemoryError, RecursionError: |
| cost_usd=record.cost_usd, | ||
| model=model, | ||
| ) | ||
| except builtins.MemoryError, RecursionError: |
There was a problem hiding this comment.
There was a problem hiding this comment.
Pull request overview
Refactors and hardens the memory consolidation/retrieval pipeline (follow-ups from #1071 / #1075) and adds end-to-end embedding cost visibility in the budget system (#997), including backend instrumentation and dashboard/category breakdown support.
Changes:
- Introduces new/stronger configuration and typing for consolidation + retrieval (e.g.,
LLMConsolidationConfig,SynthesisOutcome,MemoryToolName, immutable tuples, scoring provenance). - Improves retrieval behavior/observability (MMR candidate pool over-fetch + truncation, final-round sufficiency check + new event, additional
CancelledErrorre-raises). - Adds embedding cost tracking to budget backend + analytics and exposes the new “Embedding” category in the web dashboard and tests.
Reviewed changes
Copilot reviewed 38 out of 39 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| web/src/utils/budget.ts | Adds embedding bucket to category ratio + breakdown computation. |
| web/src/pages/budget/CategoryBreakdown.tsx | Renders embedding category in stacked bar/legend and updates empty-state logic. |
| web/src/pages/budget/CategoryBreakdown.stories.tsx | Updates Storybook fixtures to include embedding category. |
| web/src/pages/budget/BudgetPage.stories.tsx | Adds embedding CostRecord example for dashboard stories. |
| web/src/api/types.ts | Extends CostRecord.call_category union to include 'embedding'. |
| web/src/tests/utils/budget.test.ts | Updates unit assertions for embedding bucket. |
| web/src/tests/utils/budget.property.test.ts | Extends property-based tests to include embedding category. |
| web/src/tests/pages/budget/CategoryBreakdown.test.tsx | Updates UI tests for embedding bar + legend expectations. |
| tests/unit/memory/test_tool_retriever.py | Uses MemoryBackend-spec’d mocks; switches to public merge_results; updates sufficiency-check count. |
| tests/unit/memory/test_retriever.py | Tightens mocks with specs; removes hybrid tests after file split. |
| tests/unit/memory/test_retriever_hybrid.py | New file containing hybrid (RRF) pipeline tests moved out of test_retriever.py. |
| tests/unit/memory/test_ranking.py | Updates pointer comment to bigram_jaccard public API. |
| tests/unit/memory/test_ranking_diversity.py | Switches from private _bigram_jaccard to public bigram_jaccard. |
| tests/unit/memory/test_memory_tools.py | Tightens backend mock with spec=MemoryBackend; avoids positional call-arg indexing. |
| tests/unit/memory/consolidation/test_llm_strategy.py | Updates strategy construction to use LLMConsolidationConfig + spec’d mocks; updates assertions. |
| tests/unit/memory/consolidation/test_distillation.py | Uses MemoryToolName, spec’d MemoryBackend, and adds log assertions on error paths. |
| tests/unit/memory/consolidation/test_config.py | Adds coverage for LLMConsolidationConfig defaults/validation/frozen behavior. |
| tests/unit/engine/test_agent_engine_distillation.py | Tightens mocks with spec=CompletionProvider/MemoryBackend; reduces brittle call-arg assertions. |
| tests/unit/engine/post_execution/test_memory_hooks.py | New tests for extracted post-execution hooks. |
| tests/unit/engine/post_execution/init.py | New test package marker for post-execution tests. |
| tests/unit/budget/test_call_category.py | Updates enum tests for new EMBEDDING category. |
| src/synthorg/observability/events/memory.py | Adds MEMORY_REFORMULATION_FINAL_CHECK event constant. |
| src/synthorg/observability/events/budget.py | Adds embedding cost tracking event constants. |
| src/synthorg/memory/tool_retriever.py | Promotes merge_results; adds final sufficiency check event; re-raises CancelledError in exception blocks. |
| src/synthorg/memory/retriever.py | Adds candidate pool over-fetch via _compute_pool_limit() and truncates after MMR. |
| src/synthorg/memory/retrieval_config.py | Adds candidate_pool_multiplier + consistency validator. |
| src/synthorg/memory/ranking.py | Adds ScoredMemory.scoring_strategy discriminator; promotes bigram_jaccard public API. |
| src/synthorg/memory/consolidation/simple_strategy.py | Makes _select_entries/summary inputs immutable tuples. |
| src/synthorg/memory/consolidation/llm_strategy.py | Introduces LLMConsolidationConfig usage + SynthesisOutcome; immutability; CancelledError handling. |
| src/synthorg/memory/consolidation/distillation.py | Adds MemoryToolName enum and narrows memory_tool_invocations; CancelledError handling. |
| src/synthorg/memory/consolidation/config.py | Adds frozen LLMConsolidationConfig model + cross-field validation. |
| src/synthorg/memory/consolidation/init.py | Re-exports new config/types (LLMConsolidationConfig, MemoryToolName, SynthesisOutcome). |
| src/synthorg/memory/backends/mem0/config.py | Adds EmbeddingCostConfig and wires it into Mem0BackendConfig. |
| src/synthorg/memory/backends/mem0/adapter.py | Records embedding cost estimates on store/retrieve when enabled; emits budget events. |
| src/synthorg/engine/post_execution/memory_hooks.py | New extracted hook helpers for distillation capture + procedural memory pipeline. |
| src/synthorg/engine/post_execution/init.py | Exposes post-execution hook helpers via package exports. |
| src/synthorg/engine/agent_engine.py | Delegates distillation/procedural hooks to extracted module. |
| src/synthorg/budget/category_analytics.py | Extends CategoryBreakdown and orchestration ratio to include embedding fields. |
| src/synthorg/budget/call_category.py | Adds LLMCallCategory.EMBEDDING. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if not cost_cfg.enabled or self._cost_tracker is None: | ||
| return | ||
|
|
||
| input_tokens = max(1, content_length // cost_cfg.default_chars_per_token) |
There was a problem hiding this comment.
Embedding token estimation uses integer floor division (content_length // default_chars_per_token), which systematically undercounts tokens for any non-multiple of chars_per_token. For cost tracking/observability it’s typically better to use a ceiling division so the estimate isn’t biased low (still keeping the max(1, ...) guard).
| input_tokens = max(1, content_length // cost_cfg.default_chars_per_token) | |
| input_tokens = max( | |
| 1, | |
| (content_length + cost_cfg.default_chars_per_token - 1) | |
| // cost_cfg.default_chars_per_token, | |
| ) |
| await self._record_embedding_cost( | ||
| agent_id=str(agent_id), | ||
| task_id="memory-store", |
There was a problem hiding this comment.
Embedding CostRecord uses a constant task_id ("memory-store") for every store call. This will skew any downstream per-task analytics (e.g., task counts/cost-per-task in the dashboard) because many unrelated tasks collapse into a single task_id. Consider propagating the real task_id (if available) or adjusting the cost-record schema/analytics to support non-task-scoped cost records so these don’t distort task-based metrics.
| await self._record_embedding_cost( | |
| agent_id=str(agent_id), | |
| task_id="memory-store", | |
| request_task_id = getattr(request, "task_id", None) | |
| cost_task_id = ( | |
| str(request_task_id) | |
| if request_task_id | |
| else f"memory-store:{memory_id}" | |
| ) | |
| await self._record_embedding_cost( | |
| agent_id=str(agent_id), | |
| task_id=cost_task_id, |
| await self._record_embedding_cost( | ||
| agent_id=str(agent_id), | ||
| task_id="memory-retrieve", |
There was a problem hiding this comment.
Embedding CostRecord uses a constant task_id ("memory-retrieve") for every retrieve call. Like the store path, this collapses costs across many tasks into a single synthetic task_id and will distort task-based metrics. Prefer propagating an actual task_id (or supporting task_id=None / separate grouping) so dashboards remain accurate.
| await self._record_embedding_cost( | |
| agent_id=str(agent_id), | |
| task_id="memory-retrieve", | |
| query_task_id = getattr(query, "task_id", None) | |
| await self._record_embedding_cost( | |
| agent_id=str(agent_id), | |
| task_id=str(query_task_id) if query_task_id is not None else None, |
| except Exception as exc: | ||
| logger.warning( | ||
| DISTILLATION_CAPTURE_SKIPPED, | ||
| agent_id=agent_id, | ||
| task_id=task_id, | ||
| reason=f"validation_or_capture_failed: {type(exc).__name__}: {exc}", | ||
| exc_info=True, | ||
| ) |
There was a problem hiding this comment.
This wrapper logs DISTILLATION_CAPTURE_SKIPPED for unexpected exceptions during validation/capture. Since this path represents a failure (not an intentional skip), it should log a failure event (e.g., DISTILLATION_CAPTURE_FAILED) or a distinct error event, otherwise operators will misinterpret failed captures as “skipped”.
| ) | ||
|
|
||
| backend.store.assert_awaited_once() | ||
| store_request = backend.store.call_args[0][1] |
There was a problem hiding this comment.
Avoid positional call_args indexing in assertions (call_args[0][1])—it’s brittle against signature changes. Prefer using call_args.args / call_args.kwargs (as done in other updated tests) for more robust assertions.
| store_request = backend.store.call_args[0][1] | |
| call_args = backend.store.call_args | |
| store_request = call_args.kwargs.get("request") | |
| if store_request is None: | |
| store_request = call_args.args[1] |
| async def test_capture_called_when_enabled(self) -> None: | ||
| """Delegates to capture_distillation when flag and backend set.""" | ||
| backend = AsyncMock() | ||
| backend.store = AsyncMock(return_value="dist-1") | ||
| result = _make_completed_result() |
There was a problem hiding this comment.
These tests create AsyncMock() backends without a spec. The updated memory tests in this PR consistently use spec’d mocks (e.g., tests/unit/memory/test_tool_retriever.py:37 uses AsyncMock(spec=MemoryBackend)) to catch interface drift. Align these mocks with that convention (and spec the proposer/backend where applicable) so production signature changes don’t silently pass tests.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1091 +/- ##
==========================================
- Coverage 89.54% 89.50% -0.05%
==========================================
Files 750 752 +2
Lines 44045 44132 +87
Branches 4426 4427 +1
==========================================
+ Hits 39441 39501 +60
- Misses 3813 3841 +28
+ Partials 791 790 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
src/synthorg/memory/retriever.py (1)
470-474:⚠️ Potential issue | 🟠 MajorRRF (and LINEAR) truncate to
max_memoriesbefore diversity penalty, preventing thecandidate_pool_multiplierfrom taking effect.The backend over-fetches by
candidate_pool_multiplier(default 3×) specifically to provide MMR a larger pool of candidates. However,fuse_ranked_lists(..., max_results=max_memories)truncates tomax_memoriesimmediately, andrank_memories()does the same. Diversity penalty is then applied to an already-truncated pool, defeating its purpose of promoting sub-threshold candidates.The
candidate_pool_multiplieris currently wasted. Either:
- Passing
pool_limittofuse_ranked_listsand skipping truncation when diversity is enabled, or- Deferring truncation in both paths until after diversity penalty (line 276)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/memory/retriever.py` around lines 470 - 474, The RRF/LINEAR fusion currently truncates results using max_results (max_memories) inside fuse_ranked_lists and in rank_memories(), which prevents candidate_pool_multiplier from expanding the pool for the diversity/MMR step; update the logic so that fuse_ranked_lists and rank_memories do not truncate to max_memories when diversity/MMR is enabled—either by passing a pool_limit (e.g., candidate_pool_multiplier * max_memories) into fuse_ranked_lists (instead of max_results) or by skipping truncation there and deferring the final cut to max_memories until after the diversity penalty is applied (the place where diversity/MMR is invoked), ensuring candidate_pool_multiplier actually increases the candidate pool used by the diversity step.src/synthorg/memory/consolidation/llm_strategy.py (1)
545-562:⚠️ Potential issue | 🟠 MajorShort-circuit when prompt truncation drops every entry.
_build_user_prompt()can legitimately returnsummarized=()once wrapper/escape expansion pushes the first<entry>block over the total-char cap. The current path still calls the provider, stores a summary of an empty prompt, and deletes nothing, so the same originals can be reprocessed forever.Suggested fix
user_content, summarized = self._build_user_prompt(entries, agent_id, category) + if not summarized: + logger.warning( + LLM_STRATEGY_FALLBACK, + agent_id=agent_id, + category=category.value, + reason="user_prompt_empty_after_truncation", + dropped_entries=len(entries), + ) + fallback = self._fallback_summary(entries) + return fallback, SynthesisOutcome.CONCAT_FALLBACK, entries system_prompt = self._build_system_prompt(trajectory_context) response_content = await self._call_llm( system_prompt, user_content, agent_id=agent_id,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/memory/consolidation/llm_strategy.py` around lines 545 - 562, _build_user_prompt may return summarized as an empty tuple when wrapper/escape expansion causes every <entry> to be dropped; currently the code still calls _call_llm and ends up storing an empty summary and never deleting originals. Fix by short-circuiting immediately after _build_user_prompt: if summarized is empty (or len(summarized) == 0) skip the LLM call and return the concatenation fallback from _fallback_summary(entries) with SynthesisOutcome.CONCAT_FALLBACK and the original entries, ensuring no attempt to call _call_llm or persist an empty synthesized result.tests/unit/memory/consolidation/test_llm_strategy.py (1)
744-804: 🧹 Nitpick | 🔵 TrivialBound this concurrency guard so regressions fail fast.
If group processing ever becomes sequential again, the first provider call waits on
barrierforever and the whole unit shard hangs instead of failing the test. Add a short timeout around thestrategy.consolidate(...)await so the regression stays local and deterministic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/memory/consolidation/test_llm_strategy.py` around lines 744 - 804, The test can hang if group processing becomes sequential because the first provider call awaits `barrier` forever; wrap the `strategy.consolidate(...)` call with a short timeout to fail fast. Replace the await of `strategy.consolidate((*episodic, *semantic), agent_id="agent-1")` with an `asyncio.wait_for(...)` call (e.g. `result = await asyncio.wait_for(strategy.consolidate(...), timeout=1)`) so that if `_barrier_complete`/`barrier` never progresses the test errors quickly; keep references to `_barrier_complete`, `barrier`, `expected_concurrent`, and `strategy.consolidate` intact.web/src/__tests__/utils/budget.property.test.ts (1)
92-109:⚠️ Potential issue | 🟡 MinorAdd a direct embedding bucket invariant to avoid false positives.
The current property (Line 102) only checks percentage totals; it won’t fail if
'embedding'is accidentally bucketed asuncategorized.✅ Suggested property-strengthening patch
(records) => { const ratio = computeCategoryBreakdown(records) + const expectedEmbeddingCost = records + .filter((r) => r.call_category === 'embedding') + .reduce((acc, r) => acc + r.cost_usd, 0) + const expectedEmbeddingCount = records.filter( + (r) => r.call_category === 'embedding', + ).length + + expect(ratio.embedding.cost).toBeCloseTo(expectedEmbeddingCost, 8) + expect(ratio.embedding.count).toBe(expectedEmbeddingCount) + const sum = ratio.productive.percent + ratio.coordination.percent + ratio.system.percent + ratio.embedding.percent + ratio.uncategorized.percent🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/src/__tests__/utils/budget.property.test.ts` around lines 92 - 109, The test only asserts the total percentages sum to 100 but misses verifying the embedding bucket specifically; update the property in the test around computeCategoryBreakdown(records) to also compute the embedding cost share (e.g., sum of r.cost_usd for records with r.category === 'embedding' divided by totalCost * 100) and assert that ratio.embedding.percent isCloseTo that computed embedding share when totalCost > 0 (and is 0 when totalCost === 0), referencing computeCategoryBreakdown, ratio.embedding.percent, and the records array so mis-bucketing of embedding into uncategorized will fail.web/src/pages/budget/BudgetPage.stories.tsx (1)
17-19:⚠️ Potential issue | 🟡 MinorKeep the overview fixture consistent with the record total.
These additions raise
mockCostRecordsto45.67, butmockOverview.total_cost_usd,budget_remaining_usd, andbudget_used_percentstill describe42.17. The story will now render a header total that disagrees with the breakdown data.Also applies to: 57-63
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/src/pages/budget/BudgetPage.stories.tsx` around lines 17 - 19, The overview fixture (mockOverview) is inconsistent with the cost records: update mockOverview.total_cost_usd to match the sum of mockCostRecords (45.67) and recompute the dependent fields (mockOverview.budget_remaining_usd and mockOverview.budget_used_percent) accordingly (or derive them from the shared budget total if one exists) so the header total and the breakdown agree; adjust the same values in the duplicate fixture at the other block (lines around the second occurrence mentioned).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/engine/post_execution/memory_hooks.py`:
- Around line 131-139: The call to propose_procedural_memory in
propose_procedural_memory(...) passes raw str agent_id and task_id but the
callee expects NotBlankStr; add the same runtime validation used in
try_capture_distillation by wrapping agent_id and task_id with
TypeAdapter(NotBlankStr).validate_python(...) (import TypeAdapter and
NotBlankStr from their modules) before passing them into
propose_procedural_memory so validated NotBlankStr instances are supplied and
mypy/runtime errors are avoided.
In `@src/synthorg/memory/backends/mem0/adapter.py`:
- Around line 165-167: The pricing code is using
self._mem0_config.embedder.model instead of the resolved embedder model that
build_mem0_config_dict() may replace (e.g., fine-tune checkpoint or
fine_tune.base_model); change the code to cache the effective embedder model at
connect/config-build time (e.g., set self._effective_embedder_model when
build_mem0_config_dict() or connect runs) and then use that cached value in the
pricing logic (replace references to self._mem0_config.embedder.model in the
pricing block around the current check and the similar logic at lines 179-188
with self._effective_embedder_model), ensuring the cache is populated before any
pricing calls and falls back safely if unset.
- Around line 190-203: When _cost_tracker.record(record) inside
_record_embedding_cost raises system exceptions (MemoryError or RecursionError)
they must be logged before re-raising: wrap the await
self._cost_tracker.record(record) catch for builtins.MemoryError and
RecursionError and call logger.warning or logger.error with
BUDGET_EMBEDDING_COST_FAILED and the same context fields (agent_id, operation,
input_tokens, model, and record.cost_usd if present) to emit the budget failure
event, then re-raise the exception; apply the same change to the other
occurrences of this pattern referenced (the blocks around lines 534-539 and
608-614) so store() and retrieve() callers always get a
BUDGET_EMBEDDING_COST_FAILED log before the system exception propagates.
In `@src/synthorg/memory/backends/mem0/config.py`:
- Around line 252-255: Mem0BackendConfig's embedding_cost field is never
populated from company config; update build_config_from_company_config() to read
the embedding_cost section from CompanyMemoryConfig (and the corresponding
YAML/env entrypoint used by CompanyMemoryConfig) and pass a constructed
EmbeddingCostConfig into Mem0BackendConfig.embedding_cost; ensure the same fix
is applied for the other identical blocks noted (lines near 372-377 and 423-427)
so that the embedding_cost field can be toggled via CompanyMemoryConfig/YAML/env
and not remain at its default.
- Around line 195-219: Change model_pricing's typing to dict[NotBlankStr, float]
(import NotBlankStr from core.types) so blank model IDs are rejected at type
level, and update _validate_pricing_non_negative to log the offending entry with
the structured warning key MEMORY_BACKEND_CONFIG_INVALID (include model_name and
cost) before raising the ValueError; reference the model_pricing field and
_validate_pricing_non_negative method when making these edits so the new import,
type annotation, and the warning log call are placed in this same config module.
In `@src/synthorg/memory/consolidation/config.py`:
- Around line 291-300: The validator method _validate_entry_vs_total_chars
currently raises ValueError without logging; update it to log the failure (use
the module logger and include the CONFIG_VALIDATION_FAILED constant and the same
message/context) at WARNING or ERROR level immediately before raising so
operators receive the structured signal; keep the existing ValueError raised
with the same message after logging.
In `@tests/unit/budget/test_call_category.py`:
- Around line 20-30: Refactor the two tests test_string_conversion and
test_from_string to use pytest.mark.parametrize over a list of (enum_member,
string_value) pairs instead of multiple assert lines; iterate the same cases for
LLMCallCategory.PRODUCTIVE/COORDINATION/SYSTEM/EMBEDDING so each case asserts
str(enum_member) == string_value in one parametrized test and enum_member ==
LLMCallCategory(string_value) in another parametrized test, keeping the test
names and assertions equivalent but eliminating repeated assertions.
In `@tests/unit/engine/post_execution/test_memory_hooks.py`:
- Around line 199-226: Update the
test_pipeline_called_when_proposer_and_recovery_exist test to pass a non-None
procedural_memory_config into try_procedural_memory and assert that
propose_procedural_memory receives it: create a small config object (e.g.,
config = {"foo": "bar"}), call try_procedural_memory with
procedural_memory_config=config, and after mock_propose.assert_awaited_once()
check that mock_propose.call_args includes the forwarded config (assert
call_kwargs[1]["config"] is config). This ensures the wiring from
AgentEngine._try_procedural_memory -> try_procedural_memory ->
propose_procedural_memory correctly threads procedural_memory_config.
In `@tests/unit/memory/consolidation/test_config.py`:
- Around line 196-253: Add a unit test in TestLLMConsolidationConfig that
asserts the cross-field constraint on LLMConsolidationConfig: construct with
max_entry_input_chars greater than max_total_user_content_chars and assert it
raises ValidationError, and also construct with max_entry_input_chars equal to
max_total_user_content_chars and assert it is accepted (no exception) to lock in
the validator behavior for the <= relationship between max_entry_input_chars and
max_total_user_content_chars.
---
Outside diff comments:
In `@src/synthorg/memory/consolidation/llm_strategy.py`:
- Around line 545-562: _build_user_prompt may return summarized as an empty
tuple when wrapper/escape expansion causes every <entry> to be dropped;
currently the code still calls _call_llm and ends up storing an empty summary
and never deleting originals. Fix by short-circuiting immediately after
_build_user_prompt: if summarized is empty (or len(summarized) == 0) skip the
LLM call and return the concatenation fallback from _fallback_summary(entries)
with SynthesisOutcome.CONCAT_FALLBACK and the original entries, ensuring no
attempt to call _call_llm or persist an empty synthesized result.
In `@src/synthorg/memory/retriever.py`:
- Around line 470-474: The RRF/LINEAR fusion currently truncates results using
max_results (max_memories) inside fuse_ranked_lists and in rank_memories(),
which prevents candidate_pool_multiplier from expanding the pool for the
diversity/MMR step; update the logic so that fuse_ranked_lists and rank_memories
do not truncate to max_memories when diversity/MMR is enabled—either by passing
a pool_limit (e.g., candidate_pool_multiplier * max_memories) into
fuse_ranked_lists (instead of max_results) or by skipping truncation there and
deferring the final cut to max_memories until after the diversity penalty is
applied (the place where diversity/MMR is invoked), ensuring
candidate_pool_multiplier actually increases the candidate pool used by the
diversity step.
In `@tests/unit/memory/consolidation/test_llm_strategy.py`:
- Around line 744-804: The test can hang if group processing becomes sequential
because the first provider call awaits `barrier` forever; wrap the
`strategy.consolidate(...)` call with a short timeout to fail fast. Replace the
await of `strategy.consolidate((*episodic, *semantic), agent_id="agent-1")` with
an `asyncio.wait_for(...)` call (e.g. `result = await
asyncio.wait_for(strategy.consolidate(...), timeout=1)`) so that if
`_barrier_complete`/`barrier` never progresses the test errors quickly; keep
references to `_barrier_complete`, `barrier`, `expected_concurrent`, and
`strategy.consolidate` intact.
In `@web/src/__tests__/utils/budget.property.test.ts`:
- Around line 92-109: The test only asserts the total percentages sum to 100 but
misses verifying the embedding bucket specifically; update the property in the
test around computeCategoryBreakdown(records) to also compute the embedding cost
share (e.g., sum of r.cost_usd for records with r.category === 'embedding'
divided by totalCost * 100) and assert that ratio.embedding.percent isCloseTo
that computed embedding share when totalCost > 0 (and is 0 when totalCost ===
0), referencing computeCategoryBreakdown, ratio.embedding.percent, and the
records array so mis-bucketing of embedding into uncategorized will fail.
In `@web/src/pages/budget/BudgetPage.stories.tsx`:
- Around line 17-19: The overview fixture (mockOverview) is inconsistent with
the cost records: update mockOverview.total_cost_usd to match the sum of
mockCostRecords (45.67) and recompute the dependent fields
(mockOverview.budget_remaining_usd and mockOverview.budget_used_percent)
accordingly (or derive them from the shared budget total if one exists) so the
header total and the breakdown agree; adjust the same values in the duplicate
fixture at the other block (lines around the second occurrence mentioned).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 95104ff7-2b05-4dd3-8c29-839c92d8fdcc
📒 Files selected for processing (39)
src/synthorg/budget/call_category.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/post_execution/__init__.pysrc/synthorg/engine/post_execution/memory_hooks.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/consolidation/__init__.pysrc/synthorg/memory/consolidation/config.pysrc/synthorg/memory/consolidation/distillation.pysrc/synthorg/memory/consolidation/llm_strategy.pysrc/synthorg/memory/consolidation/simple_strategy.pysrc/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/memory/retriever.pysrc/synthorg/memory/tool_retriever.pysrc/synthorg/observability/events/budget.pysrc/synthorg/observability/events/memory.pytests/unit/budget/test_call_category.pytests/unit/engine/post_execution/__init__.pytests/unit/engine/post_execution/test_memory_hooks.pytests/unit/engine/test_agent_engine_distillation.pytests/unit/memory/consolidation/test_config.pytests/unit/memory/consolidation/test_distillation.pytests/unit/memory/consolidation/test_llm_strategy.pytests/unit/memory/test_memory_tools.pytests/unit/memory/test_ranking.pytests/unit/memory/test_ranking_diversity.pytests/unit/memory/test_retriever.pytests/unit/memory/test_retriever_hybrid.pytests/unit/memory/test_tool_retriever.pyweb/src/__tests__/pages/budget/CategoryBreakdown.test.tsxweb/src/__tests__/utils/budget.property.test.tsweb/src/__tests__/utils/budget.test.tsweb/src/api/types.tsweb/src/pages/budget/BudgetPage.stories.tsxweb/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/pages/budget/CategoryBreakdown.tsxweb/src/utils/budget.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Agent
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do NOT usefrom __future__ import annotationsin Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) in Python 3.14 code—ruff enforces this.
All public functions in Python must have type hints. Use mypy strict mode for type-checking.
Use Google-style docstrings on all public classes and functions in Python. This is enforced by ruff D rules.
Use NotBlankStr (from core.types) for all identifier/name fields in Python—including optional (NotBlankStr | None) and tuple variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in Python—prefer structured concurrency over bare create_task.
Python line length must be 88 characters (enforced by ruff).
Python functions must be under 50 lines, files under 800 lines.
Handle errors explicitly in Python, never silently swallow exceptions.
Always use variable namelogger(not_loggerorlog) for the logging instance in Python.
Lint Python withuv run ruff check src/ tests/. Auto-fix withuv run ruff check src/ tests/ --fix. Format withuv run ruff format src/ tests/.
Type-check Python withuv run mypy src/ tests/(strict mode).
Files:
src/synthorg/budget/call_category.pytests/unit/memory/test_ranking.pysrc/synthorg/engine/post_execution/__init__.pytests/unit/memory/test_ranking_diversity.pysrc/synthorg/observability/events/memory.pysrc/synthorg/memory/consolidation/__init__.pytests/unit/memory/test_memory_tools.pysrc/synthorg/observability/events/budget.pytests/unit/memory/test_tool_retriever.pysrc/synthorg/memory/consolidation/simple_strategy.pytests/unit/engine/test_agent_engine_distillation.pysrc/synthorg/memory/retrieval_config.pytests/unit/memory/consolidation/test_config.pytests/unit/memory/test_retriever_hybrid.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/retriever.pysrc/synthorg/memory/ranking.pytests/unit/memory/test_retriever.pytests/unit/memory/consolidation/test_distillation.pysrc/synthorg/memory/backends/mem0/config.pytests/unit/budget/test_call_category.pysrc/synthorg/memory/tool_retriever.pytests/unit/memory/consolidation/test_llm_strategy.pysrc/synthorg/memory/backends/mem0/adapter.pytests/unit/engine/post_execution/test_memory_hooks.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/memory/consolidation/config.pysrc/synthorg/engine/post_execution/memory_hooks.pysrc/synthorg/memory/consolidation/distillation.pysrc/synthorg/memory/consolidation/llm_strategy.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every Python module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging,logging.getLogger(), orprint()in Python application code. Exceptions: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, observability/http_handler.py may use stdlib logging and print.
Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT
Use structured logging withlogger.info(EVENT, key=value)syntax in Python—neverlogger.info('msg %s', val)
All error paths in Python must log at WARNING or ERROR with context before raising.
All state transitions in Python must log at INFO.
DEBUG logging is for object creation, internal flow, and entry/exit of key functions in Python.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned Python code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases. Exceptions: Operations design page, .claude/ skill files, third-party imports, provider presets (user-facing runtime data).
Library reference in docs/api/ is auto-generated via mkdocstrings + Griffe (AST-based).
Files:
src/synthorg/budget/call_category.pysrc/synthorg/engine/post_execution/__init__.pysrc/synthorg/observability/events/memory.pysrc/synthorg/memory/consolidation/__init__.pysrc/synthorg/observability/events/budget.pysrc/synthorg/memory/consolidation/simple_strategy.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/retriever.pysrc/synthorg/memory/ranking.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/tool_retriever.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/memory/consolidation/config.pysrc/synthorg/engine/post_execution/memory_hooks.pysrc/synthorg/memory/consolidation/distillation.pysrc/synthorg/memory/consolidation/llm_strategy.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/budget/call_category.pysrc/synthorg/engine/post_execution/__init__.pysrc/synthorg/observability/events/memory.pysrc/synthorg/memory/consolidation/__init__.pysrc/synthorg/observability/events/budget.pysrc/synthorg/memory/consolidation/simple_strategy.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/retriever.pysrc/synthorg/memory/ranking.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/tool_retriever.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/memory/consolidation/config.pysrc/synthorg/engine/post_execution/memory_hooks.pysrc/synthorg/memory/consolidation/distillation.pysrc/synthorg/memory/consolidation/llm_strategy.py
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
web/src/**/*.{ts,tsx}: TypeScript files in web dashboard must reuse existing components from web/src/components/ui/ before creating new ones.
React dashboard must use TypeScript 6.0+, React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand,@tanstack/react-query. No hardcoded styles—use design tokens.
Linting and pre-commit checks must not be bypassed—ESLint web dashboard (zero warnings) is non-negotiable.
web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground,bg-card,text-accent,text-success,bg-danger, etc.) or CSS variables (var(--so-*)) for colors; NEVER hardcode hex values in.tsx/.tsfiles
Usefont-sansorfont-mono(Geist tokens) for typography; NEVER setfontFamilydirectly in.tsx/.tsfiles
Use density-aware tokens (p-card,gap-section-gap,gap-grid-gap) or standard Tailwind spacing; NEVER hardcode pixel values for layout spacing in components
Use token variables (var(--so-shadow-card-hover),border-border,border-bright) for shadows and borders; NEVER hardcode values in.tsx/.tsfiles
Use@/lib/motionpresets for Framer Motion transition durations; NEVER hardcode transition durations
CSS side-effect imports in TypeScript 6 require type declarations -- add/// <reference types="vite/client" />at the top of files with CSS imports
Files:
web/src/pages/budget/CategoryBreakdown.tsxweb/src/__tests__/utils/budget.test.tsweb/src/__tests__/utils/budget.property.test.tsweb/src/api/types.tsweb/src/utils/budget.tsweb/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/__tests__/pages/budget/CategoryBreakdown.test.tsxweb/src/pages/budget/BudgetPage.stories.tsx
web/src/**/*.{ts,tsx,css}
📄 CodeRabbit inference engine (CLAUDE.md)
web/src/**/*.{ts,tsx,css}: Never hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions in web dashboard code—use design tokens and@/lib/motionpresets.
Web dashboard scripts/check_web_design_system.py enforces component reuse and design token usage on every Edit/Write to web/src/.
Files:
web/src/pages/budget/CategoryBreakdown.tsxweb/src/__tests__/utils/budget.test.tsweb/src/__tests__/utils/budget.property.test.tsweb/src/api/types.tsweb/src/utils/budget.tsweb/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/__tests__/pages/budget/CategoryBreakdown.test.tsxweb/src/pages/budget/BudgetPage.stories.tsx
web/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,js,jsx}: Always usecreateLoggerfrom@/lib/logger-- never bareconsole.warn/console.error/console.debugin application code
Logger variable name must always beconst log(e.g.const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate arguments to logger methods (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Files:
web/src/pages/budget/CategoryBreakdown.tsxweb/src/__tests__/utils/budget.test.tsweb/src/__tests__/utils/budget.property.test.tsweb/src/api/types.tsweb/src/utils/budget.tsweb/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/__tests__/pages/budget/CategoryBreakdown.test.tsxweb/src/pages/budget/BudgetPage.stories.tsx
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Test markers in Python:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Python test coverage must be 80% minimum (enforced in CI).
Use@pytest.mark.parametrizefor testing similar cases in Python.
Use test-provider, test-small-001, etc. vendor-agnostic names in Python tests.
Use Hypothesis property-based testing in Python with@given+@settingsdecorators. Configure profiles: ci (deterministic, max_examples=10, derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Control via HYPOTHESIS_PROFILE env var.
When Hypothesis finds a failure in Python tests, fix the underlying bug and add an@example(...) decorator to permanently cover the case in CI.
Never skip, dismiss, or ignore flaky Python tests—fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep(). For tasks that must block indefinitely, use asyncio.Event().wait() instead of asyncio.sleep(large_number).
Run Python unit tests withuv run python -m pytest tests/ -m unit -n 8.
Run Python integration tests withuv run python -m pytest tests/ -m integration -n 8.
Run Python e2e tests withuv run python -m pytest tests/ -m e2e -n 8.
Files:
tests/unit/memory/test_ranking.pytests/unit/memory/test_ranking_diversity.pytests/unit/memory/test_memory_tools.pytests/unit/memory/test_tool_retriever.pytests/unit/engine/test_agent_engine_distillation.pytests/unit/memory/consolidation/test_config.pytests/unit/memory/test_retriever_hybrid.pytests/unit/memory/test_retriever.pytests/unit/memory/consolidation/test_distillation.pytests/unit/budget/test_call_category.pytests/unit/memory/consolidation/test_llm_strategy.pytests/unit/engine/post_execution/test_memory_hooks.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/memory/test_ranking.pytests/unit/memory/test_ranking_diversity.pytests/unit/memory/test_memory_tools.pytests/unit/memory/test_tool_retriever.pytests/unit/engine/test_agent_engine_distillation.pytests/unit/memory/consolidation/test_config.pytests/unit/memory/test_retriever_hybrid.pytests/unit/memory/test_retriever.pytests/unit/memory/consolidation/test_distillation.pytests/unit/budget/test_call_category.pytests/unit/memory/consolidation/test_llm_strategy.pytests/unit/engine/post_execution/test_memory_hooks.py
web/src/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Web dashboard components must be tested with Vitest,
@testing-library/react, and fast-check for property-based testing.
Files:
web/src/__tests__/utils/budget.test.tsweb/src/__tests__/utils/budget.property.test.tsweb/src/__tests__/pages/budget/CategoryBreakdown.test.tsx
web/src/**/*.{test,stories}.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Web dashboard must use MSW (Mock Service Worker) for API mocking in tests and Storybook.
Files:
web/src/__tests__/utils/budget.test.tsweb/src/__tests__/utils/budget.property.test.tsweb/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/__tests__/pages/budget/CategoryBreakdown.test.tsxweb/src/pages/budget/BudgetPage.stories.tsx
web/src/**/*.stories.tsx
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.stories.tsx: For Storybook stories withtags: ['autodocs'], ensure@storybook/addon-docsis installed and added to addons
Usestorybook/testandstorybook/actionsimport paths in Storybook stories (not@storybook/testor@storybook/addon-actions)
Files:
web/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/pages/budget/BudgetPage.stories.tsx
🧠 Learnings (39)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Applied to files:
src/synthorg/engine/post_execution/__init__.pytests/unit/engine/test_agent_engine_distillation.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
src/synthorg/engine/post_execution/__init__.pytests/unit/memory/test_memory_tools.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
Applied to files:
src/synthorg/engine/post_execution/__init__.pytests/unit/memory/test_ranking_diversity.pysrc/synthorg/memory/consolidation/__init__.pytests/unit/memory/test_memory_tools.pytests/unit/memory/test_tool_retriever.pytests/unit/engine/test_agent_engine_distillation.pytests/unit/memory/test_retriever_hybrid.pysrc/synthorg/memory/retriever.pysrc/synthorg/memory/ranking.pytests/unit/memory/test_retriever.pytests/unit/memory/consolidation/test_distillation.pysrc/synthorg/memory/tool_retriever.pytests/unit/memory/consolidation/test_llm_strategy.pysrc/synthorg/memory/backends/mem0/adapter.pytests/unit/engine/post_execution/test_memory_hooks.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/memory/consolidation/config.pysrc/synthorg/engine/post_execution/memory_hooks.pysrc/synthorg/memory/consolidation/distillation.pysrc/synthorg/memory/consolidation/llm_strategy.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory
Applied to files:
tests/unit/memory/test_memory_tools.pytests/unit/memory/test_tool_retriever.pytests/unit/engine/test_agent_engine_distillation.pytests/unit/memory/test_retriever.pytests/unit/memory/consolidation/test_distillation.pytests/unit/memory/consolidation/test_llm_strategy.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/unit/memory/test_memory_tools.pytests/unit/memory/test_tool_retriever.pytests/unit/engine/test_agent_engine_distillation.pytests/unit/memory/test_retriever.pytests/unit/engine/post_execution/test_memory_hooks.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...
Applied to files:
tests/unit/engine/test_agent_engine_distillation.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
tests/unit/engine/test_agent_engine_distillation.pytests/unit/memory/test_retriever.py
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.stories.{ts,tsx} : Storybook 10: Use parameters.backgrounds.options (object keyed by name) + initialGlobals.backgrounds.value for background options (replaces old default + values array)
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: import from `storybook/test` (not `storybook/test`), `storybook/actions` (not `storybook/addon-actions`)
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/pages/budget/BudgetPage.stories.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.stories.{ts,tsx} : Storybook 10: Use storybook/test (not storybook/test) and storybook/actions (not storybook/addon-actions) import paths
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/pages/budget/BudgetPage.stories.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: Use `parameters.a11y.test: 'error' | 'todo' | 'off'` for a11y testing configuration (replaces old `.element` and `.manual`)
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/components/ui/**/*.stories.tsx : Create a `.stories.tsx` file alongside each new shared component with all states (default, hover, loading, error, empty)
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.stories.tsx : Storybook 10: Import from `storybook/test` instead of `storybook/test`
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/__tests__/pages/budget/CategoryBreakdown.test.tsxweb/src/pages/budget/BudgetPage.stories.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : Create new shared components in `web/src/components/ui/` with `.stories.tsx` Storybook file covering all states (default, hover, loading, error, empty)
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/.storybook/**/*.{ts,tsx} : In Storybook 10, use `parameters.backgrounds.options` (object keyed by name) + `initialGlobals.backgrounds.value` for backgrounds API (replaces old `default` + `values` array)
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/**/*.stories.tsx : Use `storybook/test` and `storybook/actions` import paths in Storybook stories (not `storybook/test` or `storybook/addon-actions`)
Applied to files:
web/src/pages/budget/CategoryBreakdown.stories.tsxweb/src/pages/budget/BudgetPage.stories.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/__tests__/**/*.{test,spec}.{ts,tsx} : Vitest unit tests must use coverage scoped to files changed vs origin/main branch
Applied to files:
web/src/__tests__/pages/budget/CategoryBreakdown.test.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/**/*.test.{ts,tsx} : Web dashboard: Use React Hypothesis (fast-check) for property-based testing with fc.assert + fc.property
Applied to files:
web/src/__tests__/pages/budget/CategoryBreakdown.test.tsx
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to web/src/**/*.{test,stories}.{ts,tsx} : Web dashboard must use MSW (Mock Service Worker) for API mocking in tests and Storybook.
Applied to files:
web/src/pages/budget/BudgetPage.stories.tsx
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings
Applied to files:
src/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/memory/consolidation/config.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.
Applied to files:
src/synthorg/engine/post_execution/memory_hooks.py
| model = str(self._mem0_config.embedder.model) | ||
| cost_per_1k = cost_cfg.model_pricing.get(model, 0.0) | ||
| if cost_per_1k == 0.0 and model in cost_cfg.model_pricing: |
There was a problem hiding this comment.
Price against the resolved embedder model, not the base config value.
This helper always tags and prices records with self._mem0_config.embedder.model, but build_mem0_config_dict() can replace that with a fine-tune checkpoint path or fine_tune.base_model. When fine-tuning is enabled, embedding costs will be attributed to the wrong model and may miss the intended pricing entry altogether. Cache the effective model used at connect time and reuse it here.
Also applies to: 179-188
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/memory/backends/mem0/adapter.py` around lines 165 - 167, The
pricing code is using self._mem0_config.embedder.model instead of the resolved
embedder model that build_mem0_config_dict() may replace (e.g., fine-tune
checkpoint or fine_tune.base_model); change the code to cache the effective
embedder model at connect/config-build time (e.g., set
self._effective_embedder_model when build_mem0_config_dict() or connect runs)
and then use that cached value in the pricing logic (replace references to
self._mem0_config.embedder.model in the pricing block around the current check
and the similar logic at lines 179-188 with self._effective_embedder_model),
ensuring the cache is populated before any pricing calls and falls back safely
if unset.
| embedding_cost: EmbeddingCostConfig = Field( | ||
| default_factory=EmbeddingCostConfig, | ||
| description="Embedding cost tracking configuration", | ||
| ) |
There was a problem hiding this comment.
embedding_cost is not reachable from the normal config path.
Mem0BackendConfig now has the field, but build_config_from_company_config() still constructs the backend config without any way to populate it, and the provided CompanyMemoryConfig entrypoint snippet has no corresponding section either. In practice, YAML/env-driven Mem0 setups stay on the default disabled config, so this feature cannot actually be opted into.
Also applies to: 372-377, 423-427
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/memory/backends/mem0/config.py` around lines 252 - 255,
Mem0BackendConfig's embedding_cost field is never populated from company config;
update build_config_from_company_config() to read the embedding_cost section
from CompanyMemoryConfig (and the corresponding YAML/env entrypoint used by
CompanyMemoryConfig) and pass a constructed EmbeddingCostConfig into
Mem0BackendConfig.embedding_cost; ensure the same fix is applied for the other
identical blocks noted (lines near 372-377 and 423-427) so that the
embedding_cost field can be toggled via CompanyMemoryConfig/YAML/env and not
remain at its default.
| async def test_pipeline_called_when_proposer_and_recovery_exist( | ||
| self, | ||
| ) -> None: | ||
| """Delegates to propose_procedural_memory when both are present.""" | ||
| result = _make_error_result() | ||
| recovery = _make_recovery_result() | ||
| proposer = AsyncMock() | ||
| backend = AsyncMock() | ||
| backend.store = AsyncMock(return_value="mem-001") | ||
|
|
||
| with patch( | ||
| "synthorg.memory.procedural.pipeline.propose_procedural_memory", | ||
| new_callable=AsyncMock, | ||
| ) as mock_propose: | ||
| await try_procedural_memory( | ||
| result, | ||
| recovery, | ||
| str(_AGENT_UUID), | ||
| "task-hook-001", | ||
| procedural_proposer=proposer, | ||
| memory_backend=backend, | ||
| ) | ||
|
|
||
| mock_propose.assert_awaited_once() | ||
| call_kwargs = mock_propose.call_args | ||
| assert call_kwargs[1]["proposer"] is proposer | ||
| assert call_kwargs[1]["memory_backend"] is backend | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Cover the forwarded procedural-memory config in this delegation test.
AgentEngine._try_procedural_memory() now threads procedural_memory_config into this helper path, but this happy-path test never passes a non-None config or asserts the downstream config kwarg. That leaves the new wiring unprotected.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/engine/post_execution/test_memory_hooks.py` around lines 199 -
226, Update the test_pipeline_called_when_proposer_and_recovery_exist test to
pass a non-None procedural_memory_config into try_procedural_memory and assert
that propose_procedural_memory receives it: create a small config object (e.g.,
config = {"foo": "bar"}), call try_procedural_memory with
procedural_memory_config=config, and after mock_propose.assert_awaited_once()
check that mock_propose.call_args includes the forwarded config (assert
call_kwargs[1]["config"] is config). This ensures the wiring from
AgentEngine._try_procedural_memory -> try_procedural_memory ->
propose_procedural_memory correctly threads procedural_memory_config.
…abbit Source fixes: - Fix DISTILLATION_CAPTURE_SKIPPED -> DISTILLATION_CAPTURE_FAILED for exception path - Add NotBlankStr validation in try_procedural_memory (consistency with distillation) - New BUDGET_EMBEDDING_MODEL_UNPRICED event constant for unpriced model debug log - Ceiling division for embedding token estimation (avoids systematic undercount) - Split _record_embedding_cost into _record_embedding_cost + _record_cost (<50 lines) - Log system failures before re-raise in _record_cost - model_pricing keys narrowed from str to NotBlankStr - Add CONFIG_VALIDATION_FAILED log to _validate_entry_vs_total_chars and _validate_pricing_non_negative - Add CancelledError re-raise to _safe_search, _handle_recall, sparse search init - Add logging to silent sparse search init failure - Add @computed_field for total_cost/total_tokens/total_count on CategoryBreakdown - Use computed total_tokens in compute_orchestration_ratio - Document task_id as synthetic sentinel in _record_embedding_cost docstring - Document EmbeddingCostConfig YAML config path limitation - Design tokens: space-y-section-gap, gap-grid-gap in CategoryBreakdown.tsx - Fix JSDoc 'four categories' -> 'five categories' in budget.ts - Fix CategoryBreakdown, OrchestrationRatio, CostRecord docstrings for embedding Documentation: - CLAUDE.md: add engine/post_execution/, LLMConsolidationConfig, EmbeddingCostConfig - docs/design/memory.md: add LLMConsolidationConfig to config table, candidate_pool_multiplier to diversity section, MemoryToolName enum type - docs/design/operations.md: add embedding to call_category list Tests: - New test_embedding_cost_config.py (8 tests for EmbeddingCostConfig model) - Add cross-field validator tests for LLMConsolidationConfig - Add TestCandidatePoolMultiplier (pool limit, default, bounds) - Add TestScoringStrategyField (LINEAR, RRF, None default) - Fix AsyncMock spec= in test_llm_strategy.py, test_memory_hooks.py - Fix positional call_args in test_memory_hooks.py - Parametrize enum tests in test_call_category.py
There was a problem hiding this comment.
Actionable comments posted: 9
♻️ Duplicate comments (1)
src/synthorg/memory/backends/mem0/adapter.py (1)
167-181:⚠️ Potential issue | 🟠 MajorUse the resolved embedder model for pricing and attribution.
build_mem0_config_dict()can replace the configured model with the actual checkpoint/base model used by Mem0. Pricing and recording againstself._mem0_config.embedder.modelhere will still misattribute spend and miss the intended pricing entry whenever that resolution happens.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/memory/backends/mem0/adapter.py` around lines 167 - 181, The code is using self._mem0_config.embedder.model for pricing/attribution which can differ from the resolved checkpoint/base; call or use the resolved config returned by build_mem0_config_dict() (or the resolved embedder model it produces) and use that resolved_model variable for cost lookup (cost_cfg.model_pricing), the debug/unpriced check, cost_usd calculation, and when populating the CostRecord provider/model fields (instead of self._mem0_config.embedder.model); ensure you replace all uses in this block with the resolved model value so pricing and attribution match the actual embedder used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/design/operations.md`:
- Line 441: The docs show `embedding` in the call_category usage but the
call-category table above still lists only productive, coordination, and system;
update the call_category table to include the new `embedding` category (add a
short description and an example/intent similar to the other rows) so the table
and the inline usage are consistent—edit the call_category table entry
definition and the category list to include `embedding` and a concise purpose
statement.
In `@src/synthorg/memory/backends/mem0/adapter.py`:
- Around line 559-564: The embedding cost must be recorded immediately after the
Mem0 SDK call returns to avoid losing the CostRecord on downstream failures:
move the _record_embedding_cost(...) call to directly follow the client.add(...)
(for store flows) and client.search(...) (for search flows) success returns
inside the same function (e.g., in the Mem0 adapter methods where client.add and
client.search are invoked), rather than after
validation/mapping/filtering/sparse-upsert; ensure you pass the same parameters
(agent_id, task_id, content_length, operation) and leave downstream processing
unchanged so the cost is persisted even if later steps fail.
- Around line 163-164: Remove the silent no-op by wiring the tracker through and
only gating on the config: update the Mem0MemoryBackend construction in the
factory that currently calls Mem0MemoryBackend(...) to pass the cost_tracker
instance through, ensure Mem0MemoryBackend.__init__ accepts a cost_tracker
parameter and assigns it to self._cost_tracker, and in adapter.py change the
check "if not cost_cfg.enabled or self._cost_tracker is None: return" to only
check cost_cfg.enabled (remove the "or self._cost_tracker is None" clause) so
the backend will attempt to emit CostRecord events when enabled and the tracker
is supplied.
- Around line 296-297: When async_init_sparse_field() is cancelled inside
connect(), the CancelledError handler must undo partial connection state so the
backend isn't left marked connected; update the except asyncio.CancelledError
block in connect() to (1) if self._client exists, attempt to close/disconnect it
(call its close()/disconnect() or await close if async) or otherwise drop
references, then (2) set self._client = None and self._connected = False, and
finally re-raise the CancelledError so callers still see the cancellation;
ensure you reference async_init_sparse_field and connect and clean up both
self._client and self._connected.
In `@tests/unit/engine/post_execution/test_memory_hooks.py`:
- Around line 257-272: The test is patching the wrong import path for
propose_procedural_memory; update the patch target to the module where
try_procedural_memory imports or calls propose_procedural_memory (i.e., patch
the symbol as used by try_procedural_memory rather than
"synthorg.memory.procedural.pipeline.propose_procedural_memory"), keeping
new_callable=AsyncMock and side_effect=MemoryError("out of memory"), then run
the same pytest.raises assertion around await try_procedural_memory so the
AsyncMock is applied correctly.
- Around line 235-248: The test currently patches the implementation location
"synthorg.memory.procedural.pipeline.propose_procedural_memory" but
try_procedural_memory uses the imported name, so update the patch target to the
module where try_procedural_memory imports propose_procedural_memory (e.g.,
patch "synthorg.memory.procedural.hooks.propose_procedural_memory" instead) so
the AsyncMock actually replaces the function used by try_procedural_memory; keep
the AsyncMock(side_effect=RuntimeError("boom")) and the same await call to
try_procedural_memory with proposer and memory_backend.
- Around line 210-226: The test is patching the original function's module but
the code under test calls the imported name from memory_hooks, so the patch
doesn't intercept the call; change the patch target to the symbol as looked up
in the module that calls it (patch
"synthorg.memory.post_execution.memory_hooks.propose_procedural_memory" instead
of "synthorg.memory.procedural.pipeline.propose_procedural_memory") so the
AsyncMock replaces the imported reference used by try_procedural_memory; keep
the same assertions against mock_propose (proposer and memory_backend) after
adjusting the patch target.
In `@tests/unit/memory/backends/mem0/test_embedding_cost_config.py`:
- Around line 9-53: Add a unit test to ensure whitespace-only model names are
rejected: update tests/unit/memory/backends/mem0/test_embedding_cost_config.py
by adding a test that calls EmbeddingCostConfig(model_pricing={" ": 0.5}) and
expects a ValidationError; reference the existing TestEmbeddingCostConfig class
and the EmbeddingCostConfig constructor (which uses NotBlankStr) to locate where
to add the new test alongside test_blank_model_name_rejected.
In `@tests/unit/memory/test_retriever.py`:
- Around line 622-649: The tests for candidate pool multiplier are duplicated
and miss the lower validation boundary; refactor the three similar tests
(test_pool_limit_with_diversity_enabled, test_pool_limit_without_diversity,
test_default_multiplier_is_three) into a single parametrized test using
`@pytest.mark.parametrize` that calls ContextInjectionStrategy(...,
config=MemoryRetrievalConfig(...)) and asserts strategy._compute_pool_limit()
equals the expected value for each case, and add a second validation test
alongside the existing one (which raises ValidationError for 11) to also assert
MemoryRetrievalConfig(candidate_pool_multiplier=0) raises ValidationError;
reference ContextInjectionStrategy, MemoryRetrievalConfig, _compute_pool_limit,
and ValidationError when locating code to change.
---
Duplicate comments:
In `@src/synthorg/memory/backends/mem0/adapter.py`:
- Around line 167-181: The code is using self._mem0_config.embedder.model for
pricing/attribution which can differ from the resolved checkpoint/base; call or
use the resolved config returned by build_mem0_config_dict() (or the resolved
embedder model it produces) and use that resolved_model variable for cost lookup
(cost_cfg.model_pricing), the debug/unpriced check, cost_usd calculation, and
when populating the CostRecord provider/model fields (instead of
self._mem0_config.embedder.model); ensure you replace all uses in this block
with the resolved model value so pricing and attribution match the actual
embedder used.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1abd60d3-0a21-455a-9dd7-05259eeb2fe2
📒 Files selected for processing (20)
CLAUDE.mddocs/design/memory.mddocs/design/operations.mdsrc/synthorg/budget/category_analytics.pysrc/synthorg/budget/cost_record.pysrc/synthorg/engine/post_execution/memory_hooks.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/consolidation/config.pysrc/synthorg/memory/tool_retriever.pysrc/synthorg/observability/events/budget.pytests/unit/budget/test_call_category.pytests/unit/engine/post_execution/test_memory_hooks.pytests/unit/memory/backends/mem0/test_embedding_cost_config.pytests/unit/memory/consolidation/test_config.pytests/unit/memory/consolidation/test_llm_strategy.pytests/unit/memory/test_ranking.pytests/unit/memory/test_retriever.pyweb/src/pages/budget/CategoryBreakdown.tsxweb/src/utils/budget.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (8)
docs/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Documentation files in docs/ are built with Zensical from Markdown. Config: mkdocs.yml. Output: _site/docs/.
Files:
docs/design/operations.mddocs/design/memory.md
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do NOT usefrom __future__ import annotationsin Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) in Python 3.14 code—ruff enforces this.
All public functions in Python must have type hints. Use mypy strict mode for type-checking.
Use Google-style docstrings on all public classes and functions in Python. This is enforced by ruff D rules.
Use NotBlankStr (from core.types) for all identifier/name fields in Python—including optional (NotBlankStr | None) and tuple variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in Python—prefer structured concurrency over bare create_task.
Python line length must be 88 characters (enforced by ruff).
Python functions must be under 50 lines, files under 800 lines.
Handle errors explicitly in Python, never silently swallow exceptions.
Always use variable namelogger(not_loggerorlog) for the logging instance in Python.
Lint Python withuv run ruff check src/ tests/. Auto-fix withuv run ruff check src/ tests/ --fix. Format withuv run ruff format src/ tests/.
Type-check Python withuv run mypy src/ tests/(strict mode).
Files:
tests/unit/budget/test_call_category.pytests/unit/memory/test_ranking.pysrc/synthorg/budget/cost_record.pytests/unit/memory/backends/mem0/test_embedding_cost_config.pysrc/synthorg/observability/events/budget.pytests/unit/memory/consolidation/test_config.pytests/unit/memory/test_retriever.pysrc/synthorg/memory/tool_retriever.pysrc/synthorg/budget/category_analytics.pytests/unit/engine/post_execution/test_memory_hooks.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.pytests/unit/memory/consolidation/test_llm_strategy.pysrc/synthorg/engine/post_execution/memory_hooks.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Test markers in Python:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Python test coverage must be 80% minimum (enforced in CI).
Use@pytest.mark.parametrizefor testing similar cases in Python.
Use test-provider, test-small-001, etc. vendor-agnostic names in Python tests.
Use Hypothesis property-based testing in Python with@given+@settingsdecorators. Configure profiles: ci (deterministic, max_examples=10, derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Control via HYPOTHESIS_PROFILE env var.
When Hypothesis finds a failure in Python tests, fix the underlying bug and add an@example(...) decorator to permanently cover the case in CI.
Never skip, dismiss, or ignore flaky Python tests—fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep(). For tasks that must block indefinitely, use asyncio.Event().wait() instead of asyncio.sleep(large_number).
Run Python unit tests withuv run python -m pytest tests/ -m unit -n 8.
Run Python integration tests withuv run python -m pytest tests/ -m integration -n 8.
Run Python e2e tests withuv run python -m pytest tests/ -m e2e -n 8.
Files:
tests/unit/budget/test_call_category.pytests/unit/memory/test_ranking.pytests/unit/memory/backends/mem0/test_embedding_cost_config.pytests/unit/memory/consolidation/test_config.pytests/unit/memory/test_retriever.pytests/unit/engine/post_execution/test_memory_hooks.pytests/unit/memory/consolidation/test_llm_strategy.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/budget/test_call_category.pytests/unit/memory/test_ranking.pytests/unit/memory/backends/mem0/test_embedding_cost_config.pytests/unit/memory/consolidation/test_config.pytests/unit/memory/test_retriever.pytests/unit/engine/post_execution/test_memory_hooks.pytests/unit/memory/consolidation/test_llm_strategy.py
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
web/src/**/*.{ts,tsx}: TypeScript files in web dashboard must reuse existing components from web/src/components/ui/ before creating new ones.
React dashboard must use TypeScript 6.0+, React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand,@tanstack/react-query. No hardcoded styles—use design tokens.
Linting and pre-commit checks must not be bypassed—ESLint web dashboard (zero warnings) is non-negotiable.
web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground,bg-card,text-accent,text-success,bg-danger, etc.) or CSS variables (var(--so-*)) for colors; NEVER hardcode hex values in.tsx/.tsfiles
Usefont-sansorfont-mono(Geist tokens) for typography; NEVER setfontFamilydirectly in.tsx/.tsfiles
Use density-aware tokens (p-card,gap-section-gap,gap-grid-gap) or standard Tailwind spacing; NEVER hardcode pixel values for layout spacing in components
Use token variables (var(--so-shadow-card-hover),border-border,border-bright) for shadows and borders; NEVER hardcode values in.tsx/.tsfiles
Use@/lib/motionpresets for Framer Motion transition durations; NEVER hardcode transition durations
CSS side-effect imports in TypeScript 6 require type declarations -- add/// <reference types="vite/client" />at the top of files with CSS imports
Files:
web/src/pages/budget/CategoryBreakdown.tsxweb/src/utils/budget.ts
web/src/**/*.{ts,tsx,css}
📄 CodeRabbit inference engine (CLAUDE.md)
web/src/**/*.{ts,tsx,css}: Never hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions in web dashboard code—use design tokens and@/lib/motionpresets.
Web dashboard scripts/check_web_design_system.py enforces component reuse and design token usage on every Edit/Write to web/src/.
Files:
web/src/pages/budget/CategoryBreakdown.tsxweb/src/utils/budget.ts
web/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,js,jsx}: Always usecreateLoggerfrom@/lib/logger-- never bareconsole.warn/console.error/console.debugin application code
Logger variable name must always beconst log(e.g.const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate arguments to logger methods (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Files:
web/src/pages/budget/CategoryBreakdown.tsxweb/src/utils/budget.ts
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every Python module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging,logging.getLogger(), orprint()in Python application code. Exceptions: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, observability/http_handler.py may use stdlib logging and print.
Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT
Use structured logging withlogger.info(EVENT, key=value)syntax in Python—neverlogger.info('msg %s', val)
All error paths in Python must log at WARNING or ERROR with context before raising.
All state transitions in Python must log at INFO.
DEBUG logging is for object creation, internal flow, and entry/exit of key functions in Python.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned Python code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases. Exceptions: Operations design page, .claude/ skill files, third-party imports, provider presets (user-facing runtime data).
Library reference in docs/api/ is auto-generated via mkdocstrings + Griffe (AST-based).
Files:
src/synthorg/budget/cost_record.pysrc/synthorg/observability/events/budget.pysrc/synthorg/memory/tool_retriever.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.pysrc/synthorg/engine/post_execution/memory_hooks.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/budget/cost_record.pysrc/synthorg/observability/events/budget.pysrc/synthorg/memory/tool_retriever.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.pysrc/synthorg/engine/post_execution/memory_hooks.py
🧠 Learnings (74)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Parametrize: Prefer pytest.mark.parametrize for testing similar cases.
Applied to files:
tests/unit/budget/test_call_category.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases.
Applied to files:
tests/unit/budget/test_call_category.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases
Applied to files:
tests/unit/budget/test_call_category.py
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.{tsx,ts} : Use density-aware tokens (`p-card`, `gap-section-gap`, `gap-grid-gap`) or standard Tailwind spacing. NEVER hardcode pixel values for layout spacing
Applied to files:
web/src/pages/budget/CategoryBreakdown.tsx
📚 Learning: 2026-03-27T22:32:26.927Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T22:32:26.927Z
Learning: Applies to web/src/**/*.{tsx,ts} : Use density-aware tokens (p-card, gap-section-gap, gap-grid-gap) or standard Tailwind spacing; never hardcode pixel values for layout spacing
Applied to files:
web/src/pages/budget/CategoryBreakdown.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/**/*.{ts,tsx} : Use density-aware tokens (`p-card`, `gap-section-gap`, `gap-grid-gap`) or standard Tailwind spacing; NEVER hardcode pixel values for layout spacing in components
Applied to files:
web/src/pages/budget/CategoryBreakdown.tsx
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)
Applied to files:
src/synthorg/budget/cost_record.pysrc/synthorg/observability/events/budget.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)
Applied to files:
CLAUDE.mddocs/design/memory.mdtests/unit/memory/test_retriever.pysrc/synthorg/memory/tool_retriever.pytests/unit/engine/post_execution/test_memory_hooks.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.pytests/unit/memory/consolidation/test_llm_strategy.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...
Applied to files:
CLAUDE.mddocs/design/memory.mdsrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Applied to files:
CLAUDE.mddocs/design/memory.mdsrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/communication/**/*.py : Communication package (communication/): message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution; meeting/ subpackage for meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/budget.pysrc/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).
Applied to files:
src/synthorg/observability/events/budget.py
📚 Learning: 2026-03-16T20:14:00.937Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T20:14:00.937Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).
Applied to files:
tests/unit/memory/consolidation/test_config.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files)
Applied to files:
tests/unit/memory/consolidation/test_config.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory
Applied to files:
tests/unit/memory/test_retriever.pytests/unit/engine/post_execution/test_memory_hooks.pysrc/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/backends/mem0/adapter.pytests/unit/memory/consolidation/test_llm_strategy.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/unit/memory/test_retriever.pytests/unit/engine/post_execution/test_memory_hooks.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
tests/unit/memory/test_retriever.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
src/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict) with allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields.
Applied to files:
src/synthorg/budget/category_analytics.py
📚 Learning: 2026-04-02T11:49:21.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T11:49:21.202Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`) with `allow_inf_nan=False` in all `ConfigDict` declarations
Applied to files:
src/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
src/synthorg/budget/category_analytics.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to src/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time
Applied to files:
src/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-17T11:41:02.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T11:41:02.964Z
Learning: Applies to src/**/*.py : Models: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `computed_field` for derived values instead of storing + validating redundant fields. Use `NotBlankStr` for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to **/*.py : Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use computed_field for derived values instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. Use `computed_field` for derived values instead of storing redundant fields. Use `NotBlankStr` for all identifier/name fields.
Applied to files:
src/synthorg/budget/category_analytics.pysrc/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Pre-push hooks enforce: mypy type-check (affected Python modules) + pytest unit tests (affected modules) + golangci-lint + go vet + go test (CLI) + eslint-web (web dashboard). Foundational module changes or conftest changes trigger full runs.
Applied to files:
tests/unit/engine/post_execution/test_memory_hooks.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests; fix them fully and fundamentally by mocking `time.monotonic()` and `asyncio.sleep()` for determinism or using `asyncio.Event().wait()` for indefinite blocking
Applied to files:
tests/unit/engine/post_execution/test_memory_hooks.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`
Applied to files:
tests/unit/engine/post_execution/test_memory_hooks.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state
Applied to files:
src/synthorg/memory/backends/mem0/config.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to **/*.py : Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.
Applied to files:
src/synthorg/memory/backends/mem0/config.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Applied to files:
src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-04-01T21:17:42.724Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T21:17:42.724Z
Learning: Applies to pyproject.toml : Required runtime dependencies: `mem0ai` (default and currently only memory backend), `cryptography` (Fernet encryption), `faker` (multi-locale name generation)
Applied to files:
src/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to src/synthorg/**/*.py : All error paths in Python must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-31T20:07:03.035Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions
Applied to files:
src/synthorg/memory/backends/mem0/adapter.pysrc/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings
Applied to files:
src/synthorg/memory/backends/mem0/adapter.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/memory/consolidation/config.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14
Applied to files:
src/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) in Python 3.14 code—ruff enforces this.
Applied to files:
src/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.
Applied to files:
src/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to **/*.py : Handle errors explicitly in Python, never silently swallow exceptions.
Applied to files:
src/synthorg/engine/post_execution/memory_hooks.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `NotBlankStr` from `core.types` for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Applied to files:
src/synthorg/engine/post_execution/memory_hooks.py
| if not cost_cfg.enabled or self._cost_tracker is None: | ||
| return |
There was a problem hiding this comment.
Wire the tracker into the real Mem0 backend construction path.
This short-circuit makes embedding tracking a silent no-op whenever cost_tracker is absent, and the production factory snippet (src/synthorg/memory/factory.py:100-104) still constructs Mem0MemoryBackend(...) without passing one. As shipped, no embedding CostRecords will be emitted in production.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/memory/backends/mem0/adapter.py` around lines 163 - 164, Remove
the silent no-op by wiring the tracker through and only gating on the config:
update the Mem0MemoryBackend construction in the factory that currently calls
Mem0MemoryBackend(...) to pass the cost_tracker instance through, ensure
Mem0MemoryBackend.__init__ accepts a cost_tracker parameter and assigns it to
self._cost_tracker, and in adapter.py change the check "if not cost_cfg.enabled
or self._cost_tracker is None: return" to only check cost_cfg.enabled (remove
the "or self._cost_tracker is None" clause) so the backend will attempt to emit
CostRecord events when enabled and the tracker is supplied.
| except asyncio.CancelledError: | ||
| raise |
There was a problem hiding this comment.
Don't leave the backend marked connected after a cancelled sparse init.
self._client and self._connected are already set before this branch runs. If async_init_sparse_field() is cancelled, connect() raises but the backend stays marked connected with a partially initialized client, so later calls can observe a connection that never actually completed.
🔧 Suggested fix
- self._client = client # pyright: ignore[reportAttributeAccessIssue]
- self._connected = True
- # Expose the Qdrant client for sparse vector operations.
+ # Expose the Qdrant client for sparse vector operations.
+ qdrant: Any = None
if self._mem0_config.sparse_search_enabled:
- qdrant: Any = None
try:
qdrant = client.vector_store.client # pyright: ignore[reportAttributeAccessIssue]
await async_init_sparse_field(
qdrant,
self._mem0_config.collection_name,
@@
except Exception as exc:
logger.warning(
MEMORY_BACKEND_CONNECTION_FAILED,
backend="mem0",
operation="sparse_init",
@@
reason="sparse_init_failed_falling_back_to_dense",
)
qdrant = None
- self._qdrant_client = qdrant
+ self._client = client # pyright: ignore[reportAttributeAccessIssue]
+ self._qdrant_client = qdrant
+ self._connected = True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/memory/backends/mem0/adapter.py` around lines 296 - 297, When
async_init_sparse_field() is cancelled inside connect(), the CancelledError
handler must undo partial connection state so the backend isn't left marked
connected; update the except asyncio.CancelledError block in connect() to (1) if
self._client exists, attempt to close/disconnect it (call its
close()/disconnect() or await close if async) or otherwise drop references, then
(2) set self._client = None and self._connected = False, and finally re-raise
the CancelledError so callers still see the cancellation; ensure you reference
async_init_sparse_field and connect and clean up both self._client and
self._connected.
| await self._record_embedding_cost( | ||
| agent_id=str(agent_id), | ||
| task_id="memory-store", | ||
| content_length=len(request.content), | ||
| operation="store", | ||
| ) |
There was a problem hiding this comment.
Record embedding spend immediately after the Mem0 SDK call returns.
client.add() and client.search() have already triggered the embedding work by the time these hooks run. Because the record call sits after result validation, mapping/filtering, and sparse-upsert work, any downstream failure drops the CostRecord even though the embedding spend already happened.
🔧 Suggested fix
result = await asyncio.to_thread(client.add, **kwargs)
+ await self._record_embedding_cost(
+ agent_id=str(agent_id),
+ task_id="memory-store",
+ content_length=len(request.content),
+ operation="store",
+ )
memory_id = validate_add_result(result, context="store")
@@
else:
await self._try_sparse_upsert(agent_id, memory_id, request.content)
- await self._record_embedding_cost(
- agent_id=str(agent_id),
- task_id="memory-store",
- content_length=len(request.content),
- operation="store",
- )
logger.info(
MEMORY_ENTRY_STORED,
agent_id=agent_id,
memory_id=memory_id,
category=request.category.value, if query.text is not None:
kwargs = query_to_mem0_search_args(str(agent_id), query)
raw_result = await asyncio.to_thread(client.search, **kwargs)
+ await self._record_embedding_cost(
+ agent_id=str(agent_id),
+ task_id="memory-retrieve",
+ content_length=len(query.text),
+ operation="retrieve",
+ )
else:
kwargs = query_to_mem0_getall_args(str(agent_id), query)
raw_result = await asyncio.to_thread(client.get_all, **kwargs)
@@
else:
- if query.text is not None:
- await self._record_embedding_cost(
- agent_id=str(agent_id),
- task_id="memory-retrieve",
- content_length=len(query.text),
- operation="retrieve",
- )
logger.info(
MEMORY_ENTRY_RETRIEVED,
agent_id=agent_id,
count=len(entries),
)Also applies to: 633-639
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/memory/backends/mem0/adapter.py` around lines 559 - 564, The
embedding cost must be recorded immediately after the Mem0 SDK call returns to
avoid losing the CostRecord on downstream failures: move the
_record_embedding_cost(...) call to directly follow the client.add(...) (for
store flows) and client.search(...) (for search flows) success returns inside
the same function (e.g., in the Mem0 adapter methods where client.add and
client.search are invoked), rather than after
validation/mapping/filtering/sparse-upsert; ensure you pass the same parameters
(agent_id, task_id, content_length, operation) and leave downstream processing
unchanged so the cost is persisted even if later steps fail.
| with patch( | ||
| "synthorg.memory.procedural.pipeline.propose_procedural_memory", | ||
| new_callable=AsyncMock, | ||
| ) as mock_propose: | ||
| await try_procedural_memory( | ||
| result, | ||
| recovery, | ||
| str(_AGENT_UUID), | ||
| "task-hook-001", | ||
| procedural_proposer=proposer, | ||
| memory_backend=backend, | ||
| ) | ||
|
|
||
| mock_propose.assert_awaited_once() | ||
| call_kwargs = mock_propose.call_args | ||
| assert call_kwargs[1]["proposer"] is proposer | ||
| assert call_kwargs[1]["memory_backend"] is backend |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the import style in memory_hooks.py to confirm patch target
rg -n "from synthorg.memory.procedural.pipeline import" src/synthorg/engine/post_execution/memory_hooks.pyRepository: Aureliolo/synthorg
Length of output: 140
🏁 Script executed:
# Get the complete import statement and surrounding context
sed -n '126,135p' src/synthorg/engine/post_execution/memory_hooks.pyRepository: Aureliolo/synthorg
Length of output: 410
🏁 Script executed:
# Find where propose_procedural_memory is called in memory_hooks.py
rg -n "propose_procedural_memory" src/synthorg/engine/post_execution/memory_hooks.pyRepository: Aureliolo/synthorg
Length of output: 149
🏁 Script executed:
# Check the test file to see the try_procedural_memory function being tested
sed -n '200,230p' tests/unit/engine/post_execution/test_memory_hooks.pyRepository: Aureliolo/synthorg
Length of output: 1254
Patch target will not intercept the imported function.
When memory_hooks.py imports propose_procedural_memory at line 127, Python creates a reference in the module's namespace. The function is then called at line 131 by looking it up in that namespace, not in the source module. Per Python mocking best practices, patch where the name is looked up.
🐛 Proposed fix
with patch(
- "synthorg.memory.procedural.pipeline.propose_procedural_memory",
+ "synthorg.engine.post_execution.memory_hooks.propose_procedural_memory",
new_callable=AsyncMock,
) as mock_propose:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/engine/post_execution/test_memory_hooks.py` around lines 210 -
226, The test is patching the original function's module but the code under test
calls the imported name from memory_hooks, so the patch doesn't intercept the
call; change the patch target to the symbol as looked up in the module that
calls it (patch
"synthorg.memory.post_execution.memory_hooks.propose_procedural_memory" instead
of "synthorg.memory.procedural.pipeline.propose_procedural_memory") so the
AsyncMock replaces the imported reference used by try_procedural_memory; keep
the same assertions against mock_propose (proposer and memory_backend) after
adjusting the patch target.
| with patch( | ||
| "synthorg.memory.procedural.pipeline.propose_procedural_memory", | ||
| new_callable=AsyncMock, | ||
| side_effect=RuntimeError("boom"), | ||
| ): | ||
| # Should not raise. | ||
| await try_procedural_memory( | ||
| result, | ||
| recovery, | ||
| str(_AGENT_UUID), | ||
| "task-hook-001", | ||
| procedural_proposer=proposer, | ||
| memory_backend=backend, | ||
| ) |
There was a problem hiding this comment.
Same patch target issue applies here.
This test also patches the source module instead of where the function is imported.
🐛 Proposed fix
with patch(
- "synthorg.memory.procedural.pipeline.propose_procedural_memory",
+ "synthorg.engine.post_execution.memory_hooks.propose_procedural_memory",
new_callable=AsyncMock,
side_effect=RuntimeError("boom"),
):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/engine/post_execution/test_memory_hooks.py` around lines 235 -
248, The test currently patches the implementation location
"synthorg.memory.procedural.pipeline.propose_procedural_memory" but
try_procedural_memory uses the imported name, so update the patch target to the
module where try_procedural_memory imports propose_procedural_memory (e.g.,
patch "synthorg.memory.procedural.hooks.propose_procedural_memory" instead) so
the AsyncMock actually replaces the function used by try_procedural_memory; keep
the AsyncMock(side_effect=RuntimeError("boom")) and the same await call to
try_procedural_memory with proposer and memory_backend.
| with ( | ||
| patch( | ||
| "synthorg.memory.procedural.pipeline.propose_procedural_memory", | ||
| new_callable=AsyncMock, | ||
| side_effect=MemoryError("out of memory"), | ||
| ), | ||
| pytest.raises(MemoryError, match="out of memory"), | ||
| ): | ||
| await try_procedural_memory( | ||
| result, | ||
| recovery, | ||
| str(_AGENT_UUID), | ||
| "task-hook-001", | ||
| procedural_proposer=proposer, | ||
| memory_backend=backend, | ||
| ) |
There was a problem hiding this comment.
Same patch target issue applies here.
🐛 Proposed fix
with (
patch(
- "synthorg.memory.procedural.pipeline.propose_procedural_memory",
+ "synthorg.engine.post_execution.memory_hooks.propose_procedural_memory",
new_callable=AsyncMock,
side_effect=MemoryError("out of memory"),
),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/engine/post_execution/test_memory_hooks.py` around lines 257 -
272, The test is patching the wrong import path for propose_procedural_memory;
update the patch target to the module where try_procedural_memory imports or
calls propose_procedural_memory (i.e., patch the symbol as used by
try_procedural_memory rather than
"synthorg.memory.procedural.pipeline.propose_procedural_memory"), keeping
new_callable=AsyncMock and side_effect=MemoryError("out of memory"), then run
the same pytest.raises assertion around await try_procedural_memory so the
AsyncMock is applied correctly.
| def test_pool_limit_with_diversity_enabled(self) -> None: | ||
| strategy = ContextInjectionStrategy( | ||
| backend=backend, | ||
| config=config, | ||
| ) | ||
| result = await strategy.prepare_messages( | ||
| agent_id="agent-1", | ||
| query_text="query", | ||
| token_budget=5000, | ||
| backend=_make_backend(), | ||
| config=MemoryRetrievalConfig( | ||
| diversity_penalty_enabled=True, | ||
| candidate_pool_multiplier=3, | ||
| max_memories=10, | ||
| ), | ||
| ) | ||
| assert len(result) == 1 | ||
| content = result[0].content | ||
| assert content is not None | ||
| assert "high score" in content | ||
| assert "low score" not in content | ||
| assert strategy._compute_pool_limit() == 30 | ||
|
|
||
| async def test_rrf_sparse_error_degrades_to_dense(self) -> None: | ||
| """Sparse search failure degrades gracefully to dense-only.""" | ||
| entry = _make_entry(content="dense survives", relevance_score=0.9) | ||
| backend = _make_sparse_backend(dense_entries=(entry,)) | ||
| backend.retrieve_sparse = AsyncMock( | ||
| side_effect=MemoryRetrievalError("sparse broken"), | ||
| ) | ||
| config = MemoryRetrievalConfig( | ||
| fusion_strategy=FusionStrategy.RRF, | ||
| min_relevance=0.0, | ||
| ) | ||
| def test_pool_limit_without_diversity(self) -> None: | ||
| strategy = ContextInjectionStrategy( | ||
| backend=backend, | ||
| config=config, | ||
| ) | ||
| result = await strategy.prepare_messages( | ||
| agent_id="agent-1", | ||
| query_text="query", | ||
| token_budget=5000, | ||
| backend=_make_backend(), | ||
| config=MemoryRetrievalConfig( | ||
| diversity_penalty_enabled=False, | ||
| max_memories=10, | ||
| ), | ||
| ) | ||
| assert len(result) == 1 | ||
| content = result[0].content | ||
| assert content is not None | ||
| assert "dense survives" in content | ||
| assert strategy._compute_pool_limit() == 10 | ||
|
|
||
| async def test_linear_path_unchanged(self) -> None: | ||
| """LINEAR fusion uses existing rank_memories path.""" | ||
| entry = _make_entry(content="linear path", relevance_score=0.9) | ||
| backend = _make_backend((entry,)) | ||
| config = MemoryRetrievalConfig( | ||
| fusion_strategy=FusionStrategy.LINEAR, | ||
| min_relevance=0.0, | ||
| ) | ||
| strategy = ContextInjectionStrategy( | ||
| backend=backend, | ||
| config=config, | ||
| ) | ||
| result = await strategy.prepare_messages( | ||
| agent_id="agent-1", | ||
| query_text="query", | ||
| token_budget=5000, | ||
| ) | ||
| assert len(result) == 1 | ||
| content = result[0].content | ||
| assert content is not None | ||
| assert "linear path" in content | ||
| def test_default_multiplier_is_three(self) -> None: | ||
| config = MemoryRetrievalConfig() | ||
| assert config.candidate_pool_multiplier == 3 | ||
|
|
||
| async def test_rrf_deduplicates_entries(self) -> None: | ||
| """Same entry in both dense and sparse is deduplicated by RRF.""" | ||
| entry = _make_entry( | ||
| entry_id="shared-id", | ||
| content="appears twice", | ||
| relevance_score=0.8, | ||
| ) | ||
| backend = _make_sparse_backend( | ||
| dense_entries=(entry,), | ||
| sparse_entries=(entry,), | ||
| ) | ||
| config = MemoryRetrievalConfig( | ||
| fusion_strategy=FusionStrategy.RRF, | ||
| min_relevance=0.0, | ||
| ) | ||
| strategy = ContextInjectionStrategy( | ||
| backend=backend, | ||
| config=config, | ||
| ) | ||
| result = await strategy.prepare_messages( | ||
| agent_id="agent-1", | ||
| query_text="query", | ||
| token_budget=5000, | ||
| ) | ||
| assert len(result) == 1 | ||
| content = result[0].content | ||
| assert content is not None | ||
| # Should appear only once (deduplicated) | ||
| assert content.count("appears twice") == 1 | ||
| def test_multiplier_out_of_range(self) -> None: | ||
| with pytest.raises(ValidationError): | ||
| MemoryRetrievalConfig(candidate_pool_multiplier=11) |
There was a problem hiding this comment.
Parameterize similar candidate-pool cases and cover both validation boundaries.
Lines 622–649 define closely related cases that should be consolidated with @pytest.mark.parametrize, and the out-of-range validation should include the lower bound (0) in addition to 11.
♻️ Proposed refactor
`@pytest.mark.unit`
class TestCandidatePoolMultiplier:
- def test_pool_limit_with_diversity_enabled(self) -> None:
+ `@pytest.mark.parametrize`(
+ ("diversity_penalty_enabled", "candidate_pool_multiplier", "max_memories", "expected"),
+ [
+ (True, 3, 10, 30),
+ (False, 3, 10, 10),
+ ],
+ )
+ def test_pool_limit(
+ self,
+ diversity_penalty_enabled: bool,
+ candidate_pool_multiplier: int,
+ max_memories: int,
+ expected: int,
+ ) -> None:
strategy = ContextInjectionStrategy(
backend=_make_backend(),
config=MemoryRetrievalConfig(
- diversity_penalty_enabled=True,
- candidate_pool_multiplier=3,
- max_memories=10,
+ diversity_penalty_enabled=diversity_penalty_enabled,
+ candidate_pool_multiplier=candidate_pool_multiplier,
+ max_memories=max_memories,
),
)
- assert strategy._compute_pool_limit() == 30
-
- def test_pool_limit_without_diversity(self) -> None:
- strategy = ContextInjectionStrategy(
- backend=_make_backend(),
- config=MemoryRetrievalConfig(
- diversity_penalty_enabled=False,
- max_memories=10,
- ),
- )
- assert strategy._compute_pool_limit() == 10
+ assert strategy._compute_pool_limit() == expected
def test_default_multiplier_is_three(self) -> None:
config = MemoryRetrievalConfig()
assert config.candidate_pool_multiplier == 3
- def test_multiplier_out_of_range(self) -> None:
+ `@pytest.mark.parametrize`("candidate_pool_multiplier", [0, 11])
+ def test_multiplier_out_of_range(self, candidate_pool_multiplier: int) -> None:
with pytest.raises(ValidationError):
- MemoryRetrievalConfig(candidate_pool_multiplier=11)
+ MemoryRetrievalConfig(
+ candidate_pool_multiplier=candidate_pool_multiplier,
+ )As per coding guidelines, “Use @pytest.mark.parametrize for testing similar cases in Python.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/memory/test_retriever.py` around lines 622 - 649, The tests for
candidate pool multiplier are duplicated and miss the lower validation boundary;
refactor the three similar tests (test_pool_limit_with_diversity_enabled,
test_pool_limit_without_diversity, test_default_multiplier_is_three) into a
single parametrized test using `@pytest.mark.parametrize` that calls
ContextInjectionStrategy(..., config=MemoryRetrievalConfig(...)) and asserts
strategy._compute_pool_limit() equals the expected value for each case, and add
a second validation test alongside the existing one (which raises
ValidationError for 11) to also assert
MemoryRetrievalConfig(candidate_pool_multiplier=0) raises ValidationError;
reference ContextInjectionStrategy, MemoryRetrievalConfig, _compute_pool_limit,
and ValidationError when locating code to change.
- Add EMBEDDING row to call-category table in operations.md - Reset client/connected state on CancelledError during sparse init - Add whitespace-only model name rejection test for EmbeddingCostConfig - Add lower-bound validation test (multiplier=0) for candidate_pool_multiplier
🤖 I have created a release *beep* *boop* --- ## [0.6.3](v0.6.2...v0.6.3) (2026-04-06) ### Features * backend CRUD + multi-user permissions ([#1081](#1081), [#1082](#1082)) ([#1094](#1094)) ([93e469b](93e469b)) * in-dashboard team editing + budget rebalance on pack apply ([#1093](#1093)) ([35977c0](35977c0)), closes [#1079](#1079) [#1080](#1080) * tiered rate limiting, NotificationSink protocol, in-dashboard notifications ([#1092](#1092)) ([df2142c](df2142c)), closes [#1077](#1077) [#1078](#1078) [#849](#849) * two-stage safety classifier and cross-provider uncertainty check for approval gates ([#1090](#1090)) ([0b2edee](0b2edee)), closes [#847](#847) [#701](#701) ### Refactoring * memory pipeline improvements ([#1075](#1075), [#997](#997)) ([#1091](#1091)) ([a048a4c](a048a4c)) ### Documentation * add OpenCode parity setup and hookify rule documentation ([#1095](#1095)) ([52e877a](52e877a)) ### Maintenance * bump vite from 8.0.3 to 8.0.4 in /web in the all group across 1 directory ([#1088](#1088)) ([1e86ca6](1e86ca6)) * tune ZAP DAST scan -- auth, timeouts, rules, report artifacts ([#1097](#1097)) ([82bf0e1](82bf0e1)), closes [#1096](#1096) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Summary
Memory consolidation follow-ups from PR #1071 review and embedding cost tracking.
#1075: Memory Consolidation Follow-ups (18 of 22 items)
Code polish (D16-D22):
SynthesisOutcomeStrEnum replaces bareboolin_synthesizereturn_select_entriesreturns immutabletupleinstead oflistMemoryToolNameStrEnum narrowsDistillationRequest.memory_tool_invocationsasyncio.CancelledErrorre-raise in allexcept Exceptionblocks_reformulation_loopwithMEMORY_REFORMULATION_FINAL_CHECKeventMEMORY_RETRIEVAL_STARTlog fromContextInjectionStrategy.__init__Architectural (A1, A2, A4):
LLMConsolidationConfigfrozen Pydantic model replaces loose__init__kwargs and module constants (9 fields with validation)candidate_pool_multiplieronMemoryRetrievalConfigfor MMR diversity re-ranking (over-fetch, filter+MMR, truncate)scoring_strategydiscriminator field onScoredMemory(LINEAR vs RRF provenance)File splits (B6, B7, B8):
try_capture_distillationandtry_procedural_memoryintoengine/post_execution/memory_hooks.pyTestHybridSearchPipelinetotest_retriever_hybrid.py_compute_pool_limit()helperTest hygiene (C10-C13, C15):
AsyncMock(spec=...)discipline across all memory test filescall_args[0][N]positional indexing with.args[N]_make_strategyfixture with**config_overrides_merge_resultsand_bigram_jaccardto public APIsstructlog.testing.capture_logs()Verified as already done: A3 (summary_ids), C14 (parallelism proof)
Deferred: A5 (DistillationRequest downstream), C9 (integration test), D19 (NotBlankStr no-ops)
#997: Track Embedding Model Costs in Budget System
EMBEDDINGtoLLMCallCategoryenumEmbeddingCostConfigwith per-model pricing and chars-per-token heuristicMem0MemoryBackend.store()andretrieve()to recordCostRecordwithcall_category=EMBEDDINGCategoryBreakdownwith embedding cost/tokens/count fieldsCategoryBreakdowncomponent withbg-accent-dimdesign tokenTest Plan
Review Coverage
Pre-reviewed by 5 agents (code-reviewer, type-design-analyzer, silent-failure-hunter, issue-resolution-verifier, conventions-enforcer). 13 findings addressed in final fix commit.
Closes #1075
Closes #997