feat: add RRF rank fusion to memory ranking#478
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds Reciprocal Rank Fusion (RRF) as an optional fusion strategy in memory ranking, exposes Changes
sequenceDiagram
actor Client
participant RetrieverA as "Retriever A\n(pre-ranked list)"
participant RetrieverB as "Retriever B\n(pre-ranked list)"
participant Ranking as "memory.ranking\n(fuse_ranked_lists)"
participant Observability as "observability/events"
Client->>RetrieverA: request retrieval
Client->>RetrieverB: request retrieval
RetrieverA-->>Client: ranked_list_A
RetrieverB-->>Client: ranked_list_B
Client->>Ranking: fuse_ranked_lists([ranked_list_A, ranked_list_B], k=60)
Ranking->>Ranking: compute RRF scores, normalize -> ScoredMemory[]
Ranking->>Observability: emit MEMORY_RRF_FUSION_COMPLETE(info)
alt validation failed
Ranking->>Observability: emit MEMORY_RRF_VALIDATION_FAILED(info)
end
Ranking-->>Client: fused ScoredMemory[]
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the memory ranking system by introducing Reciprocal Rank Fusion (RRF) as a new strategy. This allows for more sophisticated merging of ranked lists from multiple retrieval sources, moving beyond the existing linear combination approach. The change provides greater flexibility and improved performance for hybrid search scenarios in memory retrieval, while maintaining the linear strategy as the default for single-source ranking. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces Reciprocal Rank Fusion (RRF) as an alternative strategy for memory ranking. The changes include adding a FusionStrategy enum, new configuration fields in MemoryRetrievalConfig with associated validation, and the core fuse_ranked_lists implementation. The documentation and design specifications have been updated accordingly, and the feature is supported by a comprehensive set of new unit tests that cover various edge cases and scenarios. The overall quality of the implementation is high. My feedback includes one suggestion to improve the maintainability of a configuration validator by dynamically accessing a field's default value instead of hardcoding it.
| @model_validator(mode="after") | ||
| def _validate_rrf_k_strategy_consistency(self) -> Self: | ||
| """Warn when rrf_k is customized but fusion strategy is LINEAR.""" | ||
| _default_rrf_k = 60 |
There was a problem hiding this comment.
Hardcoding the default value _default_rrf_k = 60 makes this validator brittle. If the default value for the rrf_k field is ever changed in its Field definition, this check will become incorrect, leading to false warnings or missed ones.
To make this more robust and maintainable, you can retrieve the default value directly from the model's field definition. This ensures the validator always uses the current, correct default value.
| _default_rrf_k = 60 | |
| _default_rrf_k = self.model_fields["rrf_k"].default |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #478 +/- ##
=======================================
Coverage 93.81% 93.82%
=======================================
Files 476 476
Lines 22632 22667 +35
Branches 2182 2186 +4
=======================================
+ Hits 21233 21268 +35
Misses 1091 1091
Partials 308 308 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 273-278: Add logging calls before raising the validation
ValueError for k and max_results in the function containing these checks: use
the module logger (or the existing logger variable) to log a WARNING or ERROR
with context (e.g., include the invalid value and parameter name) immediately
before each raise; if there is a dedicated validation error event constant
available, use that constant in the log call instead of
MEMORY_RRF_FUSION_COMPLETE to follow guidelines.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 18957832-25a0-42a0-89aa-39eb0440d88e
📒 Files selected for processing (9)
CLAUDE.mddocs/design/memory.mdsrc/synthorg/memory/__init__.pysrc/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/observability/events/memory.pytests/unit/memory/test_init.pytests/unit/memory/test_ranking.pytests/unit/memory/test_retrieval_config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Sandbox
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax: useexcept A, B:(no parentheses) — ruff enforces this on Python 3.14
Type hints required on all public functions; enforce with mypy strict mode
Google style docstrings required on public classes and functions; enforced by ruff D rules
Create new objects instead of mutating existing ones; usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement in non-Pydantic internal collections
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over barecreate_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Line length: 88 characters (ruff); enforced by linter
Files:
src/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/observability/events/memory.pytests/unit/memory/test_retrieval_config.pytests/unit/memory/test_init.pysrc/synthorg/memory/__init__.pytests/unit/memory/test_ranking.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models withmodel_copy(update=...)for runtime state that evolves
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict)
Use@computed_fieldfor derived values instead of storing and validating redundant fields (e.g.,TokenUsage.total_tokens)
UseNotBlankStr(fromcore.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Every module with business logic must have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code; use synthorg'sget_logger()instead
Always useloggeras the variable name, not_loggerorlog
Always use event name constants from the domain-specific module undersynthorg.observability.eventsin logging calls
Log with structured kwargs: alwayslogger.info(EVENT, key=value)— neverlogger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Log at DEBUG level for object creation, internal flow, entry/exit of key functions
Use Google-style docstrings in all public Python code for mkdocstrings auto-generation
Files:
src/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/observability/events/memory.pysrc/synthorg/memory/__init__.py
{src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx,vue},cli/**/*.go,**/*.md,**/*.yml,**/*.yaml,**/*.json,pyproject.toml}
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names:
example-provider,example-large-001,example-medium-001,example-small-001, or size aliases
Files:
src/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/observability/events/memory.pytests/unit/memory/test_retrieval_config.pytests/unit/memory/test_init.pysrc/synthorg/memory/__init__.pyCLAUDE.mddocs/design/memory.mdtests/unit/memory/test_ranking.py
src/synthorg/memory/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory
Files:
src/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/memory/__init__.py
src/synthorg/observability/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Observability must use structured logging with correlation tracking and log sinks
Files:
src/synthorg/observability/events/memory.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slowmarkers for test categorization
Maintain 80% minimum test coverage; enforced in CI
Useasyncio_mode = 'auto'in pytest config — no manual@pytest.mark.asyncioneeded on individual tests
Set 30-second timeout per test
Always include-n autowhen running pytest with pytest-xdist; never run tests sequentially
Prefer@pytest.mark.parametrizefor testing similar cases
Usetest-provider,test-small-001, etc. in test files
Use Hypothesis property-based testing with@givenand@settings; control profile viaHYPOTHESIS_PROFILEenv var (ci: 200 examples, dev: 1000 examples)
Never skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mocktime.monotonic()andasyncio.sleep()for timing-sensitive tests
Files:
tests/unit/memory/test_retrieval_config.pytests/unit/memory/test_init.pytests/unit/memory/test_ranking.py
docs/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Markdown documentation files should follow the documented structure:
docs/design/,docs/architecture/,docs/roadmap/,docs/security.md,docs/licensing.md,docs/reference/,docs/rest-api.md,docs/api/
Files:
docs/design/memory.md
docs/design/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Design specification pages in
docs/design/must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)
Files:
docs/design/memory.md
🧠 Learnings (17)
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`)
Applied to files:
src/synthorg/memory/retrieval_config.py
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves
Applied to files:
src/synthorg/memory/retrieval_config.py
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory
Applied to files:
src/synthorg/memory/__init__.pytests/unit/memory/test_ranking.py
📚 Learning: 2026-03-15T22:42:34.900Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.900Z
Learning: Applies to src/synthorg/core/**/*.py : Core module must contain shared domain models, base classes, resilience config (RetryConfig, RateLimiterConfig)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Structured kwargs in logging: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Log with structured kwargs: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.900Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.900Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Always use `logger` as the variable name, not `_logger` or `log`
Applied to files:
CLAUDE.md
🧬 Code graph analysis (3)
src/synthorg/memory/retrieval_config.py (1)
src/synthorg/memory/ranking.py (1)
FusionStrategy(32-43)
tests/unit/memory/test_retrieval_config.py (2)
src/synthorg/memory/ranking.py (1)
FusionStrategy(32-43)src/synthorg/memory/retrieval_config.py (1)
MemoryRetrievalConfig(21-167)
src/synthorg/memory/__init__.py (1)
src/synthorg/memory/ranking.py (3)
FusionStrategy(32-43)ScoredMemory(46-80)fuse_ranked_lists(237-327)
🔇 Additional comments (24)
src/synthorg/observability/events/memory.py (1)
59-59: LGTM!The new event constant follows the established
memory.<entity>.<action>naming convention and is correctly placed under the retrieval pipeline section alongsideMEMORY_RANKING_COMPLETE.CLAUDE.md (2)
125-125: LGTM!Package description accurately updated to reflect the new RRF fusion capability in the retrieval pipeline.
194-194: LGTM!Event constant
MEMORY_RRF_FUSION_COMPLETEcorrectly added to the logging events reference list, maintaining consistency with other memory domain events.tests/unit/memory/test_init.py (2)
19-19: LGTM!New
FusionStrategyexport correctly added to the expected public API surface test.
64-64: LGTM!New
fuse_ranked_listsfunction correctly added to expected exports, maintaining alphabetical order.docs/design/memory.md (1)
522-541: LGTM!Excellent documentation update that clearly differentiates the two fusion strategies:
- Labels existing linear approach as the default
- Explains RRF formula, normalization, and the
kparameter- Provides appropriate external references
- Correctly positions RRF for multi-source scenarios while preserving linear as the single-source default
src/synthorg/memory/__init__.py (3)
67-67: LGTM!Import statement correctly updated to include
FusionStrategyandfuse_ranked_listsfrom the ranking module.
80-80: LGTM!
FusionStrategycorrectly added to__all__in alphabetical order.
119-119: LGTM!
fuse_ranked_listscorrectly added to__all__in alphabetical order.src/synthorg/memory/retrieval_config.py (3)
101-113: LGTM!New
fusion_strategyandrrf_kfields are well-defined with appropriate defaults (LINEAR, 60) and constraints (rrf_k: 1-1000). The descriptions clearly indicate the fields' purposes.
115-134: LGTM!Weight sum validation correctly updated to only apply when
fusion_strategy == LINEAR. This is the right approach since relevance/recency weights are not meaningful for RRF fusion where scoring comes from rank positions.
136-150: Good addition with minor observation.The consistency validator appropriately warns (but doesn't raise) when
rrf_kis customized while using LINEAR fusion, since the value would be ignored.Note:
_default_rrf_k = 60duplicates the field default. This is acceptable for readability, but if the field default changes, this value would need to be updated too. Consider usingself.model_fields['rrf_k'].defaultfor a single source of truth, though this adds complexity.tests/unit/memory/test_retrieval_config.py (3)
27-28: LGTM!Default value assertions correctly added for
fusion_strategy(LINEAR) andrrf_k(60).
157-166: LGTM!JSON roundtrip test for RRF configuration ensures serialization/deserialization preserves both
fusion_strategyandrrf_kvalues correctly.
169-217: LGTM!Comprehensive test class covering:
- Default fusion strategy verification
- RRF strategy acceptance
- Weight sum validation bypass for RRF
- Weight sum enforcement for LINEAR
rrf_kdefault, boundaries (1, 1000), and rejection of out-of-range values (0, 1001)Good coverage of both happy paths and error conditions.
tests/unit/memory/test_ranking.py (6)
420-430: LGTM!Good tests verifying
FusionStrategyenum values and confirming it's aStrEnum(string-based enumeration).
436-464: LGTM!Good coverage of basic fusion scenarios: empty inputs, single list ordering preservation, and disjoint list merging.
466-510: LGTM!Excellent tests for core RRF behavior:
- Overlapping entries correctly accumulate higher scores
- Rank-1 entries in all lists achieve maximum normalized score (1.0)
max_resultstruncation works correctly- Different
kvalues preserve ranking order with expected min-max normalization
512-551: LGTM!Good edge case coverage:
- Single result normalizes to 1.0
- Equal raw scores all normalize to 1.0
- Relevance score preservation (raw value passed through, None defaults to 0.0)
- Recency score is always 0.0 for RRF output
is_sharedflag defaults to False
553-597: LGTM!Rigorous verification tests:
- Duplicate ID handling confirms first
MemoryEntryis retained- Exact RRF score calculation verified with known k=60 values
- Three-list accumulation confirms correct score ordering (a > b > c)
These tests provide strong confidence in the mathematical correctness of the RRF implementation.
599-605: LGTM!Input validation tests confirm
ValueErroris raised for invalid parameters (k=0,max_results=0), matching the implementation guards.src/synthorg/memory/ranking.py (3)
4-8: LGTM!Clear module docstring that accurately documents both ranking pathways and their distinct use cases.
32-44: LGTM!Clean enum definition with clear docstring. The lowercase string values will serialize well in configs.
280-327: RRF implementation is correct and well-structured.The algorithm correctly:
- Accumulates
1/(k+rank)scores across all lists- Preserves first-encountered entry per ID while summing contributions
- Handles empty input gracefully
- Normalizes to [0,1] with proper edge case handling when all scores are equal
- Uses structured logging with appropriate event constant
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 333-338: The MEMORY_RRF_FUSION_COMPLETE event is being emitted
twice for duplicate-heavy inputs; remove the early debug call and emit
MEMORY_RRF_FUSION_COMPLETE exactly once after the fusion completes (i.e., after
duplicate merging logic) and only when duplicate_count > 0, using the same
fields (duplicate_ids_merged=duplicate_count, unique_entries=len(entries));
apply the same change to the other duplicate block referenced (the block around
the code currently at lines 355-361) so the event is not logged before and after
fusion.
- Around line 325-331: In the RRF accumulation loop over ranked_lists, each
occurrence of the same entry.id within a single ranked_list is currently counted
multiple times; change the logic so you deduplicate IDs per source list before
awarding the 1.0/(k + rank) score: for each ranked_list (the loop that currently
iterates "for ranked_list in ranked_lists"), track a per-list seen set and skip
entries whose id is already seen so that each source contributes at most one
rank for a given id; update scores (the scores dict), entries (the entries dict)
and duplicate_count accordingly (increment duplicate_count when you ignore a
duplicate within the same ranked_list) so RRF semantics are preserved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ecbb9dbe-ebe0-4d4f-a59b-3825c65d8690
📒 Files selected for processing (7)
CLAUDE.mddocs/design/memory.mdsrc/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.pysrc/synthorg/observability/events/memory.pytests/unit/memory/test_ranking.pytests/unit/memory/test_retrieval_config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
🧰 Additional context used
📓 Path-based instructions (8)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax: useexcept A, B:(no parentheses) — ruff enforces this on Python 3.14
Type hints required on all public functions; enforce with mypy strict mode
Google style docstrings required on public classes and functions; enforced by ruff D rules
Create new objects instead of mutating existing ones; usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement in non-Pydantic internal collections
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over barecreate_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Line length: 88 characters (ruff); enforced by linter
Files:
tests/unit/memory/test_ranking.pysrc/synthorg/observability/events/memory.pysrc/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.pytests/unit/memory/test_retrieval_config.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slowmarkers for test categorization
Maintain 80% minimum test coverage; enforced in CI
Useasyncio_mode = 'auto'in pytest config — no manual@pytest.mark.asyncioneeded on individual tests
Set 30-second timeout per test
Always include-n autowhen running pytest with pytest-xdist; never run tests sequentially
Prefer@pytest.mark.parametrizefor testing similar cases
Usetest-provider,test-small-001, etc. in test files
Use Hypothesis property-based testing with@givenand@settings; control profile viaHYPOTHESIS_PROFILEenv var (ci: 200 examples, dev: 1000 examples)
Never skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mocktime.monotonic()andasyncio.sleep()for timing-sensitive tests
Files:
tests/unit/memory/test_ranking.pytests/unit/memory/test_retrieval_config.py
{src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx,vue},cli/**/*.go,**/*.md,**/*.yml,**/*.yaml,**/*.json,pyproject.toml}
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names:
example-provider,example-large-001,example-medium-001,example-small-001, or size aliases
Files:
tests/unit/memory/test_ranking.pydocs/design/memory.mdsrc/synthorg/observability/events/memory.pysrc/synthorg/memory/ranking.pyCLAUDE.mdsrc/synthorg/memory/retrieval_config.pytests/unit/memory/test_retrieval_config.py
docs/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Markdown documentation files should follow the documented structure:
docs/design/,docs/architecture/,docs/roadmap/,docs/security.md,docs/licensing.md,docs/reference/,docs/rest-api.md,docs/api/
Files:
docs/design/memory.md
docs/design/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Design specification pages in
docs/design/must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)
Files:
docs/design/memory.md
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models withmodel_copy(update=...)for runtime state that evolves
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict)
Use@computed_fieldfor derived values instead of storing and validating redundant fields (e.g.,TokenUsage.total_tokens)
UseNotBlankStr(fromcore.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Every module with business logic must have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code; use synthorg'sget_logger()instead
Always useloggeras the variable name, not_loggerorlog
Always use event name constants from the domain-specific module undersynthorg.observability.eventsin logging calls
Log with structured kwargs: alwayslogger.info(EVENT, key=value)— neverlogger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Log at DEBUG level for object creation, internal flow, entry/exit of key functions
Use Google-style docstrings in all public Python code for mkdocstrings auto-generation
Files:
src/synthorg/observability/events/memory.pysrc/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.py
src/synthorg/observability/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Observability must use structured logging with correlation tracking and log sinks
Files:
src/synthorg/observability/events/memory.py
src/synthorg/memory/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory
Files:
src/synthorg/memory/ranking.pysrc/synthorg/memory/retrieval_config.py
🧠 Learnings (16)
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory
Applied to files:
tests/unit/memory/test_ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/memory/ranking.pyCLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T22:42:34.900Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.900Z
Learning: Applies to src/synthorg/core/**/*.py : Core module must contain shared domain models, base classes, resilience config (RetryConfig, RateLimiterConfig)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Structured kwargs in logging: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Log with structured kwargs: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.900Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.900Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Always use `logger` as the variable name, not `_logger` or `log`
Applied to files:
CLAUDE.md
🧬 Code graph analysis (2)
src/synthorg/memory/retrieval_config.py (1)
src/synthorg/memory/ranking.py (1)
FusionStrategy(33-44)
tests/unit/memory/test_retrieval_config.py (2)
src/synthorg/memory/ranking.py (1)
FusionStrategy(33-44)src/synthorg/memory/retrieval_config.py (1)
MemoryRetrievalConfig(22-185)
src/synthorg/memory/ranking.py
Outdated
| if duplicate_count: | ||
| logger.debug( | ||
| MEMORY_RRF_FUSION_COMPLETE, | ||
| duplicate_ids_merged=duplicate_count, | ||
| unique_entries=len(entries), | ||
| ) |
There was a problem hiding this comment.
Emit MEMORY_RRF_FUSION_COMPLETE only once per fusion call.
When duplicate_count > 0, this function logs the same ...complete event before and after the actual fusion. Any metric or sink keyed by event name will overcount successful fusions for duplicate-heavy inputs.
🛠️ Suggested fix
- if duplicate_count:
- logger.debug(
- MEMORY_RRF_FUSION_COMPLETE,
- duplicate_ids_merged=duplicate_count,
- unique_entries=len(entries),
- )
-
if not entries:
logger.debug(
MEMORY_RRF_FUSION_COMPLETE,
num_lists=len(ranked_lists),
unique_entries=0,
@@
logger.debug(
MEMORY_RRF_FUSION_COMPLETE,
num_lists=len(ranked_lists),
unique_entries=len(entries),
after_truncation=len(result),
k=k,
+ duplicate_ids_merged=duplicate_count,
)Also applies to: 355-361
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/memory/ranking.py` around lines 333 - 338, The
MEMORY_RRF_FUSION_COMPLETE event is being emitted twice for duplicate-heavy
inputs; remove the early debug call and emit MEMORY_RRF_FUSION_COMPLETE exactly
once after the fusion completes (i.e., after duplicate merging logic) and only
when duplicate_count > 0, using the same fields
(duplicate_ids_merged=duplicate_count, unique_entries=len(entries)); apply the
same change to the other duplicate block referenced (the block around the code
currently at lines 355-361) so the event is not logged before and after fusion.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 337-344: Change the two fusion-completion/state-transition log
calls from logger.debug to logger.info so they are visible as INFO-level state
transitions: update the log invocation that emits MEMORY_RRF_FUSION_COMPLETE
(the call that currently passes num_lists, unique_entries, after_truncation,
duplicate_ids_skipped, k) and the subsequent similar log block later in the
function (the second fusion completion block that logs
lists_count/unique_entries/duplicate_ids_skipped/after_truncation/k) to use
logger.info(...) instead of logger.debug(...).
- Around line 272-361: The function fuse_ranked_lists exceeds the 50-line limit;
split it into three helpers to preserve identical behavior: create
validate_rrf_inputs(k, max_results) to perform the k/max_results checks and log
MEMORY_RRF_VALIDATION_FAILED (raise ValueError on bad inputs), create
accumulate_rrf_scores(ranked_lists, k) to iterate lists, track seen_in_list,
duplicate_count, populate scores and entries (use MemoryEntry.id, update scores
with 1.0/(k+rank)), and return (scores, entries, duplicate_count), and create
finalize_rrf_result(scores, entries, duplicate_count, ranked_lists_len, k,
max_results) to handle empty entries logging MEMORY_RRF_FUSION_COMPLETE, call
_normalize_rrf_scores and _build_rrf_scored_memories, sort by combined_score,
truncate to max_results, log final MEMORY_RRF_FUSION_COMPLETE with
unique_entries/after_truncation/duplicate_ids_skipped and return the result
tuple; then refactor fuse_ranked_lists to call these helpers and return their
result, keeping names _normalize_rrf_scores and _build_rrf_scored_memories
intact so behavior and logs remain unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 557534c2-984c-47a5-97f2-c19b341365b1
📒 Files selected for processing (2)
src/synthorg/memory/ranking.pytests/unit/memory/test_ranking.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do NOT usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Useexcept A, B:syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14 (PEP 758 except syntax).
Include type hints on all public functions and classes. Strict mypy type-checking is enforced.
Use Google-style docstrings on all public classes and functions. Docstrings are enforced by ruff D rules.
Line length is 88 characters (enforced by ruff).
Files:
tests/unit/memory/test_ranking.pysrc/synthorg/memory/ranking.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, or@pytest.mark.slow. Maintain 80% coverage minimum. Useasyncio_mode = "auto"(no manual@pytest.mark.asyncioneeded). Set 30-second timeout per test. Always include-n autowhen running pytest (never sequential). Use@pytest.mark.parametrizefor testing similar cases.
Use Python Hypothesis for property-based testing with@given+@settings. Profiles:ci(200 examples, default) anddev(1000 examples), controlled viaHYPOTHESIS_PROFILEenv var. Run dev profile:HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties..hypothesis/is gitignored.
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()to make them deterministic instead of widening timing margins.
Files:
tests/unit/memory/test_ranking.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (withmodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with adopted conventions: use@computed_fieldfor derived values instead of storing + validating redundant fields; useNotBlankStrfromcore.typesfor all identifier/name fields including optional and tuple variants instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Keep functions under 50 lines and files under 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Vendor names may only appear in: (1) Operations design page (docs/design/operations.md), (2).claude/skill/agent files, (3) third-party import paths. Tests must usetest-provider,test-small-001, etc.
Pure data models, enums, and re-exports do NOT need logging.
Files:
src/synthorg/memory/ranking.py
src/synthorg/!(observability)/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/!(observability)/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerfollowed bylogger = get_logger(__name__). Never useimport logging/logging.getLogger()/print()in application code. Always use variable namelogger(not_logger, notlog).
Always use event name constants from domain-specific modules undersynthorg.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured logging with keyword arguments: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG is for object creation, internal flow, entry/exit of key functions.
Files:
src/synthorg/memory/ranking.py
🧠 Learnings (2)
📚 Learning: 2026-03-16T06:18:07.373Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:18:07.373Z
Learning: Applies to src/synthorg/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG is for object creation, internal flow, entry/exit of key functions.
Applied to files:
src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/memory/ranking.py
🧬 Code graph analysis (1)
tests/unit/memory/test_ranking.py (1)
src/synthorg/memory/ranking.py (5)
FusionStrategy(33-44)ScoredMemory(47-83)compute_combined_score(112-132)compute_recency_score(86-109)fuse_ranked_lists(272-361)
🔇 Additional comments (2)
tests/unit/memory/test_ranking.py (2)
54-105: Good parametric coverage for scoring math.This set cleanly validates formula behavior across representative weight/relevance/recency combinations.
410-623: Strong RRF test suite coverage.The new unit cases cover empty inputs, overlap behavior, dedup semantics, exact-score scenarios, and validation guards with good precision.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/synthorg/memory/ranking.py (1)
287-293:⚠️ Potential issue | 🟠 MajorFix per-list rank compression when skipping duplicates in RRF accumulation.
On Line 287-Line 293, duplicates are skipped, but rank still comes from
enumerateover the unfiltered list. This penalizes later unique entries (e.g.,A, A, BmakesBrank 3 instead of rank 2 in that source), which distorts RRF scoring.Suggested patch
def _accumulate_rrf_scores( ranked_lists: tuple[tuple[MemoryEntry, ...], ...], k: int, ) -> tuple[dict[str, float], dict[str, MemoryEntry], int]: @@ for ranked_list in ranked_lists: seen_in_list: set[str] = set() - for rank, entry in enumerate(ranked_list, start=1): + unique_rank = 0 + for entry in ranked_list: if entry.id in seen_in_list: duplicate_count += 1 continue seen_in_list.add(entry.id) - scores[entry.id] = scores.get(entry.id, 0.0) + 1.0 / (k + rank) + unique_rank += 1 + scores[entry.id] = ( + scores.get(entry.id, 0.0) + 1.0 / (k + unique_rank) + ) if entry.id not in entries: entries[entry.id] = entry🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/memory/ranking.py` around lines 287 - 293, The loop uses enumerate(ranked_list, start=1) so rank counts duplicates and later unique entries get an inflated rank; change to track per-list rank that only increments for unique entries: replace enumerate with a simple for entry in ranked_list and introduce a local unique_rank (start at 0), on encountering a duplicate increment duplicate_count and continue, otherwise increment unique_rank and use unique_rank in the score calculation (scores[entry.id] += 1.0 / (k + unique_rank)); update seen_in_list and entries the same way. Reference symbols: ranked_list, seen_in_list, duplicate_count, scores, entries, k, entry.id.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 287-293: The loop uses enumerate(ranked_list, start=1) so rank
counts duplicates and later unique entries get an inflated rank; change to track
per-list rank that only increments for unique entries: replace enumerate with a
simple for entry in ranked_list and introduce a local unique_rank (start at 0),
on encountering a duplicate increment duplicate_count and continue, otherwise
increment unique_rank and use unique_rank in the score calculation
(scores[entry.id] += 1.0 / (k + unique_rank)); update seen_in_list and entries
the same way. Reference symbols: ranked_list, seen_in_list, duplicate_count,
scores, entries, k, entry.id.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b6d41ba4-24fc-4918-bb1b-26766005183c
📒 Files selected for processing (1)
src/synthorg/memory/ranking.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Sandbox
- GitHub Check: Build Backend
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do NOT usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Useexcept A, B:syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14 (PEP 758 except syntax).
Include type hints on all public functions and classes. Strict mypy type-checking is enforced.
Use Google-style docstrings on all public classes and functions. Docstrings are enforced by ruff D rules.
Line length is 88 characters (enforced by ruff).
Files:
src/synthorg/memory/ranking.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (withmodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with adopted conventions: use@computed_fieldfor derived values instead of storing + validating redundant fields; useNotBlankStrfromcore.typesfor all identifier/name fields including optional and tuple variants instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Keep functions under 50 lines and files under 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Vendor names may only appear in: (1) Operations design page (docs/design/operations.md), (2).claude/skill/agent files, (3) third-party import paths. Tests must usetest-provider,test-small-001, etc.
Pure data models, enums, and re-exports do NOT need logging.
Files:
src/synthorg/memory/ranking.py
src/synthorg/!(observability)/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/!(observability)/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerfollowed bylogger = get_logger(__name__). Never useimport logging/logging.getLogger()/print()in application code. Always use variable namelogger(not_logger, notlog).
Always use event name constants from domain-specific modules undersynthorg.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured logging with keyword arguments: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG is for object creation, internal flow, entry/exit of key functions.
Files:
src/synthorg/memory/ranking.py
🧠 Learnings (5)
📚 Learning: 2026-03-16T06:18:07.373Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:18:07.373Z
Learning: Applies to src/synthorg/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG is for object creation, internal flow, entry/exit of key functions.
Applied to files:
src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/synthorg/memory/ranking.py
📚 Learning: 2026-03-16T06:18:07.373Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:18:07.373Z
Learning: Applies to src/synthorg/**/*.py : Keep functions under 50 lines and files under 800 lines.
Applied to files:
src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.
Applied to files:
src/synthorg/memory/ranking.py
📚 Learning: 2026-03-16T06:18:07.373Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:18:07.373Z
Learning: Applies to src/synthorg/!(observability)/**/*.py : Use structured logging with keyword arguments: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`.
Applied to files:
src/synthorg/memory/ranking.py
🔇 Additional comments (1)
src/synthorg/memory/ranking.py (1)
335-346: Validation and fusion lifecycle logging are correctly implemented.Error paths log context before raising (Line 335-Line 346), and fusion completion/state transition is emitted at INFO with structured fields (Line 350-Line 373). Nice compliance with observability guidelines.
Also applies to: 350-373
Add Reciprocal Rank Fusion as an alternative fusion strategy for merging multiple pre-ranked memory lists. Keep linear combination as default for single-source relevance+recency scoring. - Add FusionStrategy enum (LINEAR, RRF) to ranking module - Add fuse_ranked_lists() with min-max normalized RRF scoring - Add fusion_strategy and rrf_k config fields to MemoryRetrievalConfig - Conditionalize weight sum validator (skip for RRF strategy) - Add MEMORY_RRF_FUSION_COMPLETE event constant - Add 22 new unit tests (15 ranking + 8 config)
Mypy strict mode flags direct StrEnum-to-string equality as non-overlapping comparison.
Pre-reviewed by 8 agents, 12 findings addressed: - Add k/max_results input validation guards in fuse_ranked_lists - Add rrf_k/fusion_strategy consistency validator (warn on LINEAR) - Update MemoryRetrievalConfig docstring with new fields - Update test_defaults with fusion_strategy and rrf_k assertions - Add serialization roundtrip test with RRF fields - Add duplicate-ID, exact-scores, 3-list accumulation tests - Add k<1 and max_results<1 validation tests - Update CLAUDE.md event constants and package structure - Update docs/design/memory.md with RRF ranking documentation
…mini - Add _validate_supported_fusion_strategy guard rejecting RRF at config level (not yet wired into retrieval pipeline) — prevents silent misconfiguration where fusion_strategy=RRF produces LINEAR results - Add logger.warning before ValueError raises in fuse_ranked_lists - Extract _DEFAULT_RRF_K module constant (eliminates duplicated magic 60) - Extract _normalize_rrf_scores and _build_rrf_scored_memories helpers (reduces fuse_ranked_lists body from 62 to 56 lines) - Add debug logging on empty-return and duplicate-ID-merge paths - Fix ScoredMemory docstring (describe model contract, not _score_entry) - Clarify design spec pipeline steps 2-3 for RRF path - Add max_results truncation note to design spec RRF section - Update rrf_k Attributes docstring with valid range (1-1000) - Add MEMORY_RRF_VALIDATION_FAILED event constant + CLAUDE.md update - Add structlog capture_logs test for rrf_k consistency warning - Add test for entries < max_results, parametrize negative k/max_results - Parametrize recency decay and combined score test groups - Add score/ordering assertions to test_two_disjoint_lists - Update tests for RRF config rejection (was accepted, now rejected)
…tly once - Track seen IDs per ranked list so each source contributes at most one rank per document (correct RRF semantics) - Remove early MEMORY_RRF_FUSION_COMPLETE log for duplicates; include duplicate_ids_skipped in the single completion log instead - Add test_intra_list_duplicate_id_skipped test - Rename test_duplicate_id_first_occurrence_wins to clarify it tests cross-list behavior
…gs to INFO - Extract accumulation loop into _accumulate_rrf_scores (brings fuse_ranked_lists body from 54 to 41 lines, under 50-line limit) - Change MEMORY_RRF_FUSION_COMPLETE logs from debug to info (CLAUDE.md: "All state transitions must log at INFO")
Skipped duplicates within a ranked list no longer consume rank slots. Subsequent unique entries get the rank they would have in the deduplicated list, preventing ghost-rank score inflation.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/synthorg/memory/ranking.py (1)
301-377: 🛠️ Refactor suggestion | 🟠 Major
fuse_ranked_listsstill exceeds the 50-line function limit.The decomposition improved readability, but this function still remains over the repository’s hard cap. Please extract validation/finalization into helpers so this orchestrator stays under 50 lines.
As per coding guidelines: "Functions must be less than 50 lines; files must be less than 800 lines."♻️ Suggested refactor shape
+def _validate_fuse_inputs(*, k: int, max_results: int) -> None: + if k < 1: + logger.warning(MEMORY_RRF_VALIDATION_FAILED, param="k", value=k) + raise ValueError(f"k must be >= 1, got {k}") + if max_results < 1: + logger.warning( + MEMORY_RRF_VALIDATION_FAILED, + param="max_results", + value=max_results, + ) + raise ValueError(f"max_results must be >= 1, got {max_results}") + +def _finalize_fused_result( + *, + ranked_lists: tuple[tuple[MemoryEntry, ...], ...], + entries: dict[str, MemoryEntry], + scores: dict[str, float], + duplicate_count: int, + k: int, + max_results: int, +) -> tuple[ScoredMemory, ...]: + ... + def fuse_ranked_lists( ranked_lists: tuple[tuple[MemoryEntry, ...], ...], *, k: int = 60, max_results: int = 20, ) -> tuple[ScoredMemory, ...]: - if k < 1: - ... - if max_results < 1: - ... + _validate_fuse_inputs(k=k, max_results=max_results) scores, entries, duplicate_count = _accumulate_rrf_scores(ranked_lists, k) - if not entries: - ... - normalized = _normalize_rrf_scores(scores) - scored_list = _build_rrf_scored_memories(entries, normalized) - ... - return result + return _finalize_fused_result( + ranked_lists=ranked_lists, + entries=entries, + scores=scores, + duplicate_count=duplicate_count, + k=k, + max_results=max_results, + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/synthorg/memory/ranking.py` around lines 301 - 377, The fuse_ranked_lists function is still over 50 lines; extract the argument validation and the logging/finalization into helpers so fuse_ranked_lists becomes the orchestration only. Create a small _validate_fuse_args(k, max_results) that performs the k and max_results checks, logs MEMORY_RRF_VALIDATION_FAILED and raises ValueError, and a _finalize_fusion(entries, scores, ranked_lists, duplicate_count, k, max_results) that handles the empty-entries short-circuit, normalizes scores via _normalize_rrf_scores, builds/scores via _build_rrf_scored_memories, sorts/truncates to max_results, logs MEMORY_RRF_FUSION_COMPLETE (both empty and non-empty cases) and returns the final tuple; keep references to _accumulate_rrf_scores, _normalize_rrf_scores, _build_rrf_scored_memories and the logger/MEMORY_RRF_* symbols so callers remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 301-377: The fuse_ranked_lists function is still over 50 lines;
extract the argument validation and the logging/finalization into helpers so
fuse_ranked_lists becomes the orchestration only. Create a small
_validate_fuse_args(k, max_results) that performs the k and max_results checks,
logs MEMORY_RRF_VALIDATION_FAILED and raises ValueError, and a
_finalize_fusion(entries, scores, ranked_lists, duplicate_count, k, max_results)
that handles the empty-entries short-circuit, normalizes scores via
_normalize_rrf_scores, builds/scores via _build_rrf_scored_memories,
sorts/truncates to max_results, logs MEMORY_RRF_FUSION_COMPLETE (both empty and
non-empty cases) and returns the final tuple; keep references to
_accumulate_rrf_scores, _normalize_rrf_scores, _build_rrf_scored_memories and
the logger/MEMORY_RRF_* symbols so callers remain unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7e07ac61-d010-4a3d-8210-c46e2b8e9c06
📒 Files selected for processing (2)
CLAUDE.mdsrc/synthorg/memory/ranking.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Sandbox
- GitHub Check: Build Web
- GitHub Check: Build Backend
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14
All public functions and classes must have type hints; mypy strict mode enforced
Use Google-style docstrings required on all public classes and functions; enforced by ruff D rules
Create new objects for immutability; never mutate existing objects. For non-Pydantic internal collections, usecopy.deepcopy()at construction andMappingProxyTypefor read-only enforcement. For Pydantic frozen models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models (v2) for config and identity; separate mutable-via-copy models for runtime state usingmodel_copy(update=...). Never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 conventions:@computed_fieldfor derived values instead of storing redundant fields; useNotBlankStrfromcore.typesfor all identifier/name fields (including optionalNotBlankStr | Noneand tuple variants) instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls) for structured concurrency; avoid barecreate_task
Line length: 88 characters (ruff configured)
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly; never silently swallow exceptions
Validate at system boundaries: user input, external APIs, config files
Files:
src/synthorg/memory/ranking.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Useloggervariable name exclusively (not_logger, notlog)
Never useimport logging/logging.getLogger()/print()in application code — use theget_logger()function fromsynthorg.observability
Always use event name constants fromsynthorg.observability.events.<domain>modules (e.g.,PROVIDER_CALL_STARTfromevents.provider,BUDGET_RECORD_ADDEDfromevents.budget). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT. Use structured logging:logger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit
Pure data models, enums, and re-exports do NOT require logging
Files:
src/synthorg/memory/ranking.py
🧠 Learnings (12)
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`. Use structured logging: `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Structured kwargs in logging: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit
Applied to files:
CLAUDE.mdsrc/synthorg/memory/ranking.py
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Use `logger` variable name exclusively (not `_logger`, not `log`)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
CLAUDE.mdsrc/synthorg/memory/ranking.py
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code — use the `get_logger()` function from `synthorg.observability`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.
Applied to files:
CLAUDE.mdsrc/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : DEBUG logging for object creation, internal flow, entry/exit of key functions.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to **/*.py : Functions must be less than 50 lines; files must be less than 800 lines
Applied to files:
src/synthorg/memory/ranking.py
🔇 Additional comments (4)
src/synthorg/memory/ranking.py (3)
33-45: Fusion strategy enum is clean and well-scoped.
FusionStrategyis explicit and keeps the LINEAR/RRF switch easy to consume.
272-299: RRF accumulation logic looks correct after dedup fix.Per-list dedup + effective rank progression preserves one-vote-per-source semantics.
337-375: Validation/error logging and fusion completion logging are well implemented.Good use of domain event constants plus structured context at warning/info levels.
CLAUDE.md (1)
125-125: Docs update is aligned with the RRF feature rollout.The memory pipeline and logging guidance now reflect the new RRF fusion and event constants coherently.
Also applies to: 194-194
ba6411b to
421e104
Compare
🤖 I have created a release *beep* *boop* --- ## [0.2.8](v0.2.7...v0.2.8) (2026-03-16) ### Features * add RRF rank fusion to memory ranking ([#478](#478)) ([42242b5](42242b5)) * collaboration scoring enhancements — LLM sampling and human override ([#477](#477)) ([b3f3330](b3f3330)) ### Bug Fixes * add .gitattributes to enforce LF line endings for Go files ([#483](#483)) ([1b8c7b6](1b8c7b6)) * **cli:** Windows uninstall, update UX, health check, sigstore ([#476](#476)) ([470ca72](470ca72)) ### Refactoring * **web:** extract WebSocket subscription into reusable composable ([#475](#475)) ([96e6c46](96e6c46)), closes [#351](#351) ### Maintenance * bump hypothesis from 6.151.5 to 6.151.9 in the minor-and-patch group ([#482](#482)) ([a7297d5](a7297d5)) * bump nginxinc/nginx-unprivileged from `aec540f` to `ccbac1a` in /docker/web ([#479](#479)) ([176e052](176e052)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
fuse_ranked_lists())FusionStrategyenum (LINEAR,RRF) and config fields (fusion_strategy,rrf_k) toMemoryRetrievalConfigk >= 1,max_results >= 1) and consistency warningsCloses #417
Test plan
Review coverage