Skip to content

feat: add RRF rank fusion to memory ranking#478

Merged
Aureliolo merged 7 commits intomainfrom
feat/rrf-rank-fusion
Mar 16, 2026
Merged

feat: add RRF rank fusion to memory ranking#478
Aureliolo merged 7 commits intomainfrom
feat/rrf-rank-fusion

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Add Reciprocal Rank Fusion (RRF) as an alternative fusion strategy for merging multiple pre-ranked memory lists (fuse_ranked_lists())
  • Add FusionStrategy enum (LINEAR, RRF) and config fields (fusion_strategy, rrf_k) to MemoryRetrievalConfig
  • Keep linear combination as default for single-source relevance+recency scoring
  • Add input validation guards (k >= 1, max_results >= 1) and consistency warnings
  • Update design spec and CLAUDE.md with RRF documentation

Closes #417

Test plan

  • 28 new unit tests covering RRF algorithm, config validation, edge cases
  • Exact RRF score verification with known values
  • Duplicate-ID "first wins" behavior tested
  • 3+ list accumulation tested
  • Input validation boundary tests (k=0, max_results=0)
  • Serialization roundtrip with RRF fields
  • All 8248 tests pass, 94.57% coverage
  • mypy strict: no issues
  • ruff: all checks passed
  • Pre-reviewed by 8 agents, 12 findings addressed

Review coverage

Agent Findings
code-reviewer 2 (input validation)
python-reviewer 6 (guards, docs, tests)
test-coverage-analyzer 7 (missing test scenarios)
type-design-analyzer 3 (consistency validator)
docs-consistency 3 (CLAUDE.md, design spec)
logging-audit 0 (all clear)
conventions-enforcer 0 (all clear)
issue-resolution-verifier 0 (fully resolved)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 16, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 16, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 153b58c4-1052-4152-adbd-9452d9520f54

📥 Commits

Reviewing files that changed from the base of the PR and between ba6411b and 421e104.

📒 Files selected for processing (9)
  • CLAUDE.md
  • docs/design/memory.md
  • src/synthorg/memory/__init__.py
  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
  • src/synthorg/observability/events/memory.py
  • tests/unit/memory/test_init.py
  • tests/unit/memory/test_ranking.py
  • tests/unit/memory/test_retrieval_config.py

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added Reciprocal Rank Fusion (RRF) as a configurable ranking strategy for merging multiple pre-ranked memory sources; linear ranking remains the default.
    • New retrieval configuration options to select fusion strategy and set RRF smoothing (default 60); validations warn on inconsistent settings.
  • Documentation

    • Updated memory design docs to describe RRF behavior, parameters, and when linear remains preferred.
  • Observability

    • Introduced events for RRF fusion completion and validation failures.
  • Tests

    • Expanded tests covering fusion behavior, config validation, and integration with ranking.

Walkthrough

Adds Reciprocal Rank Fusion (RRF) as an optional fusion strategy in memory ranking, exposes FusionStrategy and fuse_ranked_lists, adds fusion_strategy/rrf_k to MemoryRetrievalConfig with validators, emits MEMORY_RRF_FUSION_COMPLETE and MEMORY_RRF_VALIDATION_FAILED observability events, and updates docs/tests.

Changes

Cohort / File(s) Summary
Core RRF Implementation
src/synthorg/memory/ranking.py
Adds FusionStrategy enum (LINEAR, RRF) and fuse_ranked_lists implementing Reciprocal Rank Fusion, normalization and accumulator helpers, ScoredMemory mapping for RRF outputs, input validation, and emits MEMORY_RRF_FUSION_COMPLETE / MEMORY_RRF_VALIDATION_FAILED.
Configuration & Validation
src/synthorg/memory/retrieval_config.py
Adds fusion_strategy: FusionStrategy (default LINEAR) and rrf_k: int (default 60) fields and _DEFAULT_RRF_K; conditions weight-sum validation to LINEAR, adds validators warning on inconsistent rrf_k and unsupported strategies.
Public API Exports
src/synthorg/memory/__init__.py, tests/unit/memory/test_init.py
Exports FusionStrategy and fuse_ranked_lists at package level and updates tests to expect these public symbols.
Observability
src/synthorg/observability/events/memory.py
Adds event constants MEMORY_RRF_FUSION_COMPLETE and MEMORY_RRF_VALIDATION_FAILED.
Documentation & Design
docs/design/memory.md, CLAUDE.md
Updates memory pipeline docs: documents RRF as an alternative to linear ranking, describes k smoothing, normalization/truncation, and updates event-naming guidance.
Tests
tests/unit/memory/test_ranking.py, tests/unit/memory/test_retrieval_config.py
Adds/expands unit tests covering FusionStrategy, fuse_ranked_lists behavior (empty/overlap/normalization/k/max_results) and MemoryRetrievalConfig fields/validators including validation warnings/events.
sequenceDiagram
    actor Client
    participant RetrieverA as "Retriever A\n(pre-ranked list)"
    participant RetrieverB as "Retriever B\n(pre-ranked list)"
    participant Ranking as "memory.ranking\n(fuse_ranked_lists)"
    participant Observability as "observability/events"

    Client->>RetrieverA: request retrieval
    Client->>RetrieverB: request retrieval
    RetrieverA-->>Client: ranked_list_A
    RetrieverB-->>Client: ranked_list_B
    Client->>Ranking: fuse_ranked_lists([ranked_list_A, ranked_list_B], k=60)
    Ranking->>Ranking: compute RRF scores, normalize -> ScoredMemory[]
    Ranking->>Observability: emit MEMORY_RRF_FUSION_COMPLETE(info)
    alt validation failed
        Ranking->>Observability: emit MEMORY_RRF_VALIDATION_FAILED(info)
    end
    Ranking-->>Client: fused ScoredMemory[]
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add RRF rank fusion to memory ranking' directly and specifically describes the main change: adding Reciprocal Rank Fusion as an alternative fusion strategy to the memory ranking module.
Description check ✅ Passed The PR description provides a comprehensive summary of the changes including RRF implementation, new enum and config fields, validation guards, documentation updates, and extensive test coverage.
Linked Issues check ✅ Passed The PR fully addresses issue #417 requirements: implements RRF algorithm with configurable k parameter, exposes fusion_strategy and rrf_k in MemoryRetrievalConfig, maintains linear as default, adds validation guards, updates documentation, and includes comprehensive test coverage.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the RRF rank fusion feature: new FusionStrategy enum, fuse_ranked_lists function, config fields, validation, documentation updates, and corresponding tests. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/rrf-rank-fusion
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/rrf-rank-fusion
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the memory ranking system by introducing Reciprocal Rank Fusion (RRF) as a new strategy. This allows for more sophisticated merging of ranked lists from multiple retrieval sources, moving beyond the existing linear combination approach. The change provides greater flexibility and improved performance for hybrid search scenarios in memory retrieval, while maintaining the linear strategy as the default for single-source ranking.

Highlights

  • Reciprocal Rank Fusion (RRF) Integration: Introduced RRF as an alternative fusion strategy for merging multiple pre-ranked memory lists, enhancing multi-source retrieval capabilities.
  • Configurability: Added a FusionStrategy enum (LINEAR, RRF) and new configuration fields (fusion_strategy, rrf_k) to MemoryRetrievalConfig for flexible ranking strategy selection.
  • Input Validation and Consistency: Implemented validation guards for k and max_results parameters in RRF, and added consistency warnings for rrf_k when LINEAR strategy is used.
  • Documentation Updates: Updated the design specification (docs/design/memory.md) and internal documentation (CLAUDE.md) to reflect the new RRF functionality.
Changelog
  • CLAUDE.md
    • Updated the description of the memory/ module to include "RRF fusion".
    • Added MEMORY_RRF_FUSION_COMPLETE to the list of recognized event names.
  • docs/design/memory.md
    • Updated the "Ranking Algorithm" section to specify "Linear — default".
    • Added a new section detailing the "Reciprocal Rank Fusion (RRF)" algorithm, its purpose, scoring mechanism, and use cases.
  • src/synthorg/memory/init.py
    • Imported FusionStrategy and fuse_ranked_lists from synthorg.memory.ranking.
    • Exported FusionStrategy and fuse_ranked_lists in the module's all list.
  • src/synthorg/memory/ranking.py
    • Updated the module docstring to clarify the roles of rank_memories and fuse_ranked_lists.
    • Imported StrEnum and MEMORY_RRF_FUSION_COMPLETE.
    • Defined the FusionStrategy enum with LINEAR and RRF options.
    • Implemented the fuse_ranked_lists function, which performs RRF on multiple ranked lists, including score normalization, entry deduplication, and input validation.
  • src/synthorg/memory/retrieval_config.py
    • Imported FusionStrategy.
    • Added fusion_strategy (default LINEAR) and rrf_k (default 60, range 1-1000) fields to MemoryRetrievalConfig.
    • Modified the _validate_weight_sum validator to only apply for LINEAR fusion strategy.
    • Added a new validator _validate_rrf_k_strategy_consistency to warn if rrf_k is customized but LINEAR strategy is active.
  • src/synthorg/observability/events/memory.py
    • Added the MEMORY_RRF_FUSION_COMPLETE constant.
  • tests/unit/memory/test_init.py
    • Updated the test_all_has_expected_names method to include FusionStrategy and fuse_ranked_lists in the expected exports.
  • tests/unit/memory/test_ranking.py
    • Imported FusionStrategy and fuse_ranked_lists.
    • Added TestFusionStrategy with tests for enum values and type.
    • Added TestFuseRankedLists with comprehensive unit tests for the RRF algorithm, covering empty inputs, single/multiple lists, overlapping entries, max results truncation, custom k values, score normalization, relevance/recency score handling, duplicate ID behavior, exact score verification, 3+ list accumulation, and input validation for k and max_results.
  • tests/unit/memory/test_retrieval_config.py
    • Imported FusionStrategy.
    • Updated test_defaults to assert default fusion_strategy and rrf_k.
    • Added test_json_roundtrip_with_rrf to verify serialization with RRF fields.
    • Added TestMemoryRetrievalConfigFusion with tests for default and accepted RRF strategy, validation skipping for RRF, linear strategy weight sum enforcement, rrf_k defaults and boundaries, and rejection of invalid rrf_k values.
Activity
  • 28 new unit tests were added, covering the RRF algorithm, configuration validation, and edge cases.
  • Exact RRF score verification was performed with known values.
  • Duplicate-ID "first wins" behavior was tested.
  • Accumulation across 3 or more lists was tested.
  • Input validation boundary tests (k=0, max_results=0) were included.
  • Serialization roundtrip with RRF fields was verified.
  • All 8248 existing tests passed, maintaining 94.57% code coverage.
  • mypy strict and ruff checks passed without issues.
  • The PR was pre-reviewed by 8 agents, and 12 findings were addressed, including input validation, guards, documentation, missing test scenarios, consistency validation, and documentation consistency.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 16, 2026 05:55 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Reciprocal Rank Fusion (RRF) as an alternative strategy for memory ranking. The changes include adding a FusionStrategy enum, new configuration fields in MemoryRetrievalConfig with associated validation, and the core fuse_ranked_lists implementation. The documentation and design specifications have been updated accordingly, and the feature is supported by a comprehensive set of new unit tests that cover various edge cases and scenarios. The overall quality of the implementation is high. My feedback includes one suggestion to improve the maintainability of a configuration validator by dynamically accessing a field's default value instead of hardcoding it.

@model_validator(mode="after")
def _validate_rrf_k_strategy_consistency(self) -> Self:
"""Warn when rrf_k is customized but fusion strategy is LINEAR."""
_default_rrf_k = 60
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding the default value _default_rrf_k = 60 makes this validator brittle. If the default value for the rrf_k field is ever changed in its Field definition, this check will become incorrect, leading to false warnings or missed ones.

To make this more robust and maintainable, you can retrieve the default value directly from the model's field definition. This ensures the validator always uses the current, correct default value.

Suggested change
_default_rrf_k = 60
_default_rrf_k = self.model_fields["rrf_k"].default

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.82%. Comparing base (470ca72) to head (421e104).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #478   +/-   ##
=======================================
  Coverage   93.81%   93.82%           
=======================================
  Files         476      476           
  Lines       22632    22667   +35     
  Branches     2182     2186    +4     
=======================================
+ Hits        21233    21268   +35     
  Misses       1091     1091           
  Partials      308      308           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 273-278: Add logging calls before raising the validation
ValueError for k and max_results in the function containing these checks: use
the module logger (or the existing logger variable) to log a WARNING or ERROR
with context (e.g., include the invalid value and parameter name) immediately
before each raise; if there is a dedicated validation error event constant
available, use that constant in the log call instead of
MEMORY_RRF_FUSION_COMPLETE to follow guidelines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 18957832-25a0-42a0-89aa-39eb0440d88e

📥 Commits

Reviewing files that changed from the base of the PR and between 30e96d5 and 787d3d1.

📒 Files selected for processing (9)
  • CLAUDE.md
  • docs/design/memory.md
  • src/synthorg/memory/__init__.py
  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
  • src/synthorg/observability/events/memory.py
  • tests/unit/memory/test_init.py
  • tests/unit/memory/test_ranking.py
  • tests/unit/memory/test_retrieval_config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Type hints required on all public functions; enforce with mypy strict mode
Google style docstrings required on public classes and functions; enforced by ruff D rules
Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement in non-Pydantic internal collections
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Line length: 88 characters (ruff); enforced by linter

Files:

  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
  • src/synthorg/observability/events/memory.py
  • tests/unit/memory/test_retrieval_config.py
  • tests/unit/memory/test_init.py
  • src/synthorg/memory/__init__.py
  • tests/unit/memory/test_ranking.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with model_copy(update=...) for runtime state that evolves
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict)
Use @computed_field for derived values instead of storing and validating redundant fields (e.g., TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Every module with business logic must have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code; use synthorg's get_logger() instead
Always use logger as the variable name, not _logger or log
Always use event name constants from the domain-specific module under synthorg.observability.events in logging calls
Log with structured kwargs: always logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Log at DEBUG level for object creation, internal flow, entry/exit of key functions
Use Google-style docstrings in all public Python code for mkdocstrings auto-generation

Files:

  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
  • src/synthorg/observability/events/memory.py
  • src/synthorg/memory/__init__.py
{src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx,vue},cli/**/*.go,**/*.md,**/*.yml,**/*.yaml,**/*.json,pyproject.toml}

📄 CodeRabbit inference engine (CLAUDE.md)

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or size aliases

Files:

  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
  • src/synthorg/observability/events/memory.py
  • tests/unit/memory/test_retrieval_config.py
  • tests/unit/memory/test_init.py
  • src/synthorg/memory/__init__.py
  • CLAUDE.md
  • docs/design/memory.md
  • tests/unit/memory/test_ranking.py
src/synthorg/memory/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Files:

  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
  • src/synthorg/memory/__init__.py
src/synthorg/observability/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Observability must use structured logging with correlation tracking and log sinks

Files:

  • src/synthorg/observability/events/memory.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow markers for test categorization
Maintain 80% minimum test coverage; enforced in CI
Use asyncio_mode = 'auto' in pytest config — no manual @pytest.mark.asyncio needed on individual tests
Set 30-second timeout per test
Always include -n auto when running pytest with pytest-xdist; never run tests sequentially
Prefer @pytest.mark.parametrize for testing similar cases
Use test-provider, test-small-001, etc. in test files
Use Hypothesis property-based testing with @given and @settings; control profile via HYPOTHESIS_PROFILE env var (ci: 200 examples, dev: 1000 examples)
Never skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock time.monotonic() and asyncio.sleep() for timing-sensitive tests

Files:

  • tests/unit/memory/test_retrieval_config.py
  • tests/unit/memory/test_init.py
  • tests/unit/memory/test_ranking.py
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Markdown documentation files should follow the documented structure: docs/design/, docs/architecture/, docs/roadmap/, docs/security.md, docs/licensing.md, docs/reference/, docs/rest-api.md, docs/api/

Files:

  • docs/design/memory.md
docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Design specification pages in docs/design/ must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Files:

  • docs/design/memory.md
🧠 Learnings (17)
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`)

Applied to files:

  • src/synthorg/memory/retrieval_config.py
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves

Applied to files:

  • src/synthorg/memory/retrieval_config.py
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Applied to files:

  • src/synthorg/memory/__init__.py
  • tests/unit/memory/test_ranking.py
📚 Learning: 2026-03-15T22:42:34.900Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.900Z
Learning: Applies to src/synthorg/core/**/*.py : Core module must contain shared domain models, base classes, resilience config (RetryConfig, RateLimiterConfig)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Structured kwargs in logging: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Log with structured kwargs: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.900Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.900Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Always use `logger` as the variable name, not `_logger` or `log`

Applied to files:

  • CLAUDE.md
🧬 Code graph analysis (3)
src/synthorg/memory/retrieval_config.py (1)
src/synthorg/memory/ranking.py (1)
  • FusionStrategy (32-43)
tests/unit/memory/test_retrieval_config.py (2)
src/synthorg/memory/ranking.py (1)
  • FusionStrategy (32-43)
src/synthorg/memory/retrieval_config.py (1)
  • MemoryRetrievalConfig (21-167)
src/synthorg/memory/__init__.py (1)
src/synthorg/memory/ranking.py (3)
  • FusionStrategy (32-43)
  • ScoredMemory (46-80)
  • fuse_ranked_lists (237-327)
🔇 Additional comments (24)
src/synthorg/observability/events/memory.py (1)

59-59: LGTM!

The new event constant follows the established memory.<entity>.<action> naming convention and is correctly placed under the retrieval pipeline section alongside MEMORY_RANKING_COMPLETE.

CLAUDE.md (2)

125-125: LGTM!

Package description accurately updated to reflect the new RRF fusion capability in the retrieval pipeline.


194-194: LGTM!

Event constant MEMORY_RRF_FUSION_COMPLETE correctly added to the logging events reference list, maintaining consistency with other memory domain events.

tests/unit/memory/test_init.py (2)

19-19: LGTM!

New FusionStrategy export correctly added to the expected public API surface test.


64-64: LGTM!

New fuse_ranked_lists function correctly added to expected exports, maintaining alphabetical order.

docs/design/memory.md (1)

522-541: LGTM!

Excellent documentation update that clearly differentiates the two fusion strategies:

  • Labels existing linear approach as the default
  • Explains RRF formula, normalization, and the k parameter
  • Provides appropriate external references
  • Correctly positions RRF for multi-source scenarios while preserving linear as the single-source default
src/synthorg/memory/__init__.py (3)

67-67: LGTM!

Import statement correctly updated to include FusionStrategy and fuse_ranked_lists from the ranking module.


80-80: LGTM!

FusionStrategy correctly added to __all__ in alphabetical order.


119-119: LGTM!

fuse_ranked_lists correctly added to __all__ in alphabetical order.

src/synthorg/memory/retrieval_config.py (3)

101-113: LGTM!

New fusion_strategy and rrf_k fields are well-defined with appropriate defaults (LINEAR, 60) and constraints (rrf_k: 1-1000). The descriptions clearly indicate the fields' purposes.


115-134: LGTM!

Weight sum validation correctly updated to only apply when fusion_strategy == LINEAR. This is the right approach since relevance/recency weights are not meaningful for RRF fusion where scoring comes from rank positions.


136-150: Good addition with minor observation.

The consistency validator appropriately warns (but doesn't raise) when rrf_k is customized while using LINEAR fusion, since the value would be ignored.

Note: _default_rrf_k = 60 duplicates the field default. This is acceptable for readability, but if the field default changes, this value would need to be updated too. Consider using self.model_fields['rrf_k'].default for a single source of truth, though this adds complexity.

tests/unit/memory/test_retrieval_config.py (3)

27-28: LGTM!

Default value assertions correctly added for fusion_strategy (LINEAR) and rrf_k (60).


157-166: LGTM!

JSON roundtrip test for RRF configuration ensures serialization/deserialization preserves both fusion_strategy and rrf_k values correctly.


169-217: LGTM!

Comprehensive test class covering:

  • Default fusion strategy verification
  • RRF strategy acceptance
  • Weight sum validation bypass for RRF
  • Weight sum enforcement for LINEAR
  • rrf_k default, boundaries (1, 1000), and rejection of out-of-range values (0, 1001)

Good coverage of both happy paths and error conditions.

tests/unit/memory/test_ranking.py (6)

420-430: LGTM!

Good tests verifying FusionStrategy enum values and confirming it's a StrEnum (string-based enumeration).


436-464: LGTM!

Good coverage of basic fusion scenarios: empty inputs, single list ordering preservation, and disjoint list merging.


466-510: LGTM!

Excellent tests for core RRF behavior:

  • Overlapping entries correctly accumulate higher scores
  • Rank-1 entries in all lists achieve maximum normalized score (1.0)
  • max_results truncation works correctly
  • Different k values preserve ranking order with expected min-max normalization

512-551: LGTM!

Good edge case coverage:

  • Single result normalizes to 1.0
  • Equal raw scores all normalize to 1.0
  • Relevance score preservation (raw value passed through, None defaults to 0.0)
  • Recency score is always 0.0 for RRF output
  • is_shared flag defaults to False

553-597: LGTM!

Rigorous verification tests:

  • Duplicate ID handling confirms first MemoryEntry is retained
  • Exact RRF score calculation verified with known k=60 values
  • Three-list accumulation confirms correct score ordering (a > b > c)

These tests provide strong confidence in the mathematical correctness of the RRF implementation.


599-605: LGTM!

Input validation tests confirm ValueError is raised for invalid parameters (k=0, max_results=0), matching the implementation guards.

src/synthorg/memory/ranking.py (3)

4-8: LGTM!

Clear module docstring that accurately documents both ranking pathways and their distinct use cases.


32-44: LGTM!

Clean enum definition with clear docstring. The lowercase string values will serialize well in configs.


280-327: RRF implementation is correct and well-structured.

The algorithm correctly:

  • Accumulates 1/(k+rank) scores across all lists
  • Preserves first-encountered entry per ID while summing contributions
  • Handles empty input gracefully
  • Normalizes to [0,1] with proper edge case handling when all scores are equal
  • Uses structured logging with appropriate event constant

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 333-338: The MEMORY_RRF_FUSION_COMPLETE event is being emitted
twice for duplicate-heavy inputs; remove the early debug call and emit
MEMORY_RRF_FUSION_COMPLETE exactly once after the fusion completes (i.e., after
duplicate merging logic) and only when duplicate_count > 0, using the same
fields (duplicate_ids_merged=duplicate_count, unique_entries=len(entries));
apply the same change to the other duplicate block referenced (the block around
the code currently at lines 355-361) so the event is not logged before and after
fusion.
- Around line 325-331: In the RRF accumulation loop over ranked_lists, each
occurrence of the same entry.id within a single ranked_list is currently counted
multiple times; change the logic so you deduplicate IDs per source list before
awarding the 1.0/(k + rank) score: for each ranked_list (the loop that currently
iterates "for ranked_list in ranked_lists"), track a per-list seen set and skip
entries whose id is already seen so that each source contributes at most one
rank for a given id; update scores (the scores dict), entries (the entries dict)
and duplicate_count accordingly (increment duplicate_count when you ignore a
duplicate within the same ranked_list) so RRF semantics are preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ecbb9dbe-ebe0-4d4f-a59b-3825c65d8690

📥 Commits

Reviewing files that changed from the base of the PR and between 787d3d1 and 9c552ed.

📒 Files selected for processing (7)
  • CLAUDE.md
  • docs/design/memory.md
  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
  • src/synthorg/observability/events/memory.py
  • tests/unit/memory/test_ranking.py
  • tests/unit/memory/test_retrieval_config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Backend
🧰 Additional context used
📓 Path-based instructions (8)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax: use except A, B: (no parentheses) — ruff enforces this on Python 3.14
Type hints required on all public functions; enforce with mypy strict mode
Google style docstrings required on public classes and functions; enforced by ruff D rules
Create new objects instead of mutating existing ones; use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement in non-Pydantic internal collections
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Line length: 88 characters (ruff); enforced by linter

Files:

  • tests/unit/memory/test_ranking.py
  • src/synthorg/observability/events/memory.py
  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
  • tests/unit/memory/test_retrieval_config.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow markers for test categorization
Maintain 80% minimum test coverage; enforced in CI
Use asyncio_mode = 'auto' in pytest config — no manual @pytest.mark.asyncio needed on individual tests
Set 30-second timeout per test
Always include -n auto when running pytest with pytest-xdist; never run tests sequentially
Prefer @pytest.mark.parametrize for testing similar cases
Use test-provider, test-small-001, etc. in test files
Use Hypothesis property-based testing with @given and @settings; control profile via HYPOTHESIS_PROFILE env var (ci: 200 examples, dev: 1000 examples)
Never skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock time.monotonic() and asyncio.sleep() for timing-sensitive tests

Files:

  • tests/unit/memory/test_ranking.py
  • tests/unit/memory/test_retrieval_config.py
{src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx,vue},cli/**/*.go,**/*.md,**/*.yml,**/*.yaml,**/*.json,pyproject.toml}

📄 CodeRabbit inference engine (CLAUDE.md)

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or size aliases

Files:

  • tests/unit/memory/test_ranking.py
  • docs/design/memory.md
  • src/synthorg/observability/events/memory.py
  • src/synthorg/memory/ranking.py
  • CLAUDE.md
  • src/synthorg/memory/retrieval_config.py
  • tests/unit/memory/test_retrieval_config.py
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Markdown documentation files should follow the documented structure: docs/design/, docs/architecture/, docs/roadmap/, docs/security.md, docs/licensing.md, docs/reference/, docs/rest-api.md, docs/api/

Files:

  • docs/design/memory.md
docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Design specification pages in docs/design/ must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Files:

  • docs/design/memory.md
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with model_copy(update=...) for runtime state that evolves
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict)
Use @computed_field for derived values instead of storing and validating redundant fields (e.g., TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Every module with business logic must have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code; use synthorg's get_logger() instead
Always use logger as the variable name, not _logger or log
Always use event name constants from the domain-specific module under synthorg.observability.events in logging calls
Log with structured kwargs: always logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Log at DEBUG level for object creation, internal flow, entry/exit of key functions
Use Google-style docstrings in all public Python code for mkdocstrings auto-generation

Files:

  • src/synthorg/observability/events/memory.py
  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
src/synthorg/observability/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Observability must use structured logging with correlation tracking and log sinks

Files:

  • src/synthorg/observability/events/memory.py
src/synthorg/memory/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Files:

  • src/synthorg/memory/ranking.py
  • src/synthorg/memory/retrieval_config.py
🧠 Learnings (16)
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Applied to files:

  • tests/unit/memory/test_ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/synthorg/memory/ranking.py
  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising

Applied to files:

  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T22:42:34.900Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.900Z
Learning: Applies to src/synthorg/core/**/*.py : Core module must contain shared domain models, base classes, resilience config (RetryConfig, RateLimiterConfig)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/hr/**/*.py : HR engine must provide: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, trend detection), promotion/demotion

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Structured kwargs in logging: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Log with structured kwargs: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.900Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.900Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T22:42:34.899Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T22:42:34.899Z
Learning: Applies to src/synthorg/**/*.py : Always use `logger` as the variable name, not `_logger` or `log`

Applied to files:

  • CLAUDE.md
🧬 Code graph analysis (2)
src/synthorg/memory/retrieval_config.py (1)
src/synthorg/memory/ranking.py (1)
  • FusionStrategy (33-44)
tests/unit/memory/test_retrieval_config.py (2)
src/synthorg/memory/ranking.py (1)
  • FusionStrategy (33-44)
src/synthorg/memory/retrieval_config.py (1)
  • MemoryRetrievalConfig (22-185)

Comment on lines +333 to +338
if duplicate_count:
logger.debug(
MEMORY_RRF_FUSION_COMPLETE,
duplicate_ids_merged=duplicate_count,
unique_entries=len(entries),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Emit MEMORY_RRF_FUSION_COMPLETE only once per fusion call.

When duplicate_count > 0, this function logs the same ...complete event before and after the actual fusion. Any metric or sink keyed by event name will overcount successful fusions for duplicate-heavy inputs.

🛠️ Suggested fix
-    if duplicate_count:
-        logger.debug(
-            MEMORY_RRF_FUSION_COMPLETE,
-            duplicate_ids_merged=duplicate_count,
-            unique_entries=len(entries),
-        )
-
     if not entries:
         logger.debug(
             MEMORY_RRF_FUSION_COMPLETE,
             num_lists=len(ranked_lists),
             unique_entries=0,
@@
     logger.debug(
         MEMORY_RRF_FUSION_COMPLETE,
         num_lists=len(ranked_lists),
         unique_entries=len(entries),
         after_truncation=len(result),
         k=k,
+        duplicate_ids_merged=duplicate_count,
     )

Also applies to: 355-361

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/memory/ranking.py` around lines 333 - 338, The
MEMORY_RRF_FUSION_COMPLETE event is being emitted twice for duplicate-heavy
inputs; remove the early debug call and emit MEMORY_RRF_FUSION_COMPLETE exactly
once after the fusion completes (i.e., after duplicate merging logic) and only
when duplicate_count > 0, using the same fields
(duplicate_ids_merged=duplicate_count, unique_entries=len(entries)); apply the
same change to the other duplicate block referenced (the block around the code
currently at lines 355-361) so the event is not logged before and after fusion.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 337-344: Change the two fusion-completion/state-transition log
calls from logger.debug to logger.info so they are visible as INFO-level state
transitions: update the log invocation that emits MEMORY_RRF_FUSION_COMPLETE
(the call that currently passes num_lists, unique_entries, after_truncation,
duplicate_ids_skipped, k) and the subsequent similar log block later in the
function (the second fusion completion block that logs
lists_count/unique_entries/duplicate_ids_skipped/after_truncation/k) to use
logger.info(...) instead of logger.debug(...).
- Around line 272-361: The function fuse_ranked_lists exceeds the 50-line limit;
split it into three helpers to preserve identical behavior: create
validate_rrf_inputs(k, max_results) to perform the k/max_results checks and log
MEMORY_RRF_VALIDATION_FAILED (raise ValueError on bad inputs), create
accumulate_rrf_scores(ranked_lists, k) to iterate lists, track seen_in_list,
duplicate_count, populate scores and entries (use MemoryEntry.id, update scores
with 1.0/(k+rank)), and return (scores, entries, duplicate_count), and create
finalize_rrf_result(scores, entries, duplicate_count, ranked_lists_len, k,
max_results) to handle empty entries logging MEMORY_RRF_FUSION_COMPLETE, call
_normalize_rrf_scores and _build_rrf_scored_memories, sort by combined_score,
truncate to max_results, log final MEMORY_RRF_FUSION_COMPLETE with
unique_entries/after_truncation/duplicate_ids_skipped and return the result
tuple; then refactor fuse_ranked_lists to call these helpers and return their
result, keeping names _normalize_rrf_scores and _build_rrf_scored_memories
intact so behavior and logs remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 557534c2-984c-47a5-97f2-c19b341365b1

📥 Commits

Reviewing files that changed from the base of the PR and between 9c552ed and 4c4b603.

📒 Files selected for processing (2)
  • src/synthorg/memory/ranking.py
  • tests/unit/memory/test_ranking.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Backend
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations.
Use except A, B: syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14 (PEP 758 except syntax).
Include type hints on all public functions and classes. Strict mypy type-checking is enforced.
Use Google-style docstrings on all public classes and functions. Docstrings are enforced by ruff D rules.
Line length is 88 characters (enforced by ruff).

Files:

  • tests/unit/memory/test_ranking.py
  • src/synthorg/memory/ranking.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, or @pytest.mark.slow. Maintain 80% coverage minimum. Use asyncio_mode = "auto" (no manual @pytest.mark.asyncio needed). Set 30-second timeout per test. Always include -n auto when running pytest (never sequential). Use @pytest.mark.parametrize for testing similar cases.
Use Python Hypothesis for property-based testing with @given + @settings. Profiles: ci (200 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties. .hypothesis/ is gitignored.
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.

Files:

  • tests/unit/memory/test_ranking.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (with model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with adopted conventions: use @computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields including optional and tuple variants instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Keep functions under 50 lines and files under 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Vendor names may only appear in: (1) Operations design page (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths. Tests must use test-provider, test-small-001, etc.
Pure data models, enums, and re-exports do NOT need logging.

Files:

  • src/synthorg/memory/ranking.py
src/synthorg/!(observability)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/!(observability)/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger followed by logger = get_logger(__name__). Never use import logging / logging.getLogger() / print() in application code. Always use variable name logger (not _logger, not log).
Always use event name constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured logging with keyword arguments: always logger.info(EVENT, key=value) — never logger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG is for object creation, internal flow, entry/exit of key functions.

Files:

  • src/synthorg/memory/ranking.py
🧠 Learnings (2)
📚 Learning: 2026-03-16T06:18:07.373Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:18:07.373Z
Learning: Applies to src/synthorg/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG is for object creation, internal flow, entry/exit of key functions.

Applied to files:

  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/synthorg/memory/ranking.py
🧬 Code graph analysis (1)
tests/unit/memory/test_ranking.py (1)
src/synthorg/memory/ranking.py (5)
  • FusionStrategy (33-44)
  • ScoredMemory (47-83)
  • compute_combined_score (112-132)
  • compute_recency_score (86-109)
  • fuse_ranked_lists (272-361)
🔇 Additional comments (2)
tests/unit/memory/test_ranking.py (2)

54-105: Good parametric coverage for scoring math.

This set cleanly validates formula behavior across representative weight/relevance/recency combinations.


410-623: Strong RRF test suite coverage.

The new unit cases cover empty inputs, overlap behavior, dedup semantics, exact-score scenarios, and validation guards with good precision.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/synthorg/memory/ranking.py (1)

287-293: ⚠️ Potential issue | 🟠 Major

Fix per-list rank compression when skipping duplicates in RRF accumulation.

On Line 287-Line 293, duplicates are skipped, but rank still comes from enumerate over the unfiltered list. This penalizes later unique entries (e.g., A, A, B makes B rank 3 instead of rank 2 in that source), which distorts RRF scoring.

Suggested patch
 def _accumulate_rrf_scores(
     ranked_lists: tuple[tuple[MemoryEntry, ...], ...],
     k: int,
 ) -> tuple[dict[str, float], dict[str, MemoryEntry], int]:
@@
     for ranked_list in ranked_lists:
         seen_in_list: set[str] = set()
-        for rank, entry in enumerate(ranked_list, start=1):
+        unique_rank = 0
+        for entry in ranked_list:
             if entry.id in seen_in_list:
                 duplicate_count += 1
                 continue
             seen_in_list.add(entry.id)
-            scores[entry.id] = scores.get(entry.id, 0.0) + 1.0 / (k + rank)
+            unique_rank += 1
+            scores[entry.id] = (
+                scores.get(entry.id, 0.0) + 1.0 / (k + unique_rank)
+            )
             if entry.id not in entries:
                 entries[entry.id] = entry
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/memory/ranking.py` around lines 287 - 293, The loop uses
enumerate(ranked_list, start=1) so rank counts duplicates and later unique
entries get an inflated rank; change to track per-list rank that only increments
for unique entries: replace enumerate with a simple for entry in ranked_list and
introduce a local unique_rank (start at 0), on encountering a duplicate
increment duplicate_count and continue, otherwise increment unique_rank and use
unique_rank in the score calculation (scores[entry.id] += 1.0 / (k +
unique_rank)); update seen_in_list and entries the same way. Reference symbols:
ranked_list, seen_in_list, duplicate_count, scores, entries, k, entry.id.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 287-293: The loop uses enumerate(ranked_list, start=1) so rank
counts duplicates and later unique entries get an inflated rank; change to track
per-list rank that only increments for unique entries: replace enumerate with a
simple for entry in ranked_list and introduce a local unique_rank (start at 0),
on encountering a duplicate increment duplicate_count and continue, otherwise
increment unique_rank and use unique_rank in the score calculation
(scores[entry.id] += 1.0 / (k + unique_rank)); update seen_in_list and entries
the same way. Reference symbols: ranked_list, seen_in_list, duplicate_count,
scores, entries, k, entry.id.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b6d41ba4-24fc-4918-bb1b-26766005183c

📥 Commits

Reviewing files that changed from the base of the PR and between 4c4b603 and 691b771.

📒 Files selected for processing (1)
  • src/synthorg/memory/ranking.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Backend
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations.
Use except A, B: syntax (no parentheses) for exception handling — ruff enforces this on Python 3.14 (PEP 758 except syntax).
Include type hints on all public functions and classes. Strict mypy type-checking is enforced.
Use Google-style docstrings on all public classes and functions. Docstrings are enforced by ruff D rules.
Line length is 88 characters (enforced by ruff).

Files:

  • src/synthorg/memory/ranking.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (with model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with adopted conventions: use @computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields including optional and tuple variants instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Keep functions under 50 lines and files under 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Vendor names may only appear in: (1) Operations design page (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths. Tests must use test-provider, test-small-001, etc.
Pure data models, enums, and re-exports do NOT need logging.

Files:

  • src/synthorg/memory/ranking.py
src/synthorg/!(observability)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/!(observability)/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger followed by logger = get_logger(__name__). Never use import logging / logging.getLogger() / print() in application code. Always use variable name logger (not _logger, not log).
Always use event name constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Use structured logging with keyword arguments: always logger.info(EVENT, key=value) — never logger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG is for object creation, internal flow, entry/exit of key functions.

Files:

  • src/synthorg/memory/ranking.py
🧠 Learnings (5)
📚 Learning: 2026-03-16T06:18:07.373Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:18:07.373Z
Learning: Applies to src/synthorg/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG is for object creation, internal flow, entry/exit of key functions.

Applied to files:

  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-16T06:18:07.373Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:18:07.373Z
Learning: Applies to src/synthorg/**/*.py : Keep functions under 50 lines and files under 800 lines.

Applied to files:

  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-16T06:18:07.373Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:18:07.373Z
Learning: Applies to src/synthorg/!(observability)/**/*.py : Use structured logging with keyword arguments: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`.

Applied to files:

  • src/synthorg/memory/ranking.py
🔇 Additional comments (1)
src/synthorg/memory/ranking.py (1)

335-346: Validation and fusion lifecycle logging are correctly implemented.

Error paths log context before raising (Line 335-Line 346), and fusion completion/state transition is emitted at INFO with structured fields (Line 350-Line 373). Nice compliance with observability guidelines.

Also applies to: 350-373

Add Reciprocal Rank Fusion as an alternative fusion strategy for merging
multiple pre-ranked memory lists. Keep linear combination as default for
single-source relevance+recency scoring.

- Add FusionStrategy enum (LINEAR, RRF) to ranking module
- Add fuse_ranked_lists() with min-max normalized RRF scoring
- Add fusion_strategy and rrf_k config fields to MemoryRetrievalConfig
- Conditionalize weight sum validator (skip for RRF strategy)
- Add MEMORY_RRF_FUSION_COMPLETE event constant
- Add 22 new unit tests (15 ranking + 8 config)
Mypy strict mode flags direct StrEnum-to-string equality as
non-overlapping comparison.
Pre-reviewed by 8 agents, 12 findings addressed:
- Add k/max_results input validation guards in fuse_ranked_lists
- Add rrf_k/fusion_strategy consistency validator (warn on LINEAR)
- Update MemoryRetrievalConfig docstring with new fields
- Update test_defaults with fusion_strategy and rrf_k assertions
- Add serialization roundtrip test with RRF fields
- Add duplicate-ID, exact-scores, 3-list accumulation tests
- Add k<1 and max_results<1 validation tests
- Update CLAUDE.md event constants and package structure
- Update docs/design/memory.md with RRF ranking documentation
…mini

- Add _validate_supported_fusion_strategy guard rejecting RRF at config
  level (not yet wired into retrieval pipeline) — prevents silent
  misconfiguration where fusion_strategy=RRF produces LINEAR results
- Add logger.warning before ValueError raises in fuse_ranked_lists
- Extract _DEFAULT_RRF_K module constant (eliminates duplicated magic 60)
- Extract _normalize_rrf_scores and _build_rrf_scored_memories helpers
  (reduces fuse_ranked_lists body from 62 to 56 lines)
- Add debug logging on empty-return and duplicate-ID-merge paths
- Fix ScoredMemory docstring (describe model contract, not _score_entry)
- Clarify design spec pipeline steps 2-3 for RRF path
- Add max_results truncation note to design spec RRF section
- Update rrf_k Attributes docstring with valid range (1-1000)
- Add MEMORY_RRF_VALIDATION_FAILED event constant + CLAUDE.md update
- Add structlog capture_logs test for rrf_k consistency warning
- Add test for entries < max_results, parametrize negative k/max_results
- Parametrize recency decay and combined score test groups
- Add score/ordering assertions to test_two_disjoint_lists
- Update tests for RRF config rejection (was accepted, now rejected)
…tly once

- Track seen IDs per ranked list so each source contributes at most one
  rank per document (correct RRF semantics)
- Remove early MEMORY_RRF_FUSION_COMPLETE log for duplicates; include
  duplicate_ids_skipped in the single completion log instead
- Add test_intra_list_duplicate_id_skipped test
- Rename test_duplicate_id_first_occurrence_wins to clarify it tests
  cross-list behavior
…gs to INFO

- Extract accumulation loop into _accumulate_rrf_scores (brings
  fuse_ranked_lists body from 54 to 41 lines, under 50-line limit)
- Change MEMORY_RRF_FUSION_COMPLETE logs from debug to info (CLAUDE.md:
  "All state transitions must log at INFO")
Skipped duplicates within a ranked list no longer consume rank slots.
Subsequent unique entries get the rank they would have in the
deduplicated list, preventing ghost-rank score inflation.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/synthorg/memory/ranking.py (1)

301-377: 🛠️ Refactor suggestion | 🟠 Major

fuse_ranked_lists still exceeds the 50-line function limit.

The decomposition improved readability, but this function still remains over the repository’s hard cap. Please extract validation/finalization into helpers so this orchestrator stays under 50 lines.

♻️ Suggested refactor shape
+def _validate_fuse_inputs(*, k: int, max_results: int) -> None:
+    if k < 1:
+        logger.warning(MEMORY_RRF_VALIDATION_FAILED, param="k", value=k)
+        raise ValueError(f"k must be >= 1, got {k}")
+    if max_results < 1:
+        logger.warning(
+            MEMORY_RRF_VALIDATION_FAILED,
+            param="max_results",
+            value=max_results,
+        )
+        raise ValueError(f"max_results must be >= 1, got {max_results}")
+
+def _finalize_fused_result(
+    *,
+    ranked_lists: tuple[tuple[MemoryEntry, ...], ...],
+    entries: dict[str, MemoryEntry],
+    scores: dict[str, float],
+    duplicate_count: int,
+    k: int,
+    max_results: int,
+) -> tuple[ScoredMemory, ...]:
+    ...
+
 def fuse_ranked_lists(
     ranked_lists: tuple[tuple[MemoryEntry, ...], ...],
     *,
     k: int = 60,
     max_results: int = 20,
 ) -> tuple[ScoredMemory, ...]:
-    if k < 1:
-        ...
-    if max_results < 1:
-        ...
+    _validate_fuse_inputs(k=k, max_results=max_results)
     scores, entries, duplicate_count = _accumulate_rrf_scores(ranked_lists, k)
-    if not entries:
-        ...
-    normalized = _normalize_rrf_scores(scores)
-    scored_list = _build_rrf_scored_memories(entries, normalized)
-    ...
-    return result
+    return _finalize_fused_result(
+        ranked_lists=ranked_lists,
+        entries=entries,
+        scores=scores,
+        duplicate_count=duplicate_count,
+        k=k,
+        max_results=max_results,
+    )
As per coding guidelines: "Functions must be less than 50 lines; files must be less than 800 lines."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/memory/ranking.py` around lines 301 - 377, The fuse_ranked_lists
function is still over 50 lines; extract the argument validation and the
logging/finalization into helpers so fuse_ranked_lists becomes the orchestration
only. Create a small _validate_fuse_args(k, max_results) that performs the k and
max_results checks, logs MEMORY_RRF_VALIDATION_FAILED and raises ValueError, and
a _finalize_fusion(entries, scores, ranked_lists, duplicate_count, k,
max_results) that handles the empty-entries short-circuit, normalizes scores via
_normalize_rrf_scores, builds/scores via _build_rrf_scored_memories,
sorts/truncates to max_results, logs MEMORY_RRF_FUSION_COMPLETE (both empty and
non-empty cases) and returns the final tuple; keep references to
_accumulate_rrf_scores, _normalize_rrf_scores, _build_rrf_scored_memories and
the logger/MEMORY_RRF_* symbols so callers remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/synthorg/memory/ranking.py`:
- Around line 301-377: The fuse_ranked_lists function is still over 50 lines;
extract the argument validation and the logging/finalization into helpers so
fuse_ranked_lists becomes the orchestration only. Create a small
_validate_fuse_args(k, max_results) that performs the k and max_results checks,
logs MEMORY_RRF_VALIDATION_FAILED and raises ValueError, and a
_finalize_fusion(entries, scores, ranked_lists, duplicate_count, k, max_results)
that handles the empty-entries short-circuit, normalizes scores via
_normalize_rrf_scores, builds/scores via _build_rrf_scored_memories,
sorts/truncates to max_results, logs MEMORY_RRF_FUSION_COMPLETE (both empty and
non-empty cases) and returns the final tuple; keep references to
_accumulate_rrf_scores, _normalize_rrf_scores, _build_rrf_scored_memories and
the logger/MEMORY_RRF_* symbols so callers remain unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7e07ac61-d010-4a3d-8210-c46e2b8e9c06

📥 Commits

Reviewing files that changed from the base of the PR and between 691b771 and ba6411b.

📒 Files selected for processing (2)
  • CLAUDE.md
  • src/synthorg/memory/ranking.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use except A, B: syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14
All public functions and classes must have type hints; mypy strict mode enforced
Use Google-style docstrings required on all public classes and functions; enforced by ruff D rules
Create new objects for immutability; never mutate existing objects. For non-Pydantic internal collections, use copy.deepcopy() at construction and MappingProxyType for read-only enforcement. For Pydantic frozen models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models (v2) for config and identity; separate mutable-via-copy models for runtime state using model_copy(update=...). Never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 conventions: @computed_field for derived values instead of storing redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional NotBlankStr | None and tuple variants) instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls) for structured concurrency; avoid bare create_task
Line length: 88 characters (ruff configured)
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly; never silently swallow exceptions
Validate at system boundaries: user input, external APIs, config files

Files:

  • src/synthorg/memory/ranking.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__). Use logger variable name exclusively (not _logger, not log)
Never use import logging / logging.getLogger() / print() in application code — use the get_logger() function from synthorg.observability
Always use event name constants from synthorg.observability.events.<domain> modules (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT. Use structured logging: logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit
Pure data models, enums, and re-exports do NOT require logging

Files:

  • src/synthorg/memory/ranking.py
🧠 Learnings (12)
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`. Use structured logging: `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Structured kwargs in logging: always `logger.info(EVENT, key=value)` — never `logger.info('msg %s', val)`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit

Applied to files:

  • CLAUDE.md
  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Use `logger` variable name exclusively (not `_logger`, not `log`)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • CLAUDE.md
  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code — use the `get_logger()` function from `synthorg.observability`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • CLAUDE.md
  • src/synthorg/memory/ranking.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : DEBUG logging for object creation, internal flow, entry/exit of key functions.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-16T07:07:34.946Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:07:34.946Z
Learning: Applies to **/*.py : Functions must be less than 50 lines; files must be less than 800 lines

Applied to files:

  • src/synthorg/memory/ranking.py
🔇 Additional comments (4)
src/synthorg/memory/ranking.py (3)

33-45: Fusion strategy enum is clean and well-scoped.

FusionStrategy is explicit and keeps the LINEAR/RRF switch easy to consume.


272-299: RRF accumulation logic looks correct after dedup fix.

Per-list dedup + effective rank progression preserves one-vote-per-source semantics.


337-375: Validation/error logging and fusion completion logging are well implemented.

Good use of domain event constants plus structured context at warning/info levels.

CLAUDE.md (1)

125-125: Docs update is aligned with the RRF feature rollout.

The memory pipeline and logging guidance now reflect the new RRF fusion and event constants coherently.

Also applies to: 194-194

@Aureliolo Aureliolo force-pushed the feat/rrf-rank-fusion branch from ba6411b to 421e104 Compare March 16, 2026 07:16
@Aureliolo Aureliolo merged commit 42242b5 into main Mar 16, 2026
25 of 27 checks passed
@Aureliolo Aureliolo deleted the feat/rrf-rank-fusion branch March 16, 2026 07:16
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 16, 2026 07:17 — with GitHub Actions Inactive
Aureliolo added a commit that referenced this pull request Mar 16, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.2.8](v0.2.7...v0.2.8)
(2026-03-16)


### Features

* add RRF rank fusion to memory ranking
([#478](#478))
([42242b5](42242b5))
* collaboration scoring enhancements — LLM sampling and human override
([#477](#477))
([b3f3330](b3f3330))


### Bug Fixes

* add .gitattributes to enforce LF line endings for Go files
([#483](#483))
([1b8c7b6](1b8c7b6))
* **cli:** Windows uninstall, update UX, health check, sigstore
([#476](#476))
([470ca72](470ca72))


### Refactoring

* **web:** extract WebSocket subscription into reusable composable
([#475](#475))
([96e6c46](96e6c46)),
closes [#351](#351)


### Maintenance

* bump hypothesis from 6.151.5 to 6.151.9 in the minor-and-patch group
([#482](#482))
([a7297d5](a7297d5))
* bump nginxinc/nginx-unprivileged from `aec540f` to `ccbac1a` in
/docker/web ([#479](#479))
([176e052](176e052))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add RRF rank fusion to memory ranking

1 participant