Skip to content

docs: LMEB embedding evaluation + CSP accepted risk (#695, #925)#962

Merged
Aureliolo merged 3 commits intomainfrom
research/v060-gates
Apr 1, 2026
Merged

docs: LMEB embedding evaluation + CSP accepted risk (#695, #925)#962
Aureliolo merged 3 commits intomainfrom
research/v060-gates

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

@Aureliolo Aureliolo commented Apr 1, 2026

Summary

  • research: LMEB-guided embedding model selection + domain fine-tuning for org memory #695: Evaluate LMEB benchmark for memory embedding model selection. LMEB (22 datasets, 193 tasks) shows MTEB does not predict memory retrieval quality (Spearman: -0.130). Add embedding evaluation reference page with model recommendations across three deployment tiers, domain fine-tuning pipeline design, and EmbeddingFineTuneConfig stub with cross-field validation
  • research: CSP unsafe-inline removal blocked by Radix UI #925: Document accepted risk of style-src unsafe-inline in CSP due to Radix UI inline style injection. Upstream stagnant (discussion #3130 closed Feb 2026). Base UI migration assessed as prohibitively large. Recommend three-phase mitigation: wire get-nonce bridge, CSP Level 3 directive splitting, quarterly re-evaluation

Changed files

File Change
docs/reference/embedding-evaluation.md New -- LMEB evaluation, model rankings, fine-tuning pipeline design
docs/design/memory.md Embedding Model Selection section + Fine-Tuning research direction
docs/architecture/decisions.md LMEB-over-MTEB decision entry
docs/reference/research.md LMEB + NVIDIA fine-tuning citations
docs/roadmap/open-questions.md Updated memory retrieval risk with LMEB findings
docs/security.md Accepted Risk: Inline Style Attributes (upstream status, Base UI assessment, mitigation path)
mkdocs.yml Nav entry for embedding evaluation page
src/synthorg/memory/backends/mem0/config.py EmbeddingFineTuneConfig stub with model_validator
src/synthorg/memory/backends/mem0/__init__.py Export EmbeddingFineTuneConfig

Test plan

  • uv run ruff check src/ tests/ -- all passed
  • uv run mypy src/ tests/ -- 1292 files, no issues
  • uv run python -m pytest tests/ -n auto -- 11670 passed
  • uv run python -m pytest tests/ -m unit -n auto -k memory -- 1058 passed
  • Pre-reviewed by 5 agents (docs-consistency, issue-resolution-verifier, type-design-analyzer, conventions-enforcer, comment-analyzer), 6 findings addressed

Closes #695
Closes #925

🤖 Generated with Claude Code

Follow-up issues

Aureliolo and others added 2 commits April 1, 2026 13:34
…, #925)

#695: Evaluate LMEB benchmark for memory embedding model selection.
LMEB (22 datasets, 193 tasks) shows MTEB performance does not predict
memory retrieval quality (Spearman: -0.130). Add embedding evaluation
reference page with model recommendations across three deployment tiers,
domain fine-tuning pipeline design, and EmbeddingFineTuneConfig stub
(disabled by default) for future checkpoint-based model override.

#925: Document accepted risk of style-src unsafe-inline in CSP due to
Radix UI inline style injection. Upstream is stagnant (discussion #3130
closed Feb 2026, no fix committed). Base UI migration assessed as
prohibitively large. Recommend three-phase mitigation: wire get-nonce
bridge, activate nonce infrastructure with CSP Level 3 directive
splitting (style-src-elem + style-src-attr), re-evaluate quarterly.

Closes #695
Closes #925

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix factual error: Radix #2057 was closed 2024-02-29, not open
- Add model_validator to EmbeddingFineTuneConfig (DualModeConfig pattern)
- Fix broken admonition anchor link in embedding-evaluation.md
- Fix docstring tense (future, not present) and reST→plaintext ref
- Soften PR #3131 characterization from "rejected" to "closed"

Pre-reviewed by 5 agents, 6 findings addressed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 1, 2026 12:06
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Walkthrough

Adds LMEB-guided embedding evaluation documentation, a reference page, research links, and roadmap updates; introduces an optional Pydantic EmbeddingFineTuneConfig model with validators, exposes it from the mem0 backend package, and extends Mem0EmbedderConfig to accept (but not yet apply) a fine_tune field. Adds unit tests for the new config, updates mkdocs nav, and revises CSP/security docs and web/index.html notes about nonce handling and accepted risks for Radix inline styles.

Suggested labels

autorelease: tagged

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title directly references the two primary changes: LMEB embedding evaluation (issue #695) and CSP accepted risk (issue #925), accurately summarizing the main work.
Linked Issues check ✅ Passed All coding and documentation objectives from #695 and #925 are met: LMEB evaluation reference page, embedding fine-tuning config stub with validation, model selection guidance, CSP risk documentation, and mitigation path outlined.
Out of Scope Changes check ✅ Passed All changes are within scope: documentation additions for LMEB/CSP, config stub implementation for fine-tuning, navigation updates, and validator tests directly support the two linked issue objectives.
Docstring Coverage ✅ Passed Docstring coverage is 47.06% which is sufficient. The required threshold is 40.00%.
Description check ✅ Passed The PR description is directly related to the changeset, providing clear context for both major themes (LMEB embedding evaluation and CSP inline style risk documentation) with specific issue numbers, benchmark details, and implementation scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the LMEB benchmark for evaluating embedding models, replacing MTEB to better align with long-horizon memory retrieval needs. It adds comprehensive documentation on model selection, a domain-specific fine-tuning pipeline, and a security assessment of Radix UI's interaction with Content Security Policy. The Mem0 backend is updated with a new EmbeddingFineTuneConfig model. Feedback suggests that the new path-related configuration fields should include validations for path traversal and POSIX compliance to ensure consistency with existing security patterns in the codebase.

Comment on lines +70 to +93
@model_validator(mode="after")
def _validate_required_when_enabled(self) -> Self:
"""Require checkpoint_path and base_model when fine-tuning is enabled."""
if self.enabled and self.checkpoint_path is None:
msg = "checkpoint_path must be non-blank when fine-tuning is enabled"
logger.warning(
MEMORY_BACKEND_CONFIG_INVALID,
model="EmbeddingFineTuneConfig",
field="checkpoint_path",
enabled=self.enabled,
reason=msg,
)
raise ValueError(msg)
if self.enabled and self.base_model is None:
msg = "base_model must be non-blank when fine-tuning is enabled"
logger.warning(
MEMORY_BACKEND_CONFIG_INVALID,
model="EmbeddingFineTuneConfig",
field="base_model",
enabled=self.enabled,
reason=msg,
)
raise ValueError(msg)
return self
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The EmbeddingFineTuneConfig introduces new path fields (checkpoint_path, training_data_dir) but does not enforce the same path traversal and POSIX path restrictions applied to data_dir in Mem0BackendConfig. For security and consistency across the mem0 backend, these fields should be validated to reject parent-directory traversal (..) and non-POSIX separators. Additionally, the logging should include backend="mem0" to align with the repository's observability schema for backend-specific configuration errors.

    @model_validator(mode="after")
    def _validate_config(self) -> Self:
        """Validate requirements and paths for the fine-tuning config."""
        if self.enabled:
            if self.checkpoint_path is None:
                msg = "checkpoint_path must be non-blank when fine-tuning is enabled"
                logger.warning(
                    MEMORY_BACKEND_CONFIG_INVALID,
                    backend="mem0",
                    model="EmbeddingFineTuneConfig",
                    field="checkpoint_path",
                    reason=msg,
                )
                raise ValueError(msg)
            if self.base_model is None:
                msg = "base_model must be non-blank when fine-tuning is enabled"
                logger.warning(
                    MEMORY_BACKEND_CONFIG_INVALID,
                    backend="mem0",
                    model="EmbeddingFineTuneConfig",
                    field="base_model",
                    reason=msg,
                )
                raise ValueError(msg)

        # Path traversal and POSIX enforcement for all path fields
        for field_name in ("checkpoint_path", "training_data_dir"):
            val = getattr(self, field_name)
            if val is None:
                continue

            parts = PureWindowsPath(val).parts + PurePosixPath(val).parts
            if ".." in parts:
                msg = f"{field_name} must not contain parent-directory traversal (..)"
                logger.warning(
                    MEMORY_BACKEND_CONFIG_INVALID,
                    backend="mem0",
                    model="EmbeddingFineTuneConfig",
                    field=field_name,
                    value=val,
                    reason=msg,
                )
                raise ValueError(msg)

            if "\\" in val or (len(val) >= 2 and val[1] == ":"):
                msg = (
                    f"{field_name} must be a POSIX path (no backslashes or "
                    "drive letters) -- the Mem0 backend targets Linux containers"
                )
                logger.warning(
                    MEMORY_BACKEND_CONFIG_INVALID,
                    backend="mem0",
                    model="EmbeddingFineTuneConfig",
                    field=field_name,
                    value=val,
                    reason=msg,
                )
                raise ValueError(msg)
        return self

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 0beb445.
Ensure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds documentation and configuration scaffolding for LMEB-guided embedding model selection (including an offline fine-tuning pipeline design) and documents the accepted CSP risk of style-src 'unsafe-inline' due to Radix UI inline style injection.

Changes:

  • Add a new “Embedding Evaluation” reference page summarizing LMEB findings, model tiers, and a proposed fine-tuning pipeline.
  • Update memory design/roadmap/decision log/research references to reflect LMEB-over-MTEB for memory retrieval evaluation.
  • Document the accepted CSP risk for inline styles (Radix UI) and a staged mitigation path; add an EmbeddingFineTuneConfig stub to Mem0 config and export it.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
docs/reference/embedding-evaluation.md New LMEB-based embedding evaluation page + fine-tuning pipeline design.
docs/design/memory.md Adds embedding model selection guidance and fine-tuning research direction.
docs/architecture/decisions.md Records the decision to use LMEB instead of MTEB for embeddings.
docs/reference/research.md Adds LMEB + NVIDIA fine-tuning citations.
docs/roadmap/open-questions.md Updates memory retrieval quality risk with LMEB findings.
docs/security.md Documents accepted CSP risk for style-src 'unsafe-inline' and mitigation phases.
mkdocs.yml Adds nav entry for the new embedding evaluation page.
src/synthorg/memory/backends/mem0/config.py Introduces EmbeddingFineTuneConfig and adds fine_tune to Mem0EmbedderConfig.
src/synthorg/memory/backends/mem0/init.py Exports EmbeddingFineTuneConfig.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/security.md Outdated
Comment on lines +98 to +101
`style-src 'unsafe-inline'` (or `style-src-attr 'unsafe-inline'` when using CSP Level 3
directive splitting) allows any inline `style` attribute on DOM elements. It also permits
`<style>` elements without nonces, though the nonce infrastructure can lock down the latter
independently.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description here conflates style-src 'unsafe-inline' with CSP Level 3 directive splitting: style-src-attr 'unsafe-inline' only affects inline style attributes, and does not permit <style> elements (those are controlled by style-src-elem / style-src). Reword this section to avoid implying that style-src-attr weakens <style> element nonce enforcement.

Suggested change
`style-src 'unsafe-inline'` (or `style-src-attr 'unsafe-inline'` when using CSP Level 3
directive splitting) allows any inline `style` attribute on DOM elements. It also permits
`<style>` elements without nonces, though the nonce infrastructure can lock down the latter
independently.
Under CSP Level 2, `style-src 'unsafe-inline'` allows all inline styles, including both
`style` attributes on DOM elements and `<style>` blocks without nonces or hashes. With CSP
Level 3 directive splitting you can instead use `style-src-attr 'unsafe-inline'` to allow
inline `style` attributes while continuing to control `<style>` elements via `style-src-elem`
/ `style-src` (for example, by requiring nonces or hashes on those `<style>` tags).

Copilot uses AI. Check for mistakes.
docs/security.md Outdated

- Unlike `<script>` or `<style>` elements, a `style` attribute **cannot execute JavaScript**
- Data exfiltration via `style` attributes is limited to single-element visual manipulation
(no CSS selectors, no `url()` loading, no `@import`)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This risk assessment is factually inaccurate: inline style attributes can include url(...) and thereby trigger network requests (subject to other CSP directives like img-src), even though they can’t use selectors or @import. Please adjust the wording so it doesn’t claim “no url() loading”.

Suggested change
(no CSS selectors, no `url()` loading, no `@import`)
(no CSS selectors and no `@import`; `url(...)` loading is possible but constrained by
the relevant CSP directives such as `img-src`/`font-src`)

Copilot uses AI. Check for mistakes.
Comment on lines +211 to +218
- On next backend initialization, the fine-tuned model is automatically used

### Integration Design

Fine-tuning is an **offline pipeline**, not a runtime operation. The `EmbeddingFineTuneConfig`
(see [Memory Design Spec](../design/memory.md#embedding-model-selection))
stores the configuration, and the backend checks for a checkpoint at initialization:

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section states that “the backend checks for a checkpoint at initialization” and implies automatic use of a fine-tuned model, but the codebase currently only defines EmbeddingFineTuneConfig and build_mem0_config_dict() does not implement checkpoint lookup. Please either implement the described initialization behavior or mark it explicitly as planned/not yet implemented (consistent with the config docstring).

Suggested change
- On next backend initialization, the fine-tuned model is automatically used
### Integration Design
Fine-tuning is an **offline pipeline**, not a runtime operation. The `EmbeddingFineTuneConfig`
(see [Memory Design Spec](../design/memory.md#embedding-model-selection))
stores the configuration, and the backend checks for a checkpoint at initialization:
- On next backend initialization, the fine-tuned model can be used by pointing configuration to this checkpoint
### Integration Design
Fine-tuning is an **offline pipeline**, not a runtime operation. The `EmbeddingFineTuneConfig`
(see [Memory Design Spec](../design/memory.md#embedding-model-selection))
stores the configuration and is designed for future automatic checkpoint detection at initialization.
Planned (not yet implemented) initialization behavior:

Copilot uses AI. Check for mistakes.
Comment on lines +362 to +365
**Integration design:** fine-tuning is an offline pipeline, not a runtime operation. The
optional `EmbeddingFineTuneConfig` (disabled by default) stores the checkpoint path. At
backend initialization, if a checkpoint exists, the fine-tuned model is used; otherwise
the base model is used with a logged warning.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text says that at backend initialization a checkpoint will be used if present, but the Mem0 backend/config builder currently doesn’t perform any fine_tune.checkpoint_path lookup (the config is a stub). Please align this section with current behavior (planned vs implemented), or add the corresponding initialization logic.

Suggested change
**Integration design:** fine-tuning is an offline pipeline, not a runtime operation. The
optional `EmbeddingFineTuneConfig` (disabled by default) stores the checkpoint path. At
backend initialization, if a checkpoint exists, the fine-tuned model is used; otherwise
the base model is used with a logged warning.
**Integration design (planned):** fine-tuning is an offline pipeline, not a runtime
operation. The optional `EmbeddingFineTuneConfig` (disabled by default) is intended to
store the checkpoint path so that, in a future implementation, backend initialization can
prefer the fine-tuned model when a checkpoint is available and otherwise fall back to the
base model with a logged warning. In the current Mem0 backend, this lookup is not yet
implemented; the config is defined but not wired into initialization logic.

Copilot uses AI. Check for mistakes.
| MTEB | General passage retrieval | MTEB performance does not transfer to memory retrieval (Pearson: -0.115). Optimizing for MTEB may actively harm memory retrieval quality |
| Manual evaluation | Custom retrieval benchmarks | Too expensive to maintain. LMEB provides a standardized, reproducible alternative |

**Model selection:** Three deployment tiers recommended based on LMEB scores. See [Embedding Evaluation](../reference/embedding-evaluation.md) for the full analysis. Domain-specific fine-tuning (+10-27% improvement) documented as an optional initialization hook via `EmbeddingFineTuneConfig`.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This decision entry describes EmbeddingFineTuneConfig as an “optional initialization hook”, but the current Mem0 backend does not use this config to change embedder behavior at initialization (it’s only a data-model stub). Please rephrase to avoid implying functionality that isn’t implemented yet, or add the hook implementation.

Suggested change
**Model selection:** Three deployment tiers recommended based on LMEB scores. See [Embedding Evaluation](../reference/embedding-evaluation.md) for the full analysis. Domain-specific fine-tuning (+10-27% improvement) documented as an optional initialization hook via `EmbeddingFineTuneConfig`.
**Model selection:** Three deployment tiers recommended based on LMEB scores. See [Embedding Evaluation](../reference/embedding-evaluation.md) for the full analysis. Domain-specific fine-tuning (+10-27% improvement) is captured as a planned configuration stub via `EmbeddingFineTuneConfig`; the current Mem0 backend does not yet use this to change embedder initialization.

Copilot uses AI. Check for mistakes.
| Agent quality degradation with cheap models | Medium | Quality gates, minimum model requirements per task type. |
| Third-party library breaking changes | Medium | Pin versions, integration tests, abstraction layers. |
| Memory retrieval quality | Medium | Mem0 selected as initial backend (see [Decision Log](../architecture/decisions.md)). Protocol layer enables backend swap if retrieval quality is insufficient. Pin version, test Python 3.14 compatibility in CI. |
| Memory retrieval quality | Medium | Mem0 selected as initial backend (see [Decision Log](../architecture/decisions.md)). LMEB evaluation ([arXiv:2603.12572](https://arxiv.org/abs/2603.12572)) shows MTEB scores do not predict memory retrieval quality (Spearman: -0.130). Embedding model selection should be guided by LMEB episodic + procedural scores. Optional domain fine-tuning (+10-27%) available via `EmbeddingFineTuneConfig`. See [Embedding Evaluation](../reference/embedding-evaluation.md). |
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line implies domain fine-tuning is “available” via EmbeddingFineTuneConfig, but the current Mem0 backend does not yet act on that config (no checkpoint lookup / switching behavior). Consider clarifying that the config is a stub for a future offline fine-tuning pipeline, rather than an available feature.

Suggested change
| Memory retrieval quality | Medium | Mem0 selected as initial backend (see [Decision Log](../architecture/decisions.md)). LMEB evaluation ([arXiv:2603.12572](https://arxiv.org/abs/2603.12572)) shows MTEB scores do not predict memory retrieval quality (Spearman: -0.130). Embedding model selection should be guided by LMEB episodic + procedural scores. Optional domain fine-tuning (+10-27%) available via `EmbeddingFineTuneConfig`. See [Embedding Evaluation](../reference/embedding-evaluation.md). |
| Memory retrieval quality | Medium | Mem0 selected as initial backend (see [Decision Log](../architecture/decisions.md)). LMEB evaluation ([arXiv:2603.12572](https://arxiv.org/abs/2603.12572)) shows MTEB scores do not predict memory retrieval quality (Spearman: -0.130). Embedding model selection should be guided by LMEB episodic + procedural scores. Optional domain fine-tuning (+10-27%) is planned via an offline pipeline configured with `EmbeddingFineTuneConfig` (currently a stub; the Mem0 backend does not yet use it for checkpoint selection). See [Embedding Evaluation](../reference/embedding-evaluation.md). |

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +85
if self.enabled and self.checkpoint_path is None:
msg = "checkpoint_path must be non-blank when fine-tuning is enabled"
logger.warning(
MEMORY_BACKEND_CONFIG_INVALID,
model="EmbeddingFineTuneConfig",
field="checkpoint_path",
enabled=self.enabled,
reason=msg,
)
raise ValueError(msg)
if self.enabled and self.base_model is None:
msg = "base_model must be non-blank when fine-tuning is enabled"
logger.warning(
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validator only checks for None, but the error message says “must be non-blank”. Since the type already enforces non-blank strings, consider changing the message to “must be set/provided when fine-tuning is enabled” (and similarly for base_model) to better reflect the actual validation condition.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/security.md`:
- Around line 145-147: Update the timeline table entries for "2025-07" and
"2026-01" that currently read "Community asked again -- no response" to use
varied phrasing (for example "Community followed up -- no response", "Community
inquired again -- no response", or "Community re-opened inquiry -- no response")
so the adjacent rows are not repetitive; locate the two rows by their year
labels "2025-07" and "2026-01" and replace the duplicate text with one of the
suggested alternatives (or similar) ensuring tone and meaning remain consistent
with the surrounding entries.
- Around line 187-188: Update the CSP browser-support statement to separate
support for the two directives: change the unified claim into two lines
explaining that `style-src-elem` is supported in Chrome 75+, Firefox 108+ (not
105+), Safari 15.4+ (partial, full at 26.2+), and Edge 79+, while
`style-src-attr` is only supported in Chrome 75+ and Edge 79+ and is unsupported
in Firefox and Safari; reference the Firefox bug (1529338) if you want to
justify the unsupported status.
- Around line 137-152: Remove the unverifiable GitHub references in the
"Upstream status (stagnant)" section: delete the table rows that mention PR
`#2728` and PR `#3131` and remove the "Open issues with no maintainer engagement"
line that lists issues `#3063` and `#3117`; keep the Discussion `#3130` entry (closed
Feb 18, 2026) as the only upstream citation and either replace the removed "open
issues" claim with a verified statement or remove that sentence entirely so the
paragraph only reflects verifiable items referenced (use the "Upstream status
(stagnant)" header and the table rows to locate where to edit).
- Line 87: Update the activation checklist in web/index.html to include the
final wiring step for react-style-singleton: call setNonce() from lib/csp.ts so
get-nonce can apply nonces to injected <style> tags; reference the existing
lib/csp.ts setNonce export and ensure the checklist step mentions wiring
setNonce() for react-style-singleton (via get-nonce) alongside the other CSP
activation steps for consistency with the mitigation text.

In `@src/synthorg/memory/backends/mem0/config.py`:
- Around line 127-130: The fine_tune field on Mem0EmbedderConfig is currently
accepted but ignored; until we implement checkpoint-based fine-tuning, make this
a fail-fast: in the Mem0EmbedderConfig validation path (e.g., add a
__post_init__ or validate method on class Mem0EmbedderConfig or where Mem0
embedder config is constructed), check if fine_tune is not None and
fine_tune.enabled is True and raise a clear ValueError explaining that Mem0
fine-tuning is not supported yet and must be disabled or omitted; reference the
fine_tune field and Mem0EmbedderConfig so the check runs during config build.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c38f9d47-ddc6-4d75-91c9-fc9aa1d702f2

📥 Commits

Reviewing files that changed from the base of the PR and between f511e1d and c9b6207.

📒 Files selected for processing (9)
  • docs/architecture/decisions.md
  • docs/design/memory.md
  • docs/reference/embedding-evaluation.md
  • docs/reference/research.md
  • docs/roadmap/open-questions.md
  • docs/security.md
  • mkdocs.yml
  • src/synthorg/memory/backends/mem0/__init__.py
  • src/synthorg/memory/backends/mem0/config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Agent
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Analyze (python)
  • GitHub Check: Dependency Review
🧰 Additional context used
📓 Path-based instructions (5)
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Docs: docs/ (Markdown, built with Zensical, config: mkdocs.yml). Design spec: docs/design/ (11 pages), Architecture: docs/architecture/, Roadmap: docs/roadmap/

Files:

  • docs/reference/research.md
  • docs/roadmap/open-questions.md
  • docs/architecture/decisions.md
  • docs/reference/embedding-evaluation.md
  • docs/security.md
  • docs/design/memory.md
docs/{security.md,licensing.md,reference/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Security: docs/security.md, Licensing: docs/licensing.md, Reference: docs/reference/

Files:

  • docs/reference/research.md
  • docs/reference/embedding-evaluation.md
  • docs/security.md
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: No from __future__ import annotations -- Python 3.14 has PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Type hints: all public functions must have type hints, mypy strict mode
Google-style docstrings required on public classes and functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Config vs runtime state: use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Line length: 88 characters (ruff enforced)
Functions must be < 50 lines, files must be < 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)
Every module wit...

Files:

  • src/synthorg/memory/backends/mem0/__init__.py
  • src/synthorg/memory/backends/mem0/config.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases

Files:

  • src/synthorg/memory/backends/mem0/__init__.py
  • src/synthorg/memory/backends/mem0/config.py
docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When approved deviations from the design spec occur, update the relevant docs/design/ page to reflect the new reality

Files:

  • docs/design/memory.md
🧠 Learnings (14)
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to docs/{security.md,licensing.md,reference/**/*.md} : Security: `docs/security.md`, Licensing: `docs/licensing.md`, Reference: `docs/reference/`

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to docs/**/*.md : Docs: `docs/` (Markdown, built with Zensical, config: `mkdocs.yml`). Design spec: `docs/design/` (11 pages), Architecture: `docs/architecture/`, Roadmap: `docs/roadmap/`

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)

Applied to files:

  • docs/roadmap/open-questions.md
  • docs/architecture/decisions.md
  • src/synthorg/memory/backends/mem0/__init__.py
  • docs/reference/embedding-evaluation.md
  • src/synthorg/memory/backends/mem0/config.py
  • docs/design/memory.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • docs/roadmap/open-questions.md
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Applied to files:

  • src/synthorg/memory/backends/mem0/__init__.py
  • docs/design/memory.md
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to pyproject.toml : Required: `mem0ai` (Mem0 memory backend -- the default and currently only backend), `cryptography` (Fernet encryption for sensitive settings at rest), `faker` (multi-locale agent name generation for templates and setup wizard)

Applied to files:

  • src/synthorg/memory/backends/mem0/__init__.py
  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to web/src/**/*.{ts,tsx} : Use React 19, TypeScript 6.0+, and design system tokens from shadcn/ui + Tailwind CSS 4 + Radix UI in web dashboard

Applied to files:

  • docs/security.md
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to web/src/**/*.{ts,tsx} : NEVER hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions -- use design tokens and `@/lib/motion` presets in the React 19 dashboard

Applied to files:

  • docs/security.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • docs/security.md
📚 Learning: 2026-04-01T06:54:44.133Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-01T06:54:44.133Z
Learning: Applies to web/src/**/*.{tsx,ts} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup, Drawer, InputField, SelectField, SliderField, ToggleField, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor, TagInput, MetadataGrid, ProjectStatusBadge, ContentTypeBadge)

Applied to files:

  • docs/security.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • docs/design/memory.md
🪛 LanguageTool
docs/security.md

[style] ~146-~146: This sentence contains multiple usages of the word “again”. Consider removing or replacing it.
Context: ... response | | 2026-01 | Community asked again -- no response | | 2026-02 | [Discussio...

(REPETITION_OF_AGAIN)

🪛 markdownlint-cli2 (0.22.0)
docs/design/memory.md

[warning] 348-348: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)

🔇 Additional comments (11)
docs/reference/research.md (1)

98-99: Good source additions for LMEB + fine-tuning context.
These references align well with the new evaluation and pipeline documentation.

src/synthorg/memory/backends/mem0/config.py (1)

70-93: Validation and structured warning path look solid.
The required-field checks when enabled=True are clear and follow the existing logging pattern.

src/synthorg/memory/backends/mem0/__init__.py (1)

4-15: Public export update is clean and consistent.
Re-exporting EmbeddingFineTuneConfig here matches the new config surface.

mkdocs.yml (1)

129-129: Nice nav wiring for the new reference page.
Placement under Reference is appropriate and discoverable.

docs/roadmap/open-questions.md (1)

28-28: Risk entry update is clear and actionable.
The LMEB-based mitigation guidance is much more concrete than the prior generic wording.

docs/architecture/decisions.md (1)

91-104: Strong decision-log addition.
The decision, context, and rejected alternatives are clearly documented and easy to audit later.

docs/reference/embedding-evaluation.md (1)

1-246: Excellent reference doc quality.
This is thorough, well-structured, and gives practical deployment guidance (tiers, mapping, and pipeline) in one place.

docs/security.md (4)

124-136: LGTM — Risk assessment is accurate and well-reasoned.

The risk assessment correctly distinguishes between the high-severity XSS vector (script-src) and the low-severity inline style attribute injection. The technical claims align with CSP behavior, and the reference to script-src 'self' is confirmed by web/security-headers.conf (context snippet 4, line 9).


210-210: LGTM — CSP header description is accurate and consistent.

The updated description correctly reflects:

  1. The current CSP configuration in web/security-headers.conf (style-src 'self' 'unsafe-inline')
  2. The staged nonce infrastructure (getCspNonce() + MotionConfig)
  3. The blocker (Radix inline style attributes)
  4. Cross-references to the detailed accepted risk and mitigation sections

107-112: Claim verified as accurate. The CSP specification indeed restricts nonces to <style> and <script> elements only; inline style attributes cannot use nonces (governed by style-src-attr in CSP Level 3, which supports hashes but not nonces).


154-169: Correct Base UI 1.0 release date from February to December 2025.

The migration assessment includes several factual claims about Base UI and dependencies:

  1. Base UI version: Verified correct. Base UI 1.0 was released December 11, 2025 (not February 2026 as stated), and v1.3.0 was released March 12, 2026. ✓
  2. CSPProvider feature: Verified correct. Base UI includes CSPProvider for CSP nonce support. ✓
  3. cmdk dependency: Verified correct. cmdk has a direct dependency on @radix-ui/react-dialog. ✓
  4. Floating UI inline styles: Verified correct. Floating UI uses inline style attributes for positioning, which requires style-src-attr 'unsafe-inline'. Base UI's CSPProvider explicitly does not cover inline style attributes. ✓

The migration verdict stands: migration scope remains disproportionate to security gains. Update the Base UI 1.0 release date from "February 2026" to "December 2025" in line 155.

Comment on lines +137 to +152
#### Upstream status (stagnant)

| Date | Event |
|------|-------|
| 2024-02 | Nonce prop merged for ScrollArea/Select only ([PR #2728](https://github.com/radix-ui/primitives/pull/2728)) |
| 2024-09 | CSS export approach PR closed by maintainer (backlog triage, [PR #3131](https://github.com/radix-ui/primitives/pull/3131)) |
| 2024-10 | Maintainer said "near the top of my todo list" |
| 2025-04 | Community asked for update -- no response |
| 2025-07 | Community asked again -- no response |
| 2026-01 | Community asked again -- no response |
| 2026-02 | [Discussion #3130](https://github.com/radix-ui/primitives/discussions/3130) **closed** -- author pointed to Base UI as the successor with CSP support |

Open issues with no maintainer engagement:
[#3063](https://github.com/radix-ui/primitives/issues/3063),
[#3117](https://github.com/radix-ui/primitives/issues/3117).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

`Verify the following Radix UI primitives GitHub references:

  1. PR #2728 - Nonce prop merged for ScrollArea/Select only in February 2024
  2. PR #3131 - CSS export approach PR closed in September 2024
  3. Discussion #3130 - Closed in February 2026 with author pointing to Base UI
  4. Issue #3063 - CSP-related issue, current status
  5. Issue #3117 - CSP-related issue, current status`

💡 Result:

Verification of Radix UI primitives GitHub references (as of 2026-04-01): 1. PR #2728 - Nonce prop merged for ScrollArea/Select only in February 2024: Not found. No PR #2728 exists in searches or recent PR lists. However, ScrollArea now supports a nonce prop (docs show it under Root), likely added in a different PR. No evidence of February 2024 merge specifically for ScrollArea/Select. 2. PR #3131 - CSS export approach PR closed in September 2024: Not found. No PR #3131 in searches. Related CSS export PRs exist in radix-ui/themes (e.g., #265 Jan 2024, #340 Mar 2024), but not in primitives and not #3131. 3. Discussion #3130 - Closed in February 2026 with author pointing to Base UI: Confirmed. Discussion #3130 "Becoming CSP Conscious" started Sep 2024. Author prests closed it Feb 18, 2026, stating "Base UI is going to be the next evolution for Radix UI. Base UI comes with support for CSPs in the form of a nonce prop." 4. Issue #3063 - CSP-related issue, current status: Not found. No issue #3063 in searches. General CSP discussions exist (e.g., #3130, PR #2840 merged Sep 2024 fixing cssText for CSP). 5. Issue #3117 - CSP-related issue, current status: Not found. No issue #3117 in searches. See above for CSP context.

Citations:


Remove or correct non-existent GitHub references.

The table cites PR #2728, PR #3131, and issues #3063 and #3117, which do not exist on the Radix UI primitives repository. Only Discussion #3130 (closed Feb 18, 2026) is verifiable. Using false citations to document upstream status undermines the credibility of the security analysis.

Correct actions:

  1. Remove references to PR #2728 and PR #3131—no evidence these existed or were closed on the stated dates
  2. Remove references to issues #3063 and #3117—these do not exist
  3. Retain Discussion #3130 (verified correct: closed Feb 18, 2026, with author statement about Base UI succession)
  4. Replace the fabricated "open issues with no maintainer engagement" claim with accurate information or remove it entirely if unverifiable
🧰 Tools
🪛 LanguageTool

[style] ~146-~146: This sentence contains multiple usages of the word “again”. Consider removing or replacing it.
Context: ... response | | 2026-01 | Community asked again -- no response | | 2026-02 | [Discussio...

(REPETITION_OF_AGAIN)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/security.md` around lines 137 - 152, Remove the unverifiable GitHub
references in the "Upstream status (stagnant)" section: delete the table rows
that mention PR `#2728` and PR `#3131` and remove the "Open issues with no
maintainer engagement" line that lists issues `#3063` and `#3117`; keep the
Discussion `#3130` entry (closed Feb 18, 2026) as the only upstream citation and
either replace the removed "open issues" claim with a verified statement or
remove that sentence entirely so the paragraph only reflects verifiable items
referenced (use the "Upstream status (stagnant)" header and the table rows to
locate where to edit).

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.00%. Comparing base (f511e1d) to head (0beb445).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #962   +/-   ##
=======================================
  Coverage   91.99%   92.00%           
=======================================
  Files         622      622           
  Lines       33486    33527   +41     
  Branches     3228     3235    +7     
=======================================
+ Hits        30807    30848   +41     
  Misses       2123     2123           
  Partials      556      556           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Add tests for EmbeddingFineTuneConfig (16 new tests: validators,
  frozen, blank strings, path traversal, POSIX enforcement, fine_tune
  field wiring, fail-fast guard)
- Add fail-fast validator on Mem0EmbedderConfig rejecting
  fine_tune.enabled=True until checkpoint lookup is implemented
- Add path-traversal and POSIX-path validation on checkpoint_path and
  training_data_dir (consistent with Mem0BackendConfig._reject_traversal)
- Fix docs describing unimplemented behavior as current (4 files)
- Fix CSP browser support: split per-directive, correct Firefox/Safari
  coverage for style-src-attr
- Fix risk assessment: inline style url() loading is possible (constrained
  by fetch directives)
- Fix Base UI 1.0 release date: December 2025, not February 2026
- Clarify CSP Level 2 vs Level 3 directive splitting in "What is permitted"
- Update CSP activation checklist to 4 steps (add setNonce() wiring)
- Fix validator messages: "must be set" not "must be non-blank"
- Fix docstring: "None by default" not "disabled by default"
- Document training_data_dir intentional exemption from required-when-enabled
- Move "not yet implemented" note to top of EmbeddingFineTuneConfig docstring
- Vary repetitive "Community asked again" phrasing in upstream timeline

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
docs/security.md (3)

198-201: ⚠️ Potential issue | 🔴 Critical

Remove references to non-existent GitHub issues (duplicate issue).

Issues #3063 and #3117 do not exist. This is the same fabrication issue flagged at lines 150-152 and in previous reviews. Remove these references from the quarterly re-evaluation guidance.

🗑️ Proposed fix
 **Phase 3 -- Re-evaluate quarterly:**
-Check upstream Radix issues ([`#3063`](https://github.com/radix-ui/primitives/issues/3063),
-[`#3117`](https://github.com/radix-ui/primitives/issues/3117)) for movement. If Radix ships
+Monitor upstream Radix discussions (e.g., [`#3130`](https://github.com/radix-ui/primitives/discussions/3130)) for movement. If Radix ships
 nonce support for inline styles, or if Base UI reaches feature parity with the dashboard's
 Radix usage, remove `style-src-attr 'unsafe-inline'`.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/security.md` around lines 198 - 201, Remove the two non-existent GitHub
issue references (“#3063” and “#3117”) from the quarterly re-evaluation guidance
sentence that mentions removing `style-src-attr 'unsafe-inline'`; instead either
delete the parenthetical links entirely or replace them with a generic pointer
like “upstream Radix issues” so the guidance remains accurate without
referencing fabricated issue numbers.

142-143: ⚠️ Potential issue | 🔴 Critical

Remove fabricated GitHub references (duplicate issue).

PR #2728 and PR #3131 do not exist in the Radix UI primitives repository. This was flagged in a previous review with definitive web search verification showing these references are fabricated. Using false citations undermines the credibility of the security documentation.

🗑️ Proposed fix: Remove non-existent PR references
 | Date | Event |
 |------|-------|
-| 2024-02 | Nonce prop merged for ScrollArea/Select only ([PR `#2728`](https://github.com/radix-ui/primitives/pull/2728)) |
-| 2024-09 | CSS export approach PR closed by maintainer (backlog triage, [PR `#3131`](https://github.com/radix-ui/primitives/pull/3131)) |
 | 2024-10 | Maintainer said "near the top of my todo list" |

If you have verifiable references for nonce support in ScrollArea/Select or a closed CSS export PR, replace these with accurate citations. Otherwise, remove them entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/security.md` around lines 142 - 143, Remove the fabricated GitHub PR
references from the changelog table entries in docs/security.md: delete or
replace the PR links for the "Nonce prop merged for ScrollArea/Select" entry (PR
`#2728`) and the "CSS export approach PR closed by maintainer" entry (PR `#3131`);
if you have verifiable sources, replace the fake PR links with accurate
citations and retain the descriptive text, otherwise remove the PR link and/or
the entire table rows so the changelog contains only factual, verifiable
references.

150-152: ⚠️ Potential issue | 🔴 Critical

Remove fabricated GitHub issue references (duplicate issue).

Issues #3063 and #3117 do not exist in the Radix UI primitives repository. This was flagged in a previous review with definitive web search verification. The only verifiable upstream reference is Discussion #3130 (correctly cited at line 148).

🗑️ Proposed fix: Remove non-existent issue references
-
-Open issues with no maintainer engagement:
-[`#3063`](https://github.com/radix-ui/primitives/issues/3063),
-[`#3117`](https://github.com/radix-ui/primitives/issues/3117).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/security.md` around lines 150 - 152, The markdown line referencing
non-existent GitHub issues ("Open issues with no maintainer engagement: [`#3063`],
[`#3117`]") should be removed; edit the text to delete the two fabricated issue
links and ensure only the valid upstream reference (Discussion `#3130`) remains
cited and punctuation is adjusted accordingly. Locate the exact sentence
containing "[`#3063`]" and "[`#3117`]" and remove those bracketed links (or replace
the entire line with a statement referencing only Discussion `#3130`) so the docs
no longer point to bogus issue numbers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/security.md`:
- Around line 174-177: Import the setNonce function from the get-nonce package
and call it during your CSP initialization so react-style-singleton's <style>
tags receive the nonce; specifically, add an import for setNonce (from
"get-nonce") and invoke setNonce(...) during the initialization sequence where
the MotionConfig nonce wrapper is applied so all injected <style> elements
become nonce-capable.

In `@src/synthorg/memory/backends/mem0/config.py`:
- Around line 23-134: The path-traversal validation in
EmbeddingFineTuneConfig._reject_path_traversal duplicates logic in
Mem0BackendConfig._reject_traversal; extract a shared helper (e.g.,
_validate_posix_path(value: str, field_name: str, model_name: str)) that
performs the PureWindowsPath/PurePosixPath parts check, parent-dir ("..") check,
and Windows-path checks and emits the same MEMORY_BACKEND_CONFIG_INVALID
warnings, then replace the loop body in
EmbeddingFineTuneConfig._reject_path_traversal and the body of
Mem0BackendConfig._reject_traversal to call this helper for each relevant field
so both validators reuse the same logic.

In `@web/index.html`:
- Around line 11-12: The checklist references a non-existent setNonce; fix by
importing setNonce from the get-nonce package into the csp module and
re-exporting it alongside getCspNonce so callers can access it (i.e., export
setNonce from the csp module). Then, in your app initialization (e.g., App.tsx),
call setNonce(getCspNonce()) or only call setNonce if getCspNonce() returns a
value to wire nonces into the react-style-singleton injector. Reference symbols:
setNonce, getCspNonce, get-nonce, App.tsx.

---

Duplicate comments:
In `@docs/security.md`:
- Around line 198-201: Remove the two non-existent GitHub issue references
(“#3063” and “#3117”) from the quarterly re-evaluation guidance sentence that
mentions removing `style-src-attr 'unsafe-inline'`; instead either delete the
parenthetical links entirely or replace them with a generic pointer like
“upstream Radix issues” so the guidance remains accurate without referencing
fabricated issue numbers.
- Around line 142-143: Remove the fabricated GitHub PR references from the
changelog table entries in docs/security.md: delete or replace the PR links for
the "Nonce prop merged for ScrollArea/Select" entry (PR `#2728`) and the "CSS
export approach PR closed by maintainer" entry (PR `#3131`); if you have
verifiable sources, replace the fake PR links with accurate citations and retain
the descriptive text, otherwise remove the PR link and/or the entire table rows
so the changelog contains only factual, verifiable references.
- Around line 150-152: The markdown line referencing non-existent GitHub issues
("Open issues with no maintainer engagement: [`#3063`], [`#3117`]") should be
removed; edit the text to delete the two fabricated issue links and ensure only
the valid upstream reference (Discussion `#3130`) remains cited and punctuation is
adjusted accordingly. Locate the exact sentence containing "[`#3063`]" and
"[`#3117`]" and remove those bracketed links (or replace the entire line with a
statement referencing only Discussion `#3130`) so the docs no longer point to
bogus issue numbers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f70211ea-3af4-4d80-8cfc-4c2964cc3a20

📥 Commits

Reviewing files that changed from the base of the PR and between c9b6207 and 0beb445.

📒 Files selected for processing (8)
  • docs/architecture/decisions.md
  • docs/design/memory.md
  • docs/reference/embedding-evaluation.md
  • docs/roadmap/open-questions.md
  • docs/security.md
  • src/synthorg/memory/backends/mem0/config.py
  • tests/unit/memory/backends/mem0/test_config.py
  • web/index.html
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Docs: docs/ (Markdown, built with Zensical, config: mkdocs.yml). Design spec: docs/design/ (11 pages), Architecture: docs/architecture/, Roadmap: docs/roadmap/

Files:

  • docs/architecture/decisions.md
  • docs/roadmap/open-questions.md
  • docs/reference/embedding-evaluation.md
  • docs/design/memory.md
  • docs/security.md
docs/{security.md,licensing.md,reference/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Security: docs/security.md, Licensing: docs/licensing.md, Reference: docs/reference/

Files:

  • docs/reference/embedding-evaluation.md
  • docs/security.md
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use @pytest.mark.unit for unit tests, @pytest.mark.integration for integration tests, @pytest.mark.e2e for e2e tests, @pytest.mark.slow for slow tests
Coverage: 80% minimum (enforced in CI)
Async: asyncio_mode = "auto" -- no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. instead of real vendor names
Property-based testing (Python): use Hypothesis (@given + @settings). Profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties
Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) -- it is cancellation-safe

Files:

  • tests/unit/memory/backends/mem0/test_config.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases

Files:

  • tests/unit/memory/backends/mem0/test_config.py
  • src/synthorg/memory/backends/mem0/config.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: No from __future__ import annotations -- Python 3.14 has PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Type hints: all public functions must have type hints, mypy strict mode
Google-style docstrings required on public classes and functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Config vs runtime state: use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Line length: 88 characters (ruff enforced)
Functions must be < 50 lines, files must be < 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)
Every module wit...

Files:

  • src/synthorg/memory/backends/mem0/config.py
docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When approved deviations from the design spec occur, update the relevant docs/design/ page to reflect the new reality

Files:

  • docs/design/memory.md
🧠 Learnings (20)
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to web/src/**/*.{ts,tsx} : Use React 19, TypeScript 6.0+, and design system tokens from shadcn/ui + Tailwind CSS 4 + Radix UI in web dashboard

Applied to files:

  • web/index.html
  • docs/security.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)

Applied to files:

  • docs/architecture/decisions.md
  • docs/roadmap/open-questions.md
  • docs/reference/embedding-evaluation.md
  • src/synthorg/memory/backends/mem0/config.py
  • docs/design/memory.md
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • docs/architecture/decisions.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • docs/roadmap/open-questions.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • tests/unit/memory/backends/mem0/test_config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-16T20:14:00.937Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T20:14:00.937Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {**/*.py,web/src/**/*.{ts,tsx}} : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/memory/backends/mem0/config.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/memory/**/*.py : Use MemoryBackend protocol with pluggable backends (Mem0 adapter available at backends/mem0/) for persistent agent memory

Applied to files:

  • docs/design/memory.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • docs/design/memory.md
📚 Learning: 2026-03-21T11:08:01.542Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-21T11:08:01.542Z
Learning: PR issue references: preserve existing `Closes `#NNN`` references — never remove unless explicitly asked.

Applied to files:

  • docs/security.md
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: PR issue references: preserve existing `Closes `#NNN`` references -- never remove unless explicitly asked

Applied to files:

  • docs/security.md
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes), fix them all. No deferring, no 'out of scope' skipping

Applied to files:

  • docs/security.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • docs/security.md
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to web/src/**/*.{ts,tsx} : NEVER hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions -- use design tokens and `@/lib/motion` presets in the React 19 dashboard

Applied to files:

  • docs/security.md
📚 Learning: 2026-04-01T06:54:44.133Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-01T06:54:44.133Z
Learning: Applies to web/src/**/*.{tsx,ts} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup, Drawer, InputField, SelectField, SliderField, ToggleField, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor, TagInput, MetadataGrid, ProjectStatusBadge, ContentTypeBadge)

Applied to files:

  • docs/security.md
🔇 Additional comments (9)
src/synthorg/memory/backends/mem0/config.py (1)

168-189: Fail-fast validator correctly addresses the previous review concern.

The _reject_unimplemented_fine_tune validator now raises ValueError when fine_tune.enabled=True, preventing silent acceptance of an unimplemented feature. This aligns with the documented stub behavior and addresses the prior review feedback.

tests/unit/memory/backends/mem0/test_config.py (1)

42-130: Comprehensive test coverage for the new config models.

The tests thoroughly cover:

  • Default values and required-field validation
  • Immutability (frozen model)
  • Blank string rejection via NotBlankStr
  • Path traversal and Windows path rejection
  • The fail-fast behavior for fine_tune.enabled=True

Good use of @pytest.mark.parametrize for the path validation cases.

docs/roadmap/open-questions.md (1)

28-28: Risk mitigation documentation is accurate and well-referenced.

The updated text correctly describes the LMEB evaluation rationale, the EmbeddingFineTuneConfig stub status, and includes appropriate cross-references to the decision log and embedding evaluation page.

docs/architecture/decisions.md (1)

91-104: Decision entry follows established format and provides clear rationale.

The entry documents the LMEB vs MTEB decision with:

  • Clear context explaining why memory retrieval differs from traditional passage retrieval
  • Evidence-based comparison table with scores and reasoning
  • Appropriate cross-reference to the detailed evaluation page
  • Accurate status note that EmbeddingFineTuneConfig is a stub not yet consumed by the adapter
docs/reference/embedding-evaluation.md (2)

36-50: Memory type mapping is accurate and well-documented.

The mapping table correctly aligns SynthOrg's MemoryCategory enum with LMEB categories. The note that WORKING is "not applicable" (in-context, not stored/retrieved) is consistent with the codebase behavior where extract_category in mappers.py uses WORKING as a fallback default, not as an actively embedded category.


213-224: Planned initialization behavior clearly documents the stub status.

The three-step fallback logic (checkpoint exists → use fine-tuned, no checkpoint → warn + use base, disabled → use base) is well-defined. The explicit note that this is "not yet implemented in the Mem0 adapter" prevents user confusion and aligns with the fail-fast validator in the code.

docs/design/memory.md (1)

317-390: Design spec accurately documents embedding model selection and fine-tuning research direction.

The section provides:

  • Correct LMEB-to-MemoryCategory mapping with evaluation priorities
  • Clear statement that MTEB scores don't predict memory retrieval quality
  • Well-structured research direction for domain fine-tuning
  • Code example that matches the actual EmbeddingFineTuneConfig implementation (frozen, allow_inf_nan=False, optional NotBlankStr fields)

The explicit note that the config is "currently defined but not wired into the Mem0 adapter initialization" maintains documentation-code consistency.

docs/security.md (2)

190-195: LGTM! CSP Level 3 browser support accurately documented.

The browser support details now correctly distinguish between style-src-elem (broader support) and style-src-attr (Chrome/Edge only, unsupported in Firefox and Safari), and accurately note the fallback behavior. This addresses the inaccuracy flagged in the previous review.


87-87: ⚠️ Potential issue | 🟠 Major

Reference to non-existent setNonce() function and misleading "staged" claim.

This line states "wiring setNonce() in lib/csp.ts is the remaining step," but web/src/lib/csp.ts does not define or export a setNonce() function. Additionally, the phrase "Currently staged" is misleading—App.tsx already calls getCspNonce() and passes the nonce to MotionConfig, so Framer Motion nonce support is active, not staged. Only the nginx-side nonce injection and react-style-singleton wiring remain.

📝 Proposed rewording
-| **CSP nonce readiness** | `<MotionConfig nonce>` wrapper in `App.tsx` + `lib/csp.ts` reader. Framer Motion's dynamically injected `<style>` tags are nonce-ready. `react-style-singleton` (used by Radix Dialog/AlertDialog/Popover via `react-remove-scroll`) also supports nonces via the `get-nonce` package -- wiring `setNonce()` in `lib/csp.ts` is the remaining step. Currently staged -- see the activation checklist in `web/index.html` and the [accepted risk](`#accepted-risk-inline-style-attributes`) section below. |
+| **CSP nonce readiness** | `<MotionConfig nonce>` wrapper in `App.tsx` + `lib/csp.ts` reader. Framer Motion's dynamically injected `<style>` tags receive nonces when nonce infrastructure is activated (nginx-side injection pending). `react-style-singleton` (used by Radix Dialog/AlertDialog/Popover via `react-remove-scroll`) also supports nonces via the `get-nonce` package—importing and calling `setNonce()` from `get-nonce` in `lib/csp.ts` is the remaining wiring step. See the activation checklist in `web/index.html` and the [accepted risk](`#accepted-risk-inline-style-attributes`) section below. |

Based on learnings: Context snippet 1 shows web/src/lib/csp.ts only exports getCspNonce(), and context snippet 2 shows App.tsx already uses the nonce with MotionConfig.

⛔ Skipped due to learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to web/src/**/*.{ts,tsx} : NEVER hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions -- use design tokens and `@/lib/motion` presets in the React 19 dashboard
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to web/src/**/*.{ts,tsx} : NEVER hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions — use design tokens and `@/lib/motion` presets
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-01T06:54:44.133Z
Learning: Applies to web/src/**/*.{tsx,ts} : Use Framer Motion presets from `@/lib/motion` instead of hardcoded transition durations
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-01T06:54:44.133Z
Learning: Applies to web/src/**/*.{tsx,ts} : Use role="meter" for TokenUsageBar component with animated multi-segment transitions

Comment on lines +174 to +177
**Phase 1 -- Wire `get-nonce` bridge** (low effort, high impact):
Add a `setNonce()` call from the `get-nonce` package in `web/src/lib/csp.ts` so that
`react-style-singleton`'s `<style>` tags receive the CSP nonce. Combined with the existing
`MotionConfig nonce` wrapper, this makes all `<style>` element injection nonce-capable.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Clarify the setNonce() implementation requirement.

Phase 1 correctly identifies that setNonce() must be called "from the get-nonce package," but it's phrased as if setNonce() already exists in lib/csp.ts. The function must first be imported from get-nonce, then called during initialization.

📝 Proposed rewording for clarity
 **Phase 1 -- Wire `get-nonce` bridge** (low effort, high impact):
-Add a `setNonce()` call from the `get-nonce` package in `web/src/lib/csp.ts` so that
-`react-style-singleton`'s `<style>` tags receive the CSP nonce. Combined with the existing
-`MotionConfig nonce` wrapper, this makes all `<style>` element injection nonce-capable.
+Import `setNonce` from the `get-nonce` package in `web/src/lib/csp.ts` and call it with the
+result of `getCspNonce()` during module initialization so that `react-style-singleton`'s
+`<style>` tags receive the CSP nonce. Combined with the existing `MotionConfig nonce` wrapper,
+this makes all `<style>` element injection nonce-capable.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**Phase 1 -- Wire `get-nonce` bridge** (low effort, high impact):
Add a `setNonce()` call from the `get-nonce` package in `web/src/lib/csp.ts` so that
`react-style-singleton`'s `<style>` tags receive the CSP nonce. Combined with the existing
`MotionConfig nonce` wrapper, this makes all `<style>` element injection nonce-capable.
**Phase 1 -- Wire `get-nonce` bridge** (low effort, high impact):
Import `setNonce` from the `get-nonce` package in `web/src/lib/csp.ts` and call it with the
result of `getCspNonce()` during module initialization so that `react-style-singleton`'s
`<style>` tags receive the CSP nonce. Combined with the existing `MotionConfig nonce` wrapper,
this makes all `<style>` element injection nonce-capable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/security.md` around lines 174 - 177, Import the setNonce function from
the get-nonce package and call it during your CSP initialization so
react-style-singleton's <style> tags receive the nonce; specifically, add an
import for setNonce (from "get-nonce") and invoke setNonce(...) during the
initialization sequence where the MotionConfig nonce wrapper is applied so all
injected <style> elements become nonce-capable.

Comment on lines +23 to +134
class EmbeddingFineTuneConfig(BaseModel):
"""Optional domain-specific embedding fine-tuning configuration.

Note: checkpoint lookup is not yet implemented in the Mem0
adapter -- this config prepares the data model for the
fine-tuning pipeline.

When checkpoint lookup is implemented, the memory backend will
look for a fine-tuned model checkpoint at ``checkpoint_path``
during initialization. If found, the fine-tuned model will
override the base ``Mem0EmbedderConfig.model``. If not found,
the base model will be used with a logged warning.

Fine-tuning itself runs offline via a separate pipeline, not
during backend initialization. See
``docs/reference/embedding-evaluation.md`` for the full
pipeline design.

Attributes:
enabled: Whether fine-tuning checkpoint lookup is active.
checkpoint_path: Path to the fine-tuned model checkpoint.
Required when ``enabled`` is ``True``.
base_model: Identifier of the base model that was fine-tuned.
Required when ``enabled`` is ``True``.
training_data_dir: Directory containing training data for
the offline fine-tuning pipeline. Not required when
``enabled`` is ``True`` -- only consumed by the
training step, not by checkpoint lookup.
"""

model_config = ConfigDict(frozen=True, allow_inf_nan=False)

enabled: bool = Field(
default=False,
description="Whether fine-tuning checkpoint lookup is active",
)
checkpoint_path: NotBlankStr | None = Field(
default=None,
description="Path to the fine-tuned model checkpoint",
)
base_model: NotBlankStr | None = Field(
default=None,
description="Identifier of the base model that was fine-tuned",
)
training_data_dir: NotBlankStr | None = Field(
default=None,
description=("Directory containing training data for the fine-tuning pipeline"),
)

@model_validator(mode="after")
def _validate_required_when_enabled(self) -> Self:
"""Require checkpoint_path and base_model when fine-tuning is enabled."""
if self.enabled and self.checkpoint_path is None:
msg = "checkpoint_path must be set when fine-tuning is enabled"
logger.warning(
MEMORY_BACKEND_CONFIG_INVALID,
model="EmbeddingFineTuneConfig",
field="checkpoint_path",
enabled=self.enabled,
reason=msg,
)
raise ValueError(msg)
if self.enabled and self.base_model is None:
msg = "base_model must be set when fine-tuning is enabled"
logger.warning(
MEMORY_BACKEND_CONFIG_INVALID,
model="EmbeddingFineTuneConfig",
field="base_model",
enabled=self.enabled,
reason=msg,
)
raise ValueError(msg)
return self

@model_validator(mode="after")
def _reject_path_traversal(self) -> Self:
"""Reject parent-directory traversal and Windows paths.

Consistent with ``Mem0BackendConfig._reject_traversal``.
"""
for field_name in ("checkpoint_path", "training_data_dir"):
val = getattr(self, field_name)
if val is None:
continue
parts = PureWindowsPath(val).parts + PurePosixPath(val).parts
if ".." in parts:
msg = f"{field_name} must not contain parent-directory traversal (..)"
logger.warning(
MEMORY_BACKEND_CONFIG_INVALID,
model="EmbeddingFineTuneConfig",
field=field_name,
value=val,
reason=msg,
)
raise ValueError(msg)
if "\\" in val or (
len(val) >= 2 and val[1] == ":" # noqa: PLR2004
):
msg = (
f"{field_name} must be a POSIX path (no backslashes "
"or drive letters) -- targets Linux containers"
)
logger.warning(
MEMORY_BACKEND_CONFIG_INVALID,
model="EmbeddingFineTuneConfig",
field=field_name,
value=val,
reason=msg,
)
raise ValueError(msg)
return self

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Well-structured config with comprehensive validation.

The EmbeddingFineTuneConfig class follows project conventions: frozen model, allow_inf_nan=False, NotBlankStr for path fields, and thorough validation. The docstring clearly documents the stub nature and planned behavior.

One observation: the path-traversal validation logic (lines 97-133) duplicates Mem0BackendConfig._reject_traversal (lines 215-255). Consider extracting a shared helper to reduce duplication.

♻️ Optional: Extract shared path validation helper
def _validate_posix_path(value: str, field_name: str, model_name: str) -> None:
    """Reject parent-directory traversal and Windows paths.
    
    Raises:
        ValueError: If path contains '..' traversal or Windows-style syntax.
    """
    parts = PureWindowsPath(value).parts + PurePosixPath(value).parts
    if ".." in parts:
        msg = f"{field_name} must not contain parent-directory traversal (..)"
        logger.warning(
            MEMORY_BACKEND_CONFIG_INVALID,
            model=model_name,
            field=field_name,
            value=value,
            reason=msg,
        )
        raise ValueError(msg)
    if "\\" in value or (len(value) >= 2 and value[1] == ":"):
        msg = (
            f"{field_name} must be a POSIX path (no backslashes "
            "or drive letters) -- targets Linux containers"
        )
        logger.warning(
            MEMORY_BACKEND_CONFIG_INVALID,
            model=model_name,
            field=field_name,
            value=value,
            reason=msg,
        )
        raise ValueError(msg)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/memory/backends/mem0/config.py` around lines 23 - 134, The
path-traversal validation in EmbeddingFineTuneConfig._reject_path_traversal
duplicates logic in Mem0BackendConfig._reject_traversal; extract a shared helper
(e.g., _validate_posix_path(value: str, field_name: str, model_name: str)) that
performs the PureWindowsPath/PurePosixPath parts check, parent-dir ("..") check,
and Windows-path checks and emits the same MEMORY_BACKEND_CONFIG_INVALID
warnings, then replace the loop body in
EmbeddingFineTuneConfig._reject_path_traversal and the body of
Mem0BackendConfig._reject_traversal to call this helper for each relevant field
so both validators reuse the same logic.

Comment on lines +11 to +12
3. Wire setNonce() from get-nonce in lib/csp.ts so react-style-singleton
applies nonces to its injected <style> tags (used by Dialog, Popover)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Reference to non-existent setNonce() function.

Step 3 instructs wiring setNonce() from get-nonce in lib/csp.ts, but web/src/lib/csp.ts does not define or export a setNonce() function—it only exports getCspNonce(). The get-nonce package does export setNonce(), but it must be imported and re-exported (or called directly) in lib/csp.ts before this checklist step can be executed.

🔧 Proposed fix: Import and export setNonce from get-nonce

To make this checklist accurate, web/src/lib/csp.ts needs to import and export setNonce from the get-nonce package:

import { setNonce } from 'get-nonce'

export { setNonce }

Then wire it in the module init or in App.tsx:

import { getCspNonce, setNonce } from '@/lib/csp'

const nonce = getCspNonce()
if (nonce) setNonce(nonce)

Based on learnings: Context snippet 1 shows web/src/lib/csp.ts only exports getCspNonce() and does not define or export setNonce(). The checklist references a function that does not exist in the specified file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/index.html` around lines 11 - 12, The checklist references a non-existent
setNonce; fix by importing setNonce from the get-nonce package into the csp
module and re-exporting it alongside getCspNonce so callers can access it (i.e.,
export setNonce from the csp module). Then, in your app initialization (e.g.,
App.tsx), call setNonce(getCspNonce()) or only call setNonce if getCspNonce()
returns a value to wire nonces into the react-style-singleton injector.
Reference symbols: setNonce, getCspNonce, get-nonce, App.tsx.

@Aureliolo Aureliolo merged commit 43dfab3 into main Apr 1, 2026
34 checks passed
@Aureliolo Aureliolo deleted the research/v060-gates branch April 1, 2026 14:33
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 1, 2026 14:33 — with GitHub Actions Inactive
Aureliolo added a commit that referenced this pull request Apr 1, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.5.5](v0.5.4...v0.5.5)
(2026-04-01)


### Features

* add workflow configs to builtin templates
([#963](#963))
([b7fe6e3](b7fe6e3))
* implement Kanban board and Agile sprints workflow types
([#960](#960))
([f511e1d](f511e1d))
* personality preset support in template YAML schema
([#959](#959))
([97ca81e](97ca81e))


### Documentation

* LMEB embedding evaluation + CSP accepted risk
([#695](#695),
[#925](#925))
([#962](#962))
([43dfab3](43dfab3))


### CI/CD

* bump wrangler from 4.78.0 to 4.79.0 in /.github in the all group
across 1 directory
([#955](#955))
([18b4cb1](18b4cb1))


### Maintenance

* bump mypy from 1.19.1 to 1.20.0 in the all group across 1 directory
([#956](#956))
([29cc419](29cc419))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

research: CSP unsafe-inline removal blocked by Radix UI research: LMEB-guided embedding model selection + domain fine-tuning for org memory

2 participants