feat: two-stage safety classifier and cross-provider uncertainty check for approval gates#1090
feat: two-stage safety classifier and cross-provider uncertainty check for approval gates#1090
Conversation
WalkthroughAdds a two-stage LLM-backed safety classifier with information stripping, a cross-provider uncertainty checker, and an in-memory denial tracker to the approval-gate pipeline. Introduces Pydantic configs for safety and uncertainty, new observability event constants, public exports, and modules: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements a two-stage safety classifier and a cross-provider uncertainty checker for security escalations. The safety classifier removes sensitive data and uses an LLM to identify suspicious or blocked actions, while the uncertainty checker evaluates response agreement across multiple providers to detect hallucinations. Feedback identifies several critical issues, including incorrect Python 3 exception handling syntax in multiple files, potential malformed XML generation due to improper string truncation, and dead code resulting from a misunderstanding of the smoothed IDF formula.
| except MemoryError, RecursionError: | ||
| raise |
There was a problem hiding this comment.
The except block uses incorrect syntax for catching multiple exceptions in Python 3. The current code except MemoryError, RecursionError: is interpreted as except MemoryError as RecursionError:, which catches only MemoryError and assigns the exception object to a local variable named RecursionError. This fails to catch actual RecursionError exceptions and shadows the built-in class name within the block.
| except MemoryError, RecursionError: | |
| raise | |
| except (MemoryError, RecursionError): | |
| raise |
| except MemoryError, RecursionError: | ||
| raise |
There was a problem hiding this comment.
The except block uses incorrect syntax for catching multiple exceptions. In Python 3, except E1, E2: is interpreted as except E1 as E2:, which catches only the first exception type and assigns it to a variable named after the second. This fails to catch RecursionError and shadows the built-in class name.
except (MemoryError, RecursionError):
raise| except MemoryError, RecursionError: | ||
| raise |
| safe_desc = html.escape(stripped_description) | ||
| user_content = ( | ||
| "<action>\n" | ||
| f" <tool>{safe_tool}</tool>\n" | ||
| f" <type>{safe_type}</type>\n" | ||
| f" <risk_level>{safe_risk}</risk_level>\n" | ||
| f" <description>{safe_desc}</description>\n" | ||
| "</action>" | ||
| ) | ||
|
|
||
| max_chars = self._config.max_input_tokens * 4 | ||
| if len(user_content) > max_chars: | ||
| user_content = user_content[:max_chars] + "\n... [truncated]" | ||
|
|
||
| return [ | ||
| ChatMessage(role=MessageRole.SYSTEM, content=_SYSTEM_PROMPT), | ||
| ChatMessage(role=MessageRole.USER, content=user_content), | ||
| ] |
There was a problem hiding this comment.
Truncating the entire user_content string after it has been constructed can result in malformed XML if the truncation occurs in the middle of a tag or before the closing tags (e.g., cutting off </description></action>). It is safer to truncate the individual description field specifically before interpolating it into the XML structure.
safe_desc = html.escape(stripped_description)
# Truncate description if it exceeds the token budget (approx 4 chars/token)
max_desc_chars = self._config.max_input_tokens * 4
if len(safe_desc) > max_desc_chars:
safe_desc = safe_desc[:max_desc_chars] + "... [truncated]"
user_content = (
"<action>\n"
f" <tool>{safe_tool}</tool>\n"
f" <type>{safe_type}</type>\n"
f" <risk_level>{safe_risk}</risk_level>\n"
f" <description>{safe_desc}</description>\n"
"</action>"
)
return [
ChatMessage(role=MessageRole.SYSTEM, content=_SYSTEM_PROMPT),
ChatMessage(role=MessageRole.USER, content=user_content),
]
src/synthorg/security/uncertainty.py
Outdated
| # Build TF-IDF vectors. When all documents contain the same | ||
| # terms, IDF is zero for every term (log(N/N) = 0) and all | ||
| # vectors are empty -- this means the documents are identical | ||
| # (or near-identical), so return 1.0. | ||
| tfidf_vecs: list[dict[str, float]] = [] | ||
| for tf in tf_docs: | ||
| vec = {word: tf[word] * idf[word] for word in tf if idf[word] > 0} | ||
| tfidf_vecs.append(vec) | ||
|
|
||
| if all(len(v) == 0 for v in tfidf_vecs): | ||
| return 1.0 |
There was a problem hiding this comment.
The comment regarding IDF being zero is incorrect for the smoothed formula used (log(1 + N/(1+df))), which always returns a value greater than zero for if all(len(v) == 0 for v in tfidf_vecs) is dead code as tfidf_vecs will always contain entries if the responses contain words (which is ensured by the vocab check at line 150).
| # Build TF-IDF vectors. When all documents contain the same | |
| # terms, IDF is zero for every term (log(N/N) = 0) and all | |
| # vectors are empty -- this means the documents are identical | |
| # (or near-identical), so return 1.0. | |
| tfidf_vecs: list[dict[str, float]] = [] | |
| for tf in tf_docs: | |
| vec = {word: tf[word] * idf[word] for word in tf if idf[word] > 0} | |
| tfidf_vecs.append(vec) | |
| if all(len(v) == 0 for v in tfidf_vecs): | |
| return 1.0 | |
| # Build TF-IDF vectors. | |
| tfidf_vecs: list[dict[str, float]] = [] | |
| for tf in tf_docs: | |
| vec = {word: tf[word] * idf[word] for word in tf} | |
| tfidf_vecs.append(vec) |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
There was a problem hiding this comment.
Pull request overview
Adds two optional pre-review safety layers to the approval-gate escalation flow: (1) a two-stage safety classifier with information stripping + LLM classification, and (2) a cross-provider uncertainty check that computes agreement/confidence across multiple LLM providers. Results are propagated via ApprovalItem.metadata and surfaced in the approvals UI.
Changes:
- Introduces
SafetyClassifier(PII/secret stripping + LLM classification) andUncertaintyChecker(Jaccard + TF‑IDF similarity → confidence score). - Integrates both signals into
SecOpsService._handle_escalation()and wires them via the security factory/config. - Updates approvals UI to display suspicious/low-confidence signals; adds unit/integration tests for the new components.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| web/src/pages/approvals/ApprovalDetailDrawer.tsx | Shows safety banner, confidence %, and uses stripped description when available. |
| web/src/pages/approvals/ApprovalCard.tsx | Adds “Suspicious” badge and “Low confidence” indicator on approval cards. |
| src/synthorg/security/safety_classifier.py | New two-stage stripping + LLM tool-call based safety classifier. |
| src/synthorg/security/uncertainty.py | New cross-provider uncertainty checker with pure-Python TF‑IDF + keyword overlap. |
| src/synthorg/security/service.py | Runs classifier/checker during escalation and attaches results to approval metadata. |
| src/synthorg/security/config.py | Adds SafetyClassifierConfig + UncertaintyCheckConfig and extends SecurityConfig. |
| src/synthorg/engine/_security_factory.py | Wires optional LLM-based components when provider infra is available. |
| src/synthorg/observability/events/security.py | Adds event constants for classifier/uncertainty observability. |
| src/synthorg/security/init.py | Exposes new configs/classes from the security package. |
| tests/unit/security/test_information_stripper.py | Tests stripping behavior for credentials/PII/IDs/emails. |
| tests/unit/security/test_safety_classifier.py | Tests classifier tool-call parsing, stripping-before-LLM, and fail-safe behavior. |
| tests/unit/security/test_uncertainty_checker.py | Tests similarity metrics, skip conditions, and error handling. |
| tests/unit/security/test_service_safety_integration.py | Verifies SecOpsService metadata enrichment + blocked auto-reject behavior. |
| tests/unit/engine/test_security_factory_safety.py | Verifies factory wiring for classifier/checker based on config + deps. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async def classify( | ||
| self, | ||
| description: str, | ||
| action_type: str, | ||
| tool_name: str, | ||
| risk_level: ApprovalRiskLevel, | ||
| ) -> SafetyClassifierResult: | ||
| """Run two-stage safety classification. | ||
|
|
||
| Args: | ||
| description: The escalation reason / action description. | ||
| action_type: The action type (``category:action``). | ||
| tool_name: The tool being invoked. | ||
| risk_level: The risk level from the security verdict. | ||
|
|
||
| Returns: | ||
| A ``SafetyClassifierResult`` with the classification, | ||
| stripped description, and reason. | ||
| """ | ||
| start = time.monotonic() | ||
| logger.info( | ||
| SECURITY_SAFETY_CLASSIFY_START, | ||
| tool_name=tool_name, | ||
| action_type=action_type, | ||
| risk_level=risk_level.value, | ||
| ) | ||
|
|
||
| # Stage 1: information stripping. | ||
| stripped = self._stripper.strip(description) | ||
|
|
||
| # Stage 2: LLM classification. | ||
| try: | ||
| return await self._classify_via_llm( | ||
| stripped, | ||
| action_type, | ||
| tool_name, | ||
| risk_level, | ||
| start, | ||
| ) | ||
| except MemoryError, RecursionError: | ||
| raise | ||
| except Exception: | ||
| duration_ms = (time.monotonic() - start) * 1000 | ||
| logger.exception( | ||
| SECURITY_SAFETY_CLASSIFY_ERROR, | ||
| tool_name=tool_name, | ||
| action_type=action_type, | ||
| duration_ms=duration_ms, | ||
| ) | ||
| return SafetyClassifierResult( | ||
| classification=SafetyClassification.SUSPICIOUS, | ||
| stripped_description=stripped, | ||
| reason="Safety classification failed (fail-safe: suspicious)", | ||
| classification_duration_ms=duration_ms, | ||
| ) | ||
|
|
||
| async def _classify_via_llm( | ||
| self, | ||
| stripped_description: str, | ||
| action_type: str, | ||
| tool_name: str, | ||
| risk_level: ApprovalRiskLevel, | ||
| start: float, | ||
| ) -> SafetyClassifierResult: | ||
| """Send stripped description to LLM for classification.""" | ||
| provider_name, driver = self._select_provider() | ||
| if provider_name is None or driver is None: | ||
| duration_ms = (time.monotonic() - start) * 1000 | ||
| return SafetyClassifierResult( | ||
| classification=SafetyClassification.SUSPICIOUS, | ||
| stripped_description=stripped_description, | ||
| reason="No provider available for safety classification", | ||
| classification_duration_ms=duration_ms, | ||
| ) | ||
|
|
||
| model = self._select_model(provider_name) | ||
| messages = self._build_messages( | ||
| stripped_description, | ||
| action_type, | ||
| tool_name, | ||
| risk_level, | ||
| ) | ||
|
|
||
| response = await asyncio.wait_for( | ||
| driver.complete( | ||
| messages, | ||
| model, | ||
| tools=[_SAFETY_VERDICT_TOOL], | ||
| config=CompletionConfig( | ||
| temperature=0.0, | ||
| max_tokens=256, | ||
| ), | ||
| ), | ||
| timeout=self._config.timeout_seconds, | ||
| ) | ||
|
|
||
| return self._parse_response( | ||
| response, | ||
| stripped_description, | ||
| start, | ||
| ) | ||
|
|
||
| def _select_provider( | ||
| self, | ||
| ) -> tuple[str | None, BaseCompletionProvider | None]: | ||
| """Select a provider for safety classification. | ||
|
|
||
| Prefers a cross-family provider. Falls back to the first | ||
| available provider if no cross-family option exists. | ||
| """ | ||
| available = self._registry.list_providers() | ||
| if not available: | ||
| return None, None | ||
|
|
||
| # Try cross-family selection. | ||
| for name in available: | ||
| family = get_family(name, self._configs) | ||
| cross = providers_excluding_family(family, self._configs) | ||
| cross = tuple(p for p in cross if p in available) | ||
| if cross: | ||
| selected = cross[0] | ||
| return selected, self._registry.get(selected) | ||
|
|
||
| # Fallback: use first available. | ||
| name = available[0] | ||
| return name, self._registry.get(name) |
There was a problem hiding this comment.
SafetyClassifier claims cross-family provider selection, but _select_provider() doesn't know the evaluated agent's provider family (SecurityContext has agent_provider_name) and instead just picks any provider from a different family than some other provider. This can end up selecting the same family as the agent, undermining the intended cross-family safety check. Consider passing agent_provider_name (or agent_family) into classify()/ _select_provider() and excluding that family, mirroring LlmSecurityEvaluator’s selection logic; fall back with an explicit warning when no cross-family option exists.
| def __init__( | ||
| self, | ||
| *, | ||
| provider_registry: ProviderRegistry, | ||
| provider_configs: Mapping[str, ProviderConfig], | ||
| model_resolver: ModelResolver, | ||
| config: UncertaintyCheckConfig, | ||
| ) -> None: | ||
| self._registry = provider_registry | ||
| self._configs = provider_configs | ||
| self._resolver = model_resolver | ||
| self._config = config |
There was a problem hiding this comment.
UncertaintyChecker stores provider_configs on self._configs but never uses it anywhere. This makes the API misleading and adds dead state. Either remove provider_configs from the constructor (and factory wiring) or use it (e.g., for provider filtering/family constraints/telemetry).
| async def _call_provider(candidate: ResolvedModel) -> str | None: | ||
| """Call a single provider. | ||
|
|
||
| Inside a TaskGroup, all exceptions must be caught to | ||
| avoid ExceptionGroup propagation (even MemoryError / | ||
| RecursionError -- re-raising them would wrap in an | ||
| ExceptionGroup that escapes outer except clauses). | ||
| """ | ||
| driver: BaseCompletionProvider = self._registry.get( | ||
| candidate.provider_name, | ||
| ) | ||
| try: | ||
| response = await asyncio.wait_for( | ||
| driver.complete( | ||
| messages, | ||
| candidate.model_id, | ||
| config=config, | ||
| ), | ||
| timeout=self._config.timeout_seconds, | ||
| ) | ||
| except Exception: | ||
| logger.exception( | ||
| SECURITY_UNCERTAINTY_CHECK_ERROR, | ||
| provider=candidate.provider_name, | ||
| model=candidate.model_id, | ||
| ) | ||
| return None |
There was a problem hiding this comment.
_collect_responses() catches all Exception, which includes MemoryError/RecursionError, and converts them into a logged provider failure. Swallowing these critical exceptions can leave the process in an undefined state while still returning a confidence score. Align with the rest of the security code by re-raising MemoryError/RecursionError (or handling ExceptionGroup explicitly after the TaskGroup) and only swallowing ordinary provider errors/timeouts.
src/synthorg/security/uncertainty.py
Outdated
| # Build TF-IDF vectors. When all documents contain the same | ||
| # terms, IDF is zero for every term (log(N/N) = 0) and all | ||
| # vectors are empty -- this means the documents are identical | ||
| # (or near-identical), so return 1.0. |
There was a problem hiding this comment.
The TF-IDF comment block is internally inconsistent: it describes smoothed IDF to avoid shared-term IDF=0, but the next comment claims IDF becomes zero when all docs share terms (log(N/N)=0). With the smoothed formula used here, IDF is > 0 even when df==N. Please update/remove the misleading comment so future changes don’t rely on incorrect assumptions.
| # Build TF-IDF vectors. When all documents contain the same | |
| # terms, IDF is zero for every term (log(N/N) = 0) and all | |
| # vectors are empty -- this means the documents are identical | |
| # (or near-identical), so return 1.0. | |
| # Build TF-IDF vectors. With the smoothed IDF above, even terms | |
| # shared by all documents retain a positive weight, so identical | |
| # documents do not normally produce empty vectors. The empty-vector | |
| # check below is kept as a defensive fallback for degenerate cases. |
| # Build optional LLM-based services when provider infrastructure | ||
| # is available. | ||
| has_providers = provider_registry is not None and provider_configs is not None | ||
|
|
||
| llm_evaluator = None | ||
| if has_providers and cfg.llm_fallback.enabled: | ||
| from synthorg.security.llm_evaluator import ( # noqa: PLC0415 | ||
| LlmSecurityEvaluator, | ||
| ) | ||
|
|
||
| llm_evaluator = LlmSecurityEvaluator( | ||
| provider_registry=provider_registry, # type: ignore[arg-type] | ||
| provider_configs=provider_configs, # type: ignore[arg-type] | ||
| config=cfg.llm_fallback, | ||
| ) | ||
|
|
||
| safety_classifier = None | ||
| if has_providers and cfg.safety_classifier.enabled: | ||
| from synthorg.security.safety_classifier import ( # noqa: PLC0415 | ||
| SafetyClassifier, | ||
| ) | ||
|
|
||
| safety_classifier = SafetyClassifier( | ||
| provider_registry=provider_registry, # type: ignore[arg-type] | ||
| provider_configs=provider_configs, # type: ignore[arg-type] | ||
| config=cfg.safety_classifier, | ||
| ) | ||
|
|
||
| uncertainty_checker = None | ||
| if has_providers and model_resolver is not None and cfg.uncertainty_check.enabled: | ||
| from synthorg.security.uncertainty import ( # noqa: PLC0415 |
There was a problem hiding this comment.
When safety_classifier/uncertainty_check are enabled in config but provider infrastructure (provider_registry/provider_configs and/or model_resolver) is not passed, the features will silently be disabled (objects remain None). This makes it easy to misconfigure production. Consider logging a warning (or raising in strict mode) when cfg.safety_classifier.enabled or cfg.uncertainty_check.enabled but required dependencies are missing.
| const riskColor = getRiskLevelColor(approval.risk_level) | ||
| const urgencyColor = getUrgencyColor(approval.urgency_level) | ||
| const isPending = approval.status === 'pending' | ||
| const isSuspicious = approval.metadata.safety_classification === 'suspicious' |
There was a problem hiding this comment.
UI only treats safety_classification === 'suspicious'. Backend can also emit 'blocked' when auto_reject_blocked is disabled, but the card will show no badge even though it's higher severity. Consider handling 'blocked' explicitly (e.g., danger-styled badge) and keeping 'suspicious' as warning-styled.
| const isSuspicious = approval.metadata.safety_classification === 'suspicious' | |
| const safetyClassification = approval.metadata.safety_classification | |
| const isBlocked = safetyClassification === 'blocked' | |
| const isSuspicious = safetyClassification === 'suspicious' | |
| const safetyBadgeLabel = isBlocked ? 'Blocked' : isSuspicious ? 'Suspicious' : null | |
| const safetyBadgeClasses = isBlocked | |
| ? 'border-red-200 bg-red-50 text-red-700' | |
| : isSuspicious | |
| ? 'border-amber-200 bg-amber-50 text-amber-700' | |
| : null |
| {/* Safety warning banner */} | ||
| {approval.metadata.safety_classification === 'suspicious' && ( | ||
| <div className="flex items-center gap-2 rounded-lg border border-warning/30 bg-warning/10 px-3 py-2"> | ||
| <AlertTriangle className="size-4 text-warning shrink-0" aria-hidden="true" /> | ||
| <span className="text-sm text-warning"> | ||
| This action has been flagged as suspicious by the safety classifier. | ||
| </span> | ||
| <p className="mt-1 text-sm text-secondary">{approval.description}</p> | ||
| </div> | ||
| )} |
There was a problem hiding this comment.
Detail drawer banner only renders for safety_classification === 'suspicious'. If 'blocked' approvals are allowed (auto_reject_blocked=false), they won’t show a prominent warning despite being the most severe classification. Consider adding a separate banner style/message for 'blocked'.
| </div> | ||
| )} | ||
|
|
||
| {/* Description (with stripped view toggle when available) */} |
There was a problem hiding this comment.
The inline comment says “with stripped view toggle when available”, but DescriptionSection always shows stripped_description (or description) with no toggle UI. Either implement the toggle (raw vs stripped) or update the comment to match current behavior.
| {/* Description (with stripped view toggle when available) */} | |
| {/* Description */} |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1090 +/- ##
==========================================
+ Coverage 89.50% 89.53% +0.02%
==========================================
Files 752 755 +3
Lines 44132 44576 +444
Branches 4427 4487 +60
==========================================
+ Hits 39501 39910 +409
- Misses 3841 3862 +21
- Partials 790 804 +14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web/src/pages/approvals/ApprovalCard.tsx`:
- Around line 31-34: The frontend currently hardcodes the 0.5 threshold when
computing showLowConfidence using approval.metadata.confidence_score; update the
logic to read a backend-provided threshold or flag instead. Change the check
that computes showLowConfidence (references: confidenceRaw, confidenceScore,
showLowConfidence, isSuspicious) to first look for either
approval.metadata.low_confidence_flagged (preferred boolean) or
approval.metadata.low_confidence_threshold (number) and use that value to
determine low confidence, falling back to 0.5 only if neither field exists;
ensure parsing of confidenceRaw and the threshold is robust (parseFloat and
Number.isNaN checks) before comparison.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 12510270-299e-4ffa-a685-bd6712e2b6c9
📒 Files selected for processing (14)
src/synthorg/engine/_security_factory.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/__init__.pysrc/synthorg/security/config.pysrc/synthorg/security/safety_classifier.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pytests/unit/engine/test_security_factory_safety.pytests/unit/security/test_information_stripper.pytests/unit/security/test_safety_classifier.pytests/unit/security/test_service_safety_integration.pytests/unit/security/test_uncertainty_checker.pyweb/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Agent
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web
- GitHub Check: Build Sandbox
- GitHub Check: Build Backend
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (7)
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
web/src/**/*.{ts,tsx}: TypeScript files in web dashboard must reuse existing components from web/src/components/ui/ before creating new ones.
React dashboard must use TypeScript 6.0+, React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand,@tanstack/react-query. No hardcoded styles—use design tokens.
Linting and pre-commit checks must not be bypassed—ESLint web dashboard (zero warnings) is non-negotiable.
web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground,bg-card,text-accent,text-success,bg-danger, etc.) or CSS variables (var(--so-*)) for colors; NEVER hardcode hex values in.tsx/.tsfiles
Usefont-sansorfont-mono(Geist tokens) for typography; NEVER setfontFamilydirectly in.tsx/.tsfiles
Use density-aware tokens (p-card,gap-section-gap,gap-grid-gap) or standard Tailwind spacing; NEVER hardcode pixel values for layout spacing in components
Use token variables (var(--so-shadow-card-hover),border-border,border-bright) for shadows and borders; NEVER hardcode values in.tsx/.tsfiles
Use@/lib/motionpresets for Framer Motion transition durations; NEVER hardcode transition durations
CSS side-effect imports in TypeScript 6 require type declarations -- add/// <reference types="vite/client" />at the top of files with CSS imports
Files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
web/src/**/*.{ts,tsx,css}
📄 CodeRabbit inference engine (CLAUDE.md)
web/src/**/*.{ts,tsx,css}: Never hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions in web dashboard code—use design tokens and@/lib/motionpresets.
Web dashboard scripts/check_web_design_system.py enforces component reuse and design token usage on every Edit/Write to web/src/.
Files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
web/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,js,jsx}: Always usecreateLoggerfrom@/lib/logger-- never bareconsole.warn/console.error/console.debugin application code
Logger variable name must always beconst log(e.g.const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate arguments to logger methods (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do NOT usefrom __future__ import annotationsin Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) in Python 3.14 code—ruff enforces this.
All public functions in Python must have type hints. Use mypy strict mode for type-checking.
Use Google-style docstrings on all public classes and functions in Python. This is enforced by ruff D rules.
Use NotBlankStr (from core.types) for all identifier/name fields in Python—including optional (NotBlankStr | None) and tuple variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in Python—prefer structured concurrency over bare create_task.
Python line length must be 88 characters (enforced by ruff).
Python functions must be under 50 lines, files under 800 lines.
Handle errors explicitly in Python, never silently swallow exceptions.
Always use variable namelogger(not_loggerorlog) for the logging instance in Python.
Lint Python withuv run ruff check src/ tests/. Auto-fix withuv run ruff check src/ tests/ --fix. Format withuv run ruff format src/ tests/.
Type-check Python withuv run mypy src/ tests/(strict mode).
Files:
src/synthorg/engine/_security_factory.pytests/unit/security/test_safety_classifier.pytests/unit/security/test_information_stripper.pysrc/synthorg/security/config.pytests/unit/security/test_uncertainty_checker.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/safety_classifier.pytests/unit/engine/test_security_factory_safety.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/__init__.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every Python module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging,logging.getLogger(), orprint()in Python application code. Exceptions: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, observability/http_handler.py may use stdlib logging and print.
Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT
Use structured logging withlogger.info(EVENT, key=value)syntax in Python—neverlogger.info('msg %s', val)
All error paths in Python must log at WARNING or ERROR with context before raising.
All state transitions in Python must log at INFO.
DEBUG logging is for object creation, internal flow, and entry/exit of key functions in Python.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned Python code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases. Exceptions: Operations design page, .claude/ skill files, third-party imports, provider presets (user-facing runtime data).
Library reference in docs/api/ is auto-generated via mkdocstrings + Griffe (AST-based).
Files:
src/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/safety_classifier.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/__init__.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/safety_classifier.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/__init__.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Test markers in Python:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Python test coverage must be 80% minimum (enforced in CI).
Use@pytest.mark.parametrizefor testing similar cases in Python.
Use test-provider, test-small-001, etc. vendor-agnostic names in Python tests.
Use Hypothesis property-based testing in Python with@given+@settingsdecorators. Configure profiles: ci (deterministic, max_examples=10, derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Control via HYPOTHESIS_PROFILE env var.
When Hypothesis finds a failure in Python tests, fix the underlying bug and add an@example(...) decorator to permanently cover the case in CI.
Never skip, dismiss, or ignore flaky Python tests—fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep(). For tasks that must block indefinitely, use asyncio.Event().wait() instead of asyncio.sleep(large_number).
Run Python unit tests withuv run python -m pytest tests/ -m unit -n 8.
Run Python integration tests withuv run python -m pytest tests/ -m integration -n 8.
Run Python e2e tests withuv run python -m pytest tests/ -m e2e -n 8.
Files:
tests/unit/security/test_safety_classifier.pytests/unit/security/test_information_stripper.pytests/unit/security/test_uncertainty_checker.pytests/unit/security/test_service_safety_integration.pytests/unit/engine/test_security_factory_safety.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/security/test_safety_classifier.pytests/unit/security/test_information_stripper.pytests/unit/security/test_uncertainty_checker.pytests/unit/security/test_service_safety_integration.pytests/unit/engine/test_security_factory_safety.py
🧠 Learnings (37)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
📚 Learning: 2026-03-27T12:44:29.466Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T12:44:29.466Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from `web/src/components/ui/` (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem) before creating new ones
Applied to files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from web/src/components/ui/ (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast/ToastContainer, Skeleton variants, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem, Drawer, form fields, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor) before creating new components
Applied to files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.{tsx,ts} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup, Drawer, InputField, SelectField, SliderField, ToggleField, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor, TagInput, MetadataGrid, ProjectStatusBadge, ContentTypeBadge)
Applied to files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📚 Learning: 2026-03-31T14:31:11.894Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:31:11.894Z
Learning: Applies to web/src/**/*.{ts,tsx} : Use React 19, TypeScript 6.0+, and design system tokens from shadcn/ui + Tailwind CSS 4 + Radix UI in web dashboard
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.{ts,tsx} : Do NOT build card-with-header layouts from scratch; use `<SectionCard>`
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/src/**/*.{ts,tsx} : Web dashboard shadows/borders: use token variables (var(--so-shadow-card-hover), border-border, border-bright); never hardcode shadow or border values
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to web/src/**/*.{ts,tsx} : React dashboard must use TypeScript 6.0+, React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand, tanstack/react-query. No hardcoded styles—use design tokens.
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.{ts,tsx} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones; refer to design system inventory (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, etc.)
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.{tsx,ts} : Use `color?` and `animated?` props for Sparkline component (inline SVG trend lines)
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/components/ui/**/*.tsx : Use design tokens exclusively in new components -- no hardcoded colors, fonts, or spacing
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : Import `cn` from `@/lib/utils` for conditional class merging in component files
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : For Base UI primitives, import from specific subpaths (e.g. `import { Dialog } from 'base-ui/react/dialog'`) and use the component's `render` prop for polymorphism
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-27T22:32:26.927Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T22:32:26.927Z
Learning: Applies to web/src/components/ui/*.{tsx,ts} : For new shared React components: place in web/src/components/ui/ with kebab-case filename, create .stories.tsx with all states, export props as TypeScript interface, use design tokens exclusively
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-01T20:43:51.878Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T20:43:51.878Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from `web/src/components/ui/` before creating new ones. Never hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions -- use design tokens and `@/lib/motion` presets.
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Do NOT recreate status dots inline -- use `<StatusBadge>` from `@/components/ui/status-badge`
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Applied to files:
src/synthorg/engine/_security_factory.pytests/unit/security/test_safety_classifier.pytests/unit/security/test_information_stripper.pysrc/synthorg/security/config.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/safety_classifier.pytests/unit/engine/test_security_factory_safety.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/__init__.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
src/synthorg/engine/_security_factory.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Applied to files:
src/synthorg/engine/_security_factory.pytests/unit/security/test_safety_classifier.pytests/unit/security/test_information_stripper.pysrc/synthorg/security/config.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/safety_classifier.pytests/unit/engine/test_security_factory_safety.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/__init__.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Applied to files:
src/synthorg/engine/_security_factory.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.pysrc/synthorg/security/__init__.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-04-01T09:37:49.451Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:37:49.451Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/api/**/*.py : Authentication uses JWT + API key. Approval gate integration for high-risk operations.
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module
Applied to files:
src/synthorg/observability/events/security.py
🔇 Additional comments (53)
src/synthorg/security/config.py (3)
211-238: LGTM! Well-structured safety classifier configuration.The config follows established patterns: frozen Pydantic model,
NotBlankStr | Nonefor optional model identifier, appropriate field constraints (gt=0.0for timeout,gt=0for tokens), and conservative defaults (enabled=False,auto_reject_blocked=True).
240-268: LGTM! Clean uncertainty check configuration.Field constraints are well-designed:
min_providers >= 2ensures meaningful comparison,low_confidence_thresholdis bounded to[0.0, 1.0], and the docstring clearly explains the skip behavior when insufficient providers are available.
314-319: LGTM! Integration into SecurityConfig follows existing patterns.Using
Field(default_factory=...)maintains consistency withrule_engineandllm_fallbackfields. Both features default to disabled, preserving backward compatibility.web/src/pages/approvals/ApprovalCard.tsx (1)
139-159: LGTM! Clean badge implementation with proper accessibility.The conditional badges use semantic color tokens (
text-warning,border-warning/30,bg-warning/10) and include appropriatearia-labelattributes. TheAlertTriangleicon is correctly markedaria-hidden="true".web/src/pages/approvals/ApprovalDetailDrawer.tsx (3)
258-266: LGTM! Accessible safety warning banner.The banner uses semantic color tokens and includes the
AlertTriangleicon witharia-hidden="true". The warning text clearly explains the classification status.
296-305: LGTM! Metadata fields properly integrated.The confidence percentage and safety classification are displayed using the existing
MetaFieldpattern. TheShieldicon choice is appropriate for security-related metadata.
425-437: LGTM! Clean helper component for description display.
DescriptionSectioncorrectly prioritizesstripped_descriptionwhen available, falling back to the original description. The component is appropriately scoped as a local helper.src/synthorg/engine/_security_factory.py (2)
102-143: LGTM! Conditional instantiation logic is correct.The
has_providersguard ensures LLM-based services are only created when provider infrastructure is available. The conditional imports avoid loading heavy dependencies when features are disabled.
48-57: Critical:safety_classifieranduncertainty_checkerfeatures are inoperative in AgentEngine's security interceptor path.The factory function accepts
provider_registry,provider_configs, andmodel_resolverto enable LLM-based security features. However:
- AgentEngine stores only
provider_registrybut lacksprovider_configsandmodel_resolverentirely—these are not parameters inAgentEngine.__init__.- The only production call site in
_make_security_interceptor()(lines 1605-1610) passes onlysecurity_config,audit_log,approval_store, andeffective_autonomy.- Result:
safety_classifieranduncertainty_checkerwill always beNonewhen the security interceptor is created through AgentEngine.This means these core security features are non-functional in the main agent execution path. Either
AgentEngineneeds to accept and passprovider_configsandmodel_resolver, or the factory signature should reflect that these features are only available when explicitly wired by external callers.tests/unit/security/test_safety_classifier.py (3)
1-50: LGTM! Well-structured test helpers.The helpers follow test conventions: vendor-agnostic names (
test-small-001,provider-a), clear separation of concerns between tool call creation and completion creation.
90-139: LGTM! Comprehensive classification result tests.Tests cover all three classification outcomes (SAFE, SUSPICIOUS, BLOCKED) with appropriate assertions on both classification and reason fields.
213-234: LGTM! Proper timeout test implementation.Using
asyncio.Event().wait()instead ofasyncio.sleep()for the blocking call is the correct approach per project guidelines, ensuring the test doesn't have timing-related flakiness.tests/unit/security/test_uncertainty_checker.py (3)
85-137: LGTM! Thorough coverage of similarity functions.The tests cover important edge cases: identical texts, completely different texts, partial overlap, single response, and empty responses. Using
pytest.approxfor float comparisons is appropriate.
187-209: LGTM! Skip condition tests verify graceful degradation.Tests correctly verify that insufficient providers (single provider or no
model_ref) result inconfidence_score=1.0andprovider_count=0, matching the expected skip behavior documented in the config.
271-301: LGTM! Model validation tests ensure data integrity.Tests verify the
UncertaintyResultmodel is frozen (immutable) and rejects out-of-bounds confidence scores (negative and >1.0), aligning with the Pydantic field constraints.tests/unit/security/test_information_stripper.py (3)
17-30: LGTM! Clean text preservation tests.Tests verify that non-sensitive content, empty strings, and normal log output pass through unchanged. This prevents over-stripping that could obscure legitimate information.
36-68: LGTM! Comprehensive credential stripping coverage.Tests cover major credential types: AWS access keys, SSH private keys, bearer tokens, generic API keys, and GitHub PATs. Each test verifies both removal of the sensitive value and presence of the
[CREDENTIAL]placeholder.
166-188: LGTM! Mixed content tests verify stripping precision.The multi-pattern test ensures all sensitive categories are independently detected and replaced. The context preservation test (
/src/config.pyintact) confirms the stripper doesn't over-aggressively remove legitimate content.src/synthorg/security/__init__.py (3)
34-40: LGTM! Config imports properly extended.New configuration types (
SafetyClassifierConfig,UncertaintyCheckConfig) are imported alongside existing config types, maintaining the established grouping pattern.
71-81: LGTM! New safety and uncertainty types exported.The new public API additions (
SafetyClassifier,SafetyClassification,SafetyClassifierResult,InformationStripper,UncertaintyChecker,UncertaintyResult) are properly imported and will be available for external consumers.
93-127: LGTM!__all__maintains alphabetical ordering.New exports are inserted in the correct alphabetical positions, maintaining consistency with the existing list organization.
tests/unit/security/test_service_safety_integration.py (6)
1-83: LGTM: Well-structured test helpers and imports.The helper functions provide clean abstractions for building test fixtures. Using real
RuleEngineandRiskClassifierwith empty rules ensures the service behaves as in production while isolating the safety/uncertainty features under test.
88-117: LGTM: Blocked auto-reject test correctly verifies DENY verdict and no store call.The test properly validates that
BLOCKEDclassification with defaultauto_reject_blocked=TruereturnsDENYand bypasses the approval store.
118-188: LGTM: Metadata enrichment and fail-safe tests are thorough.Tests correctly verify:
SUSPICIOUSclassification populates all expected metadata keysSAFEclassification proceeds normally with metadata- Classifier errors still create approval items (fail-safe behavior)
193-240: LGTM: Uncertainty checker integration tests verify metadata and error handling.The tests correctly validate that confidence scores and similarity metrics are stored as strings in metadata, and that checker failures still allow approval item creation.
245-331: LGTM: Combined feature tests verify correct interaction.Key behaviors verified:
- Both features contribute metadata simultaneously
BLOCKEDclassification skips uncertainty check (optimization)- No features configured leaves metadata without safety/uncertainty keys
333-370: LGTM:auto_reject_blocked=Falsetest validates configurable behavior.Correctly verifies that with
auto_reject_blocked=False, aBLOCKEDclassification creates an approval item withsafety_classification="blocked"instead of auto-rejecting.src/synthorg/security/safety_classifier.py (10)
1-59: LGTM: Module structure and imports are well-organized.The module docstring clearly documents the two-stage design and invariants. Reusing existing credential and PII patterns ensures consistency with the rest of the security subsystem.
60-100: LGTM: Pattern definitions and control character handling are comprehensive.The control character regex covers ASCII control codes, Unicode bidi overrides (U+202A-202E, U+2066-2069), zero-width characters, and BOM—properly mitigating text injection attacks in LLM-returned reasons.
102-138: LGTM: Enum and result model are well-defined.
SafetyClassificationusesStrEnumfor JSON-friendly serialization.SafetyClassifierResultis frozen withallow_inf_nan=Falseand proper field constraints.
140-191: LGTM: InformationStripper applies patterns in correct order.Credentials first (most specific), then PII, UUIDs, internal IDs, and finally emails (to avoid double-matching email-like credential patterns). The structured logging with original/stripped lengths aids debugging without leaking sensitive data.
193-248: LGTM: Tool schema and system prompt are well-crafted.The prompt explicitly warns the LLM that field values are sanitized and instructs it not to follow embedded instructions—good defense against prompt injection. The tool schema enforces required fields and
additionalProperties: false.
250-333: LGTM: SafetyClassifier.classify has proper fail-safe behavior.Errors default to
SUSPICIOUSclassification (neither auto-reject nor mark safe), matching the documented fail-safe invariant. Theexcept MemoryError, RecursionError: raisepattern correctly propagates system errors.
334-379: LGTM: LLM classification with timeout and cross-family provider selection.The
asyncio.wait_forwithtimeout_secondsfrom config ensures bounded execution. Provider selection prefers cross-family for independence.
380-416: Provider fallback returns(name, None)when registry.get fails.If
self._registry.get(selected)orself._registry.get(name)returnsNone(e.g., provider registered but driver unavailable), the code returns a tuple with(name, None). The caller at line 344 checksdriver is Noneso this is handled, but note thatprovider_namewould be non-None in the fallback path.
417-451: LGTM: Message building with XML-escaping and input stripping.All interpolated values go through both
InformationStripper.strip()andhtml.escape(), providing defense-in-depth against injection. Truncation with... [truncated]marker preserves context for the LLM.
452-517: LGTM: Response parsing with validation and sanitization.Invalid classifications default to
SUSPICIOUS. The reason is stripped of control characters and truncated to_MAX_REASON_LENGTH. Missing tool calls also default toSUSPICIOUSwith appropriate logging.tests/unit/engine/test_security_factory_safety.py (2)
1-93: LGTM: SafetyClassifier wiring tests cover all conditional paths.Tests correctly verify:
- Wired when
enabled=Trueand providers available- Not wired when
enabled=False- Not wired when providers missing (even with
enabled=True)The mock setup is minimal and sufficient.
95-147: LGTM: UncertaintyChecker wiring tests validate resolver dependency.Tests correctly verify that the uncertainty checker requires all three:
enabled=True, provider infrastructure, ANDmodel_resolver. The explicit comment# model_resolver not providedat line 143 documents the test intent clearly.src/synthorg/security/uncertainty.py (7)
1-51: LGTM: Module structure and design invariants are clearly documented.The docstring explicitly states the stdlib-only constraint for TF-IDF, skip behaviors, and timeout handling. Word tokenization regex is simple and effective.
53-82: LGTM: UncertaintyResult model with proper validation.Fields have appropriate range constraints (
ge=0.0, le=1.0for scores), optional similarity metrics default toNone, and the model is frozen withallow_inf_nan=False.
84-124: LGTM: Keyword overlap (Jaccard similarity) handles edge cases correctly.Returns 1.0 for single response, empty responses, or all-empty word sets. Pairwise comparison is O(n²) but acceptable given the small number of providers (typically 2-3).
126-199: LGTM: TF-IDF cosine similarity with smoothed IDF.The smoothed IDF formula
log(1 + N / (1 + df))correctly handles the 2-document case where standard IDF would zero out shared terms. Edge cases (empty vocab, all-zero vectors) return 1.0.
201-277: LGTM: UncertaintyChecker initialization and skip paths.Skip paths are well-documented:
- No
model_refconfigured → confidence 1.0- Insufficient providers → confidence 1.0
Both log appropriate events with context.
278-337: LGTM: Main check logic with low-confidence warning.The confidence formula weights embedding similarity (0.6) higher than keyword overlap (0.4), which is reasonable given TF-IDF captures more semantic information. Low-confidence results trigger a WARNING log with full context.
338-405: Broad exception catch in TaskGroup subtask is intentional but should be documented.The comment at lines 359-362 explains that catching all exceptions (including
MemoryError/RecursionError) is necessary to preventExceptionGrouppropagation fromTaskGroup. This is a correct and intentional pattern for structured concurrency with fail-safe semantics, though it differs from the rest of the codebase.src/synthorg/observability/events/security.py (1)
54-68: LGTM: New event constants follow established naming conventions.Safety classifier and uncertainty check events are properly namespaced under
security.*and use the sameFinal[str]pattern as existing events. The event names are descriptive and align with their usage in the new security modules.src/synthorg/security/service.py (6)
22-38: LGTM: New observability event imports for safety and uncertainty features.The imports align with the newly added constants in
events/security.pyand are used appropriately in the error/blocked logging paths.
63-73: LGTM: TYPE_CHECKING imports for new classifier and checker types.Proper use of
TYPE_CHECKINGblock to avoid circular imports while providing type hints.
101-166: LGTM: Constructor extended with optional safety/uncertainty dependencies.Docstrings clearly document the purpose and behavior of each new parameter. The dependencies are stored as instance attributes for use in
_handle_escalation.
529-640: LGTM:_handle_escalationintegrates safety classification and uncertainty check.The implementation correctly:
- Runs safety classification first (can short-circuit on BLOCKED)
- Uses
stripped_descriptionfor theApprovalItem.descriptionand uncertainty check (prevents PII broadcast)- Auto-rejects BLOCKED before creating approval item
- Falls back to
verdict.reasonwhen no stripped description is availableThe mutable
metadatadict pattern is clean and efficient.
642-686: LGTM:_run_safety_classifierwith proper fail-safe behavior.Returns
Trueonly for BLOCKED + auto_reject_blocked. On error, returnsFalsewithout modifying metadata—escalation proceeds to human review. Thetype: ignore[union-attr]is justified since the caller guards withif self._safety_classifier is not None.
687-711: LGTM:_run_uncertainty_checkwith error metadata sentinel.On error, sets
metadata["uncertainty_check_error"] = "true"to indicate the check failed—useful for frontend display and debugging. The optional fields are only added when present, keeping metadata clean.
| const isSuspicious = approval.metadata.safety_classification === 'suspicious' | ||
| const confidenceRaw = approval.metadata.confidence_score | ||
| const confidenceScore = confidenceRaw ? parseFloat(confidenceRaw) : NaN | ||
| const showLowConfidence = !Number.isNaN(confidenceScore) && confidenceScore < 0.5 |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider exposing the threshold from backend metadata.
The 0.5 threshold is hardcoded here but also defined in UncertaintyCheckConfig.low_confidence_threshold on the backend. If the server threshold changes, the frontend will display inconsistent "low confidence" indicators.
Consider having the backend include low_confidence_flagged: boolean in metadata, or include the threshold value for the frontend to compare against.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/src/pages/approvals/ApprovalCard.tsx` around lines 31 - 34, The frontend
currently hardcodes the 0.5 threshold when computing showLowConfidence using
approval.metadata.confidence_score; update the logic to read a backend-provided
threshold or flag instead. Change the check that computes showLowConfidence
(references: confidenceRaw, confidenceScore, showLowConfidence, isSuspicious) to
first look for either approval.metadata.low_confidence_flagged (preferred
boolean) or approval.metadata.low_confidence_threshold (number) and use that
value to determine low confidence, falling back to 0.5 only if neither field
exists; ensure parsing of confidenceRaw and the threshold is robust (parseFloat
and Number.isNaN checks) before comparison.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/security/safety_classifier.py`:
- Around line 180-218: The strip method in InformationStripper never applies the
_CONTROL_CHAR_RE pattern so control characters (bidi/zero-width) leak through;
update the strip(self, text: str) implementation to run
_CONTROL_CHAR_RE.sub(...) on the working string near the start of the function
(Stage 1) — e.g., immediately after the empty-check and before applying
_CREDENTIAL_STRIP_PATTERNS — so control characters are removed from result
before other substitutions (reference symbols: strip, _CONTROL_CHAR_RE,
_CREDENTIAL_STRIP_PATTERNS, result).
In `@src/synthorg/security/service.py`:
- Around line 582-593: The current fan-out uses verdict.reason when metadata
lacks "stripped_description", which can leak raw escalation text; change the
block around self._uncertainty_checker to never pass verdict.reason into
_run_uncertainty_check — instead only use metadata["stripped_description"] (or
another explicitly redacted field) and if that's missing, skip the uncertainty
check (or pass a fixed redacted placeholder) and log/record the absence; update
references to check_text, metadata, and the call to _run_uncertainty_check
accordingly so no unredacted verdict.reason is sent to providers.
- Around line 728-759: The ESCALATE branch in _handle_blocked_denial currently
returns True which causes auto-rejection and prevents human escalation; instead
call the escalation handler and return its result. Replace the final "return
True" inside the if action == DenialAction.ESCALATE block with a call to
self._handle_escalation(agent_id, tool_name, reason, metadata) (preserving the
existing metadata entries and log), so that DenialAction.ESCALATE triggers
_handle_escalation and its boolean outcome rather than forcibly returning True.
- Around line 778-792: The code is incorrectly treating a "skipped" or
single-provider check as a genuine confidence score 1.0; update the block that
sets metadata to (1) include result.provider_count in metadata (e.g.,
metadata["provider_count"] = str(result.provider_count)) and (2) only serialize
result.confidence_score into metadata["confidence_score"] when
result.provider_count > 1 and result.skipped is false (or when a provided flag
indicates the check was actually performed); otherwise set
metadata["confidence_score"] to a sentinel like "skipped" (or omit it) so
callers can distinguish true high-confidence from skipped/single-provider cases.
Ensure you reference UncertaintyChecker.check()'s returned fields
(result.confidence_score, result.provider_count, result.skipped) when making
this change.
In `@src/synthorg/security/uncertainty.py`:
- Around line 243-270: The list returned by self._resolver.resolve_all(...)
contains model variants, not unique providers, so deduplicate candidates by
candidate.provider_name before doing the min-provider check and before creating
fan-out tasks; update the logic in the Uncertainty check (around the call to
resolve_all and before computing len(candidates) and before calling
self._collect_responses) to build a unique_providers list/map keyed by
provider_name, use its length for the min_providers comparison and pass only the
unique provider entries to _collect_responses (and apply the same deduplication
at the other occurrence noted near lines 382-383); reference symbols:
resolve_all, self._config.min_providers, UncertaintyResult creation, logger
entries SECURITY_UNCERTAINTY_CHECK_SKIPPED/START, and _collect_responses.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 88fc3908-9205-42ce-b5a5-6a2a9ed77374
📒 Files selected for processing (14)
src/synthorg/engine/_security_factory.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/__init__.pysrc/synthorg/security/config.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/security/safety_classifier.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pytests/unit/security/test_denial_tracker.pytests/unit/security/test_service_safety_integration.pytests/unit/security/test_uncertainty_checker.pyweb/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Web
- GitHub Check: Build Backend
- GitHub Check: Build Sandbox
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (7)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do NOT usefrom __future__ import annotationsin Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) in Python 3.14 code—ruff enforces this.
All public functions in Python must have type hints. Use mypy strict mode for type-checking.
Use Google-style docstrings on all public classes and functions in Python. This is enforced by ruff D rules.
Use NotBlankStr (from core.types) for all identifier/name fields in Python—including optional (NotBlankStr | None) and tuple variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in Python—prefer structured concurrency over bare create_task.
Python line length must be 88 characters (enforced by ruff).
Python functions must be under 50 lines, files under 800 lines.
Handle errors explicitly in Python, never silently swallow exceptions.
Always use variable namelogger(not_loggerorlog) for the logging instance in Python.
Lint Python withuv run ruff check src/ tests/. Auto-fix withuv run ruff check src/ tests/ --fix. Format withuv run ruff format src/ tests/.
Type-check Python withuv run mypy src/ tests/(strict mode).
Files:
src/synthorg/security/__init__.pytests/unit/security/test_denial_tracker.pytests/unit/security/test_uncertainty_checker.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/config.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every Python module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging,logging.getLogger(), orprint()in Python application code. Exceptions: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, observability/http_handler.py may use stdlib logging and print.
Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT
Use structured logging withlogger.info(EVENT, key=value)syntax in Python—neverlogger.info('msg %s', val)
All error paths in Python must log at WARNING or ERROR with context before raising.
All state transitions in Python must log at INFO.
DEBUG logging is for object creation, internal flow, and entry/exit of key functions in Python.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned Python code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases. Exceptions: Operations design page, .claude/ skill files, third-party imports, provider presets (user-facing runtime data).
Library reference in docs/api/ is auto-generated via mkdocstrings + Griffe (AST-based).
Files:
src/synthorg/security/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/config.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/security/__init__.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/config.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Test markers in Python:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Python test coverage must be 80% minimum (enforced in CI).
Use@pytest.mark.parametrizefor testing similar cases in Python.
Use test-provider, test-small-001, etc. vendor-agnostic names in Python tests.
Use Hypothesis property-based testing in Python with@given+@settingsdecorators. Configure profiles: ci (deterministic, max_examples=10, derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Control via HYPOTHESIS_PROFILE env var.
When Hypothesis finds a failure in Python tests, fix the underlying bug and add an@example(...) decorator to permanently cover the case in CI.
Never skip, dismiss, or ignore flaky Python tests—fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep(). For tasks that must block indefinitely, use asyncio.Event().wait() instead of asyncio.sleep(large_number).
Run Python unit tests withuv run python -m pytest tests/ -m unit -n 8.
Run Python integration tests withuv run python -m pytest tests/ -m integration -n 8.
Run Python e2e tests withuv run python -m pytest tests/ -m e2e -n 8.
Files:
tests/unit/security/test_denial_tracker.pytests/unit/security/test_uncertainty_checker.pytests/unit/security/test_service_safety_integration.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/security/test_denial_tracker.pytests/unit/security/test_uncertainty_checker.pytests/unit/security/test_service_safety_integration.py
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
web/src/**/*.{ts,tsx}: TypeScript files in web dashboard must reuse existing components from web/src/components/ui/ before creating new ones.
React dashboard must use TypeScript 6.0+, React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand,@tanstack/react-query. No hardcoded styles—use design tokens.
Linting and pre-commit checks must not be bypassed—ESLint web dashboard (zero warnings) is non-negotiable.
web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground,bg-card,text-accent,text-success,bg-danger, etc.) or CSS variables (var(--so-*)) for colors; NEVER hardcode hex values in.tsx/.tsfiles
Usefont-sansorfont-mono(Geist tokens) for typography; NEVER setfontFamilydirectly in.tsx/.tsfiles
Use density-aware tokens (p-card,gap-section-gap,gap-grid-gap) or standard Tailwind spacing; NEVER hardcode pixel values for layout spacing in components
Use token variables (var(--so-shadow-card-hover),border-border,border-bright) for shadows and borders; NEVER hardcode values in.tsx/.tsfiles
Use@/lib/motionpresets for Framer Motion transition durations; NEVER hardcode transition durations
CSS side-effect imports in TypeScript 6 require type declarations -- add/// <reference types="vite/client" />at the top of files with CSS imports
Files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
web/src/**/*.{ts,tsx,css}
📄 CodeRabbit inference engine (CLAUDE.md)
web/src/**/*.{ts,tsx,css}: Never hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions in web dashboard code—use design tokens and@/lib/motionpresets.
Web dashboard scripts/check_web_design_system.py enforces component reuse and design token usage on every Edit/Write to web/src/.
Files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
web/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,js,jsx}: Always usecreateLoggerfrom@/lib/logger-- never bareconsole.warn/console.error/console.debugin application code
Logger variable name must always beconst log(e.g.const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate arguments to logger methods (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
🧠 Learnings (61)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/api/**/*.py : Authentication uses JWT + API key. Approval gate integration for high-risk operations.
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Applied to files:
src/synthorg/security/__init__.pytests/unit/security/test_denial_tracker.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/config.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Applied to files:
src/synthorg/security/__init__.pytests/unit/security/test_denial_tracker.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/config.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Applied to files:
src/synthorg/security/__init__.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-27T12:44:29.466Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T12:44:29.466Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from `web/src/components/ui/` (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem) before creating new ones
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from web/src/components/ui/ (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast/ToastContainer, Skeleton variants, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem, Drawer, form fields, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor) before creating new components
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.{tsx,ts} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup, Drawer, InputField, SelectField, SliderField, ToggleField, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor, TagInput, MetadataGrid, ProjectStatusBadge, ContentTypeBadge)
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to web/src/**/*.{ts,tsx} : React dashboard must use TypeScript 6.0+, React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand, tanstack/react-query. No hardcoded styles—use design tokens.
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.{ts,tsx} : Do NOT build card-with-header layouts from scratch; use `<SectionCard>`
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-31T14:31:11.894Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:31:11.894Z
Learning: Applies to web/src/**/*.{ts,tsx} : Use React 19, TypeScript 6.0+, and design system tokens from shadcn/ui + Tailwind CSS 4 + Radix UI in web dashboard
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/src/**/*.{ts,tsx} : Web dashboard shadows/borders: use token variables (var(--so-shadow-card-hover), border-border, border-bright); never hardcode shadow or border values
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.{ts,tsx} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones; refer to design system inventory (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, etc.)
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.{tsx,ts} : Use `color?` and `animated?` props for Sparkline component (inline SVG trend lines)
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-27T22:32:26.927Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T22:32:26.927Z
Learning: Applies to web/src/**/*.{tsx,ts} : Use token variables (var(--so-shadow-card-hover), border-border, border-bright) for shadows/borders; never hardcode values
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : Import `cn` from `@/lib/utils` for conditional class merging in component files
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : For Base UI primitives, import from specific subpaths (e.g. `import { Dialog } from 'base-ui/react/dialog'`) and use the component's `render` prop for polymorphism
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-27T22:32:26.927Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T22:32:26.927Z
Learning: Applies to web/src/components/ui/*.{tsx,ts} : For new shared React components: place in web/src/components/ui/ with kebab-case filename, create .stories.tsx with all states, export props as TypeScript interface, use design tokens exclusively
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Do NOT recreate status dots inline -- use `<StatusBadge>` from `@/components/ui/status-badge`
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/unit/security/test_uncertainty_checker.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-04-01T09:09:43.948Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
src/synthorg/engine/agent_engine.pysrc/synthorg/engine/_security_factory.py
📚 Learning: 2026-03-31T21:07:37.469Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.469Z
Learning: Applies to src/synthorg/providers/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-16T19:13:36.562Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:36.562Z
Learning: Applies to src/synthorg/providers/**/*.py : RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code — it's handled by the base class.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Never implement retry logic in provider subclasses or calling code — it is handled automatically by `BaseCompletionProvider` with `RetryConfig` and `RateLimiterConfig` per-provider
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T18:52:05.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T18:52:05.142Z
Learning: Applies to src/synthorg/providers/**/*.py : All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code — it's handled by the base class.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/providers/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically; never implement retry logic in driver subclasses or calling code
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-04-01T09:37:49.451Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:37:49.451Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to **/*.py : Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) in Python 3.14 code—ruff enforces this.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to **/*.py : Handle errors explicitly in Python, never silently swallow exceptions.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.
Applied to files:
src/synthorg/security/service.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/api/**/*.py : Authentication uses JWT + API key. Approval gate integration for high-risk operations.
Applied to files:
src/synthorg/security/safety_classifier.py
🔇 Additional comments (12)
web/src/pages/approvals/ApprovalCard.tsx (2)
31-37: LGTM with minor note on threshold consistency.The parsing logic is correct:
parseFloathandles string metadata values, and theNaNchecks are properly applied. The string comparisonlow_confidence === 'true'is type-safe sincemetadataisRecord<string, string>.The fallback check
confidenceScore < 0.5provides defensive rendering when the backend's pre-computedlow_confidenceflag is absent, which is reasonable for backward compatibility.
142-174: LGTM!The safety badges use semantic Tailwind classes (
text-danger,bg-danger/10,text-warning, etc.) and include properaria-labelattributes for accessibility. The badge placement in the header row is consistent with the existing urgency badge pattern.src/synthorg/observability/events/security.py (1)
54-77: LGTM!The new event constants follow the established naming conventions and value patterns. The hierarchical grouping (safety_classify, uncertainty_check, denial, tier) is clear and consistent with the existing constants in this file.
src/synthorg/security/denial_tracker.py (1)
1-149: LGTM!Well-structured implementation with clear documentation. The thread-safety claim is accurate for asyncio's single-threaded event loop model. Good use of
__slots__for the internal counts class, appropriate logging levels (WARNING for escalation, INFO for recording, DEBUG for reset), and explicit validation in the constructor.src/synthorg/engine/agent_engine.py (1)
335-336: LGTM!The new
provider_configsandmodel_resolverparameters are correctly wired through tomake_security_interceptor. The optional typing withNonedefaults maintains backward compatibility, and the factory handles missing infrastructure gracefully.Also applies to: 354-355, 1618-1620
src/synthorg/security/__init__.py (1)
34-41: LGTM!The new security components are properly exported and added to
__all__in alphabetical order. The re-exports enable clean imports likefrom synthorg.security import SafetyClassifier, DenialTracker.Also applies to: 72-83, 94-97, 105-105, 112-115, 126-128
tests/unit/security/test_denial_tracker.py (1)
1-140: LGTM!Comprehensive test coverage for
DenialTrackerincluding boundary conditions, reset semantics, agent independence, and invalid input validation. Good use of@pytest.mark.parametrizefor testing various limit combinations efficiently.web/src/pages/approvals/ApprovalDetailDrawer.tsx (3)
67-69: LGTM!Confidence score parsing follows the same robust pattern as
ApprovalCard.tsx: parsing from string metadata withNaNfallback, then formatting as a percentage only when valid.
258-279: LGTM!The safety warning banners use consistent semantic Tailwind classes (
border-danger/30,bg-danger/10,text-danger) matching theApprovalCardbadges. Good UX to show prominent banners in the detail view for blocked/suspicious classifications.
433-450: LGTM!Clean extraction of
DescriptionSectionas a file-scoped component. The fallback logic (stripped_description || description) is correct, and the "(PII redacted)" marker provides clear user feedback when information stripping was applied.src/synthorg/engine/_security_factory.py (2)
102-154: LGTM!Clean conditional wiring pattern. The
has_providersguard ensures LLM-based features are only instantiated when provider infrastructure is available. The# type: ignore[arg-type]comments are acceptable here since mypy cannot narrow through the boolean variable after thehas_providerscheck. The warning function provides good operator visibility when features are configured but infrastructure is missing.
225-243: LGTM!The
_warn_disabled_featureshelper provides clear diagnostic logging when LLM-based features are enabled in config but cannot be activated due to missing provider infrastructure. The warning message explicitly lists which features are affected.
| def strip(self, text: str) -> str: | ||
| """Replace sensitive data with tagged placeholders. | ||
|
|
||
| Args: | ||
| text: The input text to sanitize. | ||
|
|
||
| Returns: | ||
| The text with sensitive patterns replaced by | ||
| ``[CREDENTIAL]``, ``[PII]``, ``[ID]``, or ``[EMAIL]``. | ||
| """ | ||
| if not text: | ||
| return text | ||
|
|
||
| result = text | ||
|
|
||
| # Credentials first (most specific patterns). | ||
| for _label, pattern in _CREDENTIAL_STRIP_PATTERNS: | ||
| result = pattern.sub(_CREDENTIAL_PLACEHOLDER, result) | ||
|
|
||
| # PII patterns. | ||
| for _label, pattern in _PII_STRIP_PATTERNS: | ||
| result = pattern.sub(_PII_PLACEHOLDER, result) | ||
|
|
||
| # UUIDs. | ||
| result = _UUID_PATTERN.sub(_ID_PLACEHOLDER, result) | ||
|
|
||
| # Internal IDs (agent-xxx, task-xxx). | ||
| result = _INTERNAL_ID_PATTERN.sub(_ID_PLACEHOLDER, result) | ||
|
|
||
| # Emails (after credentials to avoid double-matching | ||
| # patterns that look like email-with-token). | ||
| result = _EMAIL_PATTERN.sub(_EMAIL_PLACEHOLDER, result) | ||
|
|
||
| logger.debug( | ||
| SECURITY_INFO_STRIP_COMPLETE, | ||
| original_length=len(text), | ||
| stripped_length=len(result), | ||
| ) | ||
| return result |
There was a problem hiding this comment.
Strip control characters in Stage 1 as well.
InformationStripper.strip() replaces secrets/PII, but it never applies _CONTROL_CHAR_RE. A description containing bidi or zero-width controls will therefore survive into stripped_description, which means hidden prompt text still reaches both the classifier prompt and the reviewer UI.
🧹 Minimal fix
# Emails (after credentials to avoid double-matching
# patterns that look like email-with-token).
result = _EMAIL_PATTERN.sub(_EMAIL_PLACEHOLDER, result)
+ result = _CONTROL_CHAR_RE.sub(" ", result).strip()
logger.debug(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/security/safety_classifier.py` around lines 180 - 218, The strip
method in InformationStripper never applies the _CONTROL_CHAR_RE pattern so
control characters (bidi/zero-width) leak through; update the strip(self, text:
str) implementation to run _CONTROL_CHAR_RE.sub(...) on the working string near
the start of the function (Stage 1) — e.g., immediately after the empty-check
and before applying _CREDENTIAL_STRIP_PATTERNS — so control characters are
removed from result before other substitutions (reference symbols: strip,
_CONTROL_CHAR_RE, _CREDENTIAL_STRIP_PATTERNS, result).
src/synthorg/security/service.py
Outdated
| def _handle_blocked_denial( | ||
| self, | ||
| agent_id: str, | ||
| tool_name: str, | ||
| reason: str, | ||
| metadata: dict[str, str], | ||
| ) -> bool: | ||
| """Handle BLOCKED with denial tracking. Always returns True.""" | ||
| if self._denial_tracker is None: | ||
| logger.warning( | ||
| SECURITY_SAFETY_CLASSIFY_BLOCKED, | ||
| tool_name=tool_name, | ||
| reason=reason, | ||
| ) | ||
| return True | ||
|
|
||
| action = self._denial_tracker.record_denial(agent_id) | ||
| metadata["denial_action"] = action.value | ||
| consecutive, total = self._denial_tracker.get_counts(agent_id) | ||
| metadata["denial_consecutive"] = str(consecutive) | ||
| metadata["denial_total"] = str(total) | ||
|
|
||
| if action == DenialAction.ESCALATE: | ||
| logger.warning( | ||
| SECURITY_SAFETY_CLASSIFY_BLOCKED, | ||
| tool_name=tool_name, | ||
| reason=reason, | ||
| note="max denials reached -- escalating", | ||
| consecutive=consecutive, | ||
| total=total, | ||
| ) | ||
| return True |
There was a problem hiding this comment.
Let max-denial thresholds actually escalate to humans.
Line 759 returns True, and _handle_escalation() interprets that as "auto-reject this request". So DenialAction.ESCALATE never creates an approval item even though this branch logs that it is escalating.
🔁 Minimal fix
- def _handle_blocked_denial(
+ def _handle_blocked_denial(
self,
agent_id: str,
tool_name: str,
reason: str,
metadata: dict[str, str],
) -> bool:
- """Handle BLOCKED with denial tracking. Always returns True."""
+ """Handle BLOCKED with denial tracking.
+
+ Returns ``True`` when the request should be auto-rejected and
+ ``False`` when the denial threshold has been reached and the
+ request should continue to human approval.
+ """
if self._denial_tracker is None:
logger.warning(
SECURITY_SAFETY_CLASSIFY_BLOCKED,
tool_name=tool_name,
reason=reason,
@@
if action == DenialAction.ESCALATE:
logger.warning(
SECURITY_SAFETY_CLASSIFY_BLOCKED,
tool_name=tool_name,
reason=reason,
note="max denials reached -- escalating",
consecutive=consecutive,
total=total,
)
- return True
+ return False📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _handle_blocked_denial( | |
| self, | |
| agent_id: str, | |
| tool_name: str, | |
| reason: str, | |
| metadata: dict[str, str], | |
| ) -> bool: | |
| """Handle BLOCKED with denial tracking. Always returns True.""" | |
| if self._denial_tracker is None: | |
| logger.warning( | |
| SECURITY_SAFETY_CLASSIFY_BLOCKED, | |
| tool_name=tool_name, | |
| reason=reason, | |
| ) | |
| return True | |
| action = self._denial_tracker.record_denial(agent_id) | |
| metadata["denial_action"] = action.value | |
| consecutive, total = self._denial_tracker.get_counts(agent_id) | |
| metadata["denial_consecutive"] = str(consecutive) | |
| metadata["denial_total"] = str(total) | |
| if action == DenialAction.ESCALATE: | |
| logger.warning( | |
| SECURITY_SAFETY_CLASSIFY_BLOCKED, | |
| tool_name=tool_name, | |
| reason=reason, | |
| note="max denials reached -- escalating", | |
| consecutive=consecutive, | |
| total=total, | |
| ) | |
| return True | |
| def _handle_blocked_denial( | |
| self, | |
| agent_id: str, | |
| tool_name: str, | |
| reason: str, | |
| metadata: dict[str, str], | |
| ) -> bool: | |
| """Handle BLOCKED with denial tracking. | |
| Returns ``True`` when the request should be auto-rejected and | |
| ``False`` when the denial threshold has been reached and the | |
| request should continue to human approval. | |
| """ | |
| if self._denial_tracker is None: | |
| logger.warning( | |
| SECURITY_SAFETY_CLASSIFY_BLOCKED, | |
| tool_name=tool_name, | |
| reason=reason, | |
| ) | |
| return True | |
| action = self._denial_tracker.record_denial(agent_id) | |
| metadata["denial_action"] = action.value | |
| consecutive, total = self._denial_tracker.get_counts(agent_id) | |
| metadata["denial_consecutive"] = str(consecutive) | |
| metadata["denial_total"] = str(total) | |
| if action == DenialAction.ESCALATE: | |
| logger.warning( | |
| SECURITY_SAFETY_CLASSIFY_BLOCKED, | |
| tool_name=tool_name, | |
| reason=reason, | |
| note="max denials reached -- escalating", | |
| consecutive=consecutive, | |
| total=total, | |
| ) | |
| return False |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/security/service.py` around lines 728 - 759, The ESCALATE branch
in _handle_blocked_denial currently returns True which causes auto-rejection and
prevents human escalation; instead call the escalation handler and return its
result. Replace the final "return True" inside the if action ==
DenialAction.ESCALATE block with a call to self._handle_escalation(agent_id,
tool_name, reason, metadata) (preserving the existing metadata entries and log),
so that DenialAction.ESCALATE triggers _handle_escalation and its boolean
outcome rather than forcibly returning True.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/security/safety_classifier.py`:
- Around line 457-464: The all_cross accumulation can contain duplicates and
bias secrets.choice; deduplicate before selecting by replacing the list built in
all_cross with a deduped sequence (e.g., use a set or preserve-order dedupe) of
candidates filtered from providers_excluding_family(family, self._configs) and
available, then call secrets.choice on that deduped collection and return
self._registry.get for the selected key; update references to all_cross,
secrets.choice, get_family, providers_excluding_family, available, and
_registry.get accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7c3f3689-f9d6-4ee3-a455-bbc774c39c5e
📒 Files selected for processing (4)
src/synthorg/security/safety_classifier.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pytests/unit/security/test_service_safety_integration.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Sandbox
- GitHub Check: Build Web
- GitHub Check: Build Backend
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do NOT usefrom __future__ import annotationsin Python code—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) in Python 3.14 code—ruff enforces this.
All public functions in Python must have type hints. Use mypy strict mode for type-checking.
Use Google-style docstrings on all public classes and functions in Python. This is enforced by ruff D rules.
Use NotBlankStr (from core.types) for all identifier/name fields in Python—including optional (NotBlankStr | None) and tuple variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in Python—prefer structured concurrency over bare create_task.
Python line length must be 88 characters (enforced by ruff).
Python functions must be under 50 lines, files under 800 lines.
Handle errors explicitly in Python, never silently swallow exceptions.
Always use variable namelogger(not_loggerorlog) for the logging instance in Python.
Lint Python withuv run ruff check src/ tests/. Auto-fix withuv run ruff check src/ tests/ --fix. Format withuv run ruff format src/ tests/.
Type-check Python withuv run mypy src/ tests/(strict mode).
Files:
tests/unit/security/test_service_safety_integration.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Test markers in Python:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Python test coverage must be 80% minimum (enforced in CI).
Use@pytest.mark.parametrizefor testing similar cases in Python.
Use test-provider, test-small-001, etc. vendor-agnostic names in Python tests.
Use Hypothesis property-based testing in Python with@given+@settingsdecorators. Configure profiles: ci (deterministic, max_examples=10, derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Control via HYPOTHESIS_PROFILE env var.
When Hypothesis finds a failure in Python tests, fix the underlying bug and add an@example(...) decorator to permanently cover the case in CI.
Never skip, dismiss, or ignore flaky Python tests—fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep(). For tasks that must block indefinitely, use asyncio.Event().wait() instead of asyncio.sleep(large_number).
Run Python unit tests withuv run python -m pytest tests/ -m unit -n 8.
Run Python integration tests withuv run python -m pytest tests/ -m integration -n 8.
Run Python e2e tests withuv run python -m pytest tests/ -m e2e -n 8.
Files:
tests/unit/security/test_service_safety_integration.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/security/test_service_safety_integration.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Every Python module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging,logging.getLogger(), orprint()in Python application code. Exceptions: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, observability/http_handler.py may use stdlib logging and print.
Use event name constants from synthorg.observability.events domain modules (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly:from synthorg.observability.events.<domain> import EVENT_CONSTANT
Use structured logging withlogger.info(EVENT, key=value)syntax in Python—neverlogger.info('msg %s', val)
All error paths in Python must log at WARNING or ERROR with context before raising.
All state transitions in Python must log at INFO.
DEBUG logging is for object creation, internal flow, and entry/exit of key functions in Python.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned Python code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small aliases. Exceptions: Operations design page, .claude/ skill files, third-party imports, provider presets (user-facing runtime data).
Library reference in docs/api/ is auto-generated via mkdocstrings + Griffe (AST-based).
Files:
src/synthorg/security/uncertainty.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/security/uncertainty.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
🧠 Learnings (19)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/api/**/*.py : Authentication uses JWT + API key. Approval gate integration for high-risk operations.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Applied to files:
tests/unit/security/test_service_safety_integration.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Applied to files:
tests/unit/security/test_service_safety_integration.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/unit/security/test_service_safety_integration.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
tests/unit/security/test_service_safety_integration.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Applied to files:
tests/unit/security/test_service_safety_integration.pysrc/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/security/service.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) in Python 3.14 code—ruff enforces this.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-04-06T06:43:24.031Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:43:24.031Z
Learning: Applies to **/*.py : Handle errors explicitly in Python, never silently swallow exceptions.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.
Applied to files:
src/synthorg/security/service.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/api/**/*.py : Authentication uses JWT + API key. Approval gate integration for high-risk operations.
Applied to files:
src/synthorg/security/safety_classifier.py
🔇 Additional comments (21)
src/synthorg/security/safety_classifier.py (5)
1-61: LGTM: Module structure and imports are well-organized.The module docstring clearly documents the two-stage classification design, fail-safe behavior, and cross-family provider selection. Imports are properly organized with
TYPE_CHECKINGguard for type-only imports.
180-222: LGTM: Control character stripping now applied.The
strip()method correctly applies_CONTROL_CHAR_RE.sub()at line 215, removing bidi overrides and zero-width characters that could hide prompt injection payloads. This addresses the previous review feedback.
490-526: LGTM: Secure prompt construction with proper escaping and truncation.The method correctly applies defense-in-depth:
- Strips sensitive data from
tool_nameandaction_type(even though they should already be safe)- XML-escapes all values before interpolation
- Truncates after escaping to prevent orphaned escape sequences
556-602: LGTM: Robust parsing with proper sanitization.The
_parse_tool_callmethod correctly:
- Validates classification against the allowed enum values
- Strips control characters before whitespace to handle all-control-char reasons
- Provides a safe fallback when the reason becomes empty
- Truncates to prevent oversized metadata
368-391: LGTM: Correct fail-safe error handling with PEP 758 syntax.The exception handling correctly:
- Uses PEP 758
except MemoryError, RecursionError:syntax- Re-raises system errors that shouldn't be caught
- Returns
SUSPICIOUSon classification failure (fail-safe: neither auto-rejects nor marks as safe)- Logs the exception with context before returning
src/synthorg/security/uncertainty.py (5)
1-80: LGTM: Well-structured module with clear design invariants.The module docstring clearly documents the no-external-dependencies constraint, graceful skip behavior, and individual timeout guards. The
UncertaintyResultmodel has appropriate validation constraints.
85-186: LGTM: Correct similarity implementations with proper edge case handling.The similarity functions correctly handle edge cases:
- Single response → 1.0 (nothing to compare)
- All-empty responses → 1.0 (identical emptiness)
- Empty-vs-empty pairs → 1.0 (counted as identical)
- Smoothed IDF prevents zero weights for shared terms
242-270: LGTM: Provider deduplication correctly implemented.The code properly deduplicates resolved model variants by
provider_namebefore checking againstmin_providers, ensuring genuine cross-provider comparison rather than counting aliases from the same provider multiple times.
350-389: Intentional catch-all inside TaskGroup is acceptable but worth noting.The catch-all
except Exception:at line 370 includes system errors likeMemoryErrorandRecursionError. The docstring correctly explains this is intentional to preventExceptionGrouppropagation fromTaskGroup. For a security feature, graceful degradation (fewer responses) is preferable to crashing the entire flow.
300-330: LGTM: Reasonable confidence scoring with appropriate weighting.The 60/40 weighting favoring TF-IDF over Jaccard is reasonable since TF-IDF better captures semantic similarity. The
min(1.0, ...)guard prevents any floating-point edge cases from producing invalid confidence scores.tests/unit/security/test_service_safety_integration.py (5)
1-87: LGTM: Well-structured test helpers.The helper functions provide clean abstractions for creating test fixtures.
_make_servicecorrectly assembles theSecOpsServicewith mock dependencies and realRuleEngine/AuditLoginstances for realistic integration testing.
92-196: LGTM: Comprehensive safety classifier integration tests.The tests correctly verify:
BLOCKED→DENYwithout creating approval itemsSUSPICIOUSandSAFE→ESCALATEwith appropriate metadata- Classifier errors still create approval items (fail-safe)
201-255: LGTM: Uncertainty checker integration tests with correct string metadata assertions.The tests correctly verify that uncertainty metrics are stored as strings in metadata and that checker errors don't prevent approval item creation.
425-528: LGTM: Comprehensive denial tracker integration tests.The tests correctly verify:
- Retry flow with denial tracking
- Escalation after max consecutive/total denials
- Consecutive count reset on SAFE/SUSPICIOUS
- Total limit enforcement across resets
- Fallback to immediate DENY without tracker
541-584: LGTM: Permission tier tests verify classifier bypass behavior.The tests correctly verify that
SAFE_TOOLtier bypasses the classifier entirely whileCLASSIFIER_GATEDruns the full classification flow.src/synthorg/security/service.py (6)
102-156: LGTM: Constructor correctly accepts new optional dependencies.The constructor properly declares and stores the three new dependencies (
safety_classifier,uncertainty_checker,denial_tracker) with appropriate type hints and docstring documentation.
582-590: LGTM: Uncertainty check now only uses stripped description.The code correctly only runs the uncertainty check when
stripped_descriptionis available in metadata (line 585-586), preventing rawverdict.reasonfrom being broadcast to providers. This addresses the previous review feedback.
725-771: LGTM: Denial escalation now correctly proceeds to human approval.The
_handle_blocked_denialmethod correctly returns:
True(auto-reject) when no tracker orRETRYFalse(proceed to human approval) whenESCALATEThis addresses the previous review feedback about
DenialAction.ESCALATEnot creating approval items.
773-812: LGTM: Uncertainty serialization correctly distinguishes skipped from high-confidence.The method correctly:
- Always records
provider_count- Only serializes
confidence_scorewhen 2+ providers compared- Sets
uncertainty_check_skippedflag when insufficient data- Handles errors gracefully with
uncertainty_check_errorflagThis addresses the previous review feedback about misleading 1.0 confidence scores.
641-690: LGTM: Safety classifier integration with proper tier handling and fail-safe.The method correctly:
- Bypasses classifier for
SAFE_TOOLtier- Populates metadata with classification results
- Fails safely to review (returns
False) on errors- Uses appropriate logging for all paths
692-723: LGTM: Classification processing with correct denial tracker integration.The method correctly:
- Routes
BLOCKED+auto_reject_blockedto denial handling- Resets consecutive count for
SAFEandSUSPICIOUS(both indicate the agent tried something acceptable)- Logs warning for
SUSPICIOUSto surface concerning patterns
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/security/safety_classifier.py`:
- Around line 510-512: Truncate the raw description before HTML-escaping to
avoid cutting an escape sequence: compute max_desc_chars as currently done,
apply the truncation to the unescaped string (the variable used to build
safe_desc) so you shorten raw input (append the "... [truncated]" marker) and
only afterwards call html.escape(...) to produce safe_desc; update the logic
around max_desc_chars, safe_desc and the html.escape call in
safety_classifier.py (the variables max_desc_chars, safe_desc and the
html.escape invocation) accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8566c4ae-5913-4069-beed-abd839d4a0c2
📒 Files selected for processing (1)
src/synthorg/security/safety_classifier.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Deploy Preview
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Nofrom __future__ import annotations- Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: useexcept A, B:(no parentheses) - ruff enforces this on Python 3.14.
Type hints required on all public functions, with mypy strict mode enforcement.
Docstrings required on all public classes and functions using Google style - enforced by ruff D rules.
Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement. For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries.
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Adopted conventions: use allow_inf_nan=False in all ConfigDict declarations; use@computed_fieldfor derived values instead of storing + validating redundant fields; use NotBlankStr (from core.types) for all identifier/name fields.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff enforced).
Functions must be < 50 lines, files must be < 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Variable name must always belogger(not_logger, notlog).
Never useimport logging/logging.getLogger()/ `print()...
Files:
src/synthorg/security/safety_classifier.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/security/safety_classifier.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Package structure follows: api/ (REST + WebSocket, RFC 9457 errors, auth/, workflows, reports), backup/ (scheduler, retention, handlers), budget/ (cost tracking, quotas, risk scoring), communication/ (message bus, channels), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task engine, workspace, workflow execution), hr/ (hiring, agent registry, performance, evaluation), memory/ (MemoryBackend, retrieval pipeline, consolidation, embedding, procedural), persistence/ (PersistenceBackend, SQLite, repositories), observability/ (logging, events, redaction, shipping), providers/ (LLM abstraction, routing, health), security/ (rule engine, audit, policy, risk scoring), templates/ (company templates, presets, packs), tools/ (registry, built-in tools, MCP, sandbox).
Files:
src/synthorg/security/safety_classifier.py
🧠 Learnings (14)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/api/**/*.py : Authentication uses JWT + API key. Approval gate integration for high-risk operations.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-04-06T13:43:45.380Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T13:43:45.380Z
Learning: Applies to src/**/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) - ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions
Applied to files:
src/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.
Applied to files:
src/synthorg/security/safety_classifier.py
🔇 Additional comments (8)
src/synthorg/security/safety_classifier.py (8)
1-61: LGTM!Clean module setup with clear docstring explaining design invariants (LLM only sees stripped text, fail-safe to SUSPICIOUS). Logger and imports properly organized per project conventions.
63-107: LGTM!Good reuse of existing
CREDENTIAL_PATTERNSandPII_PATTERNSfrom the detector modules. The control character regex comprehensively covers bidi overrides, zero-width chars, and invisible characters used in prompt injection attacks.
109-166: LGTM!Well-designed enums and result model.
SafetyClassifierResultcorrectly uses frozen config withallow_inf_nan=False,NotBlankStrfor the reason field, and a non-negative constraint on duration.
180-222: Control character stripping is now correctly included.The
strip()method now applies_CONTROL_CHAR_RE.sub()at line 215, addressing the previous concern about bidi/zero-width characters surviving into the stripped description.
225-284: LGTM!Well-designed tool schema with chain-of-thought support via the optional "concerns" field. The system prompt appropriately warns the LLM about sanitized placeholders and instructs it to ignore embedded instructions in field values.
337-392: LGTM!The
classify()method correctly implements fail-safe behavior: errors default toSUSPICIOUSrather thanSAFEorBLOCKED, which is the right security posture. System errors (MemoryError,RecursionError) are appropriately re-raised per project convention.
443-468: Cross-family selection with deduplication is correctly implemented.The deduplication via
list(set(all_cross))at line 463 addresses the previous concern about biased selection. The fallback to the first available provider when no cross-family option exists is appropriate for single-family deployments.
556-602: LGTM!The
_parse_tool_callmethod properly sanitizes the LLM-returned reason: control characters are stripped, empty results fall back to a default message (preventingNotBlankStrvalidation failure), and the reason is truncated to a reasonable length. The fail-safe return ofSUSPICIOUSfor invalid classifications maintains the security posture.
…k for approval gates Stage 1 (InformationStripper): strips PII, secrets, UUIDs, emails, and internal IDs from the reviewer-facing description. Reuses existing credential and PII patterns from the rule engine detectors. Stage 2 (SafetyClassifier): LLM classifier categorizes escalated actions as safe/suspicious/blocked. Blocked actions are auto-rejected. Suspicious actions get a warning badge in the reviewer UI. Cross-provider uncertainty check (UncertaintyChecker): sends the same prompt to multiple providers, compares responses via keyword overlap (Jaccard) and TF-IDF cosine similarity, produces a confidence score. Low confidence signals potential hallucination. Both features integrate into SecOpsService._handle_escalation() and propagate results to the frontend via ApprovalItem.metadata. Factory wiring in make_security_interceptor() with full backward compatibility (both features default to disabled). Frontend: warning badge on ApprovalCard for suspicious actions, safety banner + confidence score + stripped description toggle on ApprovalDetailDrawer. Closes #847 Closes #701
Pre-reviewed by 5 agents, 18 findings addressed: Security (Critical): - XML-escape all interpolated values in LLM prompt (prevent tag injection) - Strip action_type/tool_name through InformationStripper before LLM - Pass stripped description (not raw verdict.reason) to uncertainty checker - Extend control-char regex to cover Unicode bidi overrides Correctness (Major): - Remove MemoryError/RecursionError re-raise inside TaskGroup (prevents ExceptionGroup propagation) - Clamp confidence score to max 1.0 (floating-point edge case) - Filter empty/None provider responses from similarity computation - Add uncertainty_check_error sentinel to metadata on failure - Fix auto_reject_blocked=False path (was always auto-rejecting) - Change _parse_response param from object to CompletionResponse - Change _run_safety_classifier return type to bool (clearer contract) Frontend: - Replace IIFEs with precomputed variables (ESLint React Compiler rule) - Add NaN guard for parseFloat on confidence scores - Remove misleading 'Show original' toggle (description IS stripped) Tests: - Add factory wiring tests for SafetyClassifier/UncertaintyChecker - Add auto_reject_blocked=False test - Fix timeout test: asyncio.Event().wait() instead of sleep(100)
…permission tiers Review findings addressed: - Fix NotBlankStr ValidationError from control-char-only LLM reason - Wire provider_registry/configs/resolver through AgentEngine - Move XML truncation before assembly to prevent orphaned tags - Log SUSPICIOUS classification via SECURITY_SAFETY_CLASSIFY_SUSPICIOUS - Add warning logs for missing provider/model fallback scenarios - Warn when LLM features enabled but no provider infrastructure - Randomize cross-family provider selection (secrets.choice) - Remove unused provider_configs from UncertaintyChecker - Fix misleading IDF comment and dead code in TF-IDF - Handle 'blocked' classification in frontend (badge + banner) - Fix confidenceRaw falsy check to handle '0' scores - Add backend low_confidence flag for frontend threshold sync - Extend _CONTROL_CHAR_RE with U+2028-2029, U+3164, U+2800 - Fix misleading 'toggle' comment in DescriptionSection - Replace type: ignore with assert preconditions in service - Use itertools.combinations for pairwise iteration - Add PII redacted indicator in DescriptionSection - Document unused concerns field in tool schema New features (#847): - DenialTracker: max 3 consecutive / 20 total denials per agent with RETRY/ESCALATE action routing - PermissionTier: SAFE_TOOL bypasses classifier, CLASSIFIER_GATED runs full LLM classification - SafetyClassifierConfig: max_consecutive_denials, max_total_denials, safe_tool_categories fields - 8 new event constants for denial tracking and permission tiers - 23 new tests (15 denial tracker + 8 integration)
- Strip bidi/zero-width control chars in InformationStripper.strip() - Never send raw verdict.reason to uncertainty fan-out (skip when no stripped_description available) - DenialAction.ESCALATE now proceeds to human approval (return False) instead of auto-rejecting - Only serialize confidence_score when 2+ providers compared (skipped checks no longer produce misleading 1.0) - Deduplicate resolve_all candidates by provider_name before fan-out
46e8bc4 to
26357c5
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/synthorg/security/safety_classifier.py`:
- Around line 480-488: The fallback that returns the raw provider_name when no
model is configured can cause provider.reject errors; change the behavior in the
function that computes the model hint (the block that currently logs
SECURITY_SAFETY_CLASSIFY_ERROR and returns provider_name) to return the
SUSPICIOUS sentinel instead of provider_name when the provider has no models
list, and update the warning message accordingly; ensure callers that expect a
model hint (and any downstream use of driver.complete()) handle the SUSPICIOUS
value the same way as the "no-provider" case so no LLM call is attempted.
In `@web/src/pages/approvals/ApprovalCard.tsx`:
- Around line 31-37: ApprovalCard currently treats approval.metadata as
Record<string,string> and accesses safety_classification, confidence_score, and
low_confidence without narrowing the type; create a narrow type (e.g.,
SafetyMetadata with safety_classification?: 'safe' | 'suspicious' | 'blocked',
confidence_score?: string, low_confidence?: 'true' | 'false') in
web/src/api/types.ts or a local types file and then narrow or assert
approval.metadata to SafetyMetadata (or implement a simple type guard) before
computing safetyClassification, confidenceScore, isSuspicious, isBlocked, and
showLowConfidence so those variables use the narrowed types and avoid unsafe
string/undefined handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 49125ef1-280d-4632-a3d9-62c1e77cd836
📒 Files selected for processing (17)
src/synthorg/engine/_security_factory.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/__init__.pysrc/synthorg/security/config.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/security/safety_classifier.pysrc/synthorg/security/service.pysrc/synthorg/security/uncertainty.pytests/unit/engine/test_security_factory_safety.pytests/unit/security/test_denial_tracker.pytests/unit/security/test_information_stripper.pytests/unit/security/test_safety_classifier.pytests/unit/security/test_service_safety_integration.pytests/unit/security/test_uncertainty_checker.pyweb/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Build Sandbox
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Test (Python 3.14)
- GitHub Check: Dashboard Test
- GitHub Check: Dependency Review
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
web/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,js,jsx}: Always usecreateLoggerfrom@/lib/logger-- never bareconsole.warn/console.error/console.debugin application code
Logger variable name must always beconst log(e.g.const log = createLogger('module-name'))
Pass dynamic/untrusted values as separate arguments to logger methods (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx}: Use Tailwind semantic classes (text-foreground,bg-card,text-accent,text-success,bg-danger, etc.) or CSS variables (var(--so-*)) for colors; NEVER hardcode hex values in.tsx/.tsfiles
Usefont-sansorfont-mono(Geist tokens) for typography; NEVER setfontFamilydirectly in.tsx/.tsfiles
Use density-aware tokens (p-card,gap-section-gap,gap-grid-gap) or standard Tailwind spacing; NEVER hardcode pixel values for layout spacing in components
Use token variables (var(--so-shadow-card-hover),border-border,border-bright) for shadows and borders; NEVER hardcode values in.tsx/.tsfiles
Use@/lib/motionpresets for Framer Motion transition durations; NEVER hardcode transition durations
CSS side-effect imports in TypeScript 6 require type declarations -- add/// <reference types="vite/client" />at the top of files with CSS importsALWAYS reuse existing components from web/src/components/ui/ before creating new ones. NEVER hardcode hex colors, font-family, pixel spacing, or Framer Motion transitions - use design tokens and
@/lib/motionpresets. A PostToolUse hook (scripts/check_web_design_system.py) enforces these rules on every Edit/Write to web/src/.
Files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Nofrom __future__ import annotations- Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: useexcept A, B:(no parentheses) - ruff enforces this on Python 3.14.
Type hints required on all public functions, with mypy strict mode enforcement.
Docstrings required on all public classes and functions using Google style - enforced by ruff D rules.
Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement. For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries.
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Adopted conventions: use allow_inf_nan=False in all ConfigDict declarations; use@computed_fieldfor derived values instead of storing + validating redundant fields; use NotBlankStr (from core.types) for all identifier/name fields.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff enforced).
Functions must be < 50 lines, files must be < 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
Every module with business logic MUST have:from synthorg.observability import get_loggerthenlogger = get_logger(__name__). Variable name must always belogger(not_logger, notlog).
Never useimport logging/logging.getLogger()/ `print()...
Files:
src/synthorg/security/__init__.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/security/service.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/safety_classifier.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/security/__init__.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/security/service.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/safety_classifier.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Package structure follows: api/ (REST + WebSocket, RFC 9457 errors, auth/, workflows, reports), backup/ (scheduler, retention, handlers), budget/ (cost tracking, quotas, risk scoring), communication/ (message bus, channels), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task engine, workspace, workflow execution), hr/ (hiring, agent registry, performance, evaluation), memory/ (MemoryBackend, retrieval pipeline, consolidation, embedding, procedural), persistence/ (PersistenceBackend, SQLite, repositories), observability/ (logging, events, redaction, shipping), providers/ (LLM abstraction, routing, health), security/ (rule engine, audit, policy, risk scoring), templates/ (company templates, presets, packs), tools/ (registry, built-in tools, MCP, sandbox).
Files:
src/synthorg/security/__init__.pysrc/synthorg/security/denial_tracker.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/security/service.pysrc/synthorg/observability/events/security.pysrc/synthorg/security/uncertainty.pysrc/synthorg/security/safety_classifier.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow. Minimum coverage 80% (enforced in CI).
Async testing: asyncio_mode = 'auto' - no manual@pytest.mark.asyncioneeded. Default timeout: 30 seconds per test (do not add per-file markers; non-default overrides like timeout(60) allowed). Parallelism: ALWAYS include -n 8 when running pytest locally, never run tests sequentially.
Prefer@pytest.mark.parametrizefor testing similar cases. Use Hypothesis for property-based testing with@given+@settingsdecorators.
Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, test-provider, test-small-001. Vendor names may only appear in: Operations design page, .claude/ files, third-party import paths, provider presets (runtime user-facing data).
Property-based testing workflow: CI runs 10 deterministic examples per property test (derandomize=True). Random fuzzing locally: HYPOTHESIS_PROFILE=dev (1000 examples) or HYPOTHESIS_PROFILE=fuzz (10000 examples, no deadline). When Hypothesis finds a failure, fix the underlying bug and add@example(...) decorator to permanently cover the case in CI. Never skip flaky tests - fix them fundamentally by mocking time.monotonic()/asyncio.sleep() or using asyncio.Event().wait() for indefinite blocking.
Files:
tests/unit/security/test_information_stripper.pytests/unit/security/test_safety_classifier.pytests/unit/security/test_denial_tracker.pytests/unit/security/test_uncertainty_checker.pytests/unit/engine/test_security_factory_safety.pytests/unit/security/test_service_safety_integration.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/security/test_information_stripper.pytests/unit/security/test_safety_classifier.pytests/unit/security/test_denial_tracker.pytests/unit/security/test_uncertainty_checker.pytests/unit/engine/test_security_factory_safety.pytests/unit/security/test_service_safety_integration.py
🧠 Learnings (61)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/api/**/*.py : Authentication uses JWT + API key. Approval gate integration for high-risk operations.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
📚 Learning: 2026-03-27T12:44:29.466Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T12:44:29.466Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from `web/src/components/ui/` (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem) before creating new ones
Applied to files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/src/**/*.{ts,tsx} : Always reuse existing components from web/src/components/ui/ (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast/ToastContainer, Skeleton variants, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup/StaggerItem, Drawer, form fields, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor) before creating new components
Applied to files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📚 Learning: 2026-04-02T12:21:16.739Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-02T12:21:16.739Z
Learning: Applies to web/src/**/*.{tsx,ts} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, DeptHealthBar, ProgressGauge, StatPill, Avatar, Button, Toast, Skeleton, EmptyState, ErrorBoundary, ConfirmDialog, CommandPalette, InlineEdit, AnimatedPresence, StaggerGroup, Drawer, InputField, SelectField, SliderField, ToggleField, TaskStatusIndicator, PriorityBadge, ProviderHealthBadge, TokenUsageBar, CodeMirrorEditor, SegmentedControl, ThemeToggle, LiveRegion, MobileUnsupportedOverlay, LazyCodeMirrorEditor, TagInput, MetadataGrid, ProjectStatusBadge, ContentTypeBadge)
Applied to files:
web/src/pages/approvals/ApprovalCard.tsxweb/src/pages/approvals/ApprovalDetailDrawer.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.{ts,tsx} : Do NOT build card-with-header layouts from scratch; use `<SectionCard>`
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:20:08.544Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:20:08.544Z
Learning: Applies to web/src/**/*.{ts,tsx} : Web dashboard shadows/borders: use token variables (var(--so-shadow-card-hover), border-border, border-bright); never hardcode shadow or border values
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-30T10:41:40.176Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-30T10:41:40.176Z
Learning: Applies to web/src/**/*.{ts,tsx} : ALWAYS reuse existing components from `web/src/components/ui/` before creating new ones; refer to design system inventory (StatusBadge, MetricCard, Sparkline, SectionCard, AgentCard, etc.)
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : Import `cn` from `@/lib/utils` for conditional class merging in component files
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-31T14:31:11.894Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:31:11.894Z
Learning: Applies to web/src/**/*.{ts,tsx} : Use React 19, TypeScript 6.0+, and design system tokens from shadcn/ui + Tailwind CSS 4 + Radix UI in web dashboard
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Applies to web/src/components/ui/**/*.{ts,tsx} : For Base UI primitives, import from specific subpaths (e.g. `import { Dialog } from 'base-ui/react/dialog'`) and use the component's `render` prop for polymorphism
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T13:43:45.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T13:43:45.381Z
Learning: Applies to web/package.json : Web dashboard: Node.js 22+, TypeScript 6.0+. Key dependencies: React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand, tanstack/react-query, xyflow/react, dagrejs/dagre, d3-force, dnd-kit, Recharts, Framer Motion, cmdk-base, js-yaml, Axios, Lucide React, Storybook 10, Vitest, Playwright, fast-check.
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-27T22:32:26.927Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-27T22:32:26.927Z
Learning: Applies to web/src/components/ui/*.{tsx,ts} : For new shared React components: place in web/src/components/ui/ with kebab-case filename, create .stories.tsx with all states, export props as TypeScript interface, use design tokens exclusively
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-04-06T06:45:22.965Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2026-04-06T06:45:22.965Z
Learning: Do NOT recreate status dots inline -- use `<StatusBadge>` from `@/components/ui/status-badge`
Applied to files:
web/src/pages/approvals/ApprovalCard.tsx
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Applied to files:
src/synthorg/security/__init__.pytests/unit/security/test_information_stripper.pytests/unit/security/test_denial_tracker.pysrc/synthorg/security/denial_tracker.pytests/unit/engine/test_security_factory_safety.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pysrc/synthorg/security/service.pysrc/synthorg/observability/events/security.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Applied to files:
src/synthorg/security/__init__.pytests/unit/security/test_denial_tracker.pysrc/synthorg/security/denial_tracker.pytests/unit/engine/test_security_factory_safety.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pysrc/synthorg/security/service.pysrc/synthorg/observability/events/security.pytests/unit/security/test_service_safety_integration.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...
Applied to files:
src/synthorg/security/__init__.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-04-06T13:43:45.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T13:43:45.381Z
Learning: Applies to src/synthorg/**/*.py : Package structure follows: api/ (REST + WebSocket, RFC 9457 errors, auth/, workflows, reports), backup/ (scheduler, retention, handlers), budget/ (cost tracking, quotas, risk scoring), communication/ (message bus, channels), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task engine, workspace, workflow execution), hr/ (hiring, agent registry, performance, evaluation), memory/ (MemoryBackend, retrieval pipeline, consolidation, embedding, procedural), persistence/ (PersistenceBackend, SQLite, repositories), observability/ (logging, events, redaction, shipping), providers/ (LLM abstraction, routing, health), security/ (rule engine, audit, policy, risk scoring), templates/ (company templates, presets, packs), tools/ (registry, built-in tools, MCP, sandbox).
Applied to files:
src/synthorg/security/__init__.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/api/**/*.py : API package (api/): Litestar REST + WebSocket with controllers, guards, channels, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint, provider management endpoint (CRUD + test + presets), backup endpoint, RFC 9457 structured errors, AppState hot-reload slots, service auto-wiring (Phase 1 at construction, Phase 2 on startup), lifecycle helpers
Applied to files:
src/synthorg/security/__init__.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Applied to files:
src/synthorg/security/__init__.pysrc/synthorg/engine/_security_factory.pysrc/synthorg/security/config.pysrc/synthorg/security/service.pytests/unit/security/test_service_safety_integration.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.
Applied to files:
src/synthorg/engine/_security_factory.pysrc/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-04-01T09:37:49.451Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:37:49.451Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-04-06T13:43:45.380Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T13:43:45.380Z
Learning: Applies to src/**/*.py : Models: use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Adopted conventions: use allow_inf_nan=False in all ConfigDict declarations; use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr (from core.types) for all identifier/name fields.
Applied to files:
src/synthorg/security/config.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-04-01T09:09:43.948Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T09:09:43.948Z
Learning: Applies to **/*.py : Use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement in non-Pydantic internal collections (registries, BaseTool)
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-31T21:07:37.469Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.469Z
Learning: Applies to src/synthorg/providers/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-16T19:13:36.562Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:36.562Z
Learning: Applies to src/synthorg/providers/**/*.py : RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code — it's handled by the base class.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Never implement retry logic in provider subclasses or calling code — it is handled automatically by `BaseCompletionProvider` with `RetryConfig` and `RateLimiterConfig` per-provider
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-17T18:52:05.142Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T18:52:05.142Z
Learning: Applies to src/synthorg/providers/**/*.py : All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code — it's handled by the base class.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/providers/**/*.py : All provider calls go through `BaseCompletionProvider` which applies retry + rate limiting automatically; never implement retry logic in driver subclasses or calling code
Applied to files:
src/synthorg/engine/agent_engine.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-04-06T13:43:45.380Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T13:43:45.380Z
Learning: Applies to src/**/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) - ruff enforces this on Python 3.14.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.
Applied to files:
src/synthorg/security/service.pysrc/synthorg/security/safety_classifier.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-04-06T13:43:45.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T13:43:45.381Z
Learning: Applies to src/**/*.py : Event names must always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging
Applied to files:
src/synthorg/observability/events/security.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
tests/unit/security/test_service_safety_integration.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/unit/security/test_service_safety_integration.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).
Applied to files:
tests/unit/security/test_service_safety_integration.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/api/**/*.py : Authentication uses JWT + API key. Approval gate integration for high-risk operations.
Applied to files:
src/synthorg/security/safety_classifier.py
🔇 Additional comments (44)
src/synthorg/security/config.py (3)
211-249: LGTM! Well-structured configuration model.The
SafetyClassifierConfigfollows project conventions withfrozen=True,allow_inf_nan=False, proper field constraints (ge=1for denial limits,gt=0.0for timeout), and usesNotBlankStr | Nonefor the optional model identifier. Thesafe_tool_categoriesdefault provides sensible safe-tool bypass examples.
251-279: LGTM! Proper configuration for cross-provider uncertainty checks.The
UncertaintyCheckConfigcorrectly constrainsmin_providerstoge=2(meaningful comparison requires multiple providers), boundslow_confidence_thresholdwithin[0.0, 1.0], and usesNotBlankStr | Nonefor the model reference.
325-330: LGTM! Proper integration into SecurityConfig.Using
Field(default_factory=...)correctly instantiates nested frozen models, and both features default to disabled, matching the PR's stated behavior.src/synthorg/engine/agent_engine.py (2)
329-330: LGTM! Proper wiring for LLM-based security infrastructure.The new
provider_configsandmodel_resolverparameters are correctly typed, stored as private attributes, and enable optional LLM-based security features (safety classifier, uncertainty checker) when provider infrastructure is available.Also applies to: 348-349
1571-1573: LGTM! Correct forwarding to security factory.The three provider-related parameters are correctly forwarded to
make_security_interceptor, matching the factory signature shown in the context snippet (_security_factory.py:48-70).web/src/pages/approvals/ApprovalCard.tsx (1)
142-174: LGTM! Proper badge rendering with design tokens and accessibility.The badges correctly use semantic color tokens (
border-danger/30,bg-danger/10,text-warning), include properaria-labelattributes for screen readers, and follow the component's existing styling patterns.src/synthorg/security/__init__.py (2)
34-83: LGTM! Clean public API expansion.All new security components (
SafetyClassifier,UncertaintyChecker,DenialTracker, and their associated types/configs) are properly imported and will be accessible viafrom synthorg.security import ....
94-128: LGTM! Complete__all__updates.All new exports are correctly added to
__all__in alphabetical order, maintaining the existing convention.tests/unit/security/test_denial_tracker.py (1)
1-139: LGTM! Comprehensive test coverage for DenialTracker.The test suite thoroughly covers:
- Basic denial flow (first denial → RETRY)
- Threshold triggers (consecutive and total limits → ESCALATE)
- Reset semantics (consecutive reset preserves total)
- Agent isolation (independent counts per agent)
- Input validation (invalid constructor args)
- Parametrized edge cases for various limit combinations
src/synthorg/security/denial_tracker.py (3)
50-76: LGTM! Well-designed denial tracker with proper validation.The
DenialTrackerclass correctly validates constructor inputs, uses appropriate data structures (__slots__for memory efficiency on the internal class), and the thread-safety claim is accurate for asyncio's single-threaded event loop model.
78-116: LGTM! Correct denial recording logic with appropriate logging.The
record_denialmethod properly initializes agent counts on first use, increments both counters atomically, and logs at appropriate levels:INFOfor non-escalating denials,WARNINGfor escalation triggers. The escalation condition correctly checks both consecutive and total limits.
118-149: LGTM! Reset and query methods are correct.The
reset_consecutivemethod properly guards against unknown agents and already-zero counts, logging only when an actual reset occurs. Theget_countsmethod safely returns(0, 0)for unknown agents.tests/unit/security/test_safety_classifier.py (4)
52-84: LGTM! Well-structured test helpers with vendor-agnostic naming.The
_make_classifierhelper correctly wires mock providers with vendor-agnostic names (provider-a,family-a,model-a-1) and provides clean configuration for various test scenarios.
90-139: LGTM! Classification result tests cover all enum values.Tests verify correct mapping from LLM tool-call arguments (
"safe","suspicious","blocked") to theSafetyClassificationenum, and thatreasonandclassification_duration_msare properly populated.
145-183: LGTM! PII stripping tests verify security boundary.Tests confirm that sensitive data (SSN pattern) is not visible to the LLM and that the
stripped_descriptionin the result excludes detected identifiers (task IDs, emails).
189-293: LGTM! Comprehensive fail-safe error handling tests.Tests cover all failure modes (provider exception, timeout, no providers, invalid classification value, missing tool call) and verify the fail-safe behavior returns
SUSPICIOUSclassification, aligning with the PR's conservative failure default strategy.tests/unit/security/test_information_stripper.py (1)
1-188: LGTM! Thorough test coverage for InformationStripper.The test suite comprehensively covers:
- Clean text: Unchanged passthrough for non-sensitive content
- Credentials: AWS keys, SSH keys, Bearer tokens, API keys, GitHub PATs
- PII: SSN and credit card patterns
- Identifiers: UUIDs (various formats), emails, internal IDs (agent-xxx, task-xxx)
- Mixed content: Multiple pattern types in a single string
- Context preservation: Non-sensitive structural text remains intact
web/src/pages/approvals/ApprovalDetailDrawer.tsx (3)
67-69: LGTM: Confidence score parsing handles edge cases correctly.The nullish coalescing check on
confidenceRawcombined withNumber.isNaN()validation ensures safe rendering. The formatted percentage label is only shown when a valid score exists.
258-274: LGTM: Safety warning banners use semantic design tokens.The banners correctly use
text-danger/bg-danger/10andtext-warning/bg-warning/10with consistent styling. The conditional rendering based onsafety_classificationmetadata aligns with the backend's classification values.
433-450: LGTM: DescriptionSection safely renders stripped description.The component correctly prefers
stripped_descriptionfrom metadata when available and adds a "(PII redacted)" indicator. Values are rendered via JSX text interpolation, avoiding XSS risks.tests/unit/security/test_uncertainty_checker.py (4)
1-19: LGTM: Well-structured test module with proper imports and vendor-agnostic naming.The test file correctly imports both public APIs (
UncertaintyChecker,UncertaintyResult) and internal functions (_compute_keyword_overlap,_compute_tfidf_cosine_similarity) for comprehensive unit testing. Vendor-agnostic model names like"test-small-001"comply with project guidelines.
77-129: LGTM: Similarity function tests cover essential edge cases.Good coverage of identical, disjoint, partial overlap, single-item, and empty-string scenarios for both Jaccard and TF-IDF computations. The
pytest.approxusage with appropriate tolerances ensures stable floating-point comparisons.
179-201: LGTM: Skip condition tests verify graceful degradation.Tests correctly verify that single-provider and no-model-ref scenarios return
confidence_score=1.0withprovider_count=0, matching the design invariant that insufficient data defaults to "no concern" rather than blocking.
255-285: LGTM: Result model validation tests verify Pydantic constraints.Tests confirm immutability (frozen model) and boundary validation (score must be 0.0–1.0). Using
pytest.raises(Exception)is acceptable here since the specific Pydantic exception type is an implementation detail.tests/unit/engine/test_security_factory_safety.py (2)
31-93: LGTM: Factory safety classifier wiring tests are comprehensive.Tests correctly verify that
_safety_classifieris wired only when bothenabled=Trueand provider infrastructure is available. The three test scenarios (enabled+providers, disabled, no providers) cover the conditional logic branches in the factory.
95-147: LGTM: Uncertainty checker wiring tests verify resolver dependency.The tests correctly validate that
UncertaintyCheckerrequires both provider infrastructure andmodel_resolverto be wired. The "no resolver" test confirms graceful degradation when the dependency is missing.src/synthorg/engine/_security_factory.py (2)
102-155: LGTM: Conditional wiring of LLM-based security services is well-structured.The
has_providersguard correctly ensures all LLM-dependent components (LlmSecurityEvaluator,SafetyClassifier,UncertaintyChecker) are only instantiated when provider infrastructure is available. The additionalmodel_resolvercheck forUncertaintyCheckeraligns with its multi-provider design requirement.
225-243: LGTM: Warning helper provides clear operational feedback.The
_warn_disabled_featuresfunction logs which LLM-based features are configured but inactive due to missing provider infrastructure. This aids troubleshooting when features don't work as expected.src/synthorg/security/uncertainty.py (3)
54-79: LGTM: UncertaintyResult model follows Pydantic v2 conventions.The model correctly uses
ConfigDict(frozen=True, allow_inf_nan=False), boundedFieldconstraints for scores, andNotBlankStrfor the reason field per project conventions.
242-270: LGTM: Provider deduplication ensures true cross-provider comparison.The deduplication by
provider_namebefore themin_providerscheck correctly ensures that multiple model variants from the same provider don't satisfy the cross-provider requirement. This addresses the concern from previous reviews.
350-398: LGTM: TaskGroup exception handling prevents ExceptionGroup propagation.The broad
except Exceptioninside_call_provideris intentional and correctly documented—catching and logging all exceptions prevents them from propagating as anExceptionGroupthat would escape outer exception handlers. Individual provider failures degrade gracefully to fewer comparison responses.tests/unit/security/test_service_safety_integration.py (2)
92-196: LGTM: Safety classifier integration tests cover all classification outcomes.Tests comprehensively verify BLOCKED (auto-reject), SUSPICIOUS (metadata enrichment), and SAFE (normal flow) paths. The error handling test confirms fail-safe behavior where classifier failures still create approval items for human review.
425-528: LGTM: Denial tracker tests verify retry, escalation, and reset logic.Excellent coverage of denial tracking behavior: retry with remaining attempts, escalation at threshold, consecutive reset on SAFE, total limit across resets, and immediate DENY without tracker. These tests validate the deny-and-continue mechanism per the linked issue
#847.src/synthorg/security/service.py (4)
114-148: LGTM: Constructor extended with safety subsystem dependencies.The new optional parameters (
safety_classifier,uncertainty_checker,denial_tracker) follow the existing dependency injection pattern. The updated docstring clearly documents each parameter's role.
559-590: LGTM: Safety classifier and uncertainty check integration protects against PII leakage.The code correctly:
- Only runs the uncertainty check when
stripped_descriptionis available (line 586)- Never broadcasts raw
verdict.reasonto providers- Uses the stripped description for both the approval item and uncertainty check
This addresses the past review concern about leaking unredacted text.
725-771: LGTM: Denial tracking correctly distinguishes RETRY from ESCALATE.The
_handle_blocked_denialmethod now returnsFalseforDenialAction.ESCALATE(line 761), allowing the request to proceed to human approval instead of auto-rejecting. This addresses the past review concern where max-denial thresholds weren't actually escalating to humans.
773-812: LGTM: Uncertainty metadata correctly distinguishes skipped from real confidence.The code now:
- Always records
provider_count(line 784-786)- Only serializes
confidence_scorewhen 2+ providers compared (line 790-793)- Marks single-provider/skipped cases with
uncertainty_check_skipped=true(line 798)This prevents reviewers from being misled by a fake 1.0 confidence score.
src/synthorg/observability/events/security.py (1)
54-77: LGTM: New event constants follow established naming conventions.The constants correctly use the
security.<subsystem>.<event>naming pattern andFinal[str]type annotations. The coverage includes lifecycle events (start/complete/error), outcome events (blocked/suspicious/low_confidence), and state transitions (recorded/escalated/reset).src/synthorg/security/safety_classifier.py (6)
213-216: Control character stripping now properly applied in Stage 1.The
_CONTROL_CHAR_RE.sub()call at line 215 addresses the previous review feedback — bidi overrides and zero-width characters are now removed before the stripped description reaches the classifier or reviewer UI.
506-513: Truncation correctly applied before HTML escaping.The truncation logic now operates on the raw description before
html.escape()is called, preventing the possibility of cutting mid-escape-sequence (e.g.,&→&am). This addresses previous review feedback.
149-166: Well-structured result model with appropriate validation.The
SafetyClassifierResultmodel correctly usesfrozen=Trueandallow_inf_nan=Falseper project conventions,NotBlankStrfor the reason field, and properField(ge=0.0)constraint for the duration.
267-283: Effective prompt injection mitigations in system prompt.The system prompt explicitly informs the LLM that field values are sanitized with placeholders and instructs it not to follow embedded instructions. Combined with the tool-only response requirement, this provides good defense-in-depth against prompt injection attempts.
376-391: Correct PEP 758 exception syntax and fail-safe error handling.The exception handling correctly uses the PEP 758 comma-separated syntax for
MemoryError, RecursionErrorre-raise, and the generic exception handler implements proper fail-safe behavior — logging with full context vialogger.exception()and returningSUSPICIOUSclassification rather than failing silently or marking as safe.
582-590: Robust reason sanitization with appropriate fallback.The reason parsing correctly strips control characters before truncation, uses
.strip()to normalize whitespace, and provides a sensible default when the result is empty. This ensuresNotBlankStrvalidation will pass and prevents malformed or invisible-only reason strings from reaching the UI.
| logger.warning( | ||
| SECURITY_SAFETY_CLASSIFY_ERROR, | ||
| note=( | ||
| f"No model configured for provider {provider_name!r}, " | ||
| "using provider name as model hint" | ||
| ), | ||
| provider_name=provider_name, | ||
| ) | ||
| return provider_name |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Verify fallback model hint is accepted by providers.
When no model is configured and the provider has no models list, the code falls back to using the provider name itself as the model hint. While this is logged as a warning, some providers may reject an invalid model identifier, causing driver.complete() to fail — which is handled by the fail-safe path but could be avoided.
Consider whether this fallback should instead return SUSPICIOUS immediately (similar to the no-provider case) rather than attempting an LLM call that's likely to fail.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/synthorg/security/safety_classifier.py` around lines 480 - 488, The
fallback that returns the raw provider_name when no model is configured can
cause provider.reject errors; change the behavior in the function that computes
the model hint (the block that currently logs SECURITY_SAFETY_CLASSIFY_ERROR and
returns provider_name) to return the SUSPICIOUS sentinel instead of
provider_name when the provider has no models list, and update the warning
message accordingly; ensure callers that expect a model hint (and any downstream
use of driver.complete()) handle the SUSPICIOUS value the same way as the
"no-provider" case so no LLM call is attempted.
| const safetyClassification = approval.metadata.safety_classification | ||
| const isSuspicious = safetyClassification === 'suspicious' | ||
| const isBlocked = safetyClassification === 'blocked' | ||
| const confidenceRaw = approval.metadata.confidence_score | ||
| const confidenceScore = confidenceRaw != null ? parseFloat(confidenceRaw) : NaN | ||
| const showLowConfidence = approval.metadata.low_confidence === 'true' | ||
| || (!Number.isNaN(confidenceScore) && confidenceScore < 0.5) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Type-unsafe metadata access — consider narrowing the type.
The ApprovalItem.metadata is typed as Record<string, string> (per web/src/api/types.ts), but the code accesses specific properties (safety_classification, confidence_score, low_confidence) without type narrowing. This compiles because index access on Record<string, string> returns string | undefined, but the specific property names and expected values aren't validated.
Consider defining an extended interface or a type guard:
🛠️ Suggested type narrowing
// In web/src/api/types.ts or a local type file
interface SafetyMetadata {
safety_classification?: 'safe' | 'suspicious' | 'blocked'
confidence_score?: string
low_confidence?: 'true' | 'false'
}
// Then in component:
const safetyMeta = approval.metadata as SafetyMetadata
const isSuspicious = safetyMeta.safety_classification === 'suspicious'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/src/pages/approvals/ApprovalCard.tsx` around lines 31 - 37, ApprovalCard
currently treats approval.metadata as Record<string,string> and accesses
safety_classification, confidence_score, and low_confidence without narrowing
the type; create a narrow type (e.g., SafetyMetadata with
safety_classification?: 'safe' | 'suspicious' | 'blocked', confidence_score?:
string, low_confidence?: 'true' | 'false') in web/src/api/types.ts or a local
types file and then narrow or assert approval.metadata to SafetyMetadata (or
implement a simple type guard) before computing safetyClassification,
confidenceScore, isSuspicious, isBlocked, and showLowConfidence so those
variables use the narrowed types and avoid unsafe string/undefined handling.
🤖 I have created a release *beep* *boop* --- ## [0.6.3](v0.6.2...v0.6.3) (2026-04-06) ### Features * backend CRUD + multi-user permissions ([#1081](#1081), [#1082](#1082)) ([#1094](#1094)) ([93e469b](93e469b)) * in-dashboard team editing + budget rebalance on pack apply ([#1093](#1093)) ([35977c0](35977c0)), closes [#1079](#1079) [#1080](#1080) * tiered rate limiting, NotificationSink protocol, in-dashboard notifications ([#1092](#1092)) ([df2142c](df2142c)), closes [#1077](#1077) [#1078](#1078) [#849](#849) * two-stage safety classifier and cross-provider uncertainty check for approval gates ([#1090](#1090)) ([0b2edee](0b2edee)), closes [#847](#847) [#701](#701) ### Refactoring * memory pipeline improvements ([#1075](#1075), [#997](#997)) ([#1091](#1091)) ([a048a4c](a048a4c)) ### Documentation * add OpenCode parity setup and hookify rule documentation ([#1095](#1095)) ([52e877a](52e877a)) ### Maintenance * bump vite from 8.0.3 to 8.0.4 in /web in the all group across 1 directory ([#1088](#1088)) ([1e86ca6](1e86ca6)) * tune ZAP DAST scan -- auth, timeouts, rules, report artifacts ([#1097](#1097)) ([82bf0e1](82bf0e1)), closes [#1096](#1096) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Summary
Adds two pre-review safety layers to the approval gate system:
Two-stage safety classifier (feat: two-stage safety classifier with information stripping for approval gates #847): Stage 1 strips PII, secrets, UUIDs, emails, and internal IDs from the reviewer-facing description (reuses existing credential/PII patterns from rule engine detectors). Stage 2 runs an LLM classifier (cross-family provider selection) to categorize the escalated action as safe/suspicious/blocked. Blocked actions are auto-rejected. Suspicious actions get a warning badge in the reviewer UI.
Cross-provider uncertainty check (feat: cross-provider uncertainty check for hallucination detection at approval gates #701): Sends the same prompt to N providers (configurable, default 2), compares responses via Jaccard keyword overlap + TF-IDF cosine similarity, produces a confidence score (0-1). Low confidence signals potential hallucination and is surfaced in the reviewer UI.
Both features integrate into
SecOpsService._handle_escalation()and propagate results to the frontend viaApprovalItem.metadata. Both default to disabled (enabled=False) for full backward compatibility.Changes
Backend (Python)
src/synthorg/security/safety_classifier.py--InformationStripper,SafetyClassificationenum,SafetyClassifierResultmodel,SafetyClassifierclasssrc/synthorg/security/uncertainty.py--UncertaintyResultmodel,UncertaintyCheckerclass with pure-Python TF-IDF similaritysrc/synthorg/security/config.py--SafetyClassifierConfig,UncertaintyCheckConfig, extendedSecurityConfigsrc/synthorg/security/service.py--_handle_escalation()integration,_run_safety_classifier(),_run_uncertainty_check()src/synthorg/engine/_security_factory.py-- factory wiring for all three LLM-based servicessrc/synthorg/observability/events/security.py-- 11 new event constantssrc/synthorg/security/__init__.py-- new public exportsFrontend (React)
web/src/pages/approvals/ApprovalCard.tsx-- suspicious warning badge, low confidence indicatorweb/src/pages/approvals/ApprovalDetailDrawer.tsx-- safety warning banner, confidence score in metadata grid, stripped description displayTests
tests/unit/security/test_information_stripper.py(20 tests)tests/unit/security/test_safety_classifier.py(12 tests)tests/unit/security/test_uncertainty_checker.py(18 tests)tests/unit/security/test_service_safety_integration.py(10 tests)tests/unit/engine/test_security_factory_safety.py(5 tests)Security considerations
action_typeandtool_nameare stripped throughInformationStripperbefore reaching the LLMverdict.reason) to avoid broadcasting PII/secrets to all providersauto_reject_blockeddefaults toTrue(secure path is default)dangerouslySetInnerHTMLin frontend -- all values rendered via JSX text interpolationTest plan
uv run python -m pytest tests/ -n 8 -m unit)npm --prefix web run test)Review coverage
Pre-reviewed by 5 specialized agents (code-reviewer, security-reviewer, pr-test-analyzer, issue-resolution-verifier, silent-failure-hunter). 18 findings addressed across 2 commits.
Closes #847
Closes #701